Too big record size
-
Hi,
When indexing website content via CLI command “wp algolia reindex –all”, I am getting the following error on number of records.
Record at the position 144 objectID=133092-0 is too big size=11814/10000 bytes. Please have a look at https://www.algolia.com/doc/guides/sending-and-managing-data/prepare-your-data/in-depth/index-and-records-size-and-usage-limitations/#record-size-limits
Does the plugin provide a way to split post content into multiple records?
BTW. Is there a public plugin roadmap where we can learn about new developments and upcoming features?
-
The plugin is already splitting post content into multiple records for you, to be fair. https://github.com/WebDevStudios/wp-search-with-algolia/blob/main/includes/indices/class-algolia-searchable-posts-index.php#L140-L172
For example,
133092-0
,133092-1
,133092-2
would be 3 records for the post with ID133092
Why the size isn’t getting calculated correctly for post
133092
isn’t something I can necessarily answer on my own, though I’m curious what content may be in it.Regarding any type of public roadmap for things, the GitHub repo page would be best https://github.com/WebDevStudios/wp-search-with-algolia
For example, here’s what we presently have slated for 2.9.0 https://github.com/WebDevStudios/wp-search-with-algolia/milestone/20
Thank you for your detailed answer.
While I see in Algolia records with ID ending -1 or -2, I don’t see other parts of these records, meaning -0, -1, -2. I only see one of the parts. Do you know how to see all parts of split record to check what the date in them look like?
Here is the data of the record that failed to be indexed.
$attributes: Array
(
[post_id] => 133092
[post_title] => 年年翻倍!多空都贏的飆股投資法 (Traditional Chinese Edition)
[post_excerpt] =>
[images] => Array
(
[medium_large] => Array
(
[url] => https://readtoinvest.com/wp-content/uploads/2024/11/91QUZJnHIL-768x1080.jpg
[width] => 768
[height] => 1080
)
)
[post_content] => <div>
<div>28歲從職場退休、35歲財富自由!</div>
<div>曾經在股市賠光資產,他痛定思痛,</div>
<div>以工程師的精密思維與反覆試驗,</div>
<div>耗費心血研究出最佳賺錢方程式:</div>
<div>大盤紅綠燈判斷法、阿水一式、阿水二式,</div>
<div>並幫助許多人找到最適合自己的投資法!</div>
<div>◎「你想不想聽一種技法,簡單易懂,還可以操作一輩子?」</div>
<div>◎「我的學員中,如果專職操作,增值標準是一年翻一倍。」</div>
<div>◎「一個正常的上班族,要是採用我的技法還賺不到第一桶金,絕對是你沒有學好、沒有用心學。」</div>
<div>阿水非常重視C/P值。小時候家境曾由富轉債,所以只要有時間,他就會思考「如何在最短時間內賺最多的錢」。工作一年後,他開始投資股票,學了財務報表、技術分析、權證,還學了許多大師的選股方法,也懂得在上沖下洗時靈活運用不同招式,卻依然成為股市落水狗,4年慘賠近300萬。</div>
<div>本來沮喪到不敢再碰股票,直到有天被電影「洛基六:勇者無懼」激勵,重拾信心,決定開發一個「高C/P值的股票投資方式」:平日不需要花太多時間選股、確定可以使用一輩子。</div>
<div>經過長時間摸索,他透過「布林通道」理出頭緒,研發出獨有的選股方式:「阿水一式」,將之高度精進並系統化之後,以200多萬選股操作:第一年翻到近500萬!第二年,1000萬!第三年,2000萬!第五年,4000萬!</div>
<div>阿水從2015年正式開班授課至2018年為止,經由他的技法選出的,竟都是每月漲最多的飆股,同時還經過台股1600檔以上的回測證明。這一套技法不怕騙線,而且再多人使用也不會失靈。藉由本書,阿水將獨門技法不藏私的呈現,內容真金不怕火煉,希望你讀了之後,不再跌跌撞撞,能找到最適合自己的投資法。</div>
<div>★大盤紅綠燈判斷法:多年操作經驗觀察出漲跌機率,將大盤的多空趨勢化成「紅、綠、不明」3個燈號。</div>
<div>★阿水一式:做多!三招就能使用一輩子,並教你擁有「一眼瞬選飆股」的神奇技法。</div>
<div>★阿水二式:掌握關鍵四要素,檢視個股是否可做空,穩穩賺之外,還幫股票做健檢。</div>
<div>────────│6萬網友感激推薦!│────────</div>
<div>★★★★★精準分析如股市明燈!</div>
<div>★★★★★不只教表面功夫,而是從最根本的觀念進行教學。</div>
<div>★★★★★專業投資心法將一般投資人的10年血淚史都寫出來了!</div>
<div>★★★★★透過簡單技術分析,就可明確知道買賣點。</div>
<div>★★★★★水哥的分析幫我度過不少股市危機。</div>
<div>★★★★★說明得很清楚,而且有見地,幫我釐清許多重要觀念。</div>
<div>★★★★★讓我快狠準的精確挑飆股。</div>
<div>★★★★★分析詳細又客觀,跟市場上許多自稱老師的人風格截然不同,很中肯。</div>
<div>本書獻給──</div>
<div> 不想再被股票漲跌操弄的你;</div>
<div> 不想再當股票盲(忙)人的你;</div>
<div> 下列情況有任何一點符合,阿水的技法就非常適合你:</div>
<div>◎ 小資族,沒有太多錢投資</div>
<div>◎ 希望找到一個投資方式,讓生活品質更好</div>
<div>◎ 簡單易懂,只需要規律進行</div>
<div>◎ 在乎正期望值</div>
<div>◎ 喜歡系統化</div>
<div>◎ 不想花時間一直研究財報</div>
<div>◎ 不想當沖,因為好緊張</div>
<div>◎ 看到K線要背好多就頭痛</div>
<div>◎ 不想看垃圾資訊</div>
<div>◎ 不必管主力怎麼做</div>
<div>◎ 想找到飆股</div>
<div>◎ 希望價差交易,賺波段</div>
<div>◎ 想知道如何判斷大盤走多頭還是空頭?</div>
<div>◎ 想知道如何判斷個股走空?</div>
<div>◎ 想在多頭、空頭都能賺?</div>
<div>◎ 想知道何時該換股?</div>
<div>◎ 想了解資金注入的比例要多少?</div>
<div>◎ 想穿越股市上下振盪之迷霧</div>
<div>作者 / 股市阿水</div>
<div>13歲就對電腦展現出高度興趣,自學到接近駭客等級。19歲以《窮爸爸,富爸爸》的概念,建議開早餐店的親戚應該積極創造被動收入,並設計了當時仍未問世的POS系統程式,想為其優化工作流程,無奈當時大人認為他太年輕不支持他。27歲成為那斯達克上市公司的IT主管;28歲提早從職場退休;29歲開班授課,分析個股及期貨的投資要訣。現為專職投資人,並已從台股提款4000萬(數字持續增漲中)。</div>
<div>小時候家境曾由富轉債,認為自己的人生起點並不高,而且沒有退路,造就他十分重視C/P值的性格,時刻都在思考,該如何在最短時間內賺最多錢。出社會後,曾在股市慘賠300萬,但憑藉著注重邏輯、腳踏實地與重視風險的性格,使用布林通道反覆試驗摸索出獨門股市投資技法──「大盤紅綠燈判斷法」「阿水一式」「阿水二式」等,讓自己成為錢的主人,不再擔心中年失業危機。</div>
<div>抱持著「分享」的信念,他決定不藏私自己的獨門技法,持續幫助願意學習的散戶朋友,讓大家不會在股市裡「窮忙」,能穩定獲利。2015年正式開班授課至2018年為止,每個月漲最多的飆股,都是經由他研究出的技法選出,同時還經過台股1600檔以上的回測證明,技法內涵真金不怕火煉。</div>
<div>從2015年2月開始寫看盤日記,始終有三個堅持:絕不刪文、發文後不再修改、講錯了一定跟大家道歉;粉絲專頁追蹤人數至今已累積近6萬人。他認為投資和年紀無關,只和「看不看得清楚」有關,一旦看得清楚,從幾歲開始都不晚。</div>
<div>「財經狙擊手-股市阿水」粉絲專頁 https://www.facebook.com/i.warrant</div>
<div>文字協力 / 廖翊君</div>
<div>從事文字工作至今逾20年,是作家、文字內容供應者,也是經紀人;同時有一群實力堅強的專案團隊夥伴,共同完成本本好書及好內容,寫作書籍累積近150本。</div>
</div>
[_sku] => B07NY9F87F
[_product_url] => https://www.amazon.com/%E5%B9%B4%E5%B9%B4%E7%BF%BB%E5%80%8D%EF%BC%81%E5%A4%9A%E7%A9%BA%E9%83%BD%E8%B4%8F%E7%9A%84%E9%A3%86%E8%82%A1%E6%8A%95%E8%B3%87%E6%B3%95-Traditional-Chinese-%E8%82%A1%E5%B8%82%E9%98%BF%E6%B0%B4-ebook/dp/B07NY9F87F/
[_regular_price] => 8.99
[reviews_amazon] => 2
[print_length] => 119
[ranking] => 15563
[published_date] => 1
[published_year] => 2019
[price_highest] => 8.99
[rating_amazon] => 4.5
[published_date_formatted] => February 1, 2019
)Around 30% of my records failed to be indexed when I added “post_content” attribute to the record.
Any idea what might be causing this issue?-
This reply was modified 2 weeks, 3 days ago by
Martin Kilarski.
We don’t log or track how many records a given post ends up creating, so the general best way to see how many a post makes would be to log into the Algolia dashboard, and in the listing, use “ObjectID” instead of “Search” and drop in the post ID and append
-0
etc. For example,133092-0
and keep updating the suffix. Once you no longer find a result, the amount of records is the previous working number, zero based.Regarding the indexing, best I have is something specific to the install, somehow.
I managed to get it indexed after publishing, a bulk re-index, and an edit to the published post. All three made it through without issue. While not exact attribute matches, I have enough to make it equivalent, as my average record sizes are
1.89KB
Thank you for letting me know about the “ObjectID” search. Following your instructions, I was able to find all parts of the record.
Could you tell me more about the following, please? I am not sure what you mean by it.
“Regarding the indexing, best I have is something specific to the install, somehow.
I managed to get it indexed after publishing, a bulk re-index, and an edit to the published post. All three made it through without issue. While not exact attribute matches, I have enough to make it equivalent, as my average record sizes are 1.89KB”
You’re experiencing issues with that specific post, I’m not. Thus my comment about it possibly being something specific to your install.
I copy/pasted the
post_content
attribute from your paste earlier, and created apost
entry with it as my post content. I don’t have ALL the same indexed object properties as you, but at present time I’m not suspecting that as being part of the issue.What got indexed for me shown above. This was with the initial publish, and then a bulk re-index, and then also an individual edit to the post. Both publishing and editing pushes a given post to the index automatically.
That said, I’m wondering if you’d succeed with this troublesome post, if you edited and saved the post with a very minor edit, like punctuation change.
Thank you for testing.
Without changing anything and updating this page, I see in my index this post’s parts 133092-0, 133092-1, 133092-2, 133092-3 with all data. However, the index still is showing 49k records, while before it had around 66k records, so some are still missing.
I will try bulk reindex all posts and see if the issue persists.
Makes sense in that saving the individual troublesome post got it indexed, but anything found after it with the bulk processing would still be missing.
Ends up coming down to why it’s failing with the post at that point with bulk processing.
-
This reply was modified 2 weeks, 3 days ago by
- You must be logged in to reply to this topic.