Microsoft says its Agent Mode in Excel has an accuracy rate of 57.2 percent in SpreadsheetBench, a benchmark for evaluating an AI modelâs ability to edit real world spreadsheets.
It generates 42.8% bullshit.
They probably view that as a statistic worth bragging about. Itâs not. If Excel got calculations right 57.2% of the time it would be completely worthless.
I asked copilot to look through my every spreadsheet and find how many instances of a category occurred. I was curious to see if it was any good. Gave me 2 different numbers. Neither were correct.
Copilot: Putting the âArtificialâ in Artificial Intelligence.
The tech behind LLMs could have just been Clippy and everyone would be happy.
Fartificial Intelligence
Did you read the next sentence? Humans only get like 72% right. Itâs not far off at all.
I wonder where that âhuman accuracyâ statistic is coming from. Plenty of people donât know how to read and interpret data, much less use excel in the first place. Thereâs a difference between 1/4 of people in the workforce not being able to complete a task, and a specialized AI not being able to complete a task. Additionally, this is how you get into the KPI as a goal rather than a proxy issue. AI will never understand context isnât directly provided in the workbook. If you introduced a new drink at your restaurant in 2020 AI will tell you that the introduction of the drink caused a 100% decrease in foot traffic since thereâs no line item for âglobal pandemicâ. Iâm not saying AI will never be there, but people using this version of AI instead of actual analysis donât care about the facts and just want an answer and for that answer to be cheap.
As Iâve said many times, though not in this topic - AI is a tool to be used, and using it is a skill that needs to be learned.
For your pandemic example, thatâs something that you would need to provide the AI with the context of. The joke of a âprompt engineerâ being a job soon actually has merit, in that you want people who know how to use their tools the best. Itâs constantly learning through iteration to give the AI a specific instruction set to get the results you want/need.
Depending on where you go to school, 70% is passing while 50% is not. While ânot far off,â one is a C, the other a F.
Thatâs not at all what this means. In this instance, 70% is basically âhuman levelâ. For AI to already get 57% it means that itâs approaching the same level as people do in Excel.
So it achieved the actual proficiency of a middle managerâŠ
Decades ago. The company that replaced itâs CEO with a LLM thrives.
Just keep regenerating data until itâs something the stock holders like. Doesnât matter if itâs BS. Theyâre already accustomed to that.
Nice. Basically a coin flip
Slightly better than Vegas. Unfortunately, plenty of people are okay with Vegas odds.
Not enough accuracy to be useful. Not enough bullshit for politics.
deleted by creator
The best you can do in any job is to care as little about them as they care about you.
They will barely read it, and they wonât care nearly as much as you do.
I resign my position as a [position], effective [DATE].
The best cancers of both worlds.
So let me fast forward a bit, ->underpaid stressed out techworkers in the global south pretending to be AI for incompetent upper management in wealthy countries?
Not related but does global south refer to south of the equator or just everything south of north America?
I donât know if it is a perfect term, but it doesnât literally refer to any specific âSouthâ, rather I think it is a reference to the coincidence that many of the heavily industrialized empires of the 18th, 19th and 20th centuries have been in the northern hemisphere, and the general colonial power dynamic therein set up has lead to the term âGlobal Southâ meaning pretty much anywhere that has gotten the short end of the colonialism stick, vs the long end.
https://en.m.wikipedia.org/wiki/Global_North_and_Global_South
Theyâre out smarting the sheet thatâs for sure.
Excel is one place where AI makes sense. All the data is there, in a nice structured and typed format with headings etc. Easily verifiable and to provide the reasoning for its work.
LLMs canât count. Canât add. Canât deal with actually large datasets
How is excel a good fit for vibe-coding?
This isnât just an LLM. It uses excel functions and features to do the counting and adding and dealing with large data sets.
Itâs not âvibe codingâ as much as âvibe performing steps in excelâ.
Also LLMs absolutely can deal with large data sets anyway. Not sure where you got that from.
LLMs lose context over a short session. They all have input limits. Very small input limits usually. Best it can probably do is suggest formulas for you based on your natural language, maybe some copy/paste. Which means it can beat a 9 year old, great news everyone! Or show a help article on pivot tables (which the help function already does!)
Excel is very simple to work with, hence its ubiquity. LLMs also get shit wrong about half the time, way more than half with difficult things ime. Meaning they cost experienced operators time, a few studies are showing this now with coding. And are expensive as fuck. And slow as fuck. And reduce capacity for learning. Meaning they actually cap what excel can achieve, as the user wonât grow at the same rate, renoving the one advantage excel actually has: the learning rate is phenomenal
The C-Suite which insisted on this integration is basically an subservient idiot themselves at this stage who doesnât understand their product, their market fit, or their userbase. They should replace thenselves with an LLM
I think you didnât even read the article or read up about this integration.
This isnât just an LLM, itâs Agentic AI.
AI is a tool that needs to be learned how to be used properly. Anyone can pick it up and get results that are âgood enoughâ, but in the right hands what can be done is incredible - just like with any tool.
Look at something like minecraft as a perfect example of what can be done with a tool in the right hands.
Most people donât understand âAIâ as it is, and mistakenly think itâs just a school assignment cheating tool and a chat bot that makes things up.
People in here have been saying since LLMs canât do maths perfectly itâs terrible for numbers, but they canât see that it doesnât need to do maths here because itâs in excel and excel has formulas and functions that can.
Itâs crazy how the mere mention of AI makes some people lose any and all semblance of critical thinking and intelligence.
Excel is very simple to work with
Ok so your idea of excel is just what your average person might do at home with it - thatâs not what or who this stuff is for.
And it will fuck up around half of even the simple formulas. This is really bad, and the idiots in charge should feel bad. Excel basically runs the world and they are about to fuck it up
And people will fuck up those same formulas 30% of the time (vs 45% of agent mode).
AI isnât being forced into being the only way to do spreadsheets.
deleted by creator
You tell it not to.
I swear none of you guys have even attempted to use AI to do data analysis. I have, I built a MCP and integrated a copilot agent into Teams which has access to specific database data, and refined the rules for it to the point where the CFO rigorously tested it (and still does) and trusts the results it returns.
The problem with being a pragmatic LLM user is that you have on one side corporate America shoe-horning the tech in mediocre products none wants, and on the other side a large portion of the internet who loathe it but donât use it and donât even know what it does. Those conversations never go anywhere man. Youâre talking to someone who thinks accuracy of 57% on SpreadsheetBench means the model gives wrong answers 42% of the time.
Hate to agree with Microsoft but yeah, Excel is probably a great place to introduce an LLM. Itâs in that sweet spot between natural language and light programming, in an environment with math baked in so you donât really care about the modelâs accuracy or exact recall. All the data is here, and the model only has to manipulates cell numbers and writes formulas in this dumbed down language.
Iâm sure you can get away with pretty small models too. It doesnât need super human knowledge to implement 90% of common Excel use cases, and i suspect in real world scenarios the accuracy must be pretty interesting.
It could be good to layer in standard machine learning (ML), and it already does have some features (like line of best fit).
However, in todayâs context AI means LLMs, and that is not a good fit due to its unpredictability.










