r/analytics • IOh We just found out our AI has been making up analytics data for 3 months and I'm gonna throw up. Support So we've been using an Al agent since November to answer leadership questions about metrics. It seemed amazing at first fast answers, detailed explanations, everyone loved it. I just found out it's been hallucinating numbers this entire time. Our VP of sales made territory decisions based on data that didn't exist. Our CFO showed the board a deck with fake insights. The Al was just inventing plausible sounding percentages. I only caught it by accident when someone asked me to double check something. I started digging, and holy shit, it's bad.
"Agentic AI" is your automation script, created by natural language input, running on someone else's computer, having all your data.
There is a benefit for people who can't code, but I'm absolutely unsympathetic to software engineers who are full of praise for them. You should know how it works. You should be aware that there's nothing new in it.
And once again, it's not intelligent, it doesn't think or reason. It generates an output that statistically most likely is what you are looking for.
Reasoning models by the way just use a multi-step approach, where they take the generated output as an additional context to generate a maybe better fitting answer. #GenAI#AgenticAI
Today we launch the Agentic AI Foundation (AAIF) with project contributions of MCP (Anthropic), goose (Block) and AGENTS.md (OpenAI), creating a shared ecosystem for tools, standards, and community-driven innovation.
I haven't tested this model, there is magentic-UI as well. It looks dubious, as none of the model I tested were reliable or actually performed as advertized, locally or not, none. So this looks to me like open-air experiment for now. That they raised so much money on a promise that it will work one day, is even more surreal to me, especially since insurers don't seem to warm up to the idea.
Figure 2: Data Generation workflow from proposing tasks from various seeds like URLs to solving those tasks with the Magentic-One multi-agent framework to generate demonstrations for training, and finally verifiying/filtering completed trajectories
If you take the stance that technical debt is code nobody understands, then current LLM-based code generators are technical debt generators until somebody reads and understands their output.
If you take the stance that writing is thinking--that writing is among other things a process by which we order our thoughts--then understanding code generator output will require substantial rewriting of the code by whomever is tasked with converting it from technical debt to technical asset.
"AI agents have already demonstrated that they may misinterpret goals and cause some modest amount of harm. When the Washington Post tech columnist Geoffrey Fowler asked Operator, OpenAI’s computer-using agent, to find the cheapest eggs available for delivery, he expected the agent to browse the internet and come back with some recommendations. Instead, Fowler received a notification about a $31 charge from Instacart, and shortly after, a shopping bag containing a single carton of eggs appeared on his doorstep. The eggs were far from the cheapest available, especially with the priority delivery fee that Operator added. Worse, Fowler never consented to the purchase, even though OpenAI had designed the agent to check in with its user before taking any irreversible actions.
That’s no catastrophe. But there’s some evidence that LLM-based agents could defy human expectations in dangerous ways. In the past few months, researchers have demonstrated that LLMs will cheat at chess, pretend to adopt new behavioral rules to avoid being retrained, and even attempt to copy themselves to different servers if they are given access to messages that say they will soon be replaced. Of course, chatbot LLMs can’t copy themselves to new servers. But someday an agent might be able to.
Bengio is so concerned about this class of risk that he has reoriented his entire research program toward building computational “guardrails” to ensure that LLM agents behave safely."
"Agentic Windows Might Install Malware On Your Computer" 👀👏
"Agentic Windows Might Install Malware On Your Computer" 👀👏 ...