DLDhananjay Lakkawaringenaiguru.hashnode.devStop Overpaying for VectorDBs: Architecting Serverless RAG on AWSBuilding a Retrieval-Augmented Generation (RAG) prototype takes a weekend. Taking that prototype to production without burning through your infrastructure budget is a completely different engineering 7h ago·5 min read
ZLzecheng liinlizecheng.hashnode.devAI-Assisted Automated Performance Engineering: The Autoresearch Pattern That Made Ruby Templates 53% FasterOriginally published at lizecheng.net The autoresearch pattern is a closed-loop methodology where an AI coding agent autonomously proposes, executes, benchmarks, and accepts or discards code changes — running dozens of experiments per hour with no hu...14h ago·7 min read
ZLzecheng liinlizecheng.hashnode.devThe Retrieval Layer Is Your AI System's Control Plane — and It's UnderprotectedOriginally published at lizecheng.net Two separate technical developments this week point at the same underappreciated architectural problem. Put them together and the implication is serious for anyone building systems that let AI agents reason over ...1d ago·5 min read
ZLzecheng liinlizecheng.hashnode.devBitNet's 100B-on-a-CPU Achievement Isn't What You Think It IsOriginally published at lizecheng.net Microsoft open-sourced bitnet.cpp yesterday. The headline landed with predictable breathlessness: a 100-billion-parameter model running on a single commodity CPU at 5-7 tokens per second. That's approximately hu...2d ago·5 min read
FSfraser sequeirainroohai.hashnode.devBuilding Voice Agents with RoohA prevailing shortcoming of today’s voice agents is deceptively simple:- they don’t truly understand the rhythm of human conversation. I experienced this firsthand during a skill-based interview condu2d ago·12 min read