This Week In Security: The AI Hacker, FortMajeure, And Project Zero

One of the hot topics currently is using LLMs for security research. Poor quality reports written by LLMs have become the bane of vulnerability disclosure programs. But there is an equally interesting effort going on to put LLMs to work doing actually useful research. One such story is [Romy Haik] at ULTRARED, trying to build an AI Hacker. This isn’t an over-eager newbie naively asking an AI to find vulnerabilities, [Romy] knows what he’s doing. We know this because he tells us plainly that the LLM-driven hacker failed spectacularly.

The plan was to build a multi-LLM orchestra, with a single AI sitting at the top that maintains state through the entire process. Multiple LLMs sit below that one, deciding what to do next, exactly how to approach the problem, and actually generating commands for those tools. Then yet another AI takes the output and figures out if the attack was successful. The tooling was assembled, and [Romy] set it loose on a few intentionally vulnerable VMs.

As we hinted at up above, the results were fascinating but dismal. This LLM successfully found one Remote Code Execution (RCE), one SQL injection, and three Cross-Site Scripting (XSS) flaws. This whole post is sort of sneakily an advertisement for ULTRARED’s actual automated scanner, that uses more conventional methods for scanning for vulnerabilities. But it’s a useful comparison, and it found nearly 100 vulnerabilities among the collection of targets.

The AI did what you’d expect, finding plenty of false positives. Ask an AI to describe a vulnerability, and it will glad do so — no real vulnerability required. But the real problem was the multitude of times that the AI stack did demonstrate a problem, and failed to realize it. [Romy] has thoughts on why this attempt failed, and two points stand out. The first is that while the LLM can be creative in making attacks, it’s really terrible at accurately analyzing the results. The second observation is one of the most important observations to keep in mind regarding today’s AIs. It doesn’t actually want to find a vulnerability. One of the marks of security researchers is the near obsession they have with finding a great score. Continue reading “This Week In Security: The AI Hacker, FortMajeure, And Project Zero”

Hackaday Links Column Banner

Hackaday Links: January 22, 2023

The media got their collective knickers in a twist this week with the news that Wyoming is banning the sale of electric vehicles in the state. Headlines like that certainly raise eyebrows, which is the intention, of course, but even a quick glance at the proposed legislation might have revealed that the “ban” was nothing more than a non-binding resolution, making this little more than a political stunt. The bill, which would only “encourage” the phase-out of EV sales in the state by 2035, is essentially meaningless, especially since it died in committee before ever coming close to a vote. But it does present a somewhat lengthy list of the authors’ beefs with EVs, which mainly focus on the importance of the fossil fuel industry in Wyoming. It’s all pretty boneheaded, but then again, outright bans on ICE vehicle sales by some arbitrary and unrealistically soon deadline don’t seem too smart either. Couldn’t people just decide what car works best for them?

Speaking of which, a man in neighboring Colorado might have some buyer’s regret when he learned that it would take five days to fully charge his brand-new electric Hummer at home. Granted, he bought the biggest battery pack possible — 250 kWh — and is using a standard 120-volt wall outlet and the stock Hummer charging dongle, which adds one mile (1.6 km) to the vehicle’s range every hour. The owner doesn’t actually seem all that surprised by the results, nor does he seem particularly upset by it; he appears to know enough about the realities of EVs to recognize the need for a Level 2 charger. That entails extra expense, of course, both to procure the charger and to run the 240-volt circuit needed to power it, not to mention paying for the electricity. It’s a problem that will only get worse as more chargers are added to our creaky grid; we’re not sure what the solution is, but we’re pretty sure it’ll be found closer to the engineering end of the spectrum than the political end.

Continue reading “Hackaday Links: January 22, 2023”

NASA Mission Off To Rough Start After Astra Failure

When Astra’s diminutive Rocket 3.3 lifted off from its pad at the Cape Canaveral Space Force Station on June 12th, everything seemed to be going well. In fact, the mission was progressing exactly to plan right up until the end — the booster’s second stage Aether engine appeared to be operating normally until it abruptly shut down roughly a minute ahead of schedule. Unfortunately, orbital mechanics are nothing if not exacting, and an engine burn that ends a minute early might as well never have happened at all.

According to the telemetry values shown on-screen during the live coverage of the launch, the booster’s upper stage topped out at a velocity of 6.573 kilometers per second, well short of the 7.8 km/s required to attain a stable low Earth orbit. While the video feed was cut as soon as it was clear something had gone wrong, the rigid physics of spaceflight means there’s little question about the sequence of events that followed. Without the necessary energy to stay in orbit, the upper stage of the rocket would have been left in a sub-orbital trajectory, eventually reentering the atmosphere and burning up a few thousand kilometers downrange from where it started.