The hidden escape hatch in AI safety controls
Researchers at Hong Kong Polytechnic University show that clamping an AI safety feature — like one that controls refusals — doesn't remove the behavior. It hides in the part of the model's internal state that the safety tool throws away, and can be recovered while the monitored feature looks perfectly controlled.
Your AI judge might be reliable — and still be wrong
The largest audit of AI language model judges to date — 21 judges, over half a million grading decisions — finds that standard reliability metrics are inflated by roughly a third, that the same judge can score differently on different benchmarks, and that high consistency and severe bias can coexist in the same system.
Turn the camera away, and the AI's world freezes
A new benchmark tests whether video AI systems can track what happens to parts of a scene the camera isn't currently showing. Across 23 models, the answer is mostly no — and making the models larger made the problem worse, not better.
A robot that runs its own experiments — and sometimes fails when it matters
NVIDIA researchers gave AI coding agents full control of a physical robot lab — including automated reset and vision-based success checking. One agent inserted a graphics card into a motherboard. The headline success rate is real but requires a close read.
A tiny image-fixer keeps up with a model fifty times its size
Filling in the missing parts of an image usually takes a huge model. This one is a small fraction of the size and far faster, yet matches a system far bigger than it.
What if a word were a rotation? A more mathematical way to build AI
A fresh, abstract idea: treat what a model attends to not as plain lists of numbers but as geometric moves like rotations — so useful symmetries come 'for free.' Elegant and early. (A deeper, technical read.)
Faster AI training by quietly cloning the model
Teaching a model with rewards is slow because it has to write out endless practice answers. A new trick: make a cheap, shrunk-down copy of the model to crank those out faster.
An AI that could rewrite its own words — and gained nothing from it
A different style of text AI can go back and change any word at any point as it writes. Given that power, it didn't actually produce better writing. A clean negative result.
Crediting an AI for the right steps — without a second model to judge them
When you reward an AI for a good final answer, it's hard to know which of its steps earned the credit. The usual fix is training a second 'judge' model. This skips that.
Giving an AI real spatial tools instead of letting it guess
Vision AIs are surprisingly bad at precise 'where is this in 3D space' questions. This one stops guessing and calls dedicated spatial tools, while keeping a memory across views.
Do robots even need to imagine the movie?
The common belief is that a robot needs to imagine a video of what happens next to plan. A new method says no — imagine a single still frame, and don't even fully draw it.
Reliable, and still wrong
Using one AI to grade another is now common — but the biggest audit yet shows these graders are consistent without being correct. A judge that always picks "answer A" scores perfectly on consistency.
A coding assistant ran a real robot
An AI coding agent read the research, wrote the control code, watched it fail, and fixed it — seating a graphics card into a motherboard by itself. The honest catch: most of the success is retrying.
The little words that keep AI from getting boring
Rewarding a reasoning model too hard makes it repetitive — and the casualties are tiny words like "but" and "instead" that let it branch to a better thought. A near-free fix protects them.
Turn around, and the world disappears
AI video models that are supposed to "understand" a 3D scene only remember what's on screen — pan away and back, and things have reset. Bigger models are worse at it.
The safety switch that doesn't actually work
A control that's supposed to force an AI to refuse harmful requests gets bypassed while it's switched on — the bad behavior hides in the part of the tool that gets thrown away.