2026-06-24 — Ground Truth

← 2026-06-23 2026-06-24 2026-06-25 →

A senator says a banned AI broke into nearly all NSA systems in hours

2026-06-24

New testimony reframes the Mythos export ban: a top general reportedly told a senator the model breached almost all classified systems in a red-team test, not in weeks but in hours.

security · policy · anthropic · cyber · frontier-models

Alibaba's new models let AI agents practice in a world they imagine

2026-06-24

Qwen-AgentWorld trains a model to simulate the environment an agent acts in, then uses that simulation as a cheap, controllable place to learn -- reporting gains beyond training in the real thing.

research · ai-agents · world-models · reinforcement-learning · qwen

This model's job is to make better training data for other models

2026-06-24

DataClaw0 turns the grind of cleaning and labeling training data into a learned skill -- a small model that refines raw, messy multimodal streams into dense, purpose-built lessons.

research · ai-agents · training-data · multimodal · data-centric-ai

An open project publishes the recipe for training capable AI agents

2026-06-24

OpenThoughts-Agent releases its full data-curation pipeline, dataset, and experiments -- showing that what an agent learns from matters more than raw size, and letting anyone reproduce it.

research · ai-agents · open-source · training-data · reproducibility

Uber reportedly burned through its whole 2026 AI coding budget in four months

2026-06-24

The clearest enterprise cost figure yet for AI coding agents: Uber's CTO is reported to have said the company exhausted its Claude Code budget in a third of the year.

industry · economics · coding-agents · enterprise · anthropic

A small but elegant idea: putting 'experts' inside the attention layer

2026-06-24

Grouped Query Experts brings the mixture-of-experts trick into attention, activating only half a model's query heads per token while matching the full version -- at least at small scale.

research · architecture · mixture-of-experts · attention · efficiency

Anthropic gives AI agents their own work accounts, not yours

2026-06-24

Anthropic's new 'agent identity' model lets Claude agents hold their own scoped accounts for tools like GitHub and Slack, tied to channels -- instead of borrowing a human employee's login.

industry · ai-agents · enterprise · security · anthropic

Can an AI agent match real published science? A new test says: rarely

2026-06-24

NatureBench pits coding agents against the published state-of-the-art from Nature-family papers. Even the best agents beat the bar on a small minority of tasks -- mostly by reframing, not inventing.

research · benchmarks · ai-agents · science · evaluation

Google promised Gemini 3.5 Pro in June. June is almost over.

2026-06-24

Google said its next flagship would arrive in June; with days left it's still limited preview. The timing is awkward -- it overlaps a gap where another Western flagship is also unavailable.

industry · google · frontier-models · product

An AI Reportedly Broke Into Nearly All of the NSA's Classified Systems in Hours

2026-06-24

A senator says the head of the NSA told him a top AI model walked through almost all of America's classified systems in hours during a controlled test, reframing last week's government shutdown of the model.

anthropic · ai-safety · cybersecurity · export-control · policy · national-security

AI Agents Are Learning to Build the Worlds They Train In

2026-06-24

Three new open research projects point the same way: instead of only learning what to do, agents are learning to simulate the environment itself, so they can practice in their own imagination.

ai-agents · world-models · reinforcement-learning · alibaba · qwen · open-weight-models · research

Microsoft's CEO Says the AI Industry Has Not Earned the Right to Do This

2026-06-24

In a Wall Street Journal interview, Satya Nadella named OpenAI and Anthropic -- two companies Microsoft has poured billions into -- and warned that an economy reshaped by a handful of AI models will not survive politically.

microsoft · openai · anthropic · ai-economics · policy · satya-nadella

A Coding AI Ran Through Uber's Yearly Budget in Four Months

2026-06-24

Uber gave Claude Code to about 5,000 engineers, who loved it. By April the company had burned through its entire 2026 AI budget, exposing how badly old software pricing fits new agent tools.

ai-economics · anthropic · claude · coding · ai-agents · enterprise

A Classic Efficiency Trick Just Moved Into a New Part of the AI

2026-06-24

For years, the committee-of-specialists design that keeps big models fast lived in one layer of the network. A clean new result shows it works in the attention layer too, halving some of the work for free.

architecture · mixture-of-experts · attention · efficiency · research

Can an AI Agent Reproduce Real Science? A New Test Says: Rarely

2026-06-24

A new benchmark points coding agents at the actual computational results behind ninety papers in top journals. The strongest models matched the published science on fewer than one in five.

ai-agents · benchmarks · ai-for-science · coding · research

Anthropic Gives Its AI Agents Their Own Logins, Not Yours

2026-06-24

As AI agents start working in teams alongside people, the old 'the bot acts as you' model breaks down. Anthropic's answer: give each agent its own scoped account in every system it touches.

anthropic · ai-agents · security · enterprise · claude

The Model Ban Is Quietly Redrawing the AI Map

2026-06-24

Two weeks after the US pulled its top models off the market, a Chinese open model sits atop the global download charts and the community is busy rebuilding the banned capability in the open.

open-weight-models · china · export-control · glm · policy · geopolitics

DeepMind Sketches Four Roads From Human-Level AI to Superintelligence

2026-06-24

A new report from senior DeepMind researchers lays out four ways AI could push past human-level ability -- and argues the leap is more likely to be a steady climb than a single dramatic jump.

deepmind · agi · superintelligence · ai-safety · recursive-self-improvement · research

Samsung Banned ChatGPT in 2023. Now It's Giving It to 125,000 Workers.

2026-06-24

After barring ChatGPT over a data leak three years ago, Samsung has reversed course and rolled OpenAI's enterprise tools out across its workforce -- a vivid sign that the corporate holdouts are capitulating.

samsung · openai · chatgpt · enterprise · ai-adoption

Sometimes the AI Knew the Better Answer a Few Layers Early

2026-06-24

A new paper finds that a model's final layer can actually muddy an answer its middle layers had right -- and that reading the answer out a little early can claw back ability lost to safety training.

interpretability · ai-safety · alignment · decoding · research

← 2026-06-23 2026-06-24 2026-06-25 →