coding-agents

Everything on Ground Truth tagged “coding-agents” — 7 items.

The best AI agents still fail most real, long computer tasks News

A wave of new benchmarks agrees on an uncomfortable result: even top models finish only a small slice of realistic, multi-hour computer and coding jobs.

Uber reportedly burned through its whole 2026 AI coding budget in four months News

The clearest enterprise cost figure yet for AI coding agents: Uber's CTO is reported to have said the company exhausted its Claude Code budget in a third of the year.

OpenAI launches a security push at the exact moment its rival got banned News

Daybreak and 'Patch the Planet' position OpenAI as the responsible cyber-AI lab -- a defensive-security launch whose timing is the whole message.

An AI wrote a working operating-system kernel from scratch in 38 minutes News

A blow-by-blow log shows one of the now-suspended models building bootable low-level systems code from an empty folder -- the kind of feat that made regulators nervous.

A trust wobble hits AI coding tools: hidden reasoning and a runaway bug News

Two heated developer threads converge on one worry -- whether you can trust what an AI coding assistant shows you it's thinking, and what it quietly does to your machine.

A coding assistant ran a real robot News

An AI coding agent read the research, wrote the control code, watched it fail, and fixed it — seating a graphics card into a motherboard by itself. The honest catch: most of the success is retrying.

design.md Tool

A simple convention from Google Labs for writing a DESIGN.md file that gives an AI coding assistant the context and intent it needs before it starts writing code, aimed at fewer wrong turns on bigger tasks.