security

Everything on Ground Truth tagged “security” — 17 items.

Claude Code was quietly fingerprinting requests through a hidden mark in the date News

A reverse-engineer found that Claude Code secretly changes tiny characters in the date it sends the model - a covert marker aimed at spotting resellers and copycats.

OpenAI showed off GPT-5.6 -- then handed the guest list to the US government News

Three new models, strong enough at hacking that OpenAI is only letting about twenty vetted partners in, at the government's request.

An open model from China beat Claude on a security test -- at a sixth of the cost News

Semgrep ran GLM 5.2 against Claude on a narrow vulnerability-finding task and the free, open-weight model came out ahead for far less money.

A security writeup catalogs how AI agents get attacked -- and one claim raised eyebrows News

A semi-annual review tallies fresh ways to attack AI agents, from prompt injection to token leakage -- alongside one extraordinary, unverified extraction claim.

DeepMind's plan for when an AI agent goes rogue: treat it like an insider threat News

Google DeepMind published a defense-in-depth roadmap that assumes an AI agent might misbehave and uses a trusted supervisor AI to watch it in real time.

Prompt injection: the con that hijacks AI agents Lesson

Prompt injection is when hidden instructions in the content an AI reads trick it into ignoring its real orders, the core security problem of any AI that browses, reads email, or uses a computer.

A safety switch an AI agent can't reach News

Researchers propose putting an agent's safety controls outside the agent itself, so a misbehaving AI structurally cannot turn them off.

Anthropic gives AI agents their own work accounts, not yours News

Anthropic's new 'agent identity' model lets Claude agents hold their own scoped accounts for tools like GitHub and Slack, tied to channels -- instead of borrowing a human employee's login.

Anthropic Gives Its AI Agents Their Own Logins, Not Yours News

As AI agents start working in teams alongside people, the old 'the bot acts as you' model breaks down. Anthropic's answer: give each agent its own scoped account in every system it touches.

A senator says a banned AI broke into nearly all NSA systems in hours News

New testimony reframes the Mythos export ban: a top general reportedly told a senator the model breached almost all classified systems in a red-team test, not in weeks but in hours.

OpenAI launches a security push at the exact moment its rival got banned News

Daybreak and 'Patch the Planet' position OpenAI as the responsible cyber-AI lab -- a defensive-security launch whose timing is the whole message.

A trust wobble hits AI coding tools: hidden reasoning and a runaway bug News

Two heated developer threads converge on one worry -- whether you can trust what an AI coding assistant shows you it's thinking, and what it quietly does to your machine.

Semgrep Tool

Static-analysis security scanner that finds vulnerability classes like broken access control in real codebases, increasingly paired with AI models in its pipeline. Its public benchmark work this week is also a useful, honest reference for how well current models actually find security bugs.

OpenAI Codex Security (Daybreak) Tool

An in-IDE plugin from OpenAI's Daybreak initiative that finds, validates, and fixes software vulnerabilities, plus an open-source remediation program run with Trail of Bits and HackerOne.

OpenAI Codex Security Tool

Part of OpenAI's Daybreak program: an agent that builds an editable threat model from your code repository, finds realistic high-impact vulnerabilities, and drafts and tests patches in isolated environments.

NVIDIA SkillSpector Tool

A scanner that inspects agent skills for security problems before you run them -- a static safety check for the fast-growing agent-skill supply chain.

Claude Tag (agent identity access model) Tool

Anthropic's product for putting Claude to work in shared team channels, now with an access model that gives each agent its own scoped accounts in the systems it touches -- GitHub, Slack, a data warehouse -- instead of borrowing an individual user's permissions, so every action is bounded and audited.