long-context

Everything on Ground Truth tagged “long-context” — 6 items.

The KV cache: why AI gets slower and hungrier the longer it talks Lesson

The hidden notebook that lets a model avoid re-reading every previous word - and the single biggest reason long context is expensive.

DeepSeek's new open models give everyone a million-word memory by default News

DeepSeek previewed two free-to-download V4 models that can read a million tokens at once, no longer as a premium add-on but as the standard setting.

What is a context window? Lesson

A model's context window is how much text it can hold in mind at once — its working memory. Bigger is useful, but a long window isn't the same as a good memory. Here's how it works and where it breaks.

MiniMax-M3 Tool

A natively multimodal open model trained on text, image, and video from the first step, with a million-token context and a sparse-attention design built for speed; downloadable for self-hosting and also offered through MiniMax's own API and agent platform.

GLM-5.2 Tool

A flagship openly-available language model with a very large context window for long documents and code. Free to download and run yourself, with compressed versions for more modest hardware.

DeepSeek-V4 (Pro & Flash) Tool

Two newly previewed open-weight models with a 1-million-token context window on by default - a large mixture-of-experts flagship and a smaller, fast everyday model. Downloadable weights plus an API.