architecture
Transformers: the engine inside almost every modern AI Lesson
The neural-network design behind GPT, Claude, and nearly every modern AI model, and the one idea, attention, that made it work.
A language model that writes by erasing, and now keeps up with the classics News
Almost every chatbot writes one word at a time, left to right. A newly released model of real size writes the way image AIs paint, refining a whole passage at once, and finally holds its own.
A language model that doesn't write left to right News
iLLaDA is an 8-billion-parameter model that generates text by refining a blurry whole rather than one word at a time, and it's catching up to the mainstream.
Mixture of Experts: The Committee Inside a Giant Model Lesson
Why the biggest AI models are not really one big brain but a large team of specialists, only a few of whom wake up for any given word -- the trick that lets a model be huge and fast at the same time.
A small but elegant idea: putting 'experts' inside the attention layer News
Grouped Query Experts brings the mixture-of-experts trick into attention, activating only half a model's query heads per token while matching the full version -- at least at small scale.
A Classic Efficiency Trick Just Moved Into a New Part of the AI News
For years, the committee-of-specialists design that keeps big models fast lived in one layer of the network. A clean new result shows it works in the attention layer too, halving some of the work for free.
What is a context window? Lesson
A model's context window is how much text it can hold in mind at once — its working memory. Bigger is useful, but a long window isn't the same as a good memory. Here's how it works and where it breaks.
Scaling laws — does bigger always mean better? Lesson
For years, AI progress ran on a simple recipe: make the model bigger, feed it more data, get a better model. That pattern is real and predictable — but it has limits and surprises. Here's what scaling laws actually say.
An openly-released text model that writes by refining, not word-by-word News
Most language models write one word after another, left to right. A new openly-released model of real size generates text the way image AIs make pictures — refining a whole draft at once.
A world model that thinks in loops instead of stacking layers News
Instead of building an ever-deeper neural network to simulate the future, a new design re-runs one small block over and over — doing comparable work with a fraction of the size.
What if a word were a rotation? A more mathematical way to build AI News
A fresh, abstract idea: treat what a model attends to not as plain lists of numbers but as geometric moves like rotations — so useful symmetries come 'for free.' Elegant and early. (A deeper, technical read.)
What are diffusion language models? Lesson
Most AI writes one word at a time and can never go back. Diffusion language models start from noise and clarify it iteratively — and some versions can revise any word at any step. A growing alternative to the standard left-to-right approach.