llm
Qwen3.6 (open weights) Tool
Alibaba's stable Qwen3.6 release: open-weight general chat and coding models you can self-host, the same family at the center of this week's open-vs-closed pricing debate.
Mercury 2 (Inception Labs) Tool
An API-only diffusion language model pitched on raw speed, claiming to out-pace open diffusion models on tokens-per-second for latency-sensitive generation.
GLM-5.2 on Baseten Tool
The top trending open-weight model served as a fast hosted endpoint, reported at 280+ tokens/sec on Blackwell-class hardware -- an open model you can call like a closed one.
GLM-5.2 Tool
A flagship openly-available language model with a very large context window for long documents and code. Free to download and run yourself, with compressed versions for more modest hardware.
GLM 5.2 (GGUF, runnable locally) Tool
Zhipu AI's open, MIT-licensed mixture-of-experts model with a roughly million-token context, now packaged as ready-to-run quantized files you can host on your own machine. Strong on agent and coding workflows; this week it beat Claude on a narrow security benchmark at a fraction of the cost.
DeepSeek-V4 (Pro & Flash) Tool
Two newly previewed open-weight models with a 1-million-token context window on by default - a large mixture-of-experts flagship and a smaller, fast everyday model. Downloadable weights plus an API.
DeepSeek V4 Pro (API) Tool
A strong open-weight reasoning and coding model now offered through DeepSeek's own API at a permanently cut, low per-token price, undercutting frontier closed models for high-volume work.