AI assistant / coding agent
Kimi (Kimi K2.6) →
Moonshot AI's web assistant and agent, running the open-weight Kimi K2.6 model; free to use in the browser for chat and long-horizon agent tasks, with the weights also downloadable for self-hosting.
AI compiler / runtime
Modular MAX + Mojo →
A programming language (Mojo) and compiler/runtime (MAX) for running AI models efficiently across different hardware instead of being locked to one chip vendor; now being acquired by Qualcomm but still openly available to developers.
AI in the browser
Gemma-4 WebGPU Kernels →
A demo running Google's Gemma-4 model directly inside a web browser using your device's graphics hardware — private, on-device AI with no server and no data leaving your machine.
AI video production
OpenMontage →
An open-source system that turns an AI coding assistant into an automated video-production studio, with a large library of pipelines, tools, and agent skills for editing and assembling video.
Agent / automation
Gemini 3.5 Flash computer use →
Google's fast model can now operate a browser, phone, or desktop directly as a built-in tool, with optional confirm-before-acting and auto-stop-on-attack safeguards for building automation agents.
Agent deployment infra
Cloudflare Temporary Accounts →
Lets an automated agent deploy and run on Cloudflare before a human signs up, removing the account-creation step from agent workflows.
Agent framework
DeerFlow →
ByteDance's open-source agent harness that breaks a long task into specialist sub-agents running in parallel, executes code safely in sandboxes, keeps memory across sessions, and produces reports, slides, and pages; built on LangChain and works with multiple model providers.
Agent security scanner
NVIDIA SkillSpector →
A scanner that inspects agent skills for security problems before you run them -- a static safety check for the fast-growing agent-skill supply chain.
Build with your own documents
RAGFlow →
An open engine for building AI question-answering over your own files and documents.
Coding agent
Claude Code →
Anthropic's command-line coding agent that reads a whole codebase, edits files, runs tests and fixes failures on its own; it is the tool behind Anthropic's disclosure that Claude now authors most of its production code.
Create images & video
ComfyUI →
A visual, node-based studio for generating images and video with open models. Powerful and endlessly extensible.
Cut AI agent costs
Headroom →
A drop-in proxy that sits between your coding assistant and the AI model and automatically compresses bulky tool outputs, logs, and retrieved text before they reach the model — cutting token usage sharply without changing your code.
Diffusion LLM API
Mercury 2 (Inception Labs) →
An API-only diffusion language model pitched on raw speed, claiming to out-pace open diffusion models on tokens-per-second for latency-sensitive generation.
Enterprise agent platform
Claude Tag (agent identity access model) →
Anthropic's product for putting Claude to work in shared team channels, now with an access model that gives each agent its own scoped accounts in the systems it touches -- GitHub, Slack, a data warehouse -- instead of borrowing an individual user's permissions, so every action is bounded and audited.
Find models & datasets
Hugging Face →
The main hub for finding, downloading, and trying open AI models and datasets — the field's town square.
Forecasting model
TimesFM →
Google's pre-trained foundation model for time-series forecasting — predicting things that change over time, like demand, traffic, or sensor readings — usable out of the box without training your own model.
Give AI agents code memory
codebase-memory-mcp →
Indexes an entire codebase into a persistent, queryable knowledge graph so AI agents can understand large projects fast. Supports a huge range of programming languages, answers queries near-instantly, and ships as a single dependency-free binary.
Hosted open-model API
GLM-5.2 on Baseten →
The top trending open-weight model served as a fast hosted endpoint, reported at 280+ tokens/sec on Blackwell-class hardware -- an open model you can call like a closed one.
Local coding model
Gemma-4 12B Coder (GGUF) →
A fine-tuned, locally-runnable version of Google's Gemma-4 model specialized for programming tasks, packaged in a format that runs efficiently on everyday consumer hardware.
MCP app framework
Skybridge →
A framework for building MCP-native apps -- interactive tools an AI assistant can open and use directly, pitched as 'MCP apps are the new website.'
Model-orchestration API
Sakana Fugu →
A single OpenAI-compatible endpoint that dynamically routes each request across several frontier models, so you call one API and get a coordinated multi-model answer.
Open language model
LLaDA / iLLaDA →
An openly released diffusion language model (weights and code) that generates text by refining a whole passage at once rather than one word at a time, useful for experimenting with non-autoregressive generation and infilling.
Open large language model
GLM-5.2 →
A flagship openly-available language model with a very large context window for long documents and code. Free to download and run yourself, with compressed versions for more modest hardware.
Open model download
Kimi K2.6 weights (Hugging Face) →
The actual Kimi K2.6 model weights, published under a modified-MIT license for anyone to download, run, and build on; large enough that full-strength use needs a multi-GPU node.
Open-weight model
MiniMax-M3 →
A natively multimodal open model trained on text, image, and video from the first step, with a million-token context and a sparse-attention design built for speed; downloadable for self-hosting and also offered through MiniMax's own API and agent platform.
Open-weight model (agent world model)
Qwen-AgentWorld →
Alibaba's open language world model that simulates agent environments -- browser, terminal, phone, coding workspace and more -- so other agents can be trained inside the simulation. Released with open weights and code in two sizes.
Open-weight model (self-host)
DiffusionGemma →
Google's open-weight text-diffusion model that generates text in parallel blocks instead of one token at a time; Apache-2.0, runnable locally, with community tooling already shipping.
Run AI models efficiently
SGLang v0.5.13 →
A high-performance open serving engine for language models. The new version turns on faster 'guess-ahead' decoding by default and trims scheduling overhead for quicker responses.
vLLM v0.23.0 →
The widely-used open engine for serving language models fast and cheaply. The latest release adds smarter memory handling for long conversations and faster GPU execution.
Run models on your computer
LM Studio →
A friendly desktop app to find, download, and chat with open models on your own machine — no command line needed.
Ollama →
Download and run open AI models locally with a single command. The easiest on-ramp to running your own model.
Open WebUI →
A polished, ChatGPT-style web interface for the open models you run yourself.
llama.cpp →
The lean, fast engine that makes big models run on ordinary laptops; powers much of the local-AI ecosystem.
Security coding assistant
OpenAI Codex Security (Daybreak) →
An in-IDE plugin from OpenAI's Daybreak initiative that finds, validates, and fixes software vulnerabilities, plus an open-source remediation program run with Trail of Bits and HackerOne.
Serve at scale
vLLM →
The popular open engine for serving AI models fast and efficiently when you need to handle real traffic.
Train & fine-tune AI models
veRL →
The open RL post-training framework used by most research labs training reasoning models today. Run GRPO, PPO, and related reward-training methods on your own models.