open-source
Microsoft's new memory system lets AI agents remember more by storing less News
Memora keeps the rich detail of a conversation but searches it using tiny six-word labels, cutting the cost of remembering by up to 98 percent. The code is public.
The trick that makes AI type faster just hit the top of Hacker News News
A small model guesses ahead and a big model checks the work in parallel - and this week two efforts pushing that idea, DeepSeek's DSpark and JetSpec, lit up the front page while the community argued over whether it's truly 'lossless.'
AI video has a consistency problem. This model targets it. News
DomainShuttle goes after the tug-of-war in subject-driven text-to-video: keeping a specific character or object recognizable across frames while still letting the scene move freely.
The quiet race to turn messy documents into AI-ready text News
Mistral released a new document-reading model the same week an open-source rival surged, both chasing the unglamorous job that quietly decides how well AI can read your files.
An open-source 'AI crew' that turns a coding assistant into a video studio News
A project called OpenMontage shot to the top of GitHub in a day, claiming to be the first open-source system that lets AI agents handle a whole video production from script to final cut.
A language model that writes by erasing, and now keeps up with the classics News
Almost every chatbot writes one word at a time, left to right. A newly released model of real size writes the way image AIs paint, refining a whole passage at once, and finally holds its own.
An open project publishes the recipe for training capable AI agents News
OpenThoughts-Agent releases its full data-curation pipeline, dataset, and experiments -- showing that what an agent learns from matters more than raw size, and letting anyone reproduce it.
A tiny image-editing AI now runs entirely inside your web browser News
Moebius is a small inpainting model claiming far-larger-model quality, and a developer ported it to run on your own machine in a browser tab -- no server, no upload.
The best free AI model just landed — but almost nobody can run it at home News
A powerful open model anyone can legally download has reignited the open-vs-closed debate — but it's so large that 'open' now means 'open if you own a small server.'
Researchers turn the internet's hobbyist art 'filters' into training fuel News
Cleanly separating 'what's in a picture' from 'what style it's in' usually needs scarce data. A new method mines the huge public library of community-made style add-ons instead.
Open vs. closed AI models — what "open weights" really means Lesson
Some AI models you can only rent through a company's interface; others you can download and run yourself. That difference — open weights vs. closed — shapes privacy, research, cost, and who controls the technology.
An openly-released text model that writes by refining, not word-by-word News
Most language models write one word after another, left to right. A new openly-released model of real size generates text the way image AIs make pictures — refining a whole draft at once.
A powerful open model lands and reignites the open-vs-closed debate News
A Chinese lab released a flagship model anyone can download and run, with a huge memory for long documents — and a viral claim that it makes things up less than a top closed model.
veRL Tool
The open RL post-training framework used by most research labs training reasoning models today. Run GRPO, PPO, and related reward-training methods on your own models.
vLLM v0.23.0 Tool
The widely-used open engine for serving language models fast and cheaply. The latest release adds smarter memory handling for long conversations and faster GPU execution.
vLLM Tool
The popular open engine for serving AI models fast and efficiently when you need to handle real traffic.
llama.cpp Tool
The lean, fast engine that makes big models run on ordinary laptops; powers much of the local-AI ecosystem.
design.md Tool
A simple convention from Google Labs for writing a DESIGN.md file that gives an AI coding assistant the context and intent it needs before it starts writing code, aimed at fewer wrong turns on bigger tasks.
codebase-memory-mcp Tool
Indexes an entire codebase into a persistent, queryable knowledge graph so AI agents can understand large projects fast. Supports a huge range of programming languages, answers queries near-instantly, and ships as a single dependency-free binary.
Unsloth Tool
Toolkit and documentation for running and fine-tuning large open models faster and on smaller hardware, including aggressive dynamic quantization recipes that shrink models like GLM 5.2 by 80-plus percent while keeping most of their accuracy. The practical on-ramp to running near-frontier models privately.
TimesFM Tool
Google's pre-trained foundation model for time-series forecasting — predicting things that change over time, like demand, traffic, or sensor readings — usable out of the box without training your own model.
SGLang v0.5.13 Tool
A high-performance open serving engine for language models. The new version turns on faster 'guess-ahead' decoding by default and trims scheduling overhead for quicker responses.
RAGFlow Tool
An open engine for building AI question-answering over your own files and documents.
OpenMontage Tool
An open-source system that turns an AI coding assistant into an automated video-production studio, with a large library of pipelines, tools, and agent skills for editing and assembling video.
Open WebUI Tool
A polished, ChatGPT-style web interface for the open models you run yourself.
Ollama 0.31 Tool
Run open models on your own computer; the new version nearly doubles Gemma's speed on Apple Silicon using multi-token prediction, on by default.
MinerU Tool
Open-source tool that converts complex PDFs and office files into clean markdown and structured data that AI models can read reliably. Run it yourself for free, with nothing leaving your machine.
Microsoft Memora Tool
Open-source memory system for AI agents that stores rich content but searches it via tiny abstraction labels and cue anchors, cutting token cost on long-horizon tasks. Includes a distillable retriever.
Headroom Tool
A drop-in proxy that sits between your coding assistant and the AI model and automatically compresses bulky tool outputs, logs, and retrieved text before they reach the model — cutting token usage sharply without changing your code.
Gemma-4 12B Coder (GGUF) Tool
A fine-tuned, locally-runnable version of Google's Gemma-4 model specialized for programming tasks, packaged in a format that runs efficiently on everyday consumer hardware.
GLM-5.2 Tool
A flagship openly-available language model with a very large context window for long documents and code. Free to download and run yourself, with compressed versions for more modest hardware.
DeerFlow Tool
ByteDance's open-source agent harness that breaks a long task into specialist sub-agents running in parallel, executes code safely in sandboxes, keeps memory across sessions, and produces reports, slides, and pages; built on LangChain and works with multiple model providers.
DeepSeek DSpark Tool
Open-source speculative-decoding implementation using parallel tree drafting to speed up text generation with no change to the model's output - the project that topped Hacker News this week. Drop-in inference speedups for self-hosted models.
ComfyUI Tool
A visual, node-based studio for generating images and video with open models. Powerful and endlessly extensible.