local-AI

Everything on Ground Truth tagged “local-AI” — 7 items.

Ollama nearly doubles Gemma's speed on Macs by guessing ahead News

A free local-AI tool now runs Google's Gemma model far faster on Apple computers using a trick where a small model drafts words and the big one checks them in bulk.

A model that rivals the frontier now squeezes onto a single high-end desktop News

Aggressive compression shrinks GLM 5.2 by more than 80 percent while keeping most of its accuracy, putting a near-frontier model within reach of local hardware.

Unsloth Tool

Toolkit and documentation for running and fine-tuning large open models faster and on smaller hardware, including aggressive dynamic quantization recipes that shrink models like GLM 5.2 by 80-plus percent while keeping most of their accuracy. The practical on-ramp to running near-frontier models privately.

Ollama 0.31 Tool

Run open models on your own computer; the new version nearly doubles Gemma's speed on Apple Silicon using multi-token prediction, on by default.

Gemma-4 12B Coder (GGUF) Tool

A fine-tuned, locally-runnable version of Google's Gemma-4 model specialized for programming tasks, packaged in a format that runs efficiently on everyday consumer hardware.

GLM-5.2 Tool

A flagship openly-available language model with a very large context window for long documents and code. Free to download and run yourself, with compressed versions for more modest hardware.

GLM 5.2 (GGUF, runnable locally) Tool

Zhipu AI's open, MIT-licensed mixture-of-experts model with a roughly million-token context, now packaged as ready-to-run quantized files you can host on your own machine. Strong on agent and coding workflows; this week it beat Claude on a narrow security benchmark at a fraction of the cost.