serving
Everything on Ground Truth tagged “serving” — 3 items.
vLLM v0.23.0 Tool
The widely-used open engine for serving language models fast and cheaply. The latest release adds smarter memory handling for long conversations and faster GPU execution.
vLLM Tool
The popular open engine for serving AI models fast and efficiently when you need to handle real traffic.
SGLang v0.5.13 Tool
A high-performance open serving engine for language models. The new version turns on faster 'guess-ahead' decoding by default and trims scheduling overhead for quicker responses.