# VRAMPilot

> Runs GGUF models locally and recovers from out-of-memory at runtime instead of crashing.

VRAMPilot is a UX/automation layer on top of llama.cpp: it auto-fits a GGUF model to your GPU, recovers from out-of-memory at runtime, remembers the configuration that actually booted, and reports honestly what it traded away.

## Pages

- [VRAMPilot](https://vrampilot.com/en/): Runs GGUF models locally and recovers from out-of-memory at runtime instead of crashing.
- [Documentation](https://vrampilot.com/en/docs/): Install VRAMPilot, how the OOM-recovery ladder works, persistence, watchdog behavior, requirements.
- [FAQ](https://vrampilot.com/en/faq/): Blunt answers to the questions a skeptic actually asks — including what VRAMPilot does not do.
- [Comparison](https://vrampilot.com/en/compare/): VRAMPilot vs Ollama vs LM Studio vs plain llama.cpp -fit, on the out-of-memory axis only — with the probe date stated.
- [Benchmarks](https://vrampilot.com/en/benchmarks/): Curated digest of every validated result — one card per claim, raw log linked, machine and date stated.
- [Changelog](https://vrampilot.com/en/changelog/): Release history of VRAMPilot, with measured figures only.
- [Legal notice](https://vrampilot.com/en/legal/): Legal notice for vrampilot.com (LCEN) — publisher ZMLabs, SIREN, registered office, contact, and hosting provider.
- [Privacy](https://vrampilot.com/en/privacy/): This site collects nothing, and the tool transmits nothing to ZMLabs. This page is short because there is little to disclose — which is the point.

## Full content

- [llms-full.txt](https://vrampilot.com/llms-full.txt): all pages in one markdown file