EngineeringJune 5, 2026 5 min

AI Mode: the OS that tunes itself for local inference

Most distros treat a large-language-model workload like any other process. Genesi OS does not. When you start Ollama, llama.cpp, vLLM or LocalAI, a background daemon called genesi-aid notices and reconfigures the machine for inference.

Concretely: the CPU governor switches to performance, vm.swappiness drops to 10, 2MB transparent huge pages are enabled and pre-allocated, the inference threads are reniced to -5 and pinned to physical performance cores, and I/O readahead is tuned for large GGUF files.

The result is a 15–25% speedup on CPU-only inference in our testing. Crucially, every change is reversible: the moment inference stops, genesi-aid restores the previous state so your compiler gets its resources back.

You can watch all of this happen. The AI Mode Plasma widget lights up and lists the detected process, and the dedicated Genesi Monitor app shows the optimizations applied along with a live tokens/s readout.

Back to the blog