AI Mode

Automatic, reversible system tuning for local AI inference — the defining feature of Genesi OS.

AI Mode is a background daemon (genesi-aid) that detects when you're running a local AI workload and tunes the system for inference — then restores every setting when the workload ends. It's fully reversible: it captures each value before changing it and puts it back on exit (and survives a daemon restart via a /run snapshot).

What it tunes

Setting	Normal	AI Mode
CPU governor	`powersave`	`performance`
Swappiness	60	10 (keep weights in RAM)
Transparent huge pages	`madvise`	`always`
Inference process priority	0	−5 (higher)

On machines with a GPU, AI Mode also detects available VRAM (via nvidia-smi, DRM sysfs, or a vendor-agnostic Vulkan probe) to inform how much of a model can be offloaded. The guiding principle is "runs great on a potato" — every knob is RAM/VRAM-gated and degrades gracefully rather than pushing a weak machine into OOM.

Supported runtimes

Ollama · llama.cpp (llama-server, llama-cli) · vLLM · LocalAI · text-generation-webui · KoboldCPP · Oobabooga.

How much faster?

Based on testing with Ollama on CPU-only systems:

Tokens/second: +15–25%
Model load time: −30–40%
Memory stability: reduced swap usage

With a GPU the inference gains are smaller, but CPU/memory tuning still helps.

Using it

AI Mode activates automatically — just run a model. To control it manually, use the genesi-ai-mode CLI:

genesi-ai-mode status     # current state + detected processes
genesi-ai-mode on         # force on
genesi-ai-mode off        # force off
genesi-ai-mode auto       # back to automatic detection

Inspect the daemon directly:

sudo systemctl status genesi-aid
sudo journalctl -u genesi-aid -f

AI Mode widget

The desktop ships a Plasma 6 widget (and a Monitor app) showing whether AI Mode is active, the detected processes, the applied optimizations, and live tokens/second.

AI Mode

What it tunes

Supported runtimes

How much faster?

Using it

On this page