The Most Intuitive Operating System.
The Most Powerful Agentic Platform on the Planet.
It started out as a tailor-built environment for agentic development. It ended up solving the biggest problems users face when swapping to Linux.
Arch Linux + Hyprland + Ollama + Claude Code. Free download, open source.
Run LLMs Locally, Sub-500ms
A local LLM runs on your GPU, automatically selected to fit your VRAM. System queries are answered in under 500ms without sending a single byte to the internet. Your data never leaves your machine unless you choose to escalate.
Routed to local model (best fit for your VRAM). Runs wpctl via MCP. <500ms, 0 tokens sent anywhere.
Local model queries /sys/class/drm and radeontop. Returns structured data via MCP tools.
systemctl restart docker via MCP system_command. No escalation, no cloud roundtrip.
Auto-escalates to Claude Sonnet. Router detects code_debug category, selects cloud model.
Multi-Model Routing
Every query is automatically routed to the best model for the job, backed by verified benchmark data from 12 sources. Works with zero API keys out of the box.
| Query Type | Routes To | Cost |
|---|---|---|
| Math & reasoning | Gemini 3 Flash / Pro | $0.50/M tokens |
| Frontend & web dev | Claude (Sonnet / Opus) | Plan or API key |
| General coding | Claude (Sonnet / Opus) | Plan or API key |
| System commands | Best local model (VRAM-aware) | $0 |
| Local reasoning | GPT-OSS:20b | $0 |
| Quick web queries | Groq / Gemini Flash | $0 |
| Budget code tasks | Devstral / qwen3-coder | $0 |
| DevOps & terminal | GPT-5.3 Codex | API key |
Local Ollama + Groq free (14,400 req/day, includes 70B) + Gemini free (15 RPM) + Devstral (unlimited). Zero API keys needed.
Bring your own API keys for Anthropic, OpenAI, Google, Groq, or Mistral. All providers except Anthropic use one OpenAI-compatible endpoint.
PyTorch MLP classifier retrains every 50 queries on your usage patterns. Fallback chain: ML classifier, regex patterns, default route.
Why We Still Use Claude for Most Things
The routing table sends math to Gemini and quick lookups to Groq. But the majority of queries still go to Claude. Here is why.
Claude Pro / Max, not API
Costa OS authenticates with your Claude subscription. No API billing, no metered usage. You pay for the plan you already have, and the OS routes through it. This makes Claude effectively free for plan subscribers while other cloud providers charge per token.
Chatbot Arena #1, SWE-bench 80.8%
Claude leads Chatbot Arena (overall quality), tops WebDev Arena for frontend work, and holds 80.8% on SWE-bench Verified for real-world code fixes. Gemini wins on specific benchmarks (GPQA, AIME), but Claude is the most reliable generalist across task types.
System prompts that actually work
The local router depends on models following structured instructions: output format, tool selection, safety constraints. Claude follows complex multi-step system prompts more reliably than alternatives, which matters when the output drives real system commands.
The Integration Infrastructure
This is the real reason. Benchmark scores shift every quarter, but the tooling ecosystem is a durable advantage that no other provider matches.
The agent runtime that powers Costa OS. Hooks, plugins, slash commands, custom agents, autonomous sessions, and MCP server integration. No other LLM has an equivalent local agent framework with this depth of system access.
The protocol that connects the agent to 30+ system tools. Screen reading, window management, file ops, Obsidian vault, CLI wrappers. Claude created MCP and has first-class support. Other models can use MCP tools through Claude Code, but the native integration is deeper.
PreToolUse / PostToolUse hooks validate every action before execution. Custom agents with scoped tools and resource queues. Session scheduling with budget caps. Costa Flow YAML workflows with claude-code step types. This is infrastructure, not a chat wrapper.
21 knowledge files, CLAUDE.md project instructions, Obsidian vault via MCP, persistent memory across sessions. The agent accumulates context about your system, your preferences, and your projects. Switching the underlying model would lose all of this integration.
The Agent Has Real System Access
Not a chatbot. The AI executes commands, manages windows, controls audio, and navigates apps through a proprietary MCP server. "Close Firefox and open VS Code on workspace 3" just works.
30+ MCP Tools
30+ toolsAT-SPI screen reading, window management, typing, clicking, vault search, CLI registry. The agent interacts with your desktop through real APIs, not screenshots.
CLI-Anything Fast Path
~50ms agents12 CLI wrappers respond in ~50ms with 0 LLM tokens. Firefox tabs, VS Code workspace, OBS status, Strawberry playback. Deterministic, extensible registry.
Invisible Virtual Monitor
112xThe agent operates on HEADLESS-2, a virtual monitor. Opens browsers, navigates pages, fills forms, and researches without touching your displays.
Claude Code + 5 MCP Servers
7 agents5 MCP servers (costa-system, code-review-graph, context7, claude-code-enhanced, voicemode), knowledge files, Obsidian vault, 7 specialized agents. costa-session schedules autonomous Claude Code sessions with budget caps. Costa Flow defines YAML workflows.
Talk to Your Computer
Push-to-talk voice that actually understands your system. GPU-accelerated transcription, noise cancellation, 2-5 second end-to-end response.
Push-to-Talk
Hold a hotkey and speak. Release to submit. Say 'draft' to review first.
GPU Transcription
Vulkan-accelerated speech-to-text. Your voice becomes text in half a second.
Noise Cancellation
LADSPA noise reduction crushes background noise. Auto-detects when you stop talking.
Every Input Works
Speak, type in the bar, use the terminal, paste a screenshot. All inputs feed the same agent.
Just Describe What You Want
You don't need to know Linux. You don't need to know how to code. Just describe what you want in plain English.
Beautiful and Developer-Ready
A custom Mediterranean coastal theme across 15+ config domains. Pre-configured dev tools. Purpose-built desktop shell. Or change it all with one prompt.
Music Widget
Floating MPRIS controller with album art, queue browsing, library search, playlist switching, live audio quality badge, and support for 10+ players.
Keybind Editor
Visual GUI with keyboard recorder, conflict detection, mouse button discovery, per-device bindings, and support for every Hyprland bind type.
Settings Hub
Central GTK4 panel for display, input, model configuration, Claude Plan login or API keys, dev tools, and system updates.
Clipboard Intelligence
Auto-classifies pasted content including errors, URLs, JSON, commands, and code, then offers contextual actions.
Screenshot Analysis
Select any region, get instant analysis with OCR extraction, error detection, and auto-classification.
Persistent Memory
Obsidian vault connected via MCP. Claude reads and writes notes to remember your preferences, track projects, and store corrections across sessions.
Autonomous Sessions
costa-session schedules Claude Code sessions with budget caps and tool restrictions. Come back to commits, summaries, and desktop notifications.
7-Agent System
Specialized agents for deploys, server ops, architecture review, ISO builds, monitoring, cleanup, and screen navigation.
AI-Assisted Updates
Run costa-update and Claude pulls from GitHub, reviews every change, and fixes breakage automatically. Updates never touch our servers.
Why This Requires Linux
These capabilities depend on architectural features that Windows and macOS do not expose to applications.
Direct Compositor Access
Hyprland exposes every window, workspace, and input event via IPC. The agent reads and controls your desktop directly without accessibility hacks, screen scraping, or COM automation. Windows has no equivalent.
GPU Memory Control
Linux lets you query, allocate, and release VRAM programmatically. The VRAM manager hot-swaps ML models in real time based on what you're doing. Windows locks GPU memory behind driver abstractions you can't touch.
System-Wide Agent Integration
PipeWire, systemd, pacman, and hyprctl all have scriptable interfaces. The agent wires into audio routing, service management, package installs, and window control natively. On Windows, each one requires a different proprietary API with different permissions.
Zero-Overhead Voice Pipeline
Raw audio capture, LADSPA noise reduction, Vulkan-accelerated transcription, and direct text injection, all in user space with no kernel mode switches.
Filesystem as API
Everything is a file. Config changes take effect immediately. No registry, no restart, no 'applying changes.' The agent edits a config and reloads. Response times are measured in milliseconds.
Full Root Intelligence
The agent has the same access you do. Install any package, modify any service, create any systemd unit, change any config. No UAC, no Group Policy, no Defender blocking legitimate operations.
Companion Projects
Open source tools built alongside Costa OS. Ship with the ISO and work standalone on any Linux distribution.
airpods-helper
Native AirPods Pro on Linux & Windows
Full AirPods Pro integration for Linux and Windows. A Rust daemon speaks the Apple Accessory Protocol over raw Bluetooth for features Apple restricts to macOS/iOS. Optional install with Costa OS. Works standalone on any distro.
Rust + Tokio + BlueZ + Tauri + PipeWire + D-Bus
Learn Morecosta-terminal
Multi-Provider AI Terminal
A native terminal app for local and cloud AI. Routes queries across Ollama, Groq, Gemini, Mistral, and Claude using an ML classifier. Streaming responses, full Claude Code sessions with tool call visibility, and a settings wizard that auto-detects your hardware.
Rust + Tauri v2 + SolidJS + Tailwind + ONNX
Included in v1.3.4Adapts to Your Hardware
The VRAM manager automatically selects the largest model your GPU can fit. Launch a game and models unload. Close it and they reload in seconds.
| Model | Params | Q4 VRAM | Q8 VRAM | Notes |
|---|---|---|---|---|
| Qwen 3.5qwen3.5:0.8b | 0.8B | 1.5 GB | 2 GB | |
| qwen3.5:2b | 2B | 2.5 GB | 3.5 GB | |
| qwen3.5:4b | 4.7B | 4 GB | 6.5 GB | |
| qwen3.5:9b | 9.7B | 7 GB | 12 GB | |
| Qwen 3qwen3:14b | 14B | 9 GB | 16 GB | |
| Aprielapriel-15b-thinker | 15B | 10 GB | 17 GB | |
| GPT-OSSgpt-oss:20b | 21B (3.6B active) MoE | MXFP4: 16 GB | ||
| Llama 3llama3.1:8b | 8B | 5.5 GB | 9.5 GB | |
| llama3.3:70b | 70B | 40 GB | 74 GB | |
| Mistralmistral-small3.1:24b | 24B | 14 GB | 26 GB | |
| mistral-nemo:12b | 12B | 8 GB | 14 GB | |
| Gemma 3gemma3:4b | 4B | 3.5 GB | 5.5 GB | |
| gemma3:12b | 12B | 8 GB | 14 GB | |
| gemma3:27b | 27B | 17 GB | 30 GB | |
| Phiphi4:14b | 14B | 9 GB | 16 GB | |
| DeepSeek R1deepseek-r1:7b | 7B | 5 GB | 8.5 GB | |
| deepseek-r1:14b | 14B | 9 GB | 16 GB | |
| deepseek-r1:32b | 32B | 19 GB | 35 GB | |
| Devstraldevstral | 24B (MoE) MoE | 14 GB | 26 GB | |
| Qwen Coderqwen3-coder | 30B (3.3B active) MoE | 8.5 GB | 18 GB | |
| qwen2.5-coder:14b | 14B | 10 GB | 16 GB |
VRAM figures are approximate at default context length. Green = fits 16 GB, amber = tight, red = needs more. MoE models load all parameters but only activate a fraction per token.
Install Costa OS
From download to a working desktop in about 15 minutes.
Download the ISO
~2.1GB file. Save it anywhere on your computer.
Flash to USB
Use balenaEtcher (free, works on Windows/Mac/Linux). Select the ISO, select your USB drive, click Flash.
Boot from USB
Restart your computer and press the boot menu key (F12 for Dell, F9 for HP, F12 for Lenovo, F8 for ASUS).
Run the installer
Graphical installer launches automatically. Pick a disk, set a username, done.
First boot setup
Log into Claude Code first (it can fix everything else). Then hardware detection, model download, and voice setup run automatically.
Requirements
64-bit processor, 4GB+ RAM
Any GPU from the last ~10 years (including integrated graphics)
USB drive, 8GB or larger
Internet connection for first boot setup
Anthropic account (Claude Pro, Max, or API key) for cloud models
Support Costa OS
Costa OS is free and fully functional. If you'd like to support development, a one-time $9.99 purchase removes the small "Costa" watermark from the status bar. That's it. No features locked, no subscriptions, no telemetry.
Costa OS Pro
One-time purchase. No subscription.
- ✓Remove status bar watermark
- ✓Offline license. Works forever, no account needed
- ✓Support an independent developer
The Costa OS intelligence layer is open source under the Apache License 2.0. The installer and ISO distribution are proprietary.