v1.3.4 Live

The Most Intuitive Operating System.
The Most Powerful Agentic Platform on the Planet.

It started out as a tailor-built environment for agentic development. It ended up solving the biggest problems users face when swapping to Linux.

Arch Linux + Hyprland + Ollama + Claude Code. Free download, open source.

Run LLMs Locally, Sub-500ms

A local LLM runs on your GPU, automatically selected to fit your VRAM. System queries are answered in under 500ms without sending a single byte to the internet. Your data never leaves your machine unless you choose to escalate.

$ "Turn up the volume"

Routed to local model (best fit for your VRAM). Runs wpctl via MCP. <500ms, 0 tokens sent anywhere.

$ "What's using my GPU?"

Local model queries /sys/class/drm and radeontop. Returns structured data via MCP tools.

$ "Restart Docker"

systemctl restart docker via MCP system_command. No escalation, no cloud roundtrip.

$ "Debug this segfault in my Rust code"

Auto-escalates to Claude Sonnet. Router detects code_debug category, selects cloud model.

<500ms
local response time
$0
for system queries
0 bytes
sent to cloud
Benchmark-Verified

Multi-Model Routing

Every query is automatically routed to the best model for the job, backed by verified benchmark data from 12 sources. Works with zero API keys out of the box.

Query TypeRoutes ToCost
Math & reasoningGemini 3 Flash / Pro$0.50/M tokens
Frontend & web devClaude (Sonnet / Opus)Plan or API key
General codingClaude (Sonnet / Opus)Plan or API key
System commandsBest local model (VRAM-aware)$0
Local reasoningGPT-OSS:20b$0
Quick web queriesGroq / Gemini Flash$0
Budget code tasksDevstral / qwen3-coder$0
DevOps & terminalGPT-5.3 CodexAPI key
Free tier

Local Ollama + Groq free (14,400 req/day, includes 70B) + Gemini free (15 RPM) + Devstral (unlimited). Zero API keys needed.

BYOK

Bring your own API keys for Anthropic, OpenAI, Google, Groq, or Mistral. All providers except Anthropic use one OpenAI-compatible endpoint.

Self-improving

PyTorch MLP classifier retrains every 50 queries on your usage patterns. Fallback chain: ML classifier, regex patterns, default route.

Why We Still Use Claude for Most Things

The routing table sends math to Gemini and quick lookups to Groq. But the majority of queries still go to Claude. Here is why.

Plan Usage

Claude Pro / Max, not API

Costa OS authenticates with your Claude subscription. No API billing, no metered usage. You pay for the plan you already have, and the OS routes through it. This makes Claude effectively free for plan subscribers while other cloud providers charge per token.

Quality + Reliability

Chatbot Arena #1, SWE-bench 80.8%

Claude leads Chatbot Arena (overall quality), tops WebDev Arena for frontend work, and holds 80.8% on SWE-bench Verified for real-world code fixes. Gemini wins on specific benchmarks (GPQA, AIME), but Claude is the most reliable generalist across task types.

Instruction Following

System prompts that actually work

The local router depends on models following structured instructions: output format, tool selection, safety constraints. Claude follows complex multi-step system prompts more reliably than alternatives, which matters when the output drives real system commands.

The Integration Infrastructure

This is the real reason. Benchmark scores shift every quarter, but the tooling ecosystem is a durable advantage that no other provider matches.

Claude Code

The agent runtime that powers Costa OS. Hooks, plugins, slash commands, custom agents, autonomous sessions, and MCP server integration. No other LLM has an equivalent local agent framework with this depth of system access.

MCP (Model Context Protocol)

The protocol that connects the agent to 30+ system tools. Screen reading, window management, file ops, Obsidian vault, CLI wrappers. Claude created MCP and has first-class support. Other models can use MCP tools through Claude Code, but the native integration is deeper.

Agent SDK + Hooks

PreToolUse / PostToolUse hooks validate every action before execution. Custom agents with scoped tools and resource queues. Session scheduling with budget caps. Costa Flow YAML workflows with claude-code step types. This is infrastructure, not a chat wrapper.

Knowledge + Memory

21 knowledge files, CLAUDE.md project instructions, Obsidian vault via MCP, persistent memory across sessions. The agent accumulates context about your system, your preferences, and your projects. Switching the underlying model would lose all of this integration.

The Agent Has Real System Access

Not a chatbot. The AI executes commands, manages windows, controls audio, and navigates apps through a proprietary MCP server. "Close Firefox and open VS Code on workspace 3" just works.

30+ MCP Tools

30+ tools

AT-SPI screen reading, window management, typing, clicking, vault search, CLI registry. The agent interacts with your desktop through real APIs, not screenshots.

CLI-Anything Fast Path

~50ms agents

12 CLI wrappers respond in ~50ms with 0 LLM tokens. Firefox tabs, VS Code workspace, OBS status, Strawberry playback. Deterministic, extensible registry.

Invisible Virtual Monitor

112x

The agent operates on HEADLESS-2, a virtual monitor. Opens browsers, navigates pages, fills forms, and researches without touching your displays.

Claude Code + 5 MCP Servers

7 agents

5 MCP servers (costa-system, code-review-graph, context7, claude-code-enhanced, voicemode), knowledge files, Obsidian vault, 7 specialized agents. costa-session schedules autonomous Claude Code sessions with budget caps. Costa Flow defines YAML workflows.

Talk to Your Computer

Push-to-talk voice that actually understands your system. GPU-accelerated transcription, noise cancellation, 2-5 second end-to-end response.

One key

Push-to-Talk

Hold a hotkey and speak. Release to submit. Say 'draft' to review first.

0.5s

GPU Transcription

Vulkan-accelerated speech-to-text. Your voice becomes text in half a second.

99.8%

Noise Cancellation

LADSPA noise reduction crushes background noise. Auto-detects when you stop talking.

Any input

Every Input Works

Speak, type in the bar, use the terminal, paste a screenshot. All inputs feed the same agent.

Just Describe What You Want

You don't need to know Linux. You don't need to know how to code. Just describe what you want in plain English.

Learn terminal commands
Just say what you need
"Install Python and set up a project"
Edit config files by hand
Describe the change you want
"Make my desktop background darker"
Search forums for fixes
Ask and it gets fixed
"My sound isn't working, fix it"
MCP Workflows
$ Build an MCP server that controls my smart lights from the desktop
Neural Routing
$ Retrain the ML router with my latest query logs and evaluate accuracy
Agent Workflows
$ Set up a custom Claude Code workflow for automated code review on every commit
Full Apps
$ Create a voice command that deploys my staging branch to production
Persistent Memory
$ Check your notes about the auth refactor and what I told you about the API design
Workflow Automation
$ Run the security scan workflow and send results to my Telegram

Beautiful and Developer-Ready

A custom Mediterranean coastal theme across 15+ config domains. Pre-configured dev tools. Purpose-built desktop shell. Or change it all with one prompt.

Music Widget

Floating MPRIS controller with album art, queue browsing, library search, playlist switching, live audio quality badge, and support for 10+ players.

Keybind Editor

Visual GUI with keyboard recorder, conflict detection, mouse button discovery, per-device bindings, and support for every Hyprland bind type.

Settings Hub

Central GTK4 panel for display, input, model configuration, Claude Plan login or API keys, dev tools, and system updates.

Clipboard Intelligence

Auto-classifies pasted content including errors, URLs, JSON, commands, and code, then offers contextual actions.

Screenshot Analysis

Select any region, get instant analysis with OCR extraction, error detection, and auto-classification.

Persistent Memory

Obsidian vault connected via MCP. Claude reads and writes notes to remember your preferences, track projects, and store corrections across sessions.

Autonomous Sessions

costa-session schedules Claude Code sessions with budget caps and tool restrictions. Come back to commits, summaries, and desktop notifications.

7-Agent System

Specialized agents for deploys, server ops, architecture review, ISO builds, monitoring, cleanup, and screen navigation.

AI-Assisted Updates

Run costa-update and Claude pulls from GitHub, reviews every change, and fixes breakage automatically. Updates never touch our servers.

Why This Requires Linux

These capabilities depend on architectural features that Windows and macOS do not expose to applications.

Direct Compositor Access

Hyprland exposes every window, workspace, and input event via IPC. The agent reads and controls your desktop directly without accessibility hacks, screen scraping, or COM automation. Windows has no equivalent.

GPU Memory Control

Linux lets you query, allocate, and release VRAM programmatically. The VRAM manager hot-swaps ML models in real time based on what you're doing. Windows locks GPU memory behind driver abstractions you can't touch.

System-Wide Agent Integration

PipeWire, systemd, pacman, and hyprctl all have scriptable interfaces. The agent wires into audio routing, service management, package installs, and window control natively. On Windows, each one requires a different proprietary API with different permissions.

Zero-Overhead Voice Pipeline

Raw audio capture, LADSPA noise reduction, Vulkan-accelerated transcription, and direct text injection, all in user space with no kernel mode switches.

Filesystem as API

Everything is a file. Config changes take effect immediately. No registry, no restart, no 'applying changes.' The agent edits a config and reloads. Response times are measured in milliseconds.

Full Root Intelligence

The agent has the same access you do. Install any package, modify any service, create any systemd unit, change any config. No UAC, no Group Policy, no Defender blocking legitimate operations.

Companion Projects

Open source tools built alongside Costa OS. Ship with the ISO and work standalone on any Linux distribution.

airpods-helper

Native AirPods Pro on Linux & Windows

Full AirPods Pro integration for Linux and Windows. A Rust daemon speaks the Apple Accessory Protocol over raw Bluetooth for features Apple restricts to macOS/iOS. Optional install with Costa OS. Works standalone on any distro.

Rust + Tokio + BlueZ + Tauri + PipeWire + D-Bus

Learn More
ANC control (Off / Noise Cancel / Transparency / Adaptive)
Per-bud + case battery with charging status
Conversational awareness & ear detection
Parametric EQ via PipeWire filter chains
Desktop app (Tauri) + GTK4 bar widget
CLI + D-Bus (Linux) / HTTP API (Windows)

costa-terminal

Multi-Provider AI Terminal

A native terminal app for local and cloud AI. Routes queries across Ollama, Groq, Gemini, Mistral, and Claude using an ML classifier. Streaming responses, full Claude Code sessions with tool call visibility, and a settings wizard that auto-detects your hardware.

Rust + Tauri v2 + SolidJS + Tailwind + ONNX

Included in v1.3.4
ML-based query routing across local and cloud providers
Streaming responses via Tauri events
Claude Code sessions with tool call activity panel
Auto-discovery of Ollama models and cloud API keys
4-step onboarding with GPU/VRAM detection
Provider management with tier-based worker pools

Adapts to Your Hardware

The VRAM manager automatically selects the largest model your GPU can fit. Launch a game and models unload. Close it and they reload in seconds.

ModelParamsQ4 VRAMQ8 VRAMNotes
Qwen 3.5qwen3.5:0.8b0.8B1.5 GB2 GB
qwen3.5:2b2B2.5 GB3.5 GB
qwen3.5:4b4.7B4 GB6.5 GB
qwen3.5:9b9.7B7 GB12 GB
Qwen 3qwen3:14b14B9 GB16 GB
Aprielapriel-15b-thinker15B10 GB17 GB
GPT-OSSgpt-oss:20b21B (3.6B active) MoEMXFP4: 16 GB
Llama 3llama3.1:8b8B5.5 GB9.5 GB
llama3.3:70b70B40 GB74 GB
Mistralmistral-small3.1:24b24B14 GB26 GB
mistral-nemo:12b12B8 GB14 GB
Gemma 3gemma3:4b4B3.5 GB5.5 GB
gemma3:12b12B8 GB14 GB
gemma3:27b27B17 GB30 GB
Phiphi4:14b14B9 GB16 GB
DeepSeek R1deepseek-r1:7b7B5 GB8.5 GB
deepseek-r1:14b14B9 GB16 GB
deepseek-r1:32b32B19 GB35 GB
Devstraldevstral24B (MoE) MoE14 GB26 GB
Qwen Coderqwen3-coder30B (3.3B active) MoE8.5 GB18 GB
qwen2.5-coder:14b14B10 GB16 GB

VRAM figures are approximate at default context length. Green = fits 16 GB, amber = tight, red = needs more. MoE models load all parameters but only activate a fraction per token.

System Requirements
min4 GB RAM, any GPU, 20 GB disk— desktop + cloud models, no local LLM
rec16 GB RAM, 8 GB VRAM, 40 GB disk— local 7B model + voice + cloud escalation
full32 GB RAM, 12-16 GB VRAM, 80 GB disk— local 14B model, gaming + models simultaneously

Install Costa OS

From download to a working desktop in about 15 minutes.

1

Download the ISO

~2.1GB file. Save it anywhere on your computer.

2

Flash to USB

Use balenaEtcher (free, works on Windows/Mac/Linux). Select the ISO, select your USB drive, click Flash.

3

Boot from USB

Restart your computer and press the boot menu key (F12 for Dell, F9 for HP, F12 for Lenovo, F8 for ASUS).

4

Run the installer

Graphical installer launches automatically. Pick a disk, set a username, done.

5

First boot setup

Log into Claude Code first (it can fix everything else). Then hardware detection, model download, and voice setup run automatically.

Download Costa OS v1.3.4 (2.1 GB)Get balenaEtcher (USB flasher)

Requirements

64-bit processor, 4GB+ RAM

Any GPU from the last ~10 years (including integrated graphics)

USB drive, 8GB or larger

Internet connection for first boot setup

Anthropic account (Claude Pro, Max, or API key) for cloud models

Full installation guide with troubleshooting →
What's New
+
v1.3.4. Costa Terminal + model tier overhaulNew Costa Terminal app with multi-model orchestrator and worker pool. Apriel-15b as default model, gpt-oss:20b as premium tier. VRAM manager rewritten — fixes model swap budget bug. ISO manifest system for reproducible builds. Removed deprecated waybar configs.
+
v1.3.2. Adversarial security reviewSanitized personal data from training files. Hardened MCP command deny list, context gatherer secret redaction, and router command execution. Moved runtime files to XDG_RUNTIME_DIR. Fixed Firecrawl default credentials.
+
v1.3.1. Benchmark infrastructure overhaul2048-token benchmark runner, free-tier LLM judge, VRAM manager updated for Qwen 3.5. Both ML router stages retrained.
+
v1.3.0. Security hardening + Firecrawl + shell stabilityUFW firewall, Bluetooth encryption, kernel hardening, security scanning agent. Self-hosted Firecrawl web scraping. AGS crash supervisor with auto-restart.

Support Costa OS

Costa OS is free and fully functional. If you'd like to support development, a one-time $9.99 purchase removes the small "Costa" watermark from the status bar. That's it. No features locked, no subscriptions, no telemetry.

Costa OS Pro

One-time purchase. No subscription.

$9.99
  • Remove status bar watermark
  • Offline license. Works forever, no account needed
  • Support an independent developer

The Costa OS intelligence layer is open source under the Apache License 2.0. The installer and ISO distribution are proprietary.