// Reference
AI Model Selection
Switch the AI brain powering your session at any time — directly from the chat. Supports cloud providers, local models running on your hardware, and fully custom configurations.
/model — Switch Active Model
Opens an interactive menu to pick the AI model for the current session. The change takes effect immediately — no restart needed.
/model
The menu shows all available models grouped by type:
| Model | Type | Requires |
|---|---|---|
| GPT 5.5 | Cloud | OPENAI_API_KEY |
| GPT 4o | Cloud | OPENAI_API_KEY |
| Gemini 2.5 Flash | Cloud | GEMINI_API_KEY |
| Gemini 3.5 Flash | Cloud | GEMINI_API_KEY |
| Claude 4.8 Opus | Cloud | ANTHROPIC_API_KEY |
| Claude 3.5 Sonnet | Cloud | ANTHROPIC_API_KEY |
| DeepSeek R1 (API) | Cloud | DEEPSEEK_API_KEY |
| DeepSeek V4 Flash | Cloud | DEEPSEEK_API_KEY |
| Llama 3.2 1B Q4 | Local | Downloaded automatically |
| Qwen 2.5 1.5B Q8 | Local | Downloaded automatically |
| DeepSeek R1 Distill 8B Q4 | Local | Downloaded automatically |
| Phi-4 14B Q4 | Local | Downloaded automatically |
| Others (custom) | Custom | Model spec string |
Local models are downloaded automatically the first time you select them. No additional setup required — the agent handles the download in the background.
/model <spec> — Set Model Directly
Pass the model name directly to skip the menu:
/model claude-3-5-sonnet-latest /model gpt-4o /model gemini-2.5-flash /model deepseek-r1-8b-q4 # local model /model openai/gpt-4o # explicit provider prefix
/models — Multi-Brain Routing
Assign different AI models to different tasks within the same session. The agent automatically routes each request to the appropriate brain based on what you're doing.
# Use Claude for security tasks, GPT-4o for everything else /models security:claude-3-5-sonnet-latest default:gpt-4o # Use a local model for code and a cloud model for analysis /models code:deepseek-r1-8b-q4 analyze:gemini-2.5-flash
Once activated, the routing is shown in the confirmation and saved to your project config. The agent logs which brain handled each request.
Multi-brain routing lets you optimize for both cost and quality — use a fast local model for everyday tasks and a powerful cloud model only when you need it.