Skip to content

Supported Providers

Fermi supports cloud APIs and local inference servers. Use fermi init to configure any combination.

ProviderModelsAuth
AnthropicClaude Haiku 4.5, Sonnet 4.6, Opus 4.6 (+ 1M context variants), Opus 4.7ANTHROPIC_API_KEY
OpenAIGPT-5.2, 5.2 Codex, 5.3 Codex, 5.4, 5.4 Mini, 5.4 Nano, 5.5OPENAI_API_KEY or OAuth
GitHub CopilotFetched live from your plan’s catalog — e.g. Claude Opus 4.8/4.7, Sonnet 4.6, GPT-5.3 Codex, 5.4, 5.4 Mini, 5.5, 5 Mini/copilot device-flow login
DeepSeekV4 Flash, V4 ProManaged slot (FERMI_DEEPSEEK_*)
Kimi / MoonshotK2.6, K2.5, K2 Instruct (Global, China, Code variants)Managed slots (FERMI_KIMI_*)
MiniMaxM2.5, M2.5 Highspeed, M2.7, M2.7 Highspeed (Global, China)Managed slots (FERMI_MINIMAX_*)
GLM / ZhipuGLM-5.1, 5, 5 Turbo, 5V Turbo, 4.7 (Global, China, Code variants)Managed slots (FERMI_GLM_*)
Xiaomi (MiMo)V2.5, V2.5 ProManaged slot (FERMI_XIAOMI_*)
Qwen / DashScopeQwen3.6 Plus, Qwen3.7 Max (China, Singapore, US regions)Managed slots (FERMI_QWEN_*)
OpenRouterMulti-vendor curated presets (Claude, GPT, Kimi, MiniMax, GLM, DeepSeek, Qwen, Xiaomi) + any custom modelOPENROUTER_API_KEY
OllamaAny local model (dynamic discovery)
oMLXAny local MLX model (dynamic discovery)
LM StudioAny local GGUF model (dynamic discovery)

Cloud providers require either an API key or an OAuth login. The init wizard prompts for keys and stores them in ~/.fermi/.env. Kimi, MiniMax, GLM, DeepSeek, Xiaomi, and Qwen use Fermi-managed internal slots. GitHub Copilot uses its own device-flow OAuth via /copilot. OpenAI (ChatGPT Login) stores OAuth tokens in ~/.fermi/state/oauth.json.

Local providers (Ollama, oMLX, LM Studio) connect to a server on your machine. No API key needed. During fermi init, the wizard queries the server’s model endpoint to discover available models.

Use /model during a session to switch between any configured model. For providers with missing keys, selecting a model can prompt you to import or paste the key on the spot.

Use /tier to assign models to high/medium/low tiers for sub-agents.

See Model Switching for details.

Third-party coding plans (Kimi-Code, GLM-Code) use whitelist-based access control. Unless your account has explicit access, these endpoints will reject requests. Standard API endpoints work normally.