Model registry
Ironspire ships with 46 static model definitions across 12 providers, plus runtime discovery for local models. This page lists every built-in model with its context window, maximum output length, and capability flags.
Capability columns use checkmarks and crosses for quick scanning. Tools indicates whether the model supports function calling (MCP tools, file operations, etc.). Vision indicates whether the model can process images.
Claude Models
Available through both the Claude SDK and Claude API providers. Both providers offer the same models; the difference is billing (subscription vs. pay-per-token).
| Model | Context Window | Max Output | Tools | Vision |
|---|---|---|---|---|
| Opus 4.6 | 1,000,000 | 32,000 | Yes | Yes |
| Opus 4.5 | 1,000,000 | 32,000 | Yes | Yes |
| Sonnet 4.5 | 1,000,000 | 64,000 | Yes | Yes |
| Sonnet 4 | 200,000 | 64,000 | Yes | Yes |
| Haiku 4.5 | 200,000 | 8,000 | Yes | Yes |
| Haiku 3.5 | 200,000 | 8,000 | Yes | Yes |
Opus 4.6 is the flagship model, best suited for complex reasoning, architecture decisions, and multi-step problem solving. Sonnet 4.5 offers the largest output window in the Claude family and strikes a strong balance between capability and speed. Haiku models are the fastest and cheapest option, ideal for simple tasks, quick lookups, and high-volume parallel agents.
OpenAI Models
Available through the OpenAI provider.
| Model | Context Window | Max Output | Tools | Vision |
|---|---|---|---|---|
| GPT-4o | 128,000 | 16,000 | Yes | Yes |
| GPT-4o-mini | 128,000 | 16,000 | Yes | Yes |
| GPT-4.1 | 1,000,000 | 32,000 | Yes | Yes |
| GPT-4.1-mini | 1,000,000 | 32,000 | Yes | Yes |
| GPT-4.1-nano | 1,000,000 | 32,000 | Yes | Yes |
| o3 | 200,000 | 100,000 | Yes | Yes |
| o3-mini | 200,000 | 100,000 | Yes | No |
| o4-mini | 200,000 | 100,000 | Yes | Yes |
GPT-4.1 and its variants offer million-token context windows, making them well suited for large codebases. The o-series models (o3, o3-mini, o4-mini) are reasoning-focused models with very large output windows, useful for complex analysis and code generation. Note that o3-mini does not support vision.
Google Gemini Models
Available through the Google Gemini provider.
| Model | Context Window | Max Output | Tools | Vision |
|---|---|---|---|---|
| Gemini 2.5 Pro | 1,000,000 | 64,000 | Yes | Yes |
| Gemini 2.5 Flash | 1,000,000 | 64,000 | Yes | Yes |
| Gemini 2.5 Flash-Lite | 1,000,000 | 64,000 | Yes | Yes |
| Gemini 3.1 Pro | 1,000,000 | 64,000 | Yes | Yes |
| Gemini 3 Flash | 1,000,000 | 64,000 | Yes | Yes |
| Gemini 3.1 Flash-Lite | 1,000,000 | 64,000 | Yes | Yes |
All Gemini models share million-token context windows and 64K output limits. The entire family supports both tools and vision. Flash variants prioritise speed over depth, while Pro variants offer higher quality reasoning.
Mistral Models
Available through the Mistral provider.
| Model | Context Window | Max Output | Tools | Vision |
|---|---|---|---|---|
| Large 3 | 262,000 | Default | Yes | No |
| Small 4 | 256,000 | Default | Yes | No |
| Codestral | 256,000 | Default | Yes | No |
| Devstral 2 | 262,000 | Default | Yes | No |
| Ministral 8B | 262,000 | Default | Yes | No |
Mistral models do not support vision. Codestral and Devstral 2 are code-specialised models designed for software engineering tasks. Ministral 8B is a lightweight model suitable for simple, high-throughput work.
DeepSeek Models
Available through the DeepSeek provider.
| Model | Context Window | Max Output | Tools | Vision |
|---|---|---|---|---|
| V3 | 131,000 | Default | Yes | No |
| R1 | 131,000 | Default | No | No |
DeepSeek R1 does not support tools or vision. Agents using R1 cannot call MCP tools, read files, or process images. Use R1 only for pure reasoning and text generation tasks.
xAI Grok Models
Available through the xAI Grok provider.
| Model | Context Window | Max Output | Tools | Vision |
|---|---|---|---|---|
| Grok 4.20 | 2,000,000 | Default | Yes | Yes |
| Grok 4-1-fast | 2,000,000 | Default | Yes | Yes |
Grok models offer the largest context windows of any provider at 2 million tokens, suitable for processing very large codebases or lengthy documents in a single context. Both variants support tools and vision.
AWS Bedrock Models
Available through the AWS Bedrock provider. Model availability varies by AWS region.
| Model | Context Window | Max Output | Tools | Vision |
|---|---|---|---|---|
| Nova Micro | Default | Default | Yes | No |
| Nova Lite | Default | Default | Yes | Yes |
| Nova Pro | Default | Default | Yes | Yes |
| Nova Premier | Default | Default | Yes | Yes |
| Llama 4 Scout | Default | Default | Yes | No |
| Llama 4 Maverick | Default | Default | Yes | No |
| Llama 3.3 70B | Default | Default | Yes | No |
| Llama 3.1 70B | Default | Default | Yes | No |
| Llama 3.1 8B | Default | Default | Yes | No |
| Mistral Large | Default | Default | Yes | No |
| Mistral Small | Default | Default | Yes | No |
Bedrock provides access to Amazon's own Nova family alongside popular open-weight models (Llama, Mistral) running on AWS infrastructure. This is useful if your organisation mandates that all API traffic stays within your AWS account.
Ollama Models (Runtime Discovery)
The Ollama provider does not ship with a static model list. Instead, Ironspire queries your local Ollama server at runtime and populates the model dropdown with whatever models you have pulled.
To add a model:
- Open a terminal
- Run
ollama pull <model-name>(e.g.ollama pull llama3.1,ollama pull codellama) - Return to Ironspire and refresh the Ollama provider card
- The new model appears in the dropdown
Context windows and capabilities depend on the specific model. Ironspire reads metadata from the Ollama server where available.
Check the Ollama model library for the full list of available models. Popular choices for coding tasks include Llama 3.1, CodeLlama, DeepSeek Coder, and Mistral.
Pricing
Model pricing varies by provider and changes frequently. Ironspire tracks token usage and computes estimated costs using built-in pricing data, but you should always verify current rates on each provider's pricing page.
Cost tracking is visible in the chat header HUD (per-message and per-session) and in the Analytics panel (aggregate fleet costs over time). See Analytics for details.