Model registry

Ironspire ships with 46 static model definitions across 12 providers, plus runtime discovery for local models. This page lists every built-in model with its context window, maximum output length, and capability flags.

Capability columns use checkmarks and crosses for quick scanning. Tools indicates whether the model supports function calling (MCP tools, file operations, etc.). Vision indicates whether the model can process images.

Claude Models

Available through both the Claude SDK and Claude API providers. Both providers offer the same models; the difference is billing (subscription vs. pay-per-token).

Model	Context Window	Max Output	Tools	Vision
Opus 4.6	1,000,000	32,000	Yes	Yes
Opus 4.5	1,000,000	32,000	Yes	Yes
Sonnet 4.5	1,000,000	64,000	Yes	Yes
Sonnet 4	200,000	64,000	Yes	Yes
Haiku 4.5	200,000	8,000	Yes	Yes
Haiku 3.5	200,000	8,000	Yes	Yes

Opus 4.6 is the flagship model, best suited for complex reasoning, architecture decisions, and multi-step problem solving. Sonnet 4.5 offers the largest output window in the Claude family and strikes a strong balance between capability and speed. Haiku models are the fastest and cheapest option, ideal for simple tasks, quick lookups, and high-volume parallel agents.

OpenAI Models

Available through the OpenAI provider.

Model	Context Window	Max Output	Tools	Vision
GPT-4o	128,000	16,000	Yes	Yes
GPT-4o-mini	128,000	16,000	Yes	Yes
GPT-4.1	1,000,000	32,000	Yes	Yes
GPT-4.1-mini	1,000,000	32,000	Yes	Yes
GPT-4.1-nano	1,000,000	32,000	Yes	Yes
o3	200,000	100,000	Yes	Yes
o3-mini	200,000	100,000	Yes	No
o4-mini	200,000	100,000	Yes	Yes

GPT-4.1 and its variants offer million-token context windows, making them well suited for large codebases. The o-series models (o3, o3-mini, o4-mini) are reasoning-focused models with very large output windows, useful for complex analysis and code generation. Note that o3-mini does not support vision.

Google Gemini Models

Available through the Google Gemini provider.

Model	Context Window	Max Output	Tools	Vision
Gemini 2.5 Pro	1,000,000	64,000	Yes	Yes
Gemini 2.5 Flash	1,000,000	64,000	Yes	Yes
Gemini 2.5 Flash-Lite	1,000,000	64,000	Yes	Yes
Gemini 3.1 Pro	1,000,000	64,000	Yes	Yes
Gemini 3 Flash	1,000,000	64,000	Yes	Yes
Gemini 3.1 Flash-Lite	1,000,000	64,000	Yes	Yes

All Gemini models share million-token context windows and 64K output limits. The entire family supports both tools and vision. Flash variants prioritise speed over depth, while Pro variants offer higher quality reasoning.

Mistral Models

Available through the Mistral provider.

Model	Context Window	Max Output	Tools	Vision
Large 3	262,000	Default	Yes	No
Small 4	256,000	Default	Yes	No
Codestral	256,000	Default	Yes	No
Devstral 2	262,000	Default	Yes	No
Ministral 8B	262,000	Default	Yes	No

Mistral models do not support vision. Codestral and Devstral 2 are code-specialised models designed for software engineering tasks. Ministral 8B is a lightweight model suitable for simple, high-throughput work.

DeepSeek Models

Available through the DeepSeek provider.

Model	Context Window	Max Output	Tools	Vision
V3	131,000	Default	Yes	No
R1	131,000	Default	No	No

DeepSeek R1 does not support tools or vision. Agents using R1 cannot call MCP tools, read files, or process images. Use R1 only for pure reasoning and text generation tasks.

xAI Grok Models

Available through the xAI Grok provider.

Model	Context Window	Max Output	Tools	Vision
Grok 4.20	2,000,000	Default	Yes	Yes
Grok 4-1-fast	2,000,000	Default	Yes	Yes

Grok models offer the largest context windows of any provider at 2 million tokens, suitable for processing very large codebases or lengthy documents in a single context. Both variants support tools and vision.

AWS Bedrock Models

Available through the AWS Bedrock provider. Model availability varies by AWS region.

Model	Context Window	Max Output	Tools	Vision
Nova Micro	Default	Default	Yes	No
Nova Lite	Default	Default	Yes	Yes
Nova Pro	Default	Default	Yes	Yes
Nova Premier	Default	Default	Yes	Yes
Llama 4 Scout	Default	Default	Yes	No
Llama 4 Maverick	Default	Default	Yes	No
Llama 3.3 70B	Default	Default	Yes	No
Llama 3.1 70B	Default	Default	Yes	No
Llama 3.1 8B	Default	Default	Yes	No
Mistral Large	Default	Default	Yes	No
Mistral Small	Default	Default	Yes	No

Bedrock provides access to Amazon's own Nova family alongside popular open-weight models (Llama, Mistral) running on AWS infrastructure. This is useful if your organisation mandates that all API traffic stays within your AWS account.

Ollama Models (Runtime Discovery)

The Ollama provider does not ship with a static model list. Instead, Ironspire queries your local Ollama server at runtime and populates the model dropdown with whatever models you have pulled.

To add a model:

Open a terminal
Run ollama pull <model-name> (e.g. ollama pull llama3.1, ollama pull codellama)
Return to Ironspire and refresh the Ollama provider card
The new model appears in the dropdown

Context windows and capabilities depend on the specific model. Ironspire reads metadata from the Ollama server where available.

Check the Ollama model library for the full list of available models. Popular choices for coding tasks include Llama 3.1, CodeLlama, DeepSeek Coder, and Mistral.

Pricing

Model pricing varies by provider and changes frequently. Ironspire tracks token usage and computes estimated costs using built-in pricing data, but you should always verify current rates on each provider's pricing page.

Cost tracking is visible in the chat header HUD (per-message and per-session) and in the Analytics panel (aggregate fleet costs over time). See Analytics for details.

Next steps

PreviousConfiguring Providers NextCustom Endpoints