Skip to content

Model registry

Ironspire ships with 46 static model definitions across 12 providers, plus runtime discovery for local models. This page lists every built-in model with its context window, maximum output length, and capability flags.

Capability columns use checkmarks and crosses for quick scanning. Tools indicates whether the model supports function calling (MCP tools, file operations, etc.). Vision indicates whether the model can process images.

Claude Models

Available through both the Claude SDK and Claude API providers. Both providers offer the same models; the difference is billing (subscription vs. pay-per-token).

ModelContext WindowMax OutputToolsVision
Opus 4.61,000,00032,000YesYes
Opus 4.51,000,00032,000YesYes
Sonnet 4.51,000,00064,000YesYes
Sonnet 4200,00064,000YesYes
Haiku 4.5200,0008,000YesYes
Haiku 3.5200,0008,000YesYes

Opus 4.6 is the flagship model, best suited for complex reasoning, architecture decisions, and multi-step problem solving. Sonnet 4.5 offers the largest output window in the Claude family and strikes a strong balance between capability and speed. Haiku models are the fastest and cheapest option, ideal for simple tasks, quick lookups, and high-volume parallel agents.

OpenAI Models

Available through the OpenAI provider.

ModelContext WindowMax OutputToolsVision
GPT-4o128,00016,000YesYes
GPT-4o-mini128,00016,000YesYes
GPT-4.11,000,00032,000YesYes
GPT-4.1-mini1,000,00032,000YesYes
GPT-4.1-nano1,000,00032,000YesYes
o3200,000100,000YesYes
o3-mini200,000100,000YesNo
o4-mini200,000100,000YesYes

GPT-4.1 and its variants offer million-token context windows, making them well suited for large codebases. The o-series models (o3, o3-mini, o4-mini) are reasoning-focused models with very large output windows, useful for complex analysis and code generation. Note that o3-mini does not support vision.

Google Gemini Models

Available through the Google Gemini provider.

ModelContext WindowMax OutputToolsVision
Gemini 2.5 Pro1,000,00064,000YesYes
Gemini 2.5 Flash1,000,00064,000YesYes
Gemini 2.5 Flash-Lite1,000,00064,000YesYes
Gemini 3.1 Pro1,000,00064,000YesYes
Gemini 3 Flash1,000,00064,000YesYes
Gemini 3.1 Flash-Lite1,000,00064,000YesYes

All Gemini models share million-token context windows and 64K output limits. The entire family supports both tools and vision. Flash variants prioritise speed over depth, while Pro variants offer higher quality reasoning.

Mistral Models

Available through the Mistral provider.

ModelContext WindowMax OutputToolsVision
Large 3262,000DefaultYesNo
Small 4256,000DefaultYesNo
Codestral256,000DefaultYesNo
Devstral 2262,000DefaultYesNo
Ministral 8B262,000DefaultYesNo

Mistral models do not support vision. Codestral and Devstral 2 are code-specialised models designed for software engineering tasks. Ministral 8B is a lightweight model suitable for simple, high-throughput work.

DeepSeek Models

Available through the DeepSeek provider.

ModelContext WindowMax OutputToolsVision
V3131,000DefaultYesNo
R1131,000DefaultNoNo

DeepSeek R1 does not support tools or vision. Agents using R1 cannot call MCP tools, read files, or process images. Use R1 only for pure reasoning and text generation tasks.

xAI Grok Models

Available through the xAI Grok provider.

ModelContext WindowMax OutputToolsVision
Grok 4.202,000,000DefaultYesYes
Grok 4-1-fast2,000,000DefaultYesYes

Grok models offer the largest context windows of any provider at 2 million tokens, suitable for processing very large codebases or lengthy documents in a single context. Both variants support tools and vision.

AWS Bedrock Models

Available through the AWS Bedrock provider. Model availability varies by AWS region.

ModelContext WindowMax OutputToolsVision
Nova MicroDefaultDefaultYesNo
Nova LiteDefaultDefaultYesYes
Nova ProDefaultDefaultYesYes
Nova PremierDefaultDefaultYesYes
Llama 4 ScoutDefaultDefaultYesNo
Llama 4 MaverickDefaultDefaultYesNo
Llama 3.3 70BDefaultDefaultYesNo
Llama 3.1 70BDefaultDefaultYesNo
Llama 3.1 8BDefaultDefaultYesNo
Mistral LargeDefaultDefaultYesNo
Mistral SmallDefaultDefaultYesNo

Bedrock provides access to Amazon's own Nova family alongside popular open-weight models (Llama, Mistral) running on AWS infrastructure. This is useful if your organisation mandates that all API traffic stays within your AWS account.

Ollama Models (Runtime Discovery)

The Ollama provider does not ship with a static model list. Instead, Ironspire queries your local Ollama server at runtime and populates the model dropdown with whatever models you have pulled.

To add a model:

  1. Open a terminal
  2. Run ollama pull <model-name> (e.g. ollama pull llama3.1, ollama pull codellama)
  3. Return to Ironspire and refresh the Ollama provider card
  4. The new model appears in the dropdown

Context windows and capabilities depend on the specific model. Ironspire reads metadata from the Ollama server where available.

Check the Ollama model library for the full list of available models. Popular choices for coding tasks include Llama 3.1, CodeLlama, DeepSeek Coder, and Mistral.

Pricing

Model pricing varies by provider and changes frequently. Ironspire tracks token usage and computes estimated costs using built-in pricing data, but you should always verify current rates on each provider's pricing page.

Cost tracking is visible in the chat header HUD (per-message and per-session) and in the Analytics panel (aggregate fleet costs over time). See Analytics for details.

Next steps