Custom endpoints

When the built-in providers do not cover your needs, you can add custom OpenAI-compatible endpoints. This lets you connect to self-hosted models, API proxies, internal gateways, or any provider that implements the OpenAI chat completions API format.

Custom endpoints use the custom-* provider ID pattern and appear alongside the built-in providers in the agent model selector.

When to Use Custom Endpoints

Custom endpoints are useful for several scenarios:

Self-hosted models: connect to vLLM, TGI, LocalAI, LM Studio, or any other inference server running on your network
API proxies and gateways: route requests through a corporate proxy, rate-limiting gateway, or logging middleware
Unlisted providers: connect to a new provider that Ironspire does not yet include as a built-in option
Development and testing: point agents at a mock server or staging endpoint during development
Cost optimisation: use a provider that offers better pricing for your specific workload

Any server that accepts OpenAI-format chat completion requests (POST /v1/chat/completions) should work as a custom endpoint.

Adding a Custom Endpoint

Open Settings with Ctrl+, (or Cmd+, on macOS)
Go to the Providers tab
Scroll to the bottom of the provider list
Click Add Custom Endpoint
Fill in the configuration fields (detailed below)
Click Save

Ironspire runs a health check immediately after saving. If the endpoint is reachable and responds correctly, the status dot turns green.

Configuration Fields

Field	Required	Description
Name	Yes	A display name for the endpoint (shown in the model selector)
Base URL	Yes	The root URL of the API (e.g. `https://my-server.example.com/v1`)
API Key	No	An authentication token sent as a Bearer header. Leave blank if the endpoint does not require authentication.
Model ID	Yes	The model identifier sent in the request body (e.g. `meta-llama/Llama-3.1-70B-Instruct`)
Display Name	No	A friendly model name shown in the UI. Defaults to the Model ID if left blank.
Context Window	No	The model's context window size in tokens. Used for compaction calculations and the context gauge.
Max Output	No	The maximum number of output tokens. Used to set the `max_tokens` parameter.

The Base URL should point to the root of the API, not the specific completions path. Ironspire appends /chat/completions automatically.

If your endpoint uses a non-standard path (e.g. /api/generate instead of /v1/chat/completions), include the full path up to but not including the completions segment. Ironspire always appends the final path component.

Provider Quirks

Not all OpenAI-compatible servers behave identically. The Provider Quirks section lets you adjust how Ironspire communicates with the endpoint.

Stream Toggle

Toggle streaming on or off. Most modern inference servers support streaming (stream: true in the request body), but some older or simpler servers only support non-streaming responses.

Streaming on (default): Ironspire sends stream: true and processes server-sent events in real time. You see tokens appear as they are generated.
Streaming off: Ironspire waits for the complete response. The entire reply appears at once. Use this if your server does not support streaming or returns malformed stream events.

Tool Call Format

Select how the endpoint handles tool calls (function calling). Ironspire supports two formats:

Format	Description
OpenAI-native (default)	Standard OpenAI tool call format with `tool_calls` in the assistant message and `tool` role for results. Use this for servers that fully implement the OpenAI tools API.
Legacy function	Older format using `function_call` in the assistant message and `function` role for results. Use this for servers that have not adopted the newer tools format.

If your server does not support tool calling at all, agents using this endpoint will operate in chat-only mode (no MCP tools, no file access).

Editing a Custom Endpoint

Open Settings > Providers
Find your custom endpoint card
Click the card to expand it
Edit any field
Changes save automatically

Ironspire re-runs the health check after each edit. If you change the base URL or API key, the health status resets and re-validates.

Removing a Custom Endpoint

Open Settings > Providers
Find your custom endpoint card
Click the delete icon (trash) in the card header
Confirm the deletion in the modal

Deleting a custom endpoint immediately disconnects any agents that are using it. Reassign those agents to a different provider before deleting, or they will show a provider error on the next message.

Agents that were using the deleted endpoint retain their conversation history. You can reassign them to a new provider and continue chatting without losing any messages.

Validation and Health Checks

When you save a custom endpoint, Ironspire performs several checks:

URL validation: confirms the base URL is well-formed and uses HTTPS (or HTTP for localhost/private network addresses)
Connection test: sends a lightweight request to the endpoint to verify it is reachable
Authentication check: if an API key is provided, confirms the server does not reject it
Model verification: sends a minimal chat completion request to verify the model ID is valid

The health dot reflects the result:

Dot	Meaning
Green	All checks passed
Yellow	Endpoint reachable but rate-limited
Red	Connection failed, authentication rejected, or model not found

Health checks continue to run periodically during active sessions, so the status stays current.

Example: Connecting to LM Studio

LM Studio provides a local OpenAI-compatible server. Here is a typical configuration:

Field	Value
Name	LM Studio
Base URL	`http://localhost:1234/v1`
API Key	(leave blank)
Model ID	`meta-llama/Llama-3.1-8B-Instruct`
Context Window	`131072`
Streaming	On
Tool Call Format	OpenAI-native

After saving, the LM Studio endpoint appears in the model selector for any agent. You can assign it to specific agents while keeping other agents on cloud providers.

Example: Connecting to a Corporate Proxy

If your organisation routes all API traffic through an internal proxy:

Field	Value
Name	Internal Gateway
Base URL	`https://ai-gateway.corp.example.com/v1`
API Key	Your internal auth token
Model ID	`gpt-4o` (or whatever model the proxy routes to)
Streaming	On
Tool Call Format	OpenAI-native

This is especially useful in enterprise environments where direct access to external APIs is restricted.

Next steps

PreviousModel Registry NextAgent Presets