Model Routing
Orloj decouples agents from specific model providers through ModelEndpoint resources. A ModelEndpoint declares a provider, base URL, default model, and authentication -- and agents reference it by name. This lets you swap providers, manage credentials centrally, and route different agents to different models without modifying agent manifests.
Model Endpoints
A ModelEndpoint resource configures a connection to a model provider.
apiVersion: orloj.dev/v1
kind: ModelEndpoint
metadata:
name: openai-default
spec:
provider: openai
base_url: https://api.openai.com/v1
default_model: gpt-4o-mini
auth:
secretRef: openai-api-keySupported Providers
| Provider | provider value | Default base_url |
|---|---|---|
| OpenAI | openai | https://api.openai.com/v1 |
| Anthropic | anthropic | https://api.anthropic.com/v1 |
| Azure OpenAI | azure-openai | (must be set explicitly) |
| Ollama | ollama | http://127.0.0.1:11434 |
| Mock | mock | (no network calls) |
Provider-Specific Options
Some providers require additional configuration via the options field:
spec:
provider: anthropic
base_url: https://api.anthropic.com/v1
default_model: claude-3-5-sonnet-latest
options:
anthropic_version: "2023-06-01"
max_tokens: "1024"
auth:
secretRef: anthropic-api-keyspec:
provider: azure-openai
base_url: https://YOUR_RESOURCE_NAME.openai.azure.com
default_model: gpt-4o-deployment
options:
api_version: "2024-10-21"
auth:
secretRef: azure-openai-api-keyOllama (local, no auth required):
spec:
provider: ollama
base_url: http://127.0.0.1:11434
default_model: llama3.1Binding Agents to Models
Agents configure model routing through spec.model_ref, which points to a ModelEndpoint:
apiVersion: orloj.dev/v1
kind: Agent
metadata:
name: writer-agent
spec:
model_ref: openai-default
prompt: |
You are a writing agent.How Routing Works
When a worker executes an agent turn:
- The runtime resolves the agent's referenced ModelEndpoint from
model_ref. - The model gateway constructs a provider-specific API request using the endpoint's
base_url,default_model,options, and auth credentials. - The request is sent to the provider and the response is returned to the agent execution loop.
ModelEndpoint references are resolved by name within the same namespace, or by namespace/name for cross-namespace references.
Authentication
Model authentication is managed through Secret resources referenced by auth.secretRef. The simplest way to create one is the imperative CLI command:
orlojctl create secret openai-api-key --from-literal value=sk-your-api-key-hereOr with a YAML manifest via orlojctl apply -f:
apiVersion: orloj.dev/v1
kind: Secret
metadata:
name: openai-api-key
spec:
stringData:
value: sk-your-api-key-herestringData values are base64-encoded into data during normalization and then cleared (write-only semantics). The runtime reads from data at execution time.
In production, you can also skip Secret resources entirely and inject values via environment variables (ORLOJ_SECRET_<name>). See Secret Handling for details.
Governance Integration
AgentPolicy resources can restrict which models an agent is allowed to use via the allowed_models field:
apiVersion: orloj.dev/v1
kind: AgentPolicy
metadata:
name: cost-policy
spec:
allowed_models:
- gpt-4o
max_tokens_per_run: 50000If an agent's resolved endpoint default_model is not in the policy's allowed_models list, execution is denied.