Security and Isolation
This page describes current runtime security controls and expected operator practices.
Current Controls
AgentPolicygates model/tool/token usage.AgentRoleandToolPermissionenforce per-tool authorization.ToolApprovalandTaskApprovalpause risky actions or sensitive outputs for explicit human review.- Tool runtime enforces timeout/retry/isolation policy from
Tool.spec.runtime. - Unsupported tools and disallowed runtime requests fail closed.
- Permission denials are terminal for the current execution path.
For regulated environments, TaskApproval checkpoints add a second control plane beyond tool authorization: a human can review and approve, deny, or request changes on sensitive agent output before the workflow continues.
Namespace Isolation
Namespaces are an organizational boundary, not a security boundary. Any authenticated user with the correct role (e.g., reader, writer, admin) can access resources in any namespace. There is no per-namespace access control by default.
For deployments that require per-namespace or per-resource authorization, the server exposes a ResourceAuthorizer extension point (see ServerOptions.ResourceAuthorizer in api/auth_context.go). A custom authorization layer can implement this interface to enforce fine-grained policies based on the caller's identity, the target namespace, resource type, and HTTP method. This hook is nil by default and all requests that pass the role check are permitted.
Control plane API tokens
The HTTP API (including orlojctl) authenticates automation with Authorization: Bearer <token> when you enable token validation on the server. Orloj does not mint or email API keys: the operator chooses a secret string, configures it on orlojd, and distributes the same value to people and CI that need API access.
See also: Remote CLI and API access — end-to-end flow for self-hosters (env vars, orlojctl config, config.json lifecycle).
This is separate from native UI sign-in (--auth-mode=native), which uses an admin username/password and session cookies in the browser. The CLI does not use that password for API calls; use a bearer token as below (or run with auth disabled in trusted dev environments only).
1. Generate a token
Use a cryptographically random value (length is flexible; treat it like a password):
# Hex (64 characters); easy to paste into env files
openssl rand -hex 32
# Or base64 (~44 characters)
openssl rand -base64 32Store the output in your secrets manager, Kubernetes Secret, or password manager—not in git.
2. Configure the server
Pick one of these (same token string you generated):
- Flag:
orlojd --api-key='<token>' - Environment:
ORLOJ_API_TOKEN='<token>'(also read when--api-keyis unset; see server help)
For multiple distinct tokens with different roles (reader vs admin-style access), use:
export ORLOJ_API_TOKENS='reader-bot:reader-token-here:reader,automation-bot:automation-token-here:admin'Format is comma-separated name:token:role entries. Legacy token:role entries are still accepted for backward compatibility. When ORLOJ_API_TOKENS is set, it populates the token map and a single ORLOJ_API_TOKEN is only used if that list is empty (see loadAuthConfig in api/authz.go).
For runtime-managed tokens (no server restart required), use:
orlojctl create token <name> --role <role>
orlojctl get tokens
orlojctl delete token <name>3. Configure clients (orlojctl and automation)
Use the same token the server expects:
- Environment:
ORLOJ_API_TOKENorORLOJCTL_API_TOKEN - Flag:
orlojctl --api-token '<token>' ... - Profile:
orlojctl config set-profile ... --token-env VARso the token stays in the environment, not on disk
See Remote CLI and API access for client precedence, default --server resolution, and profiles.
4. Native auth mode and APIs
If you use --auth-mode=native, the UI still requires a bearer token (or session cookie) for protected API routes. Configure ORLOJ_API_TOKEN / --api-key on the server so orlojctl and other API clients can authenticate with Authorization: Bearer—the admin password alone is not used for programmatic access.
5. Initial setup protection
When deploying with --auth-mode=native on a network-exposed instance, set ORLOJ_SETUP_TOKEN to prevent unauthorized admin account creation. When this variable is set, the /v1/auth/setup endpoint requires a matching setup_token field in the JSON request body:
{
"username": "admin",
"password": "...",
"setup_token": "your-setup-token-here"
}The comparison uses constant-time comparison to prevent timing side-channels. Without ORLOJ_SETUP_TOKEN, the setup endpoint is open to the first caller (protected only by rate limiting).
6. Authentication rate limiting
Authentication endpoints (/v1/auth/login, /v1/auth/setup, /v1/auth/change-password, /v1/auth/admin/reset-password) are rate-limited per client IP address. The default policy allows 10 requests per minute sustained with a burst of 20 to accommodate legitimate multi-step flows. Requests that exceed the limit receive HTTP 429.
Trusted proxy configuration
By default, the rate limiter ignores X-Forwarded-For and X-Real-IP headers and uses the TCP peer address (RemoteAddr) to identify clients. This prevents attackers from bypassing rate limits by rotating spoofed forwarding headers.
If Orloj runs behind a reverse proxy or load balancer, configure --trusted-proxies (env: ORLOJ_TRUSTED_PROXIES) with the CIDR(s) of your proxy so the server can extract the real client IP from forwarding headers:
# Single proxy
orlojd --trusted-proxies='10.0.0.0/8'
# Multiple proxies
orlojd --trusted-proxies='10.0.0.0/8,172.16.0.0/12'
# Single IP (treated as /32)
export ORLOJ_TRUSTED_PROXIES='192.168.1.50'When trusted proxies are configured, X-Forwarded-For is parsed right-to-left: entries added by trusted proxies are skipped, and the first untrusted entry is used as the client IP. If the immediate peer is not in the trusted set, forwarding headers are ignored regardless of their content.
Without --trusted-proxies, all requests arriving through a proxy share a single rate-limit bucket (the proxy's IP). The server logs a warning when it detects forwarding headers but has no trusted proxies configured.
The same trust gate applies to X-Forwarded-Proto for session cookie security: the Secure flag is only set based on the forwarding header when the peer is a trusted proxy.
Tool Types
All tool types (http, external, grpc, webhook-callback, mcp, cli, wasm) flow through the governed runtime pipeline, so policy enforcement, retry, timeout, and error handling behave identically regardless of transport. See Tools for type details.
gRPC TLS
gRPC tool connections require TLS (minimum TLS 1.2) by default. Plaintext gRPC is available as an opt-in for development environments only. Production deployments should always use the default TLS transport.
SSRF Protection
Outbound HTTP, gRPC, and MCP connections validate the target endpoint twice: once at call time (URL parsing and scheme allowlist) and again at dial time via a net.Dialer.Control hook that inspects the actual IP the kernel is about to connect to. Dial-time enforcement closes the hostname-bypass and DNS-rebinding gaps that a URL-only check cannot catch. For generic tool and MCP egress, the following destinations are blocked regardless of configuration:
- Loopback addresses (
127.0.0.0/8,::1, and IPv4-mapped IPv6 equivalents like::ffff:127.0.0.1) - Link-local addresses (
169.254.0.0/16,fe80::/10) - Cloud metadata endpoints (
169.254.169.254for AWS/GCP/Azure IMDS,fd00:ec2::254for AWS IMDSv2 IPv6) - Unspecified addresses (
0.0.0.0,::)
Private network addresses (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, fc00::/7) and RFC 6598 carrier-grade NAT (100.64.0.0/10) are also blocked unless the caller explicitly opts in.
ModelEndpoint resources use a model-gateway-specific safe client. Set spec.allowPrivate: true only for trusted local/private model servers; it permits loopback plus private/CGNAT addresses for model traffic while still blocking cloud metadata, link-local, and unspecified addresses. The default is false for all providers except ollama, which defaults to true because Ollama is a local-first runtime.
Upgrading from earlier versions: if you run an OpenAI-compatible server (vLLM, LM Studio, LocalAI, LiteLLM proxy, TGI, etc.) on localhost or a private network under provider: openai-compatible, add spec.allowPrivate: true to those ModelEndpoint resources before upgrading, or the gateway will fail at dial time with an error that names the resolved IP and the exact field to change.
MCP Server Security
McpServer resources connect to external MCP (Model Context Protocol) servers that expose tools for agent use. Security considerations vary by transport:
- stdio (
transport: stdio): The MCP server runs as a subprocess managed by Orloj. Thecommandandargsfields control exactly what binary is executed. The subprocess inherits only the environment variables explicitly listed inspec.envand resolvedspec.env[].secretRefvalues -- no host environment leaks into the child process. - HTTP (
transport: http): The MCP server is a remote endpoint. SSRF validation (above) applies to thespec.endpointURL, blocking loopback, link-local, and private-network targets by default. Usespec.authto attach bearer or API-key credentials to outbound requests.
Tool scoping: Use spec.tool_filter.include to restrict which tools the MCP server exposes. Without a filter, all tools reported by tools/list are generated as Tool resources. In production, prefer an explicit allowlist to minimize attack surface.
Credential injection: Secrets referenced via spec.env[].secretRef follow the same secret resolution chain as other resources. Avoid placing credentials in spec.env[].value plaintext fields outside of development.
Governed runtime: Tools discovered from MCP servers are generated as standard Tool resources with spec.type: mcp. They flow through the same governed runtime pipeline (policy enforcement, retry, auth injection, approvals) as all other tool types.
See MCP Server concept and the Connect an MCP Server guide for setup details.
Isolation Modes
none-- direct execution with real HTTP/gRPC calls (no isolation boundary)sandboxed-- restricted container with secure defaults (see below)container-- per-invocation isolated containerwasm-- WebAssembly module with host-guest stdin/stdout boundary
Container backend supports constrained execution for high-risk paths.
WASM backend uses executor-factory boundaries and command-backed runtime execution (default runtime binary wasmtime). Invalid wasm runtime configuration fails closed with deterministic non-retryable policy errors.
Sandboxed Container Defaults
When isolation_mode=sandboxed (the default for high/critical risk tools), the container backend enforces these security constraints:
| Control | Value |
|---|---|
| Filesystem | --read-only |
| Linux capabilities | --cap-drop=ALL |
| Privilege escalation | --security-opt no-new-privileges |
| Network | --network none |
| User | 65532:65532 (non-root) |
| Memory | 128m |
| CPU | 0.50 cores |
| Process limit | 64 PIDs |
These defaults are enforced by SandboxedContainerDefaults() in the runtime and validated by conformance tests. Override with --tool-container-* flags only when necessary.
CLI Tool Isolation
CLI tools (spec.type: cli) default to container isolation regardless of risk level. Unlike HTTP container isolation, CLI containers use bridge networking by default because the binary itself is the network client (e.g., kubectl reaches the K8s API, gh reaches GitHub).
- No shell: invocations use
exec.CommandContext(execve-style argv). There is nosh -cpath and no opt-in for shell mode. - Arg templates are per-entry: each Go template produces exactly one argv element. No shell splitting or word expansion occurs.
- Secrets via env_from only: process environment is constructed exclusively from
spec.cli.env(literals) andspec.cli.env_from(resolved secrets). No host environment variables leak into the container. - Binary allowlist (optional):
--cli-tool-allowed-commandsrejects commands not on the list before exec. - Argv length limit:
--cli-tool-max-argv-length(default 4096 bytes) prevents oversized argument lists. spec.authrejected: CLI tools must useenv_fromfor credentials; settingspec.authproduces a validation error to prevent silent misconfiguration.
Operators can restrict per-tool networking with spec.cli.network: none for tools that do not need outbound access (e.g., jq, yq).
Secret Handling
Orloj resolves secrets referenced by secretRef fields (on ModelEndpoint and Tool resources) using a chain of resolvers, tried in order:
- Resource Store -- looks up a
Secretresource by name and reads the base64-encoded value fromspec.data. - Environment Variables -- looks up
ORLOJ_SECRET_<name>(configurable prefix via--model-secret-env-prefix/--tool-secret-env-prefix).
The first resolver that returns a value wins.
Development
Use Secret resources for local development. The fastest way is the imperative CLI command -- no YAML file needed:
orlojctl create secret openai-api-key --from-literal value=sk-your-key-hereOr with a YAML manifest:
apiVersion: orloj.dev/v1
kind: Secret
metadata:
name: openai-api-key
spec:
stringData:
value: sk-your-key-hereEncryption at Rest
When using the Postgres storage backend, enable encryption at rest for Secret resources by passing a 256-bit AES key to both orlojd and orlojworker:
# Generate a key (hex-encoded, 64 characters)
openssl rand -hex 32
# Pass via flag
orlojd --secret-encryption-key=<hex-key> ...
orlojworker --secret-encryption-key=<hex-key> ...
# Or via environment variable
export ORLOJ_SECRET_ENCRYPTION_KEY=<hex-key>When enabled, all Secret.spec.data values are encrypted with AES-256-GCM before being written to the database and decrypted transparently on read. This protects secrets against direct database access, backup exposure, and log/dump leaks.
The key must be identical across all server and worker processes that share the same database. Both hex-encoded (64 characters) and base64-encoded (44 characters) formats are accepted.
Without an encryption key, Secret data is stored as base64-encoded plaintext in the JSONB payload -- suitable for development but not for production.
On orlojd, the same --secret-encryption-key / ORLOJ_SECRET_ENCRYPTION_KEY setting also wraps the private key used for SealedSecret decryption when sealing is enabled. If no encryption key is configured, SealedSecret resources remain storable but reconcile to Error, and GET /v1/sealing-key/public returns 503.
Git-safe Sealed Secrets
Secret resources protect values in the API and optionally at rest in Postgres, but they are still plaintext manifests before apply. Use SealedSecret when you need to commit encrypted secret manifests to git.
The workflow is:
orlojdcreates or loads one active sealing keypair in the control plane.- Clients fetch the public key from
GET /v1/sealing-key/publicororlojctl seal public-key. - Clients convert a normal
Secretmanifest into aSealedSecretmanifest locally withorlojctl seal secret -f secret.yaml, or generate one directly from literals withorlojctl seal secret <name> --from-literal key=value. orlojddecrypts theSealedSecretand writes a normalSecretthrough the existing secret store path.- Workers continue to read the generated
Secretexactly as they do for manually created secrets.
SealedSecret and the generated Secret use the same name and namespace in v1. Generated Secrets are marked with orloj.dev/sealedsecret-owner=<namespace>/<name>. If a Secret with that name already exists and is not owned by the same SealedSecret, reconcile fails closed instead of overwriting user-managed data.
Examples:
# Seal an existing Secret manifest into secret.sealed.yaml
orlojctl seal secret -f secret.yaml
# Generate a SealedSecret file directly from literals
orlojctl seal secret openai-api-key \
--from-literal value=sk-prod-123 \
--out secrets/openai-api-key.sealed.yamlSealing Key Security Model
Orloj v1 uses one active control-plane sealing keypair per backing store.
orlojdonly generates a sealing key if no active key exists andORLOJ_SECRET_ENCRYPTION_KEYis set. Startup loads an existing active key when present; it does not generate a new key on every restart.- The generated sealing keypair is RSA-4096.
- The sealing private key is stored in Postgres encrypted with AES-256-GCM under
ORLOJ_SECRET_ENCRYPTION_KEY. - Each
SealedSecretentry uses a fresh random 32-byte AES data key. The entry plaintext is encrypted with AES-256-GCM, and the AES data key is wrapped with RSA-OAEP-SHA256. - The AES-GCM authenticated data binds the ciphertext to
<namespace>,<name>, and the secret entry key. A ciphertext copied to a different secret identity will fail to decrypt.
Operationally, this means:
- A committed
SealedSecretmanifest is safe to store in git as long as the control-plane private key remains protected. - If an attacker gets both the database and
ORLOJ_SECRET_ENCRYPTION_KEY, they can recover the stored sealing private key. - If an attacker gets code execution on
orlojd, they can unseal secrets. - Losing
ORLOJ_SECRET_ENCRYPTION_KEYmakes both encryptedSecretdata and the stored sealing private key unrecoverable. - Orloj v1 does not rotate sealing keys automatically yet; it keeps one active key until a future manual rotation flow is introduced.
Production
For production, choose one or both of the following approaches:
1. Encrypted Secret resources -- enable --secret-encryption-key and continue using Secret resources as in development. This is the simplest upgrade path.
2. SealedSecret manifests -- keep declarative secret manifests in git without exposing plaintext. This works well when you want resource-driven configuration and reviewable manifests, but do not want plaintext Secret YAML in the repository.
3. Environment variables -- bypass Secret resources entirely by injecting provider keys into the runtime environment:
export ORLOJ_SECRET_openai_api_key="sk-prod-key"The resolver normalizes the secret name: a secretRef: openai-api-key looks up ORLOJ_SECRET_openai_api_key (hyphens become underscores).
4. External secret managers -- inject secrets as environment variables using your platform's native mechanism:
- Kubernetes: Use external-secrets-operator or the CSI secrets driver to sync Vault/AWS Secrets Manager/GCP Secret Manager values into pod env vars.
- HashiCorp Vault: Use Vault Agent sidecar to render secrets into env or files.
- Cloud providers: Use AWS Secrets Manager, GCP Secret Manager, or Azure Key Vault with their respective injection mechanisms.
Approaches 3 and 4 do not require Secret resources -- the env-var resolver handles resolution directly.
API Redaction
The REST API never returns plaintext secret data. All GET responses for Secret resources replace every value in spec.data with "***". This applies to both individual resource fetches and list responses. Secret data is write-only through the API; to verify a secret value, use the resource it references (e.g., test a model endpoint or tool that depends on it).
Event bus messages for secret create/update operations are also redacted before publication.
SealedSecret resources are returned as ciphertext blobs. The API never exposes the control-plane private key.
Security Requirements
- Raw secret values must not appear in logs or trace payloads.
- Store the encryption key itself in a secure location (e.g., a KMS, Vault, or hardware security module). Do not commit it to version control.
- Validate redaction behavior during incident drills.
- Back up
ORLOJ_SECRET_ENCRYPTION_KEYseparately from the database. Losing it prevents decrypting encryptedSecretvalues and the storedSealedSecretprivate key.
Tool Auth Profiles
Tools can authenticate using one of four profiles via spec.auth.profile:
| Profile | Suitable for | Notes |
|---|---|---|
bearer (default) | API tokens, service keys | Injected as Authorization: Bearer <token> |
api_key_header | APIs using custom header auth (e.g., X-Api-Key) | Requires auth.headerName |
basic | Legacy HTTP basic auth | Secret must be username:password |
oauth2_client_credentials | Machine-to-machine OAuth2 | Requires auth.tokenURL; uses multi-key secret with client_id and client_secret |
Auth in Container Isolation
For container-isolated tools, auth is injected as environment variables rather than HTTP headers. The container's entrypoint script maps these to the appropriate curl headers:
| Env Var | Auth Profile |
|---|---|
TOOL_AUTH_BEARER | bearer, oauth2_client_credentials |
TOOL_AUTH_BASIC | basic |
TOOL_AUTH_HEADER_NAME + TOOL_AUTH_HEADER_VALUE | api_key_header |
Auth Error Handling
Auth failures produce distinct error codes (auth_invalid for HTTP 401, auth_forbidden for HTTP 403) that are non-retryable. For oauth2_client_credentials, a 401 triggers automatic token cache eviction and one retry with a fresh token.
Auth Audit Trail
Every tool invocation records tool_auth_profile and tool_auth_secret_ref (the secret name, not the resolved value) in the task trace. Use these fields for audit queries and compliance reporting.
Risk-Tier Routing and Approvals
Tools can declare operation classes (read, write, delete, admin) via spec.operation_classes. Policy rules in ToolPermission.spec.operation_rules define per-class verdicts: allow, deny, or approval_required.
When a tool call triggers approval_required:
- The task enters
WaitingApprovalphase. - A
ToolApprovalresource is created for the pending decision. - An operator approves or denies via the REST API.
- Approval outcomes produce
approval_pending,approval_denied, orapproval_timeouterror codes.
All approval-related outcomes are non-retryable and do not consume retry budget.
Operational Guidance
- Use
operation_ruleswithverdict: approval_requiredfor destructive operations (delete,admin) in production environments. - Set appropriate TTLs on
ToolApprovalresources (default: 10 minutes) to prevent tasks from waiting indefinitely. - Monitor
WaitingApprovaltask counts and approval latencies to detect bottlenecks.
Operational Requirements
- Enforce least-privilege tool permissions.
- Monitor denial and runtime policy error trends.
- Monitor auth failure rates by profile for early detection of expired credentials.
- Monitor approval request volume and response latency for
WaitingApprovaltasks.