Kubernetes Deployment (Helm + Manifest Fallback)
Purpose
Deploy Orloj on Kubernetes with a Helm chart (recommended) or with raw manifests (fallback).
Prerequisites
- Kubernetes cluster access (
kubectlcontext configured) - Helm 3 (
helm) curl,jqfor verification (andgoif runningorlojctlfrom source)
The release workflow publishes orlojd and orlojworker container images plus the Helm chart to GHCR — you do not need to build anything yourself unless you're deploying from a local checkout.
Install
1. Install with Helm (Recommended)
The chart is published as an OCI artifact on every v* release:
helm upgrade --install orloj oci://ghcr.io/orlojhq/charts/orloj \
--version 0.14.2 \
--namespace orloj \
--create-namespace \
--set postgresql.auth.password='<strong-password>' \
--set secretEncryptionKey="$(openssl rand -hex 32)" \
--set auth.mode=native \
--set auth.setupToken="$(openssl rand -hex 32)"Notes:
- The chart defaults
image.registry,image.server.repository, andimage.worker.repositoryat the published GHCR images, so you do not need to set them. secretEncryptionKeyis a 256-bit AES key used to encrypt provider API keys at rest in Postgres. Generate withopenssl rand -hex 32and store it as you would any other root secret.auth.mode=nativerequiresauth.setupTokenfor first-user bootstrap. See Operations > Security.- Model provider API keys (Anthropic, OpenAI, Bedrock, etc.) are not chart values — they are encrypted
Secretresources you create viaorlojctlafter the control plane is up, andModelEndpointresources reference them by name. See ModelEndpoint.
To inspect effective values:
helm get values orloj --namespace orlojInstall from a source checkout
If you've cloned the repo and want to deploy a development build, you can install from the chart directory directly. Subchart deps must be resolved first:
helm dependency update charts/orloj
helm upgrade --install orloj ./charts/orloj \
--namespace orloj \
--create-namespace \
--set postgresql.auth.password='<strong-password>' \
--set secretEncryptionKey="$(openssl rand -hex 32)" \
--set auth.mode=native \
--set auth.setupToken="$(openssl rand -hex 32)"To pin custom image tags (for example, a locally-built image pushed to your own registry):
--set image.registry=ghcr.io/<your-org> \
--set image.server.repository=<your-org>/orloj-orlojd \
--set image.server.tag=<your-tag> \
--set image.worker.repository=<your-org>/orloj-orlojworker \
--set image.worker.tag=<your-tag>GitOps (ArgoCD, Flux)
ArgoCD Application example pointing at the OCI chart:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: orloj
namespace: argocd
spec:
project: default
source:
repoURL: ghcr.io/orlojhq/charts
chart: orloj
targetRevision: 0.14.2
helm:
valueFiles:
- values.yaml
destination:
server: https://kubernetes.default.svc
namespace: orloj
syncPolicy:
automated: { prune: true, selfHeal: true }
syncOptions: [ CreateNamespace=true ]2. Manifest Fallback (No Helm)
If you cannot use Helm, apply the baseline manifest set:
- Edit
docs/deploy/kubernetes/orloj-stack.yamlimage references and rotate the baseline secrets (Postgres password, secret encryption key, setup token). - Apply manifests:
kubectl apply -f docs/deploy/kubernetes/orloj-stack.yamlVerify
Wait for rollouts. The Helm release names follow the <release>-<component> convention; with helm install orloj ... you get:
kubectl -n orloj rollout status statefulset/orloj-postgresql
kubectl -n orloj rollout status statefulset/orloj-nats
kubectl -n orloj rollout status deploy/orloj-server
kubectl -n orloj rollout status deploy/orloj-workerIf you used the manifest fallback, the names are unprefixed:
kubectl -n orloj rollout status deploy/postgres
kubectl -n orloj rollout status deploy/nats
kubectl -n orloj rollout status deploy/orlojd
kubectl -n orloj rollout status deploy/orlojworkerPort-forward the API service:
# Helm install
kubectl -n orloj port-forward svc/orloj-server 8080:8080
# Manifest fallback
kubectl -n orloj port-forward svc/orlojd 8080:8080In another terminal:
curl -s http://127.0.0.1:8080/healthz | jq .
orlojctl --server http://127.0.0.1:8080 get workers
orlojctl --server http://127.0.0.1:8080 apply -f examples/blueprints/pipeline/ --run
orlojctl --server http://127.0.0.1:8080 get task bp-pipeline-taskDone means:
- all rollouts are successful.
- API service is reachable through port-forward.
- at least one worker is
Ready. - sample task reaches
Succeeded.
Operate
Scale workers (Helm install):
kubectl -n orloj scale deploy/orloj-worker --replicas=3
kubectl -n orloj rollout status deploy/orloj-workerFor long-term scaling, prefer the HPA values:
worker:
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70Restart control plane:
kubectl -n orloj rollout restart deploy/orloj-server
kubectl -n orloj rollout status deploy/orloj-serverView logs:
kubectl -n orloj logs deploy/orloj-server --tail=200
kubectl -n orloj logs deploy/orloj-worker --tail=200Upgrade chart release:
helm upgrade orloj oci://ghcr.io/orlojhq/charts/orloj \
--version <new-version> --namespace orloj --reuse-valuesRollback:
helm rollback orloj <revision> --namespace orlojTroubleshoot
- pods in
ImagePullBackOff: verify image names/tags and registry access. - workers not processing: verify
ORLOJ_AGENT_MESSAGE_CONSUME=trueand message-bus env values. - tasks not created: verify the API endpoint is reachable from
orlojctl.
Tool Isolation: Kubernetes Backend
When toolIsolation.kubernetes.enabled=true, Orloj runs tool invocations with isolation_mode: kubernetes as ephemeral Kubernetes Jobs in the cluster. This eliminates the need for a Docker socket on worker nodes.
RBAC Requirements
The Helm chart automatically creates a Role (and RoleBinding) for the worker ServiceAccount with the following permissions:
| API Group | Resource | Verbs |
|---|---|---|
batch | jobs | create, get, list, watch, delete |
| (core) | pods | get, list |
| (core) | pods/log | get |
| (core) | secrets | get |
The Role is scoped to the namespace configured by toolIsolation.kubernetes.namespace (defaults to the release namespace).
Helm Values
Configure the Kubernetes tool isolation backend under toolIsolation.kubernetes:
toolIsolation:
kubernetes:
enabled: false # Set to true to enable
namespace: "" # Namespace for tool Jobs (default: release namespace)
serviceAccount: "" # Service account for tool Pods (default: worker SA)
jobTTLSeconds: 300 # TTL seconds after Job finishes (automatic cleanup)
defaultImage: "curlimages/curl:8.8.0" # Fallback image for HTTP toolsWhen enabled, the chart sets ORLOJ_TOOL_K8S_ENABLED=true plus related env vars on both the orlojd server and orlojworker deployments.
Coexistence with Container Backend
Both container and kubernetes isolation backends can be active simultaneously. Each tool's spec.runtime.isolation_mode selects which backend handles that tool:
isolation_mode: container— runs viadocker runon the worker hostisolation_mode: kubernetes— runs as a Kubernetes Job in the cluster
This allows gradual migration from Docker-based isolation to Kubernetes-native execution.
Security Defaults
- This baseline is not HA —
server.replicaCountdefaults to 1. Multi-replicaorlojdrequires leader election (see roadmap). - Rotate secrets before non-test use:
postgresql.auth.password(orpostgresql.auth.existingSecretfor a pre-sealed value).secretEncryptionKey— losing this makes every encrypted OrlojSecretunrecoverable.auth.setupToken— single-use bootstrap; rotate after the first admin account is created.auth.apiToken— set this only if you also need a static bearer for CLI/automation; otherwise rely on user-issued tokens minted through the native auth flow.
ORLOJ_AUTH_MODEdefaults tonative(the chart'sauth.modevalue).auth.mode=offdisables authentication entirely and is intended only for local development.- Restrict namespace and service exposure based on cluster policy. The chart's
server.ingressis opt-in and emits anetworking.k8s.io/v1 Ingress; for Gateway API environments, leaveserver.ingress.enabled=falseand ship anHTTPRoutealongside the release.