Frequently Asked Questions

Core Concepts

What is Orkestra?

Orkestra is a declarative operator runtime for Kubernetes. It turns CRDs into fully functional operators without controllers, reconcilers, or conversion code.

You declare what a CRD should do — create a Deployment and a Service, apply defaults, validate fields, convert between versions. Orkestra runs the operator. The code you would have written does not exist.

The one-sentence version

Every operator framework before Orkestra reduced the code you write. Orkestra removes the need to write code at all.

See Your CRD Is Enough for the full picture.

Do I need to write Go code?

No — for the common case.

Orkestra provides these capabilities declaratively, with no Go:

Informers watching your exact GVK and version
Workqueue with configurable depth, backoff, and rate limiting
Worker pool with configurable concurrency
Drift correction (reconcile: true on any template resource)
Owner references and cascade deletion
Kubernetes event emission
Leader election
Health endpoints and Prometheus metrics
Multi-version CRD conversion
Admission-time validation and mutation

Go hooks are available when you need them — external API calls, complex conditional logic, type-safe struct access. But hooks are additive. The declarative layer handles everything else.

When Go becomes necessary

The 20% of operator logic that genuinely requires code — creating a user inside PostgreSQL, calling an external API, reading another cluster’s state — is handled by hooks. Hooks coexist with declarative templates. You do not choose one or the other.

How does Orkestra differ from Helm or Kustomize?

Different category entirely.

	Helm	Kustomize	Orkestra
What it does	Renders templates once	Patches manifests once	Runs a continuous operator loop
When it runs	At deploy time	At deploy time	Continuously, while the cluster runs
Drift correction	No	No	Yes — corrects on every reconcile cycle
Watches CRs	No	No	Yes — every change event triggers reconcile
Versioning	Chart versions	Kustomization	Per-CRD operator stacks, declarative conversion
Dependencies	Chart dependencies	Kustomization bases	`dependsOn` ordering with ready signals

Orkestra is an operator runtime. Helm and Kustomize are deployment tools. They solve adjacent problems and compose naturally — you can use a Helm chart as a Katalog source in a Komposer.

What is a Katalog?

A Katalog is a YAML document that declares how Orkestra should manage one or more CRDs. It is not a Kubernetes CRD itself — it is a file.

apiVersion: orkestra.orkspace.io/v1
kind: Katalog
metadata:
  name: website-operator

spec:
  crds:
    website:
      apiTypes:
        group: demo.orkestra.io
        version: v1alpha1
        kind: Website
        plural: websites
      operatorBox:
        default: true
        onCreate:
          deployments:
            - image: "{{ .spec.image }}"
              replicas: "{{ .spec.replicas }}"
              reconcile: true

Why Katalog is not a CRD

Orkestra deliberately keeps Katalog and Komposer as plain YAML files, not Kubernetes CRDs. See Why Katalog and Komposer Are Not CRDs for the full reasoning. The short version: your CRD should be the focus, not Orkestra’s management infrastructure.

See the Katalog Schema for all available fields.

What is a Komposer?

A Komposer composes multiple Katalogs from different sources into one unified runtime configuration.

apiVersion: orkestra.orkspace.io/v1
kind: Komposer
metadata:
  name: platform-komposer

imports:
  registry:
    - url: ghcr.io/orkspace/orkestra-registry/postgres@v14
      oci: true
  files:
    - ./katalogs/website.yaml
  helm:
    - repo: https://charts.myorg.io
      chart: platform-crds
      version: 2.1.0

spec:
  crds:
    postgres:
      workers: 8      # override for production

The spec.crds inline block always wins on name conflict — it is the override mechanism. Platform teams publish Katalogs; application teams compose and override.

See the Komposer Schema for all options.

What is the OrkestraRegistry?

The OrkestraRegistry is two things:

1. The internal resource library (pkg/orkestra-registry/) — Go implementations of Create, Update, Delete, and Resolve for every common Kubernetes resource type: Deployments, Services, Secrets, ConfigMaps, Jobs, CronJobs, Pods, ServiceAccounts. These are called by the reconciler when it processes declarative templates. You never call them directly unless you are writing hooks.

2. The public pattern registry (orkspace/orkestra-registry) — versioned operator patterns distributed as OCI artifacts. Pull a Postgres operator pattern with one line in a Komposer. No binary. No deployment. Just a Katalog.

The npm analogy

The OrkestraRegistry is Orkestra’s package manager for operator behavior. Patterns are versioned, composable, and overridable. You import them like dependencies, not like binaries.

See the Reference for full schema documentation.

Running Orkestra

Can Orkestra manage multiple CRDs?

Yes — any number. This is the point.

Each CRD in a Katalog gets its own complete, isolated operator stack:

Dedicated informer watching its exact GVK and API version
Dedicated workqueue with independent depth and backoff
Dedicated worker pool — other CRDs cannot consume its workers
Dedicated health endpoint at /katalog/{crd}/health
Dedicated Prometheus metrics labeled by GVK

All of these operator stacks run inside one Orkestra process. The isolation is at the logic level. The shared infrastructure — API server connection, informer factory, health server, leader election — is paid once.

The economics

15 separate operator processes: ~750 MB–3 GB memory, 15 health endpoints, 15 metric schemas, 15 upgrade procedures.

Orkestra managing 15 CRDs: ~50 MB memory, 1 health server, 1 metric schema,
1 upgrade procedure.

How do I start Orkestra?

Locally, for development:

ork run --file katalog.yaml

In a cluster, via Helm:

helm repo add orkestra https://orkspace.github.io/orkestra
helm install orkestra orkestra/orkestra \
  --namespace orkestra-system \
  --create-namespace \
  --set runtime.katalog.existingConfigMap=my-katalog-configmap

See the Deploying the Control Center for full cluster setup including TLS, RBAC, and production tuning.

What does `ork validate` do?

ork validate runs the complete Katalog loading sequence without starting the runtime. It surfaces every configuration error — bad YAML, unknown kinds, circular dependencies, missing registry files, empty pattern files — before any cluster changes are made.

ork validate --file katalog.yaml

✓ website
    kind: Website
    group: demo.orkestra.io / version: v1alpha1 / plural: websites
    mode: dynamic / workers: 3 / resync: 15s
    validation: 2 rules / mutation: 1 rule

✗ application
    error: circular dependency: application → namespace → application

Run in CI

ork validate exits with a non-zero code on any error. Add it to your CI pipeline to catch Katalog errors before they reach the cluster:

- name: Validate Katalog
  run: ork validate --file katalog.yaml

It requires no cluster connection — safe to run in any CI environment.

Does Orkestra require cert-manager?

No. Orkestra needs TLS certificates for its HTTPS server (used by conversion and admission webhooks) when ENABLE_CONVERSION=true or ENABLE_ADMISSION_WEBHOOK=true. Where those certificates come from is your choice.

Approach	Suitable for
Self-signed (via `generate-certs.sh`)	Development and testing
cert-manager `Certificate` resource	Production — automated renewal
External PKI / corporate CA	Enterprise environments with existing PKI
Cloud provider ACM / GCP managed certs	Cloud-native deployments

The Helm chart includes optional cert-manager integration. Set certManager.enabled: true and the chart creates a Certificate resource and mounts the resulting Secret automatically.

Conversion and webhooks share one certificate

/convert, /validate, and /mutate all run on the same HTTPS server on :8443 with the same TLS certificate. One certificate covers all three endpoints.

What environment variables does Orkestra read?

Variable	Default	Description
`ORKESTRA_PORT`	`8080`	HTTP server port
`ENABLE_CONVERSION`	`false`	Enable the `/convert` HTTPS endpoint
`ENABLE_ADMISSION_WEBHOOK`	`false`	Enable `/validate` and `/mutate` (requires `ENABLE_CONVERSION`)
`TLS_CERT`	—	Path to TLS certificate
`TLS_KEY`	—	Path to TLS key
`ORK_REGISTRY`	—	Default registry URL for `imports.registry` entries without explicit URL
`DEFAULT_WORKERS`	`3`	Worker count per CRD when not set in Katalog
`DEFAULT_RESYNC`	`15s`	Resync interval when not set in Katalog
`MAX_QUEUE_DEPTH`	`100`	Max queue depth when not set in Katalog
`LOG_LEVEL`	`info`	Log verbosity: `debug`, `info`, `warn`, `error`
`NAMESPACE`	—	Namespace where Orkestra runs — used in webhook configurations
`ORKESTRA_SERVICE_NAME`	`orkestra`	Service name for webhook clientConfig
`CONVERSION_WINDOW`	`1000`	Rolling window size for conversion and admission latency percentiles

CRDs and Operators

What is the super-operator model?

The super-operator model is the principle that each CRD gets a complete, isolated operator stack while sharing the runtime infrastructure.

In traditional frameworks, one-operator-per-CRD means one binary, one deployment, one informer factory, one leader election lease per CRD. The isolation is at the process level — expensive.

In Orkestra, one-operator-per-CRD means one informer, one queue, one worker pool, one reconciler per CRD — all inside a single process. The isolation is at the logic level. The runtime infrastructure (API server connection, informer factory, health server, leader election) is shared.

This gives you the isolation guarantee of the one-operator-per-CRD principle at a fraction of the resource cost.

The kube-controller-manager analogy

This is exactly how kube-controller-manager works. It runs the Deployment controller, the ReplicaSet controller, the Job controller, and dozens of others in one process. Each controller is isolated — they share only the infrastructure. Orkestra applies this proven model to custom resources.

Can Orkestra manage built-in Kubernetes resources?

Yes. kind: Deployment, kind: Pod, kind: Service, and 30+ other built-in Kubernetes kinds are supported without declaring group, version, or plural — Orkestra enriches them automatically from its internal registry:

- name: deployment-governance
  apiTypes:
    kind: Deployment   # ← only field needed for built-in kinds
  validation:
    - field: metadata.labels.team
      operator: exists
      message: "all deployments must declare a team owner"
      action: warn

Governance without a separate policy engine

This is how you apply governance to Kubernetes built-in resources without OPA, Kyverno, or a separate admission controller. Orkestra watches the resource, validates it at reconcile time, and optionally intercepts at admission time when ENABLE_ADMISSION_WEBHOOK=true.

Run ork validate --file katalog.yaml to see exactly what Orkestra resolves for a kind-only declaration.

Does Orkestra support multi-version CRDs?

Yes — with zero conversion code.

Each CRD version is a separate entry in the Katalog with its own complete operator stack. Each entry’s informer watches its specific GVK — the API server converts objects to the requested version before delivering them. Conversion rules are declared alongside reconcile templates and evaluated by the same resolver:

- name: website-v1
  conversion:
    storageVersion: v1
    paths:
      - from: v1alpha1
        to: v1
        spec:
          image: "{{ .spec.image }}"
          seo:
            enabled: false   # v1alpha1 has no seo field — supply default

Production results: 62 conversions, 0 failures, sub-millisecond average latency.

No separate webhook deployment

Conversion runs on Orkestra’s own HTTPS server — the same server that serves /validate and /mutate. No separate conversion webhook binary. No separate TLS certificate. No separate deployment.

See the Katalog Schema for the full conversion field reference.

Validation and Mutation

What is the difference between validation and mutation?

Validation evaluates rules against a CR and either blocks it (action: deny) or surfaces an advisory (action: warn).

Mutation applies defaults and normalisations to a CR before it is stored. Fields declared with default: are set only when absent. Fields declared with override: are always set.

Both run at two points:

Admission time — when ENABLE_ADMISSION_WEBHOOK=true, synchronously during kubectl apply
Reconcile time — every reconcile cycle, regardless of webhook configuration

Declare once, enforced at both points.

Roll out rules safely

Deploy new validation rules with action: warn first. Observe controller_admission_validation_violations_total in Prometheus to understand which CRs would be affected. When you are confident, change to action: deny. The Katalog change takes effect on the next Orkestra restart.

Does `ENABLE_ADMISSION_WEBHOOK=true` block the API server if Orkestra is down?

No — by design. The webhook configuration uses FailurePolicy: Ignore by default. If Orkestra is unreachable when the API server calls /validate or /mutate, the operation is allowed through. Validation catches violations at reconcile time when Orkestra restarts.

# To change to blocking behaviour (requires high-availability Orkestra deployment):
# Set in Helm values:
webhooks:
  failurePolicy: Fail    # default: Ignore

Before setting Fail

FailurePolicy: Fail means Orkestra’s availability directly gates all CR deployments. Set it only with multiple Orkestra replicas, a PodDisruptionBudget, and confidence that your admission rules are correct. Start with Ignore.

Operations

How do I debug a CRD in production?

Use the Control Center — it gives you a full view of all CRDs, worker pools, queue depth, reconcile metrics, and dependency health without any additional tooling.

For quick terminal diagnostics, the runtime exposes HTTP endpoints:

# CRD health — 200 OK or 503 degraded
curl localhost:8080/katalog/website/health | jq

# Full CRD detail — stats, queue depth, active warnings
curl localhost:8080/katalog/website | jq

# All managed CRDs
curl localhost:8080/katalog | jq

# Prometheus metrics
curl localhost:8080/metrics | grep website

Port-forwarding in-cluster

When Orkestra runs in a cluster, port-forward before hitting the endpoints:

kubectl port-forward svc/orkestra 8080:8080 -n orkestra-system

The most common issues:

Symptom	Likely cause
`/health` returns 503	CRD degraded — check reconcile error rate in `/katalog/{crd}`
Resource not created	`when:` condition not met — check CR fields vs condition
Webhook rejection	Validation rule firing — read the error message in `kubectl apply` output
Stuck in terminating	`onDelete` Job blocked — check Job status in the CR’s namespace
Old field values	Reconciler not running — check if CRD is enabled and healthy

Is Orkestra safe for production?

Yes. Orkestra is designed for and demonstrated in production.

Leader election — only one instance actively reconciles; followers maintain warm caches for instant failover
safeReconcile — panics in any reconciler are caught; other CRDs are unaffected
Per-CRD failure domains — a degraded CRD does not affect others
Graceful shutdown — in-flight reconciles complete before the process exits
Conversion in production — 62 conversions, 0 failures, sub-millisecond latency

Failover time

Worst-case leader failover is 15 seconds (the lease duration). In practice, a follower on a healthy node with a warm cache starts reconciling within 16–17 seconds of a leader crash. During this window, CRs are not modified — they are queued and processed when the new leader starts.

See Trust and Failure Model for every failure mode, what it means, and how Orkestra handles it.

What RBAC permissions does Orkestra need?

Orkestra needs a ClusterRole with:

rules:
  # Watch and manage every CRD it is configured to handle
  - apiGroups: ["*"]
    resources: ["*"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  # Leader election
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["get", "create", "update"]

  # Emit Kubernetes events
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "patch"]

  # Webhook configuration (when ENABLE_ADMISSION_WEBHOOK=true)
  - apiGroups: ["admissionregistration.k8s.io"]
    resources:
      - validatingwebhookconfigurations
      - mutatingwebhookconfigurations
    verbs: ["get", "create", "update", "patch"]

The ["*"] rule is broad. Scope it to specific API groups using restrictedNamespaces and targeted ClusterRole rules when running in security-sensitive environments.

The Helm chart generates the correct ClusterRole automatically based on the Katalog entries and enabled features.

Ecosystem

How does Orkestra compare to kro?

kro (Kubernetes Resource Orchestrator) was announced in 2024 by Google, Microsoft, and AWS. It allows declaring ResourceGraphDefinitions that compose Kubernetes resources declaratively. The core insight — operator behavior should be a declaration — is the same insight Orkestra is built on.

The differences are significant:

	kro	Orkestra
Per-CRD isolation	No — shared reconcile context	Yes — dedicated informer, queue, workers
Multi-version CRDs	No	Yes — declarative conversion paths
Registry/distribution	No	Yes — OCI artifacts, Artifact Hub
Admission webhooks	No	Yes — validation and mutation
Health API	No	Yes — per-CRD endpoints and Prometheus
Observability	No	Yes — Control Center, per-CRD health endpoints, Prometheus
Hooks for external logic	No	Yes — typed and dynamic Go hooks

kro is a composability layer. Orkestra is a runtime. The fact that three major cloud providers independently arrived at the same insight validates the direction. Orkestra is the complete version of what they were reaching for.

Can Orkestra manage third-party CRDs?

Yes — any CRD that Kubernetes accepts, Orkestra can watch and reconcile. No fork, no reverse engineering, no changes to the CRD definition needed.

- name: prometheus
  apiTypes:
    group: monitoring.coreos.com
    version: v1
    kind: Prometheus
    plural: prometheuses
  operatorBox:
    default: true
    onCreate:
      # governance, companion resources, defaults

This is how governance patterns work — you apply Orkestra’s validation and mutation model to CRDs you did not write and cannot modify.

What is the path to Kubernetes core?

See Declarative Operators: A New Model for Kubernetes Extensibility for the full argument and roadmap.

The short version: Orkestra is building toward CNCF Sandbox, then a Kubernetes Enhancement Proposal, then alpha/beta/GA integration into kube-controller-manager. The target timeline is five years. The prerequisite is production adoption at multiple organisations, with metrics.

The Katalog and Komposer becoming native Kubernetes kinds — kubectl get katalogs — is the end state. At that point, every cluster ships with a meta-controller that understands declarative operator definitions. Platform teams write Katalogs. Kubernetes manages them.

Core Concepts

What is Orkestra?

Do I need to write Go code?

How does Orkestra differ from Helm or Kustomize?

What is a Katalog?

What is a Komposer?

What is the OrkestraRegistry?

Running Orkestra

Can Orkestra manage multiple CRDs?

How do I start Orkestra?

What does ork validate do?

Does Orkestra require cert-manager?

What environment variables does Orkestra read?

CRDs and Operators

What is the super-operator model?

Can Orkestra manage built-in Kubernetes resources?

Does Orkestra support multi-version CRDs?

Validation and Mutation

What is the difference between validation and mutation?

Does ENABLE_ADMISSION_WEBHOOK=true block the API server if Orkestra is down?

Operations

How do I debug a CRD in production?

Is Orkestra safe for production?

What RBAC permissions does Orkestra need?

Ecosystem

How does Orkestra compare to kro?

Can Orkestra manage third-party CRDs?

What is the path to Kubernetes core?

What does `ork validate` do?

Does `ENABLE_ADMISSION_WEBHOOK=true` block the API server if Orkestra is down?