ADR-0001 — Per-vhost cert split for the CCAT step-ca endpoint#
Status. Accepted, 2026-05-05.
Supersedes. None. Captures a decision previously embedded in commits, lessons-learned, and ad-hoc operator knowledge.
Context#
The CCAT step-ca endpoint must be reachable by clients across two
populations: machines and operators on Uni Köln subnets, and external
partners (REUNA, partner workstations off the uni network). step-cli
clients pin trust to the CCAT root via step ca bootstrap --fingerprint … and then refuse to talk to a TLS endpoint whose chain doesn’t lead
to that root.
We worked through three approaches before landing on the current one, each ruled out by a hard constraint:
Direct exposure of step-ca on
:9000to all clients. Uni Köln IT firewalls drop:9000between subnets even within the university network, so cross-subnet uni clients (let alone external partners) cannot reach it at all.:9000is reachable only from input-b’s own/24.Let’s Encrypt on the
ca.ccat.uni-koeln.devhost via nginx-proxy. Clean from a deployment standpoint (acme-companion handles every other vhost this way), but it would require bootstrapping clients against an LE-issued cert. step-cli’s pinned trust model treats this as a chain mismatch — there’s no clean way to ask clients to trust both a CCAT root and a public CA for the same hostname.Bridge the two roots client-side (append the system CA bundle to
~/.step/certs/root_ca.crt). We tried this; it works forstep ssh login(OIDC flow) but fails for the JWK-flow commands likestep ssh certificate, which readroot_ca.crtfor an internal cert-chain code path and reject multi-PEM input. It is also ergonomically miserable: every laptop needs the hack reapplied after eachstep ca bootstrap --force. The full ping-pong is indocs/source/ceremony/lessons-learned-cutover-2026-05-04.md.
Decision#
step-ca lives behind nginx-proxy with a per-vhost cert split:
The
ca.ccat.uni-koeln.devhost serves a CCAT-rooted cert (issued by step-ca itself via theprod-servicesJWK provisioner, written to/opt/proxy/certs/ca.ccat.uni-koeln.de.{crt,key}).acme-companionis opted out for this vhost.Every other vhost served by nginx-proxy keeps Let’s Encrypt via
acme-companion.step-ca’s own port
:9000remains open on input-b but is firewalled to Uni Köln/16(defaulting to theca_allowed_source_cidrsgroup var) and is used only by the same-host issuance/renewal scripts that run on input-b. Cross-subnet clients always use:443.Access to
ca.ccat.uni-koeln.de:443is policy-enforced in-repo atproxy/data/vhost.d/ca.ccat.uni-koeln.de: an explicit allowlist of partner CIDRs followed bydeny all. Adding a partner is a PR plus annginx -s reload.
Consequences#
Accepted, positive:
All cross-subnet clients use a single bootstrap path:
step ca bootstrap --ca-url https://ca.ccat.uni-koeln.de. No port, no client-side trust hacks, no per-client variance.Adding a partner is a PR plus a proxy reload. There is no per-partner cert ceremony.
The policy enforcement point (
vhost.dallowlist) is in git, code- reviewed, and audit-trailable through commit history.
Accepted, costs:
The
ca.ccat.uni-koeln.devhost cert needs its own renewal lifecycle —step-ca-vhost-renew.timer(every 12 h) callingstep-ca/renew-vhost-cert.sh, plus an emergency re-issue path viastep-ca/issue-vhost-cert.sh.acme-companiondoes not manage this cert; if anyone re-enables it for this vhost the ACME challenge will overwrite the CCAT cert.:9000is unusable cross-subnet by design. Operators outside Uni Köln cannot issue or renew the vhost cert directly without first reaching input-b (which they have to do for the rest of CCAT operations anyway).The
vhost.dallowlist is the only access control on the public CA endpoint. If it’s misconfigured (e.g. lost during a proxy config refactor), the CA endpoint becomes world-reachable. The default config explicitly fails closed (deny allafter the allow lines) to mitigate this.
Cross-references#
CCAT Certificate Authority — Architecture and Design — current-state explanation of the trust posture, including why the CA vhost opts out of LE.
CA day-to-day operations — operator how-to for the vhost cert lifecycle (timer, inspection, adding a partner subnet).
CA rotation and disaster recovery — runbook for vhost cert routine renewal and emergency re-issue.
Client setup — SSH with step-ca certificates — partner-facing bootstrap procedure including the off-network tunnel option.
Lessons learned — Phase 2 HSM cutover 2026-05-04 — full retrospective of the attempts that ruled out (1)–(3) above.