CCAT CA — Offline Root Ceremony Playbook#

Read this before doing anything. Every step is in order. Do not skip, do not reorder, do not improvise. If anything in §0 — §3 fails, stop the ceremony and re-plan. After §3 you are committed to the run.

Two people present. Witness reads each command aloud before the operator runs it. Operator paste-confirms output line-by-line.

Reference docs live in docs/ on this same USB:

  • docs/ceremony-live-usb-setup.md — threat model + hygiene rationale

  • docs/COMMISSIONING-TODO.md — overall Phase 2 checklist (you are somewhere inside its “Offline root ceremony” section)

  • docs/ca-architecture.md — design context for the CA this playbook is creating (two tiers, two HSMs, lifetimes, GitHub-team gate)

  • docs/ca-provisioner-set.md — reference tables (provisioner set, Ansible role tags, lifetime flags)

Replace every <ROOT-USER-PIN>, <ROOT-SO-PIN>, <INT-USER-PIN>, <INT-SO-PIN> with the value from your paper PIN sheet. Never paste PIN values into shell history that gets exported. This whole session runs in RAM and dies when you power off — that is the protection. Do not save shell history to the export USB.


§0 — Before you sit down#

Confirm physically present:

  • Two Nitrokey HSM 2 dongles, HSM #1 and HSM #2 (label them now, before plugging in, with masking tape).

  • Boot USB (“CCAT CA BOOT — YYYY-MM-DD”), Ubuntu LTS Live, no persistence — verified before today via SHA256SUMS + GPG.

  • This supplies USB (“CCAT CA SUPPLIES — YYYY-MM-DD”).

  • Empty export USB (“CCAT CA EXPORT — YYYY-MM-DD”) — this is the third USB that will carry public artefacts off the laptop.

  • Paper PIN sheet with four PINs filled in, by hand.

  • Pen + envelope for the fingerprint paper.

  • Ceremony laptop with internal SSD that you have decided ahead of time will not be mounted during this session.

  • Witness present.

If any of the above is missing, stop. Reschedule.


§1 — Boot, air-gap, sanity-check#

Before pressing power:

  • Ethernet cable physically unplugged from the laptop.

  • Wi-Fi disabled in BIOS/firmware (or Wi-Fi card removed).

  • Bluetooth disabled in BIOS/firmware.

  • HSM dongles not yet plugged in.

Boot from the boot USB. At the GRUB menu pick “Try Ubuntu”, not “Install Ubuntu”. When the desktop is up, open a terminal.

Belt-and-suspenders network kill:

nmcli radio all off
ip link show | grep 'state UP'

The grep output must show only lo (loopback). If you see any other state UP interface, stop the ceremony.

Confirm the internal disk is not mounted:

findmnt | grep -E '/(media|mnt)/.+/' || echo "no removable mounts yet"
mount | grep -E ' on /[^ ]+ type (ext|btrfs|xfs)' | grep -v '^/dev/loop'

The second line should print nothing. If it shows your laptop’s internal disk mounted, do not click into it from the Files app and do not proceed until you have unmounted it.


§2 — Plug in supplies USB and verify the manifest#

Plug in only this supplies USB (still no HSMs).

ls /media/ubuntu/

Note the mount-point that just appeared (e.g. /media/ubuntu/CCATSUPP). Set a shell variable for convenience:

export S=/media/ubuntu/CCATSUPP   # adjust to actual mount-point
ls $S
cd $S
sha256sum -c MANIFEST.sha256

Every line of the sha256sum -c output must end with OK. If any line says FAILED, the supplies USB has been tampered with between preparation and now. Stop the ceremony, do not proceed, re-prepare a new supplies USB on a clean machine.

If all OK, also eyeball:

cat $S/VERSIONS.txt

The step-cli and step-kms-plugin versions printed there are the versions you are about to install. They should match what is committed in the system-integration git repo for this ceremony. If a witness has a phone, confirm via the public release tag URLs.


§3 — Install software from the supplies USB#

sudo dpkg -i $S/debs/*.deb
sudo dpkg -i $S/step/*.deb
sudo systemctl start pcscd

Sanity-check the toolchain:

step --version
step-kms-plugin --version
opensc-tool --version
pkcs11-tool --version

All four should print versions without errors. If any fail, stop.


§4 — Verify each HSM serial against its masking-tape label#

The two dongles were physically labeled “HSM #1” and “HSM #2” before the ceremony (§0). This step confirms the serial reported by pkcs11-tool --list-slots matches the labeled stick — and from §5 onwards every step works with a single stick plugged in at a time, so slot indices (which re-enumerate on every plug-in) are not recorded. Only the serial, which is the stable, non-secret device identifier, is written down.

Plug in HSM #1 (the one masking-tape-labeled “HSM #1”). Nothing else should be plugged in.

lsusb | grep -i nitrokey
pkcs11-tool --list-slots

Exactly one slot must appear: an uninitialised Nitrokey HSM. Read the serial off the screen. Operator writes it on the HSM #1 — root row of the paper sheet. Witness verbally confirms the spoken serial matches the masking-tape label and the written paper.

Unplug HSM #1.

Plug in HSM #2 (masking-tape-labeled “HSM #2”).

pkcs11-tool --list-slots

Again exactly one slot. Read the serial. Operator writes it on the HSM #2 — intermediate row. Witness confirms.

Unplug HSM #2.

The paper sheet now has both serials. They go in the safe with HSM #1 and the PIN sheet at the end of the ceremony.

Stick-swap gate — read once, apply at every gate marker below

From §5 onwards every HSM-touching step is targeted at exactly one stick (root or intermediate). sc-hsm-tool --initialize, pkcs11-tool without explicit slot filtering, and step’s PKCS#11 KMS URI all bind to whichever device pcscd enumerated first when multiple are present. Single-stick presence is therefore the only reliable way to address the right HSM — token labels do not yet exist before §§5–6, and even afterwards a wrong stick can swallow a command intended for the other one.

Whenever you see a gate marker

**Gate: only HSM #N (Role) plugged in.**

apply this procedure before continuing:

  1. Unplug whichever HSM is currently in (one or both, if you somehow have two in).

  2. Plug in only HSM #N.

  3. Capture the serial of the now-plugged HSM into a shell variable that subsequent step / step-kms-plugin commands will reference:

    SERIAL=$(pkcs11-tool --module /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so --list-token-slots | awk '/serial num/{print $NF; exit}')
    echo "$SERIAL"
    
  4. Operator reads the printed $SERIAL aloud — serials are non-secret, they are printed on the Nitrokey case; only PINs, SO-PINs, and the root fingerprint are silent-only. Witness compares the spoken serial against the HSM #N row on the paper sheet, character by character.

  5. Both verbally confirm “HSM #N, serial matches” before the next command runs. The $SERIAL variable now addresses the currently plugged HSM in every URI below.

If $SERIAL comes back empty, two slots appear, or the serial does not match the paper, stop the ceremony.

Gate: only HSM #1 (root) plugged in. Apply gate procedure.


Phase A — HSM #1 (root) operations#

The next three steps (§§5–7) all run with HSM #1 plugged in. Do not unplug between them. Set up a working directory in RAM that all ceremony output files will land in:

mkdir -p ~/ceremony && cd ~/ceremony

PKCS#11 URI shape for step-cli (verified 2026-04-29).

The commands in §§7, 10, 11, 12 use this URI structure:

  • --kms "pkcs11:module-path=...;serial=$SERIAL?pin-value=$PIN" — module path + serial of the currently-plugged HSM (captured at the gate) + the user PIN inline.

  • --key "pkcs11:id=01" — minimal object selector. Adding object= or token= to --key made step’s getPublicKey URI matcher reject the lookup (“uri not matching”) in our run.

Why serial= instead of token=?

The opensc-pkcs11.so driver presents one PKCS#11 token per PIN type, and renames it: when you init with --label "ccat-root" the URI-layer token name becomes ccat-root (UserPIN), which has to be percent-encoded (ccat-root%20%28UserPIN%29) to feed step. Stable across re-inits, no encoding gymnastics, already on the paper sheet — serial=$SERIAL (e.g.\ serial=DENK0107520) is just cleaner. The token form still works if you prefer it; both identifiers point at the same physical card.

Why pin-value and not pin-source? URI-embedded pin-source=/path/to/file did not resolve reliably in this run. pin-value=$PIN (with $PIN captured via read -rs PIN so the literal value never lands in shell history) was the form that consistently let step log into the token.

§5 — Initialise HSM #1 as the root#

sc-hsm-tool --initialize --label "ccat-root" \
    --so-pin <ROOT-SO-PIN> --pin <ROOT-USER-PIN>

Expect a “Token initialized” success line. If it errors, stop and do not continue. A wrongly-initialised HSM is recoverable, but you do not want to compose ceremony state under uncertainty.


§6 — Generate the root key on HSM #1#

pkcs11-tool --module /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so \
    --token-label ccat-root --login --pin <ROOT-USER-PIN> \
    --keypairgen --key-type EC:secp384r1 \
    --label "ccat-root-key" --id 01

Expect “Key pair generated” output. The private key is now on HSM #1 and will never leave it.


§7 — Create the self-signed root certificate#

The --kms URI tells step to use the HSM key by reference; the certificate is public output, the private key stays inside HSM #1.

Working URI shape (verified 2026-04-29 ceremony). After extensive trial and error against step-cli + the OpenSC virtual-token model, the following two-part URI structure is what step’s PKCS#11 driver actually accepts on Ubuntu 24.04 Live:

  • --kms carries module-path + token (or serial) + pin-value (in-line, not pin-source — the latter did not resolve reliably in our run).

  • --key is minimal — just pkcs11:id=01. Adding object= or token= to --key made step’s getPublicKey URI matcher fail.

The token form token=ccat-root%20%28UserPIN%29 (URL-encoded ccat-root (UserPIN)) works. So does serial=<DENK0...> if you recorded the serial in §4 — that variant is more robust to label changes but requires the operator to type the serial.

Read the root user PIN into a shell variable so it’s not in your shell history (read -rs reads silently; the URI gets the PIN via shell expansion at run time):

read -rs PIN
step certificate create \
    --profile root-ca \
    --not-after 240960h \
    --no-password --insecure \
    --kms "pkcs11:module-path=/usr/lib/x86_64-linux-gnu/opensc-pkcs11.so;serial=$SERIAL?pin-value=$PIN" \
    --key "pkcs11:id=01" \
    "CCAT Observatory Root CA" \
    root_ca.crt
unset PIN

--no-password --insecure suppresses step’s reflex prompt for an output-file password. Since we’re using an existing HSM key, no private-key file is written to disk; the cert itself is public PEM that doesn’t need a passphrase. --insecure is just step’s required acknowledgement that the operator intentionally chose --no-password.

step certificate create takes 3 positionals <subject> <crt-file> <key-file>, but the <key-file> slot is for writing a freshly-generated key. To use an existing HSM key, set --key <pkcs11:URI> and omit the third positional. --kms carries the token + PIN context; --key selects the object by id.

240960h ≈ 27.5 years. The lifetime should outlive the next planned rotation; rotation is an event, not a schedule.

Witness verifies the file appeared:

ls -la root_ca.crt
step certificate inspect root_ca.crt --short

Expect Subject: CN=CCAT Observatory Root CA, Issuer: same, validity window ~27y from today.


Phase B — HSM #2 (intermediate) operations#

Gate: only HSM #2 (intermediate) plugged in. Apply gate procedure (§4 sidebar). The next four steps (§§8–11) all run against HSM #2. Do not unplug between them.

§8 — Initialise HSM #2 as the intermediate#

sc-hsm-tool --initialize --label "ccat-intermediate" \
    --so-pin <INT-SO-PIN> --pin <INT-USER-PIN>

Same expectation as §5. Witness confirms the success message.


§9 — Generate the intermediate key on HSM #2#

pkcs11-tool --module /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so \
    --token-label ccat-intermediate --login --pin <INT-USER-PIN> \
    --keypairgen --key-type EC:secp384r1 \
    --label "ccat-intermediate-key" --id 01

§10 — Create the intermediate CSR#

Read the intermediate user PIN (a different secret from the root user PIN used in §7):

read -rs PIN
step certificate create \
    --csr \
    --no-password --insecure \
    --kms "pkcs11:module-path=/usr/lib/x86_64-linux-gnu/opensc-pkcs11.so;serial=$SERIAL?pin-value=$PIN" \
    --key "pkcs11:id=01" \
    "CCAT Observatory Intermediate CA" \
    intermediate.csr

--csr switches the output from a self-signed certificate to a CSR. Same --kms + --key pattern as §7, just pointing at HSM #2 and the intermediate key generated in §9. Token name is ccat-intermediate (UserPIN) (URL-encoded). Keep $PIN in the shell variable for §11 below — it uses the same intermediate user PIN; do not unset PIN until after §11 is done.

step certificate inspect intermediate.csr --short

Expect a CSR with subject CN=CCAT Observatory Intermediate CA and an EC P-384 public key.


§11 — Generate SSH user/host CA keys on HSM #2#

step-ca uses separate keys for SSH certificates (distinct from the X.509 chain). Generate both on HSM #2 so step-ca can sign SSH certs at runtime:

pkcs11-tool --module /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so \
    --token-label ccat-intermediate --login --pin <INT-USER-PIN> \
    --keypairgen --key-type EC:secp384r1 \
    --label "ccat-ssh-user-ca-key" --id 02

pkcs11-tool --module /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so \
    --token-label ccat-intermediate --login --pin <INT-USER-PIN> \
    --keypairgen --key-type EC:secp384r1 \
    --label "ccat-ssh-host-ca-key" --id 03

Export the public parts in OpenSSH format:

(reuses the $PIN from §10 — the same intermediate user PIN.)

step-kms-plugin key outputs the public key in PKCS#8 PEM by default. We convert to OpenSSH wire format via step crypto key format --ssh, which is part of step-cli (already on the supplies USB):

step-kms-plugin key "pkcs11:module-path=/usr/lib/x86_64-linux-gnu/opensc-pkcs11.so;serial=$SERIAL;id=02?pin-value=$PIN" > /tmp/.ssh_user_ca.pem
step crypto key format --ssh /tmp/.ssh_user_ca.pem > ssh_user_ca.pub
rm /tmp/.ssh_user_ca.pem

step-kms-plugin key "pkcs11:module-path=/usr/lib/x86_64-linux-gnu/opensc-pkcs11.so;serial=$SERIAL;id=03?pin-value=$PIN" > /tmp/.ssh_host_ca.pem
step crypto key format --ssh /tmp/.ssh_host_ca.pem > ssh_host_ca.pub
rm /tmp/.ssh_host_ca.pem

Sanity-check format:

head -c 30 ssh_user_ca.pub
head -c 30 ssh_host_ca.pub

Both should start with ecdsa-sha2-nistp384 .

Fallback if step crypto key format --ssh is missing in this step-cli version: replace the conversion line with ssh-keygen -i -m PKCS8 -f /tmp/.ssh_*_ca.pem > ssh_*_ca.pub. Same result, uses standard OpenSSH instead of step-cli.

If step-kms-plugin key --help shows --format ssh (or --ssh) in newer releases, you can collapse each conversion to a single command — but the two-step PEM→ssh path above works on every version and matches the rest of the ceremony’s belt-and-suspenders style.

After both .pub files are written, clear the intermediate PIN before the swap to HSM #1:

unset PIN

Phase C — back to HSM #1 to cross-sign#

Gate: only HSM #1 (root) plugged in. Apply gate procedure. This is the final HSM swap. §12 below uses HSM #1 to sign the intermediate CSR; §13 (fingerprint) needs no HSM.

§12 — Sign the intermediate CSR with the root#

Read the root user PIN back into a shell variable (the intermediate PIN was unset at the end of §11):

read -rs PIN
step certificate sign \
    --profile intermediate-ca \
    --not-after 87600h \
    --kms "pkcs11:module-path=/usr/lib/x86_64-linux-gnu/opensc-pkcs11.so;serial=$SERIAL?pin-value=$PIN" \
    intermediate.csr \
    root_ca.crt \
    "pkcs11:id=01" \
    > intermediate_ca.crt
unset PIN

(step certificate sign writes the cert to stdout — no output-file password prompt, so --no-password --insecure aren’t needed here.)

For step certificate sign the three positionals are <csr> <issuing-ca-cert> <issuing-ca-key> — so root_ca.crt is the signing certificate (used to derive issuer info), not an output, and the third positional is the root-key URI on HSM #1. We use the same minimal pkcs11:id=01 form that worked in §7.

Argument order matters. This step-cli version requires all flags before all positionals. With positionals first the parser rejected the cert positional with scheme is missing (it was trying to apply URI parsing to it because a --kms argument followed). Putting --kms and the other flags before intermediate.csr resolves that. A bare relative path root_ca.crt is correct — do not add a file:// prefix; that errors out with scheme not expected.

87600h = 10 years.

Verify the chain:

step certificate verify intermediate_ca.crt --roots root_ca.crt
step certificate inspect intermediate_ca.crt --short

The verify command should print nothing (success). Inspect should show issuer = root subject.


Phase D — paper finalisation (no HSM)#

§13 — Record the root fingerprint on paper#

This is the canonical value every CCAT client will check at step ca bootstrap. If this paper is lost, recovery is painful.

step certificate fingerprint root_ca.crt

Operator transcribes the fingerprint from the screen onto the paper — twice, on two different sheets, side by side. Witness visually verifies each character on the screen against each character on the paper, on both sheets. Do not read the fingerprint aloud — same audio-leak hygiene as PINs (ambient mics in phones / watches / voice assistants). The fingerprint itself becomes public information once clients deploy, but visual verification against the screen is strictly stronger than aural confirmation (which only validates what the operator typed, not what is actually on screen vs paper), and keeping a uniform “all values are checked silently” rule prevents the operator developing speak-aloud habits that would leak the PIN/SO-PIN values on the same sheet. The serial check rule (spoken aloud) is the deliberate exception, scoped to non-secret device IDs. Both papers go in the envelope along with HSM #1 and the PIN sheet.

Also save it to the export USB later (§14) as a tamper-evident cross-check.


§14 — Copy public artefacts to the export USB#

Plug in the export USB (“CCAT CA EXPORT”). Do not unplug the supplies USB yet — the manifest from §2 is your audit trail.

EXPORT=/media/ubuntu/CCATEXPORT   # adjust to the actual mount-point
mkdir -p $EXPORT/ceremony-$(date +%Y-%m-%d)
cd ~/ceremony

cp root_ca.crt           $EXPORT/ceremony-$(date +%Y-%m-%d)/
cp intermediate_ca.crt   $EXPORT/ceremony-$(date +%Y-%m-%d)/
cp ssh_user_ca.pub       $EXPORT/ceremony-$(date +%Y-%m-%d)/
cp ssh_host_ca.pub       $EXPORT/ceremony-$(date +%Y-%m-%d)/
step certificate fingerprint root_ca.crt > $EXPORT/ceremony-$(date +%Y-%m-%d)/FINGERPRINT.txt

ls -la $EXPORT/ceremony-$(date +%Y-%m-%d)/
sync
umount $EXPORT

Five files must be on the export USB:

  • root_ca.crt

  • intermediate_ca.crt

  • ssh_user_ca.pub

  • ssh_host_ca.pub

  • FINGERPRINT.txt

There is no private file to copy. The private keys are inside the HSMs. If you find yourself reaching for an intermediate_key.pem, stop — something has gone catastrophically wrong.

Witness compares FINGERPRINT.txt content against the paper from §13, character by character. Equal? Good.


§15 — Unplug HSMs, label, power off#

Unplug HSM #1 first. Apply a permanent label sticker: CCAT root DO NOT INSTALL YYYY-MM-DD. Place it in the envelope with the root PIN paper and the fingerprint paper.

Unplug HSM #2. Label: CCAT intermediate input-b YYYY-MM-DD. Place in a small sealed bag for transport to the server room.

Unplug the supplies USB. Do not plug it into a networked machine afterwards — the manifest in §2 was the tamper-detection control; that seal can only be checked again before another ceremony, not after casual reuse.

Power off the laptop normally (Power Shut Down, not Suspend). After it’s off, leave it for ~30 seconds before removing the boot USB, to be sure nothing is still flushing to it.


§16 — Hand-off#

The ceremony is done. The post-ceremony cutover happens elsewhere on the network. The next operator picks up at docs/COMMISSIONING-TODO.md § “HSM cutover on input-b” and the executable docs/source/ceremony/cutover-playbook.md. They will need:

  • HSM #2 (intermediate, in its labelled bag)

  • The intermediate user PIN (memorised by, or shared with, the cutover operator — not the root PINs, which stay in the safe)

  • The export USB

  • The fingerprint paper, for cross-checking the committed root_ca.crt in ansible/roles/ca_trust/files/ and the step ca bootstrap --fingerprint <…> value the test cohort will run

The boot USB and the supplies USB are now ceremony-archive media. Put them in the safe with HSM #1, separately from the root PINs.


If something goes wrong mid-ceremony#

  • Before §5 (HSM init): stop, power off, retry on a different day. Nothing has been written to HSMs yet.

  • Between §5 and §12: the HSMs hold partial state. You can re-sc-hsm-tool --initialize an HSM (this wipes it) and start over from §5 — but only do this with the witness in agreement, and write down what happened. If you wipe HSM #1 you must also discard root_ca.crt from §7 if it was already created; if you wipe HSM #2 you must discard intermediate.csr and ssh_*_ca.pub from §§10–11.

  • After §12 (root has signed the intermediate): if the signed output is bad, do not retry by re-initialising HSM #1 unless you also agree the root cert from §7 is to be discarded. The whole ceremony is atomic; you either commit to the root cert from §7 or you wipe both HSMs and rerun from §5 with new PINs.

  • At any point: if you suspect the ceremony has been observed (someone walks in, an unexpected light blinks, a USB device is inserted that you did not authorise) — stop, power off, treat the HSMs as compromised, replan with fresh PINs and fresh HSMs.