Design DigiCert — the PKI that issues, renews, and revokes X.509 certificates at a billion TLS connections a day. What a certificate actually is, where it lives (Keychain, Windows Store, JVM, Kubernetes Secrets), how the TLS handshake uses it, how revocation works (CRL, OCSP, CRLite), and how a Kubernetes-native CA — cert-manager, SPIFFE/SPIRE, Istio mTLS — replaces the manual ceremony at cloud scale. The schema, the issuance pipeline, the YAML, and the data architecture behind the padlock every browser shows.
| Question | 60-second answer |
|---|---|
| Why certificates exist | Establish trust between strangers — prove a server is who it claims to be, encrypt everything in transit. |
| Who issues them | Certificate Authorities. Root CAs are air-gapped; intermediates do the actual signing at scale. |
| How trust works | Chain: Root → Intermediate → Leaf. Your OS ships ~150 trusted roots. A leaf is trusted if it traces back to one of them. |
| How certificates fail | Expiry, compromised private key, CA misissuance, or CA distrust (DigiNotar 2011, Symantec 2018). |
| Modern solution | ACME + cert-manager: zero-touch issuance and renewal every 60–90 days, no human in the loop. |
| Typical Interview Site | Interview Studio |
|---|---|
| Memorization | Understanding |
| Coding only | Coding + Architecture + Data Modeling |
| Short answers | Deep reasoning with trade-offs |
| LeetCode style | Real-world engineering at scale |
| Junior focus | Senior / Staff / L6–L7 |
A lock icon looks like a feature. It is, in fact, a chain of cryptographic signatures, a trust hierarchy burned into every operating system worldwide, a real-time revocation check, and a key-agreement protocol — all resolved in under 50 milliseconds before a single byte of HTTP is exchanged.
"Design DigiCert's certificate issuance and lifecycle platform. Walk me through what an X.509 certificate actually is, where it lives on a client machine, how TLS uses it, how you handle revocation, and how you'd build the issuance pipeline that can issue a million certs a day with sub-300ms latency, zero downtime, and full audit compliance. Then tell me how this changes in a Kubernetes environment."
The question catches most candidates off-guard because PKI sits at an uncomfortable intersection: it's security infrastructure, but the interesting engineering is data modeling and distributed systems. A weak answer draws a box labeled "Certificate Authority" and calls it done. A strong answer names the four forces that make this hard, then shows how every later decision exists to survive one of them.
There are exactly four forces, and they are not the usual suspects:
In scope: certificate issuance (DV, OV, EV), the chain of trust, TLS handshake mechanics, CRL/OCSP/CRLite revocation, domain control validation (DCV), Certificate Transparency, and the Kubernetes-native CA (cert-manager, SPIFFE/SPIRE, Istio mTLS). Out of scope (modeled-for, not designed in detail): email signing (S/MIME), code signing, document signing, IoT device enrollment internals, and HSM hardware procurement. Two tensions I name up front: (1) issuance must be fast (sub-300ms for DV-auto) but also auditable forever (7-year retention per CA/B Forum); (2) revocation must be real-time but also work offline (CRLite).
Envelope math, volunteered:
| Quantity | Estimate | Consequence |
|---|---|---|
| TLS connections / day (global) | ~8–10B | OCSP responders need Anycast + Redis cache; can't be DB-hit per request |
| Certs issued / day (DigiCert) | ~3–5M | At 47-day lifetime: 6× renewal frequency = 18–30M/day by 2027 |
| OCSP response TTL | 24–48 h | Pre-sign and cache; revocation latency ≤ 24 h from CA/B Forum baseline |
| CRL size (intermediate) | <10 MB | Must be partitioned; browsers don't download unbounded CRLs |
| CT log submission latency | 50–200 ms | 2 independent logs in parallel; retry on failure; budget this in issuance SLA |
| Issuance latency target (DV-auto) | <300 ms | DCV via http-01 or dns-01 is the bottleneck; not the signing itself |
| Audit retention | 7 years | Append-only event log (Kafka → Iceberg / S3); WebTrust / ETSI audit access |
The number unlike the others: 8–10 billion OCSP checks per day. That is not a database problem. It is a CDN/Anycast problem — pre-signed responses, Redis-cached, refreshed before TTL expires. Every other number follows from resolving that one deliberately.
The accessible primer — HTTP vs HTTPS, padlock states, chain of trust, TLS handshake, and failure stories — lives in the article: Designing DigiCert →
A certificate is a signed data structure that binds a public key to an identity. The diagram below shows every field your browser validates. See the article for the plain-language walkthrough.
Quick reference: SAN (not CN) is canonical for hostnames. Extended Key Usage restricts misuse. OCSP URL (in AIA) is where clients check revocation. CT SCT List embeds 2+ Signed Certificate Timestamps — Chrome requires them. Signature covers all fields above; verify against the Issuer's public key.
| Field | Example value | Why it matters |
|---|---|---|
| Serial Number | A3:0F:81:C2:… (20 bytes) | CA-unique; key for CRL and OCSP revocation lookups |
| Subject / SAN | DNS:example.com, DNS:*.example.com | What the cert is valid for; CNs are deprecated |
| Not Before / Not After | 90-day window | Validity window; worthless outside it |
| Subject Public Key | ECDSA P-256 public key bytes | Server proves possession during TLS handshake |
| Extended Key Usage | TLS Web Server Authentication (OID 1.3.6.1.5.5.7.3.1) | Restricts cert use; prevents misuse |
| OCSP URL (AIA) | http://ocsp.digicert.com | Where clients check revocation in real time |
| CT SCT List | 2+ Signed Certificate Timestamps | Proof cert was CT-logged; Chrome requires this |
| Signature | Intermediate CA's private key signs all fields above | ~72 bytes (ECDSA); verified against Issuer's public key |
That's all HTTPS is: HTTP (the language websites speak) wrapped inside TLS (the sealed envelope). The certificate is the proof that the envelope was sealed by the real sender — not someone pretending to be your friend.
When macOS shows ⊗ This root certificate is not trusted, it means: your computer doesn't recognise the organisation that signed this certificate as a trustworthy authority. Think of it like a passport stamped by a country your government doesn't recognise. The passport might be real — but nobody here will accept it.
notAfter date, or the CA published a revocation notice. Even if the CA is trusted, this specific cert is no longer valid.HTTP sends data in plain text across the internet — like writing your credit card number on a postcard. Any router, ISP, or coffee shop Wi-Fi between you and the website can see it. HTTPS encrypts everything using TLS, so even if someone intercepts the traffic, they see random bytes. The certificate establishes the encryption keys AND verifies you're talking to the real site, not an impersonator. Companies that still run HTTP for anything involving user data are exposing their customers and violating most data privacy regulations (GDPR, CCPA, PCI-DSS).
TLS 1.3 reduced the handshake to one round-trip. Here is every message exchanged and what the client verifies before accepting the cert. See the article for the plain-language walkthrough.
CLIENT SERVER
────── ──────
ClientHello ──────────────────────────────────►
- TLS version: 1.3
- key_share: client's ephemeral X25519 pub key
- supported cipher suites
◄─────────────────────────── ServerHello
- selected cipher: TLS_AES_256_GCM_SHA384
- key_share: server's X25519 pub key
← BOTH SIDES DERIVE session keys now →
(all subsequent messages are encrypted)
◄─────────────────────────── EncryptedExtensions
◄─────────────────────────── Certificate (leaf + intermediates)
◄─────────────────────────── CertificateVerify (server signs handshake hash
with its private key — proves key possession)
◄─────────────────────────── Finished (HMAC over entire handshake)
CLIENT verifies the cert:
1. Chain: leaf → intermediate → trusted root in local trust store
2. SANs include the hostname being connected to
3. Not Before ≤ now ≤ Not After
4. CertificateVerify signature validates against cert's public key
5. OCSP staple (or online OCSP call) → status = GOOD
6. CT SCTs present (Chrome requires 2+ from independent logs)
Finished ─────────────────────────────────────► (confirms receipt)
═══════════════════════════════════════════════ Application Data (HTTP/2, encrypted)
Total added RTTs: 1 (0-RTT resumption possible on reconnect)
The interesting design is the OCSP cache as a separate hot read-path table and the serial number generation constraint. Every table below serves a distinct compliance or operational requirement.
-- Certificate Authorities (roots + intermediates)
CREATE TABLE certificate_authority (
ca_id BIGSERIAL PRIMARY KEY,
ca_type TEXT NOT NULL CHECK (ca_type IN ('root','intermediate','issuing')),
parent_ca_id BIGINT REFERENCES certificate_authority(ca_id),
common_name TEXT NOT NULL,
subject_dn TEXT NOT NULL,
public_key_sha256 BYTEA NOT NULL UNIQUE, -- key fingerprint
cert_pem TEXT NOT NULL,
valid_from TIMESTAMPTZ NOT NULL,
valid_until TIMESTAMPTZ NOT NULL,
hsm_slot_id TEXT, -- HSM partition reference (never the key itself)
is_active BOOLEAN NOT NULL DEFAULT TRUE
);
-- Every issued certificate
CREATE TABLE certificate (
cert_id BIGSERIAL PRIMARY KEY,
serial_number BYTEA NOT NULL, -- 20 bytes CSPRNG entropy (RFC 5280 / SC63)
issuing_ca_id BIGINT NOT NULL REFERENCES certificate_authority(ca_id),
subject_cn TEXT NOT NULL,
san_list TEXT[] NOT NULL, -- DNS names, IPs — indexed via GIN
cert_type TEXT NOT NULL CHECK (cert_type IN ('dv','ov','ev','code_sign','s_mime','client')),
public_key_algo TEXT NOT NULL, -- RSA, EC, Ed25519
public_key_sha256 BYTEA NOT NULL,
cert_pem TEXT NOT NULL,
valid_from TIMESTAMPTZ NOT NULL,
valid_until TIMESTAMPTZ NOT NULL,
issued_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
status TEXT NOT NULL DEFAULT 'active'
CHECK (status IN ('active','revoked','expired','hold')),
customer_id BIGINT NOT NULL,
order_id BIGINT NOT NULL,
ct_log_ids TEXT[] NOT NULL DEFAULT '{}', -- SCT log IDs (must have 2+)
UNIQUE (issuing_ca_id, serial_number)
);
-- Revocation (append-only; never delete)
CREATE TABLE revocation (
revocation_id BIGSERIAL PRIMARY KEY,
cert_id BIGINT NOT NULL REFERENCES certificate(cert_id),
revoked_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
reason_code INT NOT NULL, -- RFC 5280 CRLReason (1=keyCompromise, 4=superseded, etc.)
reason_text TEXT,
requested_by TEXT NOT NULL, -- customer / admin / automated
crl_published_at TIMESTAMPTZ -- null until included in next CRL
);
-- Domain Control Validation events
CREATE TABLE dcv_event (
dcv_id BIGSERIAL PRIMARY KEY,
domain TEXT NOT NULL,
method TEXT NOT NULL CHECK (method IN ('http-01','dns-01','tls-alpn-01','email')),
challenge_token TEXT NOT NULL,
challenge_value TEXT NOT NULL,
validated_at TIMESTAMPTZ,
expires_at TIMESTAMPTZ NOT NULL, -- reuse window (was 825d → now 90d)
cert_id BIGINT REFERENCES certificate(cert_id),
ip_logged INET -- requester IP for audit trail
);
-- OCSP response cache — the hot read path (8B req/day)
-- Populated at issuance and on revocation; refreshed before next_update
CREATE TABLE ocsp_response_cache (
cert_id BIGINT PRIMARY KEY REFERENCES certificate(cert_id),
ca_id BIGINT NOT NULL REFERENCES certificate_authority(ca_id),
this_update TIMESTAMPTZ NOT NULL,
next_update TIMESTAMPTZ NOT NULL, -- cache TTL
ocsp_status TEXT NOT NULL CHECK (ocsp_status IN ('good','revoked','unknown')),
signed_response BYTEA NOT NULL -- DER-encoded OCSPResponse; serve directly
);
-- Indexes
CREATE INDEX idx_cert_san ON certificate USING GIN (san_list);
CREATE INDEX idx_cert_serial ON certificate (issuing_ca_id, serial_number);
CREATE INDEX idx_cert_expiring ON certificate (valid_until) WHERE status = 'active';
CREATE INDEX idx_cert_customer ON certificate (customer_id, status, valid_until);
CREATE INDEX idx_rev_crl ON revocation (crl_published_at) WHERE crl_published_at IS NULL;
CREATE INDEX idx_ocsp_refresh ON ocsp_response_cache (next_update);
Serial number design: never use a sequence. CA/B Forum Ballot SC63 requires at least 64 bits of CSPRNG entropy in every serial. Use os.urandom(20) (20 bytes = 160 bits), store as BYTEA. The UNIQUE(issuing_ca_id, serial_number) constraint catches the astronomically-rare collision and lets the application retry. Sequential serials would leak issuance volume to anyone watching CT logs.
Every certificate starts as a PKCS#10 CSR the applicant generates from their private key. The CA validates it, an HSM-held key signs and returns the cert. For DV-auto: under 300 ms end-to-end.
Applicant (customer / cert-manager operator)
│
│ 1. Generate key pair locally — private key NEVER leaves applicant
│ 2. Create CSR: openssl req -new -key key.pem -out csr.pem
│ 3. POST /certificate { csr_pem, order_id, dcv_method }
│
▼
DigiCert RA — Registration Authority (validates domain control)
│
├─ DV (Domain Validation — automated):
│ http-01: GET /.well-known/acme-challenge/{token}
│ → must return "{token}.{account_thumbprint}"
│ dns-01: TXT _acme-challenge.example.com
│ = base64url(sha256("{token}.{account_thumbprint}"))
│ tls-alpn-01: TLS handshake on :443 with acmeValidation-v1 OID in SAN
│
├─ OV/EV: org checks (WHOIS, phone call, Dun & Bradstreet / GLEIF / LEI)
│
├─ Policy linting:
│ - Key size ≥ RSA 2048 or EC P-256+; SHA-256+ only; SAN required
│ - Not validity > 90 days; no wildcards on EV; EKU must match cert type
│ - Run pkilint + zlint — any WARN/ERR blocks issuance
│
├─ CT pre-certificate submission (in parallel to 2+ independent logs):
│ POST https://ct.googleapis.com/logs/argon2025h2/ct/v1/add-pre-chain
│ POST https://ct.cloudflare.com/logs/nimbus2025/ct/v1/add-pre-chain
│ ← receive Signed Certificate Timestamps (SCTs, 104 bytes each)
│
└─ HSM sign (PKCS#11 call; private key never leaves the HSM):
Build final TBSCertificate with SCTs embedded
Sign with Intermediate CA private key
Return: { cert_pem, chain_pem }
Write to DB:
INSERT certificate(..., ct_log_ids, status='active')
INSERT ocsp_response_cache(cert_id, status='good', next_update=NOW()+48h, signed_response=...)
INSERT dcv_event(validated_at=NOW(), ...)
Publish to Kafka topic cert.issued → audit log → S3/Iceberg
Revocation must reach every browser within 24 hours (CA/B Forum §4.9.1.1). Three mechanisms, each with different trade-offs:
| Mechanism | How it works | Latency | Privacy | Offline? |
|---|---|---|---|---|
| CRL | CA publishes a signed list of revoked serials as a file. Browsers download periodically (24–48 h). Partitioned to stay <10 MB per intermediate. | Up to 48 h | ✓ (no per-lookup call) | ✓ (file cached) |
| OCSP | Client makes an HTTP request per cert: "Is serial X revoked?" CA responds with a signed status. Stapling moves this onto the server. | <75 ms (w/ staple: 0 ms) | ✗ (CA sees every connection) / ✓ with stapling | ✗ (requires connectivity) |
| CRLite / CRLSets | All revoked serials from all CT logs compiled into a Bloom filter (~2 MB). Pushed to browsers daily with software updates. Zero OCSP latency, zero privacy leak. | 0 ms | ✓ | ✓ |
In Kubernetes, three tools compose a complete PKI: cert-manager (ACME/DigiCert lifecycle), SPIFFE/SPIRE (cryptographic workload identity), and Istio/Linkerd (transparent mTLS between every pod — no application code changes required).
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: digicert-acme
spec:
acme:
server: https://acme.digicert.com/v2/acme/directory
email: ops@example.com
privateKeySecretRef:
name: digicert-acme-account-key
solvers:
- dns01:
route53:
region: us-east-1
accessKeyIDSecretRef:
name: route53-creds
key: access-key-id
secretAccessKeySecretRef:
name: route53-creds
key: secret-access-key
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: example-com-tls
namespace: default
spec:
secretName: example-com-tls-secret
issuerRef:
name: digicert-acme
kind: ClusterIssuer
dnsNames:
- example.com
- "*.example.com"
duration: 2160h # 90 days
renewBefore: 720h # renew 30 days before expiry
privateKey:
algorithm: ECDSA
size: 256 # P-256
# Registration: bind k8s service account → SPIFFE ID
# spiffe://cluster.local/ns/payments/sa/checkout-api
# selector: k8s:ns:payments + k8s:sa:checkout-api
# ttl: 3600 (1-hour SVIDs, auto-rotated by SPIRE agent)
# Each pod gets its SVID at /run/spire/sockets/agent.sock
# Istio or app reads the SVID and uses it for mTLS
# AuthorizationPolicy: only checkout-api may call fraud-scorer
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: fraud-scorer-allow
namespace: payments
spec:
selector:
matchLabels:
app: fraud-scorer
rules:
- from:
- source:
principals:
- "cluster.local/ns/payments/sa/checkout-api"
cert-manager writes a renewed cert into the Kubernetes Secret, but running pods don't reload automatically. Three failure modes: env var mount (never updates — requires pod restart); volume mount (kubelet syncs in 60 s but app must re-read; most TLS libraries don't); cached TLS context (Go's tls.Config / Java's SSLContext loaded at startup). Fix: use Go's tls.Config.GetCertificate callback, or let Istio/Envoy handle TLS termination (xDS reloads with zero downtime).
Four simultaneous system-design problems: low-latency issuance, globally-distributed revocation, 7-year audit retention, and Kubernetes automation.
| Problem | Decision | Trade-off |
|---|---|---|
| OCSP at 8B req/day | BGP Anycast + pre-signed Redis cache; DB is never in the hot path | Cache TTL vs revocation freshness — 24 h is the CA/B Forum floor |
| Root CA protection | Air-gapped HSM; only the intermediate CA is online; Root signs intermediates in quarterly ceremonies | Speed of intermediate rotation vs security; quarterly is the industry norm |
| CT log submission latency | Submit to 3 logs in parallel, require 2 successes, timeout 250 ms; maintain 5 log relationships | Latency budget vs log diversity; more logs = more resilience at marginal cost |
| DCV reuse | Cache validated domains for 90 days (down from 825); reuse avoids re-challenge on renewal | Shorter reuse window = more DCV overhead; shorter lifetime cert = more renewals |
| Audit trail | Every issuance event → Kafka topic (immutable) → S3 Iceberg table (7-year retention) | Storage cost is trivial vs compliance cost of gaps; append-only is non-negotiable |
| Serial number | 20 bytes os.urandom(); UNIQUE constraint catches collision (2⁻¹⁵⁶ probability) | No sequential leak to CT log observers; entropy requirement met per SC63 |
| 47-day cert lifetime (2027) | ACME becomes mandatory in practice; issuance pipeline must scale 6× without human review | Revocation matters less (47-day exposure window); automation matters more |
These questions separate the staff-level answer from the senior one — curl behavior, REST API flows, lifecycle triggers, and how certs accumulate inside a real machine.
When I run curl https://example.com, what cert-related work does curl actually do?
curl opens TCP, initiates TLS, validates the chain against /etc/ssl/certs/ca-certificates.crt (Linux) or the system Keychain (macOS) — not the browser's store. Use curl -v to see every handshake message; curl --cacert custom.crt to override; curl -k to skip (never in production).
How do you use curl to inspect a certificate without a browser?
openssl s_client -connect example.com:443 -showcerts </dev/null | openssl x509 -noout -text dumps every field. In Kubernetes: kubectl get secret tls-cert -o jsonpath='{.data.tls\.crt}' | base64 -d | openssl x509 -noout -text.
curl returns SSL certificate problem: unable to get local issuer certificate. Walk me through diagnosis.
The server is not sending the intermediate. Run openssl s_client -connect host:443 -showcerts — if only one cert, fix nginx: ssl_certificate must point to a leaf + intermediate bundle. Root cause: server misconfiguration at deployment time.
What does curl --resolve example.com:443:1.2.3.4 https://example.com do, and when is it useful?
--resolve overrides DNS but sends the correct SNI hostname — test a new cert before DNS cutover without affecting live traffic. Use to verify a Kubernetes Ingress picked up the renewal.
How does the ACME REST API flow work, end to end? What are the actual HTTP calls?
ACME (RFC 8555) is JSON-over-HTTPS: Directory → New Account → New Order → Authorization (get challenge) → Provision challenge → Challenge Ready → Poll until valid → Finalize (post CSR) → Download PEM chain. Every POST is JWS-signed; replay protection via server-issued Replay-Nonce.
DigiCert also has a non-ACME REST API (CertCentral). How does that differ from ACME?
CertCentral (POST /services/v2/order/certificate/ssl_plus) supports OV/EV with human approval workflows, multi-year subscriptions, and webhook callbacks — things ACME can't do. Use ACME for automation, CertCentral API for enterprise OV/EV or DigiCert MPKI for IoT.
How does DigiCert's REST API handle idempotency? What happens if my client crashes mid-order and retries?
ACME orders are keyed by account + identifier set — re-posting the same identifiers returns the existing order, no duplicate. For CertCentral, store the order ID on the first successful POST and use it on retries.
DigiCert doesn't "push" certs to you — so what triggers a new certificate being issued?
DigiCert is always pull, never push. Four triggers: manual (CertCentral UI), ACME client cron (certbot/acme.sh checks expiry), cert-manager controller (notAfter - renewBefore < now), or CI/CD pipeline webhook. DigiCert has no visibility into your cert's expiry unless you use its managed renewal service.
How does cert-manager decide it's time to renew? Walk through the controller logic.
cert-manager reconciles every Certificate CRD: if notAfter - renewBefore < now, it creates a CertificateRequest (CSR wrapper), triggering the Issuer. The controller also watches the Secret — missing or expiring certs trigger immediately. The grace period ensures the new cert is written before the old one expires.
What happens if renewal fails and nobody notices? How do you design for this?
Alert on tls_cert_days_remaining < 14 (critical) and < 30 (warning) + Certificate.Ready=False for >1 h + external blackbox probe from outside the cluster. Runbook: if ACME fails, check kubectl describe certificate, verify DNS solver permissions, manually re-trigger by deleting the Secret.
How does a certificate "get onto" a user's computer? Nobody installs DigiCert manually.
Root CAs are bundled with the OS — Apple ~170, Microsoft 200+, Mozilla has its own program. When you connect, the server sends the leaf + intermediate chain; your machine verifies against a root it already trusts. The leaf is never stored permanently. What accumulates: intermediate CA cache (Chrome/Firefox), HSTS preload entries.
If a root CA is added to my macOS Keychain, what exactly changes? And how does software know to use it?
sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain ca.crt writes a trust record used by Safari, curl (SecureTransport), and native apps — but not Firefox, Node.js, or Java (each bundles its own store). In enterprise, MDM (Jamf, Intune) pushes root CAs to all managed devices.
How do certs "sink" — accumulate and go stale on a machine over time?
Stale intermediate cache (Chrome/Firefox), Keychain bloat (old self-signed + corporate CAs), JVM cacerts growth (never pruned after rotation), Docker base images with expired roots baked in. Fix: treat CA bundles as versioned artifacts in the build pipeline; prune expired entries on a schedule.
A root CA is removed from the Apple/Mozilla root program. What happens to machines that already have it?
Removal is distributed via OS/browser updates — machines that update stop trusting certs from that root; unpatched machines still trust it. Operators: monitor which CA issued your certs; have a runbook to re-issue from a different CA within 24 hours. Never be single-CA dependent for critical services.
What is certificate pinning, and why is it almost always a mistake?
Pinning hard-codes the expected public key hash — if the cert renews without updating the pin, all users are locked out. Google removed Chrome's HPKP in 2018 after catastrophic incidents; CT solves the rogue-CA problem more gracefully. If you must pin: pin the root or intermediate, never the leaf.
What is SNI and why does it matter for multi-tenant hosting?
SNI carries the hostname in plaintext in the ClientHello, enabling a single IP to serve thousands of certs. It leaks the hostname to ISPs and firewalls. Encrypted Client Hello (ECH) is the in-progress IETF fix; Cloudflare and Firefox support it experimentally.
Your company uses a TLS inspection proxy (MITM). How does that work cryptographically, and what breaks?
A TLS inspection proxy terminates TLS and re-establishes a new connection using a dynamically-generated cert from the corporate CA (pushed via MDM). What breaks: certificate pinning (key mismatch), CT verification (dynamic cert was never CT-logged), and client mTLS. It is a deliberate enterprise MITM trading user privacy for DLP visibility.
What is a wildcard certificate? When should you use one, and what are its security limits?
Wildcard (*.example.com) covers single-label subdomains only — not the apex or deeper paths. One key compromise invalidates all subdomains. Use per-service certs for sensitive endpoints (payment API, auth server); wildcard only for services you trust and control equally.
What is DANE and why hasn't it replaced CA-signed certs?
DANE publishes a cert fingerprint in a DNSSEC-secured TLSA record — no CA needed. Browser support is near-zero (Chrome/Safari never shipped it; Firefox removed it). DNSSEC adoption is too low and CT + short lifetimes solve the same rogue-CA problem without requiring it.
DigiCert had a misissuance that required revoking 83,000 certificates in 24 hours. How do you design the operator-side response?
Inventory via curl "https://crt.sh/?q=%25.example.com&output=json". If using cert-manager: delete affected Secrets to trigger immediate re-issuance from a secondary ClusterIssuer (have Let's Encrypt fallback pre-configured). Verify OCSP flips to revoked within 24 h. The 24-hour window is too short to design the response under pressure — have the runbook before the incident.
What is the difference between DV, OV, and EV certificates? Does EV still matter?
DV proves domain control only (automated, minutes). OV additionally verifies legal org via WHOIS/phone. EV is the strongest vetting — Chrome/Safari removed the green bar in 2019, making EV visually indistinguishable from DV. EV is now primarily a compliance checkbox (PCI-DSS, financial regulators); DV is sufficient for most public sites.
A developer checks a private key into git. What do you do?
Treat as confirmed compromise. Revoke immediately (CA/B Forum requires <24 h), rotate the key, purge git history (git filter-repo / BFG), audit server access logs for the exposure window. Post-mortem: add pre-commit hooks (truffleHog), enable GitHub secret scanning, migrate all key material to Vault or AWS Secrets Manager.
CLM (Certificate Lifecycle Management) is the generic industry term for the practice of tracking, issuing, renewing, and revoking certificates across an organization — knowing where every cert is, who owns it, and when it expires. Without CLM, enterprises discover expired certs when production goes down at 2 AM.
TLM (Trust Lifecycle Manager) is DigiCert's commercial product name for their CLM platform — it's their branded answer to the question "how does a Fortune 500 manage 50,000 certificates across 200 systems?" TLM adds discovery (scan your network for certs you didn't know existed), policy enforcement (alert if someone installs a self-signed cert), automation hooks (integrate with Venafi, ServiceNow, HashiCorp Vault), and executive dashboards showing cert risk posture.
| CLM (generic practice) | TLM (DigiCert product) | |
|---|---|---|
| What it is | The discipline of managing cert inventory and lifecycle | DigiCert's SaaS platform implementing CLM |
| Key functions | Issuance, renewal, revocation, expiry alerting | All of CLM + network discovery, policy engine, ITSM integrations, compliance reporting |
| Who uses it | Any org with >50 certs should practice CLM | Enterprises with 1,000s of certs, complex PKI, regulatory requirements |
| Alternatives | Venafi, AppViewX, Sectigo SCM, HashiCorp Vault + cert-manager | Competing enterprise CLM platforms (Venafi TLS Protect, AppViewX CERT+) |
| CA | Price | Best for | Key facts |
|---|---|---|---|
| DigiCert | $$$ | Enterprise, OV/EV, IoT, code signing, MPKI | Largest commercial CA by revenue; acquired Symantec's CA business (2017), QuoVadis, Verizon Business; TLM platform for enterprise CLM; global Anycast OCSP |
| Let's Encrypt | Free | Web servers, startups, automation, Kubernetes | Non-profit (ISRG); DV only; 90-day certs; ACME-first; issues ~4M certs/day; backed by EFF, Mozilla, Cisco; no OV/EV; no wildcard until 2018; the default for cert-manager |
| Google Trust Services (GTS) | Free (via Google products) | Google Cloud, Firebase, Google Workspace | Google's own CA; used internally for google.com and all GCP-managed certs; GCP Certificate Manager issues from GTS automatically; not available as a standalone commercial CA for external use |
| Sectigo (was Comodo CA) | $–$$ | SMB, cheap DV/OV/EV, resellers | Largest CA by volume (mostly via resellers); rebranded from Comodo CA in 2018 after security incidents; competitive pricing; large reseller network; SCM for CLM |
| SSL.com | $–$$ | Budget DV/OV/EV, document signing, S/MIME | US-based; competitive pricing; strong S/MIME and document signing portfolio; ACME support; good for orgs needing OV/EV at lower cost than DigiCert |
| Entrust | $$$ | Government, financial services, identity | Strong in public sector and banking; PKI-as-a-service; nCipher HSM integration; competes directly with DigiCert in enterprise |
| GlobalSign | $$ | Mid-market, IoT, MPKI | Subsidiary of GMO Internet; strong in IoT device identity; Atlas managed PKI platform; good ACME support |
When you buy a domain on GoDaddy or Squarespace and HTTPS "just works," someone made a CA partnership decision on your behalf. Here's how it works:
| Option | Cost | How | Limitations |
|---|---|---|---|
| Let's Encrypt | Free | ACME client (certbot, acme.sh, cert-manager). Runs on any server. Renews automatically every 60–90 days. | DV only. Rate limits (50 certs/domain/week). No phone support. No EV. No wildcard on http-01 (dns-01 required for wildcard). |
| Cloudflare Free | Free | Point your DNS to Cloudflare. Universal SSL is automatic. Zero config. | Cert is Cloudflare's — your origin can use a self-signed cert for the Cloudflare-to-origin leg. You don't control the leaf cert directly. |
| AWS ACM | Free | Provision via console or Terraform. Auto-renews. Attached to ALB/CloudFront. | Not exportable. Only works with AWS services. Private key never accessible. |
| GCP-managed certs | Free | One annotation on a Kubernetes Ingress or GCP Load Balancer resource. Google handles issuance and renewal via GTS. | GCP-only. DV only. No custom CA. |
| Sectigo / SSL.com DV | $7–15/yr | Buy via reseller (Namecheap, SSLs.com). Good if you need a single DV cert with a commercial CA name for client trust. | Manual renewal unless you use their ACME endpoint. OV costs more. |
| Self-signed (internal only) | Free | openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes | Red X in browsers. Only usable for internal tools, local dev, or mTLS between services you control with a custom trust root pushed to all clients. |
Canvas (Instructure) is the LMS used by millions of students and teachers. In a real-world breach scenario, here's where certificates are both the attack surface and the defense:
mydistrict.instructure.com. If an attacker can get a TLS certificate for that subdomain — even fraudulently — they can stand up a convincing phishing site that shows a valid padlock, harvest student login credentials, and access gradebooks, 504 plans, and IEP documents. Before Certificate Transparency, this was possible without detection. A rogue CA could issue a cert for mydistrict.instructure.com and nobody would know.*.instructure.com appears in public logs within minutes. Instructure (and DigiCert's monitoring service) watches CT logs for their domains. An unauthorized cert triggers an immediate alert — revocation request filed within the hour. The attack window shrinks from "months before discovery" (DigiNotar 2011) to "minutes." Google's Safe Browsing also ingests CT alerts and can push warnings to Chrome users visiting a phishing site using a recently-logged suspicious cert.mkcert is a tool that creates a locally-trusted CA and issues certs for localhost and *.test domains — the cert is trusted only on the machine where mkcert installed its CA root. This avoids the red X in local dev without adding self-signed exceptions to the system trust store, and the locally-trusted CA is never exposed to the internet.| Provider | Service | CA used | Export key? | Best for |
|---|---|---|---|---|
| AWS | ACM (Certificate Manager) | Amazon Trust Services (own CA) | No — HSM-backed | ALB, CloudFront, API Gateway, Elastic Beanstalk; free; auto-renews |
| AWS | ACM Private CA | Your own CA hosted in AWS | Yes (for private certs) | Internal mTLS, IoT, on-prem hybrid; ~$400/CA/month |
| GCP | Certificate Manager | Google Trust Services | No | GCP Load Balancers, GKE Ingress; free; annotation-driven |
| GCP | Certificate Authority Service | Your own CA on GCP | Yes | Enterprise PKI, device identity, mTLS; pay-per-cert |
| Azure | App Service Managed Cert | DigiCert (via Microsoft partnership) | No | App Service apps; free for apex + subdomains |
| Azure | Key Vault Certificates | DigiCert or GlobalSign (configurable) | Yes (if policy allows) | Enterprise cert storage, rotation, policy; integrates with AKS |
| Cloudflare | Universal SSL | Cloudflare CA / Let's Encrypt / DigiCert | No (edge cert) | Any site behind Cloudflare; free; automatic |
SaaS applications like Salesforce use certificates at three distinct layers, each with different ownership and lifecycle:
login.salesforce.com, yourorg.my.salesforce.com — Salesforce issues and manages these certs (DigiCert or GTS). They renew automatically. If they expire, it's Salesforce's outage, not yours. Zero customer action required.crm.example.com pointing to Salesforce, you own and upload that cert. Salesforce's My Domain feature accepts a PEM cert + private key from DigiCert, Sectigo, or any CA. You are responsible for renewal. This is the #1 source of Salesforce cert expiry outages in enterprise — a domain team uploads a 1-year cert, the Salesforce admin who set it up leaves, and 12 months later CRM goes dark.A typical modern cloud-native stack uses certificates at every layer simultaneously — and they're all managed differently:
The operational challenge: these six layers each have different renewal schedules, different owners, and different failure modes. A mature org has a cert inventory (from CLM/TLM or a Prometheus exporter) that surfaces all of them in one dashboard — so the 2 AM alert is "cert expires in 13 days" not "site is down."
| Dimension | Weak answer | Strong answer |
|---|---|---|
| What is a cert? | "A file that proves identity" | X.509 v3: serial, SAN, EKU, OCSP URL, CT SCTs, CA signature — and walks the chain |
| Trust stores | Vague about "the browser's cert store" | Names macOS Keychain vs JVM cacerts vs NSS; explains why "works in Chrome not curl" |
| TLS handshake | "Client connects and a key is exchanged" | Describes 1-RTT TLS 1.3, what client verifies (chain, SAN, OCSP, CT), OCSP stapling |
| Revocation | "The cert is revoked" | Differentiates CRL/OCSP/CRLite; names the soft-fail problem; explains OCSP stapling and CRLite Bloom filter |
| Schema | Single certs table | CA table, cert table, revocation (append-only), dcv_event, ocsp_response_cache (hot path separated); serial number = CSPRNG not sequence |
| Kubernetes | "Use Let's Encrypt" | cert-manager + ClusterIssuer, SPIFFE/SPIRE for workload identity, Istio mTLS + AuthorizationPolicy, hot-reload solution |
| 47-day implication | Doesn't mention it | 6× issuance volume; ACME mandatory; revocation less critical; monitoring dashboards required |