Skip to content

Docs: acme_server lifetime constraint is incomplete — must account for the intermediate's renewal_window_ratio #546

Description

@73VW

Page

acme_server directive (and the pki / intermediate_lifetime and renewal_window_ratio options it interacts with).

What the docs currently say

For the acme_server lifetime option:

lifetime (Default: 12h) is a duration which specifies the validity period for issued certificates. This value must be less than the lifetime of the intermediate certificate used for signing. It is not recommended to change this unless absolutely necessary.

The problem

The documented constraint — lifetime < intermediate_lifetime — is necessary but not sufficient, and it is misleading precisely in the situation where a reader consults it (i.e. when raising lifetime).

Caddy renews the CA's intermediate when a fraction of its lifetime remains, governed by the PKI renewal_window_ratio (default 0.2). While the old intermediate is still the active signer, any leaf it issues inherits a signer whose remaining validity is only renewal_window_ratio × intermediate_lifetime. If a leaf's lifetime exceeds that remaining window, the leaf will outlive the intermediate that signed it.

When that happens, the served chain becomes invalid the moment the old intermediate expires — even though the leaf itself is still valid — and stays broken until the leaf is renewed and re-chained onto the fresh intermediate.

So the real constraint is:

lifetime  <  renewal_window_ratio(PKI) × intermediate_lifetime

Concrete failing example (all values otherwise documented as valid)

  • acme_server lifetime = 72h
  • intermediate_lifetime = 168h (the 7-day default)
  • PKI renewal_window_ratio = 0.2 (default)

Check against the documented rule: 72h < 168h ✓ — looks fine.

Check against the actual rule: 0.2 × 168h = 33.6h, and 72h > 33.6h ✗ — broken.

Concretely: set the intermediate's birth at t = 0, so it expires at t = 168h and is rotated at t = intermediate_lifetime − renewal_window_ratio × intermediate_lifetime = 134.4h (the last instant the old intermediate is the active signer). A leaf (re)issued at t expires at t + lifetime, and outlives its signer when t + 72 > 168, i.e. t > 96h. Combined with t ≤ 134.4h, the issuance danger window is:

intermediate_lifetime − lifetime   <   t   ≤   intermediate_lifetime − renewal_window_ratio × intermediate_lifetime
            96h                     <   t   ≤                       134.4h

Its width is exactly lifetime − renewal_window_ratio × intermediate_lifetime = 72h − 33.6h = 38.4h — i.e. the amount by which the safety inequality is violated (when the inequality holds, this width is negative and the window does not exist, so the chain never breaks).

Note this counts renewals, not just first issuance: a leaf whose renewal falls inside the window is re-chained onto the soon-to-expire intermediate and orphaned, even if it was originally issued outside it. Because leaves renew continuously across a fleet, a (re)issuance lands in this window on essentially every intermediate rotation, so the outage recurs.

How we hit this in practice (realistic user journey)

We run an internal staging network where two ingress reverse proxies act as the internal CA (Caddy pki app) and expose an ACME endpoint. Backend hosts (reachable over internal dynamic DNS) run Caddy as ACME clients and obtain leaf certs from that CA; the reverse proxies talk to the backends over HTTPS and verify the upstream certificate.

We had to setup a machine with Nginx + Acme for a specific reason. acme.sh cron won't renew on a window tighter than ~24h. With the 12h default lifetime, the cert expired before the cron renewed it.

We followed the docs:

  1. With the 12h default lifetime, the cert expired before the cron renewed it — a legitimate reason to raise lifetime ("absolutely necessary" was genuinely met).
  2. We raised lifetime to 72h as a safety margin.
  3. We checked the documented constraint: 72h < 168h ✓.
  4. The "not recommended to change this unless absolutely necessary" note correctly drew our attention to lifetime being the sensitive knob — but nothing indicated that changing lifetime requires also revisiting intermediate_lifetime through the renewal ratio.

Result: a guaranteed outage on every ~7-day intermediate rotation. Leaf certs were still valid, the CA was renewing its intermediate correctly on disk — the only fault was the undocumented coupling between lifetime, intermediate_lifetime, and the PKI renewal_window_ratio.

This is not a Caddy bug; the behaviour is internally consistent. It is purely a documentation gap, and the current wording actively reinforces the mistake.

Suggested doc change

Replace the bare < intermediate_lifetime constraint with the ratio-aware one, and add the failing example. Proposed wording for the lifetime description:

  • lifetime (Default: 12h) is a duration which specifies the validity period for issued certificates.

    ⚠️ This value must be less than renewal_window_ratio × intermediate_lifetime, using the renewal_window_ratio of the intermediate certificate used for signing.

    It is not recommended to change this unless absolutely necessary. If you do raise it, raise intermediate_lifetime accordingly so the inequality still holds.

    The intermediate is rotated while renewal_window_ratio × intermediate_lifetime of its validity remains, and a leaf issued (or renewed) just before that rotation inherits the old intermediate as its signer. If the leaf's lifetime exceeds the intermediate's remaining validity at that point, the leaf outlives its signing intermediate: the served chain becomes invalid as soon as that intermediate expires — even though the leaf itself is still valid — until the leaf is renewed.

A cross-reference from the intermediate_lifetime and renewal_window_ratio options back to this constraint would help too.

Environment

  • Caddy v2.11.4 (pki app as internal CA + acme_server), ACME clients on separate hosts.
  • Not version-specific — the relationship between the three options is the same across recent 2.x.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions