Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,35 @@ pdflatex htmltrust.tex

The compiled PDF will be output as `paper/htmltrust.pdf`.

## Known Issue: Runtime DOM Mutation Breaks Verification

HTMLTrust signs the **static HTML** that leaves the publishing pipeline. Browser verifiers, however, read the **live DOM** β€” the state of the page after every script on the page has finished running. If anything inside a `<signed-section>` is mutated between page load and verification, the verifier's recomputed `content-hash` will not match the signed one, and the signature will be reported as invalid even though it is cryptographically correct.

### Concrete cases we have observed

- **Hugo Blox docs theme** injects a `<button class="copy-button">Copy</button>` into every `<pre>` block at runtime. When a signed page contains a code block, the verifier sees an extra `Copy` token inside the signed region that the signer never saw.
- Any **client-side syntax-highlighting** library (Prism, highlight.js) that rewrites a code block's inner HTML at load time has the same effect.
- **Analytics, lazy-loading, or social-share injection** libraries that add nodes inside content containers will break verification if they touch a signed region.

### Mitigations available today

| Mitigation | Trade-off |
|---|---|
| Ensure no client-side script writes into `<signed-section>` descendants | Simplest, but constrains theme/framework choice |
| Pre-render any decoration server-side (e.g., emit the "Copy" button into the static HTML so the signer hashes it) | Works, but every page-template change requires a re-bake |
| Move runtime-injected decoration **outside** the `<signed-section>` (sibling, not child) | Often the cleanest fix when you control the script |
| Read `outerHTML` from a pristine fetch instead of `element.innerHTML` for verification | Requires a verifier-side change; doesn't help current extensions |

### Open spec question

This is a real, general challenge for any content-signing protocol that targets browser-side verification. The spec needs to give implementations explicit guidance β€” likely a combination of:

1. **Stage 1 canonicalization** SHOULD define a "skip-on-mutation-marker" mechanism (e.g., `data-htmltrust-ignore="true"` on a subtree) so themes can mark decoration that must be excluded from the hash.
2. **Authoring guidance** SHOULD warn against injecting nodes inside signed regions at runtime.
3. **Verifier guidance** MAY recommend fetching the original document and verifying against that, treating DOM-state verification as a separate, optional capability.

This is tracked as an active open design question (see also [open design questions on the implementation page](https://www.htmltrust.org/implementation/#open-design-questions)). Community input is welcome.

## Companion Repositories

| Repository | Description |
Expand Down
2 changes: 1 addition & 1 deletion paper/htmltrust.tex
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ \subsection{Signed HTML Blocks}

\paragraph{Required attributes.} \texttt{keyid} (identifies the signer, resolvable per \S2.2), \texttt{signature} (the cryptographic signature, encoded per the hash encoding rules below), \texttt{content-hash} (hash of the canonicalized content, prefixed with the hash algorithm, e.g. \texttt{sha256:\ldots}), and \texttt{algorithm} (the signature algorithm, e.g. \texttt{ed25519}, \texttt{ecdsa}, or \texttt{rsa}).

\paragraph{Hash and signature encoding (open feedback).} Hashes and signatures in this revision are encoded as unpadded Base64, which is shorter than hexadecimal by roughly one-third. We invite community feedback on whether hexadecimal (widespread in tooling such as git and TLS), Base32 (case-insensitive and easier to transcribe by hand), or another encoding would be preferable for broader ecosystem alignment.\footnote{Or Ecoji, anyone? A 32-byte SHA-256 digest encodes to 26 emoji characters via the Ecoji base-1024 alphabet, producing a delightful \texttt{content-hash="sha256:πŸŽ‚πŸ¦ŠπŸ™πŸŒΊπŸŽ¨πŸ•πŸš€πŸŒˆπŸŽ­πŸ”οΈβš‘πŸ€πŸ¦„βœ¨πŸŒŠπŸ„πŸŽͺπŸ–οΈπŸŒ»πŸŽ―πŸ¦πŸŽ²πŸŒ™πŸ¦‹πŸŽΈπŸŽƒ"}. It is, alas, \textit{longer} in wire bytes because each emoji occupies four UTF-8 bytes, so we have not adopted it. But we thought you should know it exists.}
\paragraph{Hash and signature encoding (open feedback).} Hashes and signatures in this revision are encoded as unpadded Base64, which is shorter than hexadecimal by roughly one-third. We invite community feedback on whether hexadecimal (widespread in tooling such as git and TLS), Base32 (case-insensitive and easier to transcribe by hand), or another encoding would be preferable for broader ecosystem alignment.\footnote{Or Ecoji, anyone? A 32-byte SHA-256 digest encodes to 26 emoji characters via the Ecoji base-1024 alphabet, producing a delightfully unreadable \texttt{content-hash="sha256:<26 emoji from the Ecoji alphabet>"}. It is, alas, \textit{longer} in wire bytes because each emoji occupies four UTF-8 bytes, so we have not adopted it. But we thought you should know it exists.}

\paragraph{Canonical signing payload.} The signature is computed over a deterministic binding string:
\begin{lstlisting}[basicstyle=\ttfamily\footnotesize]
Expand Down
Loading