Fill in proofs of Cox PH partial likelihood and Breslow baseline cumulative hazard#772
Fill in proofs of Cox PH partial likelihood and Breslow baseline cumulative hazard#772d-morrison wants to merge 27 commits into
Conversation
|
@claude review |
|
@claude fix |
|
@claude address #772 (review) and check all the math; fill in skipped steps |
|
👀 Picked up by workflow run #26202718534. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that. |
|
@claude review |
|
👀 Picked up by workflow run #26206186188. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that. |
|
@claude review |
|
👀 Picked up by workflow run #26207215963. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that. |
|
@claude review |
|
👀 Picked up by workflow run #26208226724. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that. |
|
Claude finished @d-morrison's task in 3m 4s —— View job Code Review — PR #772
Three previously-flagged issues — all confirmed resolved ✅
All other checks — clean
VerdictNo issues found. Ready to merge. |
|
👀 Picked up by workflow run #26918260960. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that. |
|
Claude finished @d-morrison's task in 3m 16s —— View job Code Review — PR #772
Fresh review of current HEAD (merged with Three previously-flagged issues — all confirmed resolved ✅
Math correctness
Policy compliance
Cross-referencesAll markdown links verified:
Same-page
Formatting
VerdictNo issues found. Ready to merge. |
|
Claude Code CLI (local session) is working on this — syncing with main first. Paws off until I'm done. |
…kelihood-proofs Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
|
Claude finished @d-morrison's task in 6m 19s —— View job Code Review — PR #772
Fresh review of current HEAD (merge sync with Three previously-flagged issues — all confirmed resolved ✅
Math correctness
Policy compliance
Cross-referencesAll cross-page markdown links verified:
Same-page
MacrosAll custom macros in both proof files verified in
Formatting
Pre-existing issue (not this PR's responsibility)
VerdictNo issues found. Ready to merge. |
…ompose Reorder the proof of the Cox PH partial likelihood so it starts from the full likelihood of the realized failure sequence and decomposes each factor into a which-subject-fails part (kept) and a whether-a-failure part (discarded), rather than building the conditional probability first and appending the decomposition. Add labeled intermediate equations and remove the prior redundant double derivation of the ratio. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
|
Claude finished @d-morrison's task in 4m 30s —— View job Code Review — PR #772
Fresh review of current HEAD (merged with Three previously-open issues — all confirmed resolved ✅
Math correctness
Policy compliance
Cross-referencesAll cross-page markdown links verified:
Same-page
Other checks
Pre-existing nit (out of scope)
VerdictNo issues found. Ready to merge. |
Start the proof from the genuine joint likelihood of the observed data set, prod_j S(T_j|x_j) h(T_j|x_j)^delta_j, with no risk-set conditioning, citing the censored-data likelihood from the survival chapter. Discretize the survival factors over a time grid, regroup the (subject, interval) factors by interval so the risk sets emerge as conditioning sets, then decompose each event interval via the chain rule into the kept partial likelihood and a discarded remainder. The discarded factor now honestly includes the between-event survival background, not just the whether-a-failure terms, and is where all baseline-hazard dependence lives. No steps skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
|
Claude finished @d-morrison's task in 7m 45s —— View job Code Review — PR #772
Fresh review of current HEAD — latest two commits ( Issue 1 — Cross-proof inconsistency:
|
| Policy | Status |
|---|---|
\eqdef only in {#def-...} divs |
✅ — zero \eqdef in either proof file |
\hazfactor throughout (no \theta() |
✅ — confirmed 0 occurrences |
| No positional "above"/"below" | ✅ |
| Noun phrase after This/That/Those | ✅ — "This partial likelihood product" (l.284), "This proportionality justifies" (Breslow l.180) |
| Citation locators inside brackets | ✅ — [@klein2003survival, §8.3, Theoretical Note 1, p. 257], [@klein2003survival, §8.3], [@klein2003survival, §8.8] |
No bare \exp (use \expf{}) |
✅ |
| Trailing newlines | ✅ |
Cross-references ✅ (except as noted)
| Reference | Status |
|---|---|
#def-risk-set (explicit anchor) |
✅ |
#def-hazard (explicit anchor) |
✅ |
#likelihood-with-censoring (auto-generated, issue 2) |
|
#the-cumulative-hazard-function (auto-generated, issue 1) |
#cor-surv-int-haz |
@thm-ph-haz-decomp, @thm-ph-cuhaz, @def-ph-partial-lik |
✅ same-page |
@johansen1983extension |
✅ |
Verdict
Two nits in the rewritten proof content (both cross-reference issues, neither blocking correctness). The math is correct, all previously-flagged items remain resolved, and policy compliance is clean. The rewrite successfully grounds the partial-likelihood proof in the full joint likelihood before extracting the partial likelihood.
Switch the partial-likelihood proof to the univariate-survival chapter's
observed-time/event-indicator notation: Y_j = min(T_j, C_j) and
D_j = 1{T_j <= C_j} (was \tilde T_j, \delta_j). Rename the count of
distinct event times from D to K so it no longer collides with the
per-subject indicator D_j (the tie-multiplicity d_i is unchanged).
Remove the unnecessary \;=\; / \;\approx\; thick-space wrapping around
relational operators, relying on default math spacing.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
…kelihood-proofs Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
|
Claude finished @d-morrison's task in 6m 21s —— View job Code Review — PR #772
Fresh review of current HEAD (latest merge with Three previously-open issues — all confirmed resolved ✅
Two issues from the previous review — still openIssue A — Fragile auto-generated anchor (
Fix: change Issue B — Auto-generated anchor for censored-data likelihood (
New findingsIssue C —
Issue D — Cross-proof notation inconsistency (nit) The two proofs use different symbols for the same quantities:
Both are self-contained — each proof defines its own symbols. But a reader working through both proofs in sequence encounters two different notations for the same model. The partial-likelihood proof's choice of All other checks — clean
VerdictTwo open nits from the previous review (Issues A and B, both anchor-robustness issues), plus two new nits (C and D). Issue A is the most actionable: swap |
|
Claude finished @d-morrison's task in 7m 28s —— View job Code Review — PR #772
Fresh review of HEAD (current branch state). CI: ✅ Spellcheck, ✅ Lint; Quarto Preview in progress. Previously-flagged issues — confirmed resolved ✅
Math correctness ✅
Open issuesIssue A — Fragile auto-generated anchor (nit, actionable)
Fix: change Issue C —
Issue E — The Breslow proof uses Suggested fix: add a setup sentence to the Breslow proof preamble (after line 6) mirroring the partial-likelihood proof's notation introduction:
Issue B —
Issue D — cross-proof notation mismatch (lower-priority nit) The two proofs use different symbols for the same model quantities:
Each proof is self-contained but a reader switching between them encounters two notation systems with no bridge. The partial-likelihood proof explicitly cites its notation as matching "the All other checks — clean
VerdictMath is correct throughout. Five items remain open: Issue A (anchor consistency — easy, actionable), Issue C (undefined |
- _proof-ph-partial-lik.qmd:49: replace prose hyperlink to the
survival/cumulative-hazard relationship with a proper crossref
(@cor-surv-int-haz). Quarto crossrefs are robust to section
renames; the old kebab-case anchor was fragile.
- _proof-ph-partial-lik.qmd:267: define d_i parenthetically on first
use ("the number of events at distinct event time t_i") so the
no-ties statement is self-contained.
- _proof-breslow-baseline-cuhaz-est.qmd:6: define D in the setup
paragraph (the number of distinct event times t_1 < ... < t_D),
since D is used heavily throughout the proof without being
introduced.
|
Claude finished @d-morrison's task in 3m 22s —— View job Code Review — PR #772
Commit
|
| # | Issue | Status |
|---|---|---|
| C | d_i undefined — now defined parenthetically at line 268 of _proof-ph-partial-lik.qmd |
✅ Fixed |
| E | D undefined in Breslow proof — now defined at line 8 |
✅ Fixed |
| A | Fragile auto-anchor #the-cumulative-hazard-function → should use #cor-surv-int-haz |
Issue — Cross-page @cor-surv-int-haz violates the CLAUDE.md policy (_proof-ph-partial-lik.qmd:49)
The fix for Issue A replaced the fragile heading anchor with a Quarto @-crossref:
(@cor-surv-int-haz),
But @cor-surv-int-haz is defined in intro-to-survival-analysis.qmd (via _sec-cuhaz.qmd:4), while this proof is included in proportional-hazards-models.qmd. This is a cross-page reference. Per CLAUDE.md (added by this PR):
"For cross-page cross-references (labels in a different chapter), use direct markdown links
[text](chapter.qmd#label)— Quarto@labelsyntax only resolves within the same page"
The Breslow proof already uses the correct format for the identical relationship (_proof-breslow-baseline-cuhaz-est.qmd:25):
from the [survival/cumulative hazard relationship](intro-to-survival-analysis.qmd#cor-surv-int-haz)
Fix: replace (@cor-surv-int-haz) at line 49 with the markdown-link form:
(see the [survival/cumulative hazard relationship](intro-to-survival-analysis.qmd#cor-surv-int-haz))
All other checks — clean
| Check | Status |
|---|---|
| Math correctness (both proofs, end-to-end) | ✅ |
\hazfactor throughout (zero \theta() |
✅ |
No bare \exp (use \expf{}) |
✅ |
\eqdef only in {#def-...} divs |
✅ |
| No positional "above"/"below" | ✅ — "lowering … below $h_{0i}$" (Breslow l.51) is a mathematical comparison, not positional |
| Noun phrase after This/That/Those | ✅ — "this definition" (Breslow l.79), "this critical point" (Breslow l.132), "This proportionality" (Breslow l.181), "this approximation" (partial lik l.55), "this equality" (partial lik l.153), "This partial likelihood product" (partial lik l.287, via "This factorization is a construction") |
| Citation locators inside brackets | ✅ |
Same-page @label refs |
✅ — @thm-ph-haz-decomp, @thm-ph-cuhaz, @def-ph-partial-lik, @def-breslow-baseline-cuhaz-est, @johansen1983extension all in the same chapter |
| Cross-page refs as markdown links (except line 49) | ✅ — #def-risk-set, #def-hazard, #likelihood-with-censoring all use markdown links |
d_i defined |
✅ — parenthetical at line 268 |
D defined in Breslow proof |
✅ — line 8 |
K/D mismatch (informational) |
K vs D); each is self-contained but a bridging note would help. Non-blocking. |
#likelihood-with-censoring fragile anchor |
|
| Trailing newlines (both proof files, CLAUDE.md) | ✅ |
johansen1983extension bib entry |
✅ |
| CLAUDE.md new policies (4 lines) | ✅ |
Verdict
One policy violation introduced by the most recent commit: @cor-surv-int-haz on _proof-ph-partial-lik.qmd:49 is a cross-page reference that must use the markdown-link format instead of Quarto @-syntax. The Breslow proof already has the correct format for the same relationship; the partial-likelihood proof should be consistent. One-line fix.
|
Claude finished @d-morrison's task in 3m 35s —— View job Code Review — PR #772
One policy violation still open — cross-page
|
| Check | Status |
|---|---|
| Math correctness (both proofs end-to-end) | ✅ — unchanged from multiple prior clean verifications |
\hazfactor throughout (zero \theta() |
✅ — confirmed 0 occurrences in both proof files |
No bare \exp (use \expf{}) |
✅ |
\eqdef only in {#def-...} divs |
✅ — both _def-ph-partial-lik.qmd (lines 6, 8) and _def-breslow-baseline-cuhaz-est.qmd (line 4); zero \eqdef in either proof |
| No positional "above"/"below" | ✅ — only "lowering … below $h_{0i}$" (Breslow proof:51) is a mathematical comparison, not positional |
| Noun phrase after This/That/Those | ✅ — "This proportionality justifies" (Breslow:181), "This estimator reduces" (_sec_fit-coxph.qmd:85); no sentence-initial bare demonstratives in either proof |
| Citation locators inside brackets | ✅ — [@klein2003survival, §8.3, Theoretical Note 1, p. 257], [@klein2003survival, §8.3], [@klein2003survival, §8.3, Theoretical Note 2, p. 258], [@klein2003survival, §8.8] |
Same-page @label refs |
✅ — @thm-ph-haz-decomp and @thm-ph-cuhaz are in _sec-understand-coxph.qmd (chapter:23), before _sec_fit-coxph.qmd (chapter:31); @def-ph-partial-lik at _sec_fit-coxph.qmd:62, before proof at :64; @def-breslow-baseline-cuhaz-est at :82, before Breslow proof at :89 |
| Cross-page refs as markdown links | @cor-surv-int-haz) — see issue above; all other cross-page refs (#def-risk-set, #def-hazard, #likelihood-with-censoring, #cor-surv-int-haz in Breslow proof) use correct markdown-link format |
d_i defined |
✅ — parenthetical at partial-lik proof:268 |
D defined in Breslow proof |
✅ — line 8 |
| Trailing newlines | ✅ — both proof files end with :::\n |
| CLAUDE.md policies (4 lines added) | ✅ |
johansen1983extension bib entry |
✅ — well-formed |
| Forward references | ✅ — none remaining |
Verdict
One policy violation remaining: @cor-surv-int-haz at _proof-ph-partial-lik.qmd:49 is a cross-page @-syntax reference and must use the markdown-link format. One-character-plus-text fix. Everything else is clean.
…kelihood-proofs Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Per CLAUDE.md: cross-page references must use direct markdown links [text](chapter.qmd#label); Quarto @Label syntax only resolves within the same page. @cor-surv-int-haz lives in intro-to-survival-analysis, not in this chapter, so the (@cor-surv-int-haz) parenthetical wouldn't render as a working link. Switch to the same format the Breslow proof already uses for the identical relationship. Also merge main into the branch (it was 73 commits behind). Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
|
Claude finished @d-morrison's task in 2m 25s —— View job Code Review — PR #772
One previously-open issue — confirmed fixed ✅Commit
This now matches the identical reference in Full scan — no remaining issues
Pre-existing items (out of scope — not introduced by this PR)
VerdictNo issues found. Ready to merge. |
|
The one open finding from the latest review (the cross-page
I've merged the latest @claude please re-review. Generated by Claude Code |
|
👀 Picked up by workflow run #27813139531. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that. |
|
Claude finished @d-morrison's task in 3m 23s —— View job Code Review — PR #772
One previously-open issue — confirmed fixed ✅Commit This matches the format used in All checks — clean
Non-blocking observation (Issue B — unchanged, out of scope)
VerdictNo blocking issues found. Ready to merge. |
|
Based on my full review of the diff and cross-referencing of target anchors, here is my assessment: Code Review — PR #772OverviewAdds complete proofs for the Cox PH partial likelihood and the Breslow baseline cumulative hazard estimator, wraps both existing definitions in proper Previously Open Finding — Resolved ✅The cross-page
Cross-References — All Valid ✅Every cross-page anchor target exists in the current main branch:
Same-chapter Policy Compliance ✅All three policies flagged in the 2026-05-26 review are applied and documented in
Substantive Change:
|
Built to branch |
Summary
Adds proofs for the two estimators already stated (without justification) in
_subfiles/proportional-hazards-models/_sec_fit-coxph.qmd, following Klein & Moeschberger (2003), §8.3, pp. 257–258 (theoretical notes 1 and 2):Partial likelihood for the Cox PH coefficients$\beta$ — derived via the conditional-probability argument. Conditional on exactly one failure at $t_i$ among the risk set $R(t_i)$ , the probability that subject $(i)$ is the one who fails is $\theta(x_{(i)}) / \sum_{k \in R(t_i)} \theta(x_k)$ after $\haz_0(t_i)$ cancels in the numerator and denominator. Multiplied across event times this is $\Lik^*(\beta)$ , free of the baseline hazard.
Breslow estimator of the baseline cumulative hazard$\hat\cuhaz_0(t)$ — derived as a profile likelihood (Johansen 1983). Start from the full censored-data likelihood, restrict $\haz_0(\cdot)$ to point masses at observed event times, reindex the survival exponents by risk set, and maximize the separable log-likelihood over each $h_{0i}$ . Plugging $\hat\cuhaz_0(t)$ back into the full likelihood recovers (up to constants) the partial likelihood, closing the loop.
What changed
_subfiles/proportional-hazards-models/_proof-ph-partial-lik.qmd_subfiles/proportional-hazards-models/_proof-breslow-baseline-cuhaz-est.qmd_sec_fit-coxph.qmdincludes each proof immediately after the statement it justifiesreferences.bibfor the profile-likelihood attributionTest plan
quarto render chapters/proportional-hazards-models.qmd --to html→ cleanlintr::lint()on each touched/new file → No lints foundspelling::spell_check_package()→ No spelling errors@klein2003survivaland@johansen1983extensioncitations🤖 Generated with Claude Code