Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
f43135c
Fill in proofs of Cox PH partial likelihood and Breslow baseline cumu…
d-morrison May 27, 2026
b4a29b1
Show the full-likelihood factorization the partial likelihood discards
d-morrison May 27, 2026
afc8e6a
Drop "used above" back-reference in partial-likelihood factorization
d-morrison May 27, 2026
842ea45
Merge remote-tracking branch 'origin/main' into feat/coxph-partial-li…
d-morrison May 27, 2026
f5bcf16
Add explicit nouns after mid-sentence "this" in the Cox proofs
d-morrison May 27, 2026
bc2ca75
Add noun after "Apply this" in partial-likelihood proof
d-morrison May 27, 2026
9fa0381
Merge remote-tracking branch 'origin/main' into feat/coxph-partial-li…
d-morrison May 27, 2026
5267fd6
add algebraic connection between partial likelihood and full likelihood
claude[bot] May 27, 2026
87c70bb
Merge branch 'main' into feat/coxph-partial-likelihood-proofs
d-morrison May 29, 2026
03edd64
Merge main into feat/coxph-partial-likelihood-proofs
claude Jun 3, 2026
1d73616
Merge main into feat/coxph-partial-likelihood-proofs
claude Jun 3, 2026
c8d337c
Merge remote-tracking branch 'origin/main' into feat/coxph-partial-li…
d-morrison Jun 3, 2026
8c26461
fix(coxph proofs): address review findings on PR #772
d-morrison Jun 3, 2026
8ed1566
fix(coxph proofs): remove forward pointer to Breslow estimator def
d-morrison Jun 3, 2026
72b134e
Merge main into feat/coxph-partial-likelihood-proofs
claude Jun 5, 2026
e7063d2
Merge remote-tracking branch 'origin/main' into feat/coxph-partial-li…
d-morrison Jun 5, 2026
ec982c8
Restructure partial-likelihood proof: full likelihood first, then dec…
d-morrison Jun 5, 2026
e9f6718
Derive partial likelihood from the full right-censored joint likelihood
d-morrison Jun 5, 2026
7f953c2
Use Y_j/D_j survival notation; drop redundant spacing around relations
d-morrison Jun 5, 2026
4f29c04
Merge remote-tracking branch 'origin/main' into feat/coxph-partial-li…
d-morrison Jun 5, 2026
f1d6eae
Merge remote-tracking branch 'origin/main' into feat/coxph-partial-li…
claude Jun 6, 2026
c686bea
Merge remote-tracking branch 'origin/main' into feat/coxph-partial-li…
claude Jun 9, 2026
017bed2
Address three review nits on Cox PH proofs
claude Jun 9, 2026
9c33908
Merge branch 'main' into feat/coxph-partial-likelihood-proofs
d-morrison Jun 16, 2026
bd7024a
Merge remote-tracking branch 'origin/main' into feat/coxph-partial-li…
d-morrison Jun 18, 2026
b845a74
Fix cross-page @cor-surv-int-haz to markdown link (rme#772)
d-morrison Jun 18, 2026
2cceb9c
Merge remote-tracking branch 'origin/main' into feat/coxph-partial-li…
claude Jun 19, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ Before committing any `.qmd`, `.R`, or config file change:
### Math Notation
- Use custom macros from `latex-macros/macros.qmd` instead of raw LaTeX
- Key macros: `\E{Y|X=x}`, `\ba`/`\ea`, `\tp{v}`, `\b`, `\g`, `\a`, `\devn(...)`, `\erf{...}`
- Use `\eqdef` instead of `=` for the defining equation in any `{#def-...}` div
- Include every intermediate step in derivations — do not skip steps
- Color coding: `\red{...}` for focal/extra terms, `\blue{...}` for shared terms
- Ratios vs. factors:
Expand All @@ -66,6 +67,9 @@ Before committing any `.qmd`, `.R`, or config file change:
- Factual claims must have a specific citation
- Variable definitions in exercises: use bullet points/table with symbol, meaning, and dataset column
- After every definition or concept, include a concrete example — preferably numerical — to illustrate the abstract idea; use a `{#exm-...}` div
- Never use "above" or "below" to refer to content — cross-reference with `@label` syntax instead
- For cross-page cross-references (labels in a different chapter), use direct markdown links `[text](chapter.qmd#label)` — Quarto `@label` syntax only resolves within the same page
- Always add a noun phrase after "This", "That", and "Those" to clarify the referent (e.g., "This estimator", not "This")

### Pull Requests
- Remove existing review requests immediately when starting work on a PR
Expand Down
Original file line number Diff line number Diff line change
@@ -1,2 +1,7 @@
$$\hat \cuhaz_0(t) =
\sum_{t_i < t} \frac{d_i}{\sum_{k\in R(t_i)} \hazfactor(x_k)}$$
:::{#def-breslow-baseline-cuhaz-est}
#### Breslow estimator of the baseline cumulative hazard

$$\hat \cuhaz_0(t) \eqdef
\sum_{t_i \le t} \frac{d_i}{\sum_{k\in R(t_i)} \hazfactor(\vx_k)}$$

:::
11 changes: 7 additions & 4 deletions _subfiles/proportional-hazards-models/_def-ph-partial-lik.qmd
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
{{< include latex-macros/macros.qmd >}}
:::{#def-ph-partial-lik}
#### Cox PH partial likelihood

$$
\ba
\Lik^*_i &= \frac{\hazfactor(\vx_i)}{\sum_{k \in R(t_i)} \hazfactor(\vx_k)}
\Lik^*_i(\b) &\eqdef \frac{\hazfactor(\vx_{(i)})}{\sum_{k \in R(t_i)} \hazfactor(\vx_k)}
\\
\Lik^* &=
\prod_{\set{i:\ d_i = 1}} \Lik^*_i
\Lik^*(\b) &\eqdef
\prod_{\set{i:\ d_i = 1}} \Lik^*_i(\b)
\ea
$$

:::
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
::: proof

Adapted from [@klein2003survival, §8.3, Theoretical Note 2, p. 258];
the original profile-likelihood argument is due to @johansen1983extension.

Assume, as in the partial-likelihood proof, that there are no tied event times,
so each ordered event time $t_i$ corresponds to exactly one event.
Let $D$ denote the number of distinct event times $t_1 < \cdots < t_D$.

The full censored-data likelihood for the proportional-hazards model is

$$
\Lik\sb{\b,\,\haz_0(\cdot)}
= \prod_{j = 1}^{n}
\haz(\tilde{T}_j \mid \vx_j)^{\delta_j}\,
\surv(\tilde{T}_j \mid \vx_j),
$$

where $\delta_j$ is the event indicator and $\tilde{T}_j$ the observed time
for subject $j$.
Substituting
$\haz(t \mid \vx) = \haz_0(t)\,\hazfactor(\vx)$ (by @thm-ph-haz-decomp)
and
$\surv(t \mid \vx) = \expf{-\cuhaz_0(t)\,\hazfactor(\vx)}$
(by @thm-ph-cuhaz, using $\surv(t) = \expf{-\cuhaz(t)}$ from the [survival/cumulative hazard relationship](intro-to-survival-analysis.qmd#cor-surv-int-haz))
gives

$$
\Lik\sb{\b,\,\haz_0(\cdot)}
= \prod_{j = 1}^{n}
\sb{\haz_0(\tilde{T}_j)\,\hazfactor(\vx_j)}^{\delta_j}\,
\expf{-\cuhaz_0(\tilde{T}_j)\,\hazfactor(\vx_j)}.
$$

Fix $\b$ and maximize over $\haz_0(\cdot)$.
Subjects with $\delta_j = 0$ contribute only the survival term;
for such subjects, $\haz_0(\tilde{T}_j)^{\delta_j} = 1$ regardless of $\haz_0$.
Adding mass to $\haz_0$ at a non-event time $t$ increases $\cuhaz_0(\tilde{T}_j)$
for every subject with $\tilde{T}_j \ge t$, penalizing their survival terms,
without any compensating gain in a hazard-density factor.
Conversely, for each event subject $j$ with $\delta_j = 1$, treat
$\haz_0$ as a discrete hazard measure: $\haz_0(t_i)$ equals the
point-mass weight placed at $t_i$. Consider any allocation of a
fixed total mass $h_{0i}$ to a neighborhood of $t_i$:
$\cuhaz_0(\tilde{T}_j)$ depends only on that total (the survival
penalty $\expf{-\cuhaz_0(\tilde T_j)\hazfactor(\vx_j)}$ is therefore
unchanged), but the hazard-density factor
$\haz_0(t_i)^{\delta_j}$ is maximized when all of that mass is
concentrated as a single point at $t_i$ (so $\haz_0(t_i) = h_{0i}$);
spreading the same total across a wider interval would reduce the
per-point weight at $t_i$, lowering $\haz_0(t_i)$ below $h_{0i}$,
while leaving the survival penalty fixed.
The likelihood is therefore maximized by a hazard
that places point masses only at the observed event times:

$$
\haz_0(t) = \begin{cases}
h_{0i}, & t = t_i \text{ for some } i \in \set{1, \dots, D},\\
0, & \text{otherwise},
\end{cases}
$$

with $\cuhaz_0(\tilde{T}_j) = \sum_{i\ :\ t_i \le \tilde{T}_j} h_{0i}$.
Expanding the survival exponent and swapping the order of summation:

$$
\ba
\sum_j \cuhaz_0(\tilde{T}_j)\,\hazfactor(\vx_j)
&= \sum_j \hazfactor(\vx_j) \sum_{i:\ t_i \le \tilde{T}_j} h_{0i}
\\
&= \sum_i h_{0i} \sum_{j:\ \tilde{T}_j \ge t_i} \hazfactor(\vx_j)
\\
&= \sum_i h_{0i} \sum_{j \in R(t_i)} \hazfactor(\vx_j),
\ea
$$

where the second equality swaps the order of summation,
and the third uses $R(t_i) = \{j : \tilde{T}_j \ge t_i\}$
(see the [risk set definition](intro-to-survival-analysis.qmd#def-risk-set); this definition uses the "at risk *at* $t_i$" convention with the
$\ge$ boundary — a subject censored exactly at $t_i$ is in $R(t_i)$
and contributes $h_{0i}$ to $\cuhaz_0(\tilde{T}_j)$).
Write $S_i = \sum_{j \in R(t_i)} \hazfactor(\vx_j)$ for the
risk-set-weighted hazard-multiplier sum at $t_i$.
Reindexing the hazard-density product: subjects with $\delta_j = 0$
contribute a factor of $\haz_0(\tilde{T}_j)^0 = 1$ (no contribution),
and for each subject with $\delta_j = 1$ their event occurred at some
$t_i$, so $\haz_0(\tilde{T}_j) = h_{0i}$:

$$
\ba
\prod_{j=1}^n \sb{\haz_0(\tilde{T}_j)\,\hazfactor(\vx_j)}^{\delta_j}
&= \prod_{j:\,\delta_j=1} \haz_0(\tilde{T}_j)\,\hazfactor(\vx_j)
&& \text{(terms with } \delta_j = 0 \text{ equal } 1\text{, dropped)}
\\
&= \prod_{i=1}^D h_{0i}\,\hazfactor(\vx_{(i)}).
&& \text{(}\haz_0(\tilde{T}_j) = h_{0i} \text{ for event } j \text{ at event time } t_i\text{)}
\ea
$$

Factoring $\expf{-\sum_i h_{0i}\,S_i} = \prod_i \expf{-h_{0i}\,S_i}$
and combining factor-by-factor with the hazard-density product:

$$
\ba
\Lik\sb{\b,\,h_{01},\dots,h_{0D}}
&= \underbrace{\prod_{i=1}^{D} h_{0i}\,\hazfactor(\vx_{(i)})}_{\text{hazard-density}}
\cdot \underbrace{\prod_{i=1}^{D} \expf{-h_{0i}\,S_i}}_{\text{survival}}
\\
&= \prod_{i = 1}^{D}
h_{0i}\;\hazfactor(\vx_{(i)})\;
\expf{-h_{0i}\,S_i}.
\ea
$$

The log-likelihood is separable in the $h_{0i}$ (each summand $\log h_{0i} - h_{0i}\,S_i$ involves only $h_{0i}$):

$$
\loglik\sb{\b,\,h_{01},\dots,h_{0D}}
= \sum_{i = 1}^{D} \log\hazfactor(\vx_{(i)})
+ \sum_{i = 1}^{D}
\cb{\log h_{0i} - h_{0i}\,S_i}.
$$

Differentiating with respect to $h_{0i}$ and setting the derivative to zero:

$$
\frac{1}{h_{0i}} \;-\; S_i \;=\; 0
\quad\Longrightarrow\quad
\hat h_{0i} = \frac{1}{S_i}.
$$

The second derivative $-1/h_{0i}^2 < 0$ confirms this critical point is a maximum.

Summing over event times $t_i \le t$ gives the **Breslow estimator**
of the baseline cumulative hazard:

$$
\hat \cuhaz_0(t)
= \sum_{t_i \le t} \hat h_{0i}
= \sum_{t_i \le t} \frac{1}{S_i}
\quad\text{where } S_i = \sum_{j \in R(t_i)} \hazfactor(\vx_j).
$$

With tied event times, the numerator generalizes to $d_i$,
the number of events at $t_i$
(see [@klein2003survival, §8.8] for the tie-handling adjustments).

Substituting $\hat h_{0i}$ back into the profile likelihood:

$$
\ba
\Lik\sb{\b,\,\hat h_{01},\dots,\hat h_{0D}}
&= \prod_{i = 1}^{D}
\hat h_{0i}\;\hazfactor(\vx_{(i)})\;
\expf{-\hat h_{0i}\,S_i}
\\
&= \prod_{i = 1}^{D}
\frac{\hazfactor(\vx_{(i)})}{S_i}\;
\expf{-S_i / S_i}
&& \text{(substituting } \hat h_{0i} = 1/S_i\text{)}
\\
&= \prod_{i = 1}^{D}
\frac{\hazfactor(\vx_{(i)})}{S_i}\;
e^{-1}
&& \text{(since } S_i/S_i = 1\text{)}
\\
&= e^{-D}
\cdot \prod_{i = 1}^{D} \frac{\hazfactor(\vx_{(i)})}{S_i}.
\ea
$$

Under the no-ties assumption each event time $t_i$ has $d_i = 1$, so
$\prod_{i=1}^D \frac{\hazfactor(\vx_{(i)})}{S_i}$
equals the partial likelihood $\Lik^*(\b)$ of @def-ph-partial-lik.
Hence

$$\Lik\sb{\b,\,\hat h_{01},\dots,\hat h_{0D}} = e^{-D}\,\Lik^*(\b).$$

The factor $e^{-D}$ does not depend on $\b$,
so the profile likelihood is proportional to the partial likelihood $\Lik^*(\b)$.
This proportionality justifies treating the partial likelihood as a profile likelihood
for $\b$, with $\haz_0(\cdot)$ concentrated out.

:::
Loading
Loading