diff --git a/_subfiles/Linear-models-overview/_sec_linreg_diag_residuals.qmd b/_subfiles/Linear-models-overview/_sec_linreg_diag_residuals.qmd index 4d7189c25..edbc9c6aa 100644 --- a/_subfiles/Linear-models-overview/_sec_linreg_diag_residuals.qmd +++ b/_subfiles/Linear-models-overview/_sec_linreg_diag_residuals.qmd @@ -123,6 +123,8 @@ $$\hat{\vY} = H\vY$$ ::: :::{#thm-resid-unbiased} +#### Mean and variance of residuals + For an ordinary least squares linear model with fitted values $\hat y_i = \dprodf{\vx_i}{\vb}$ (and fitted-value vector $\hat{\vY}$), diff --git a/_subfiles/intro-MLEs/_sec-loglik.qmd b/_subfiles/intro-MLEs/_sec-loglik.qmd index 0af0660bb..d62659578 100644 --- a/_subfiles/intro-MLEs/_sec-loglik.qmd +++ b/_subfiles/intro-MLEs/_sec-loglik.qmd @@ -8,6 +8,7 @@ It is typically easier to work with the log of the likelihood function: --- :::{#thm-mle-use-log} +#### Maximize the log-likelihood instead of the likelihood The likelihood and log-likelihood have the same maximizer: diff --git a/_subfiles/intro-MLEs/_sec_likelihood.qmd b/_subfiles/intro-MLEs/_sec_likelihood.qmd index 33e8b2892..32e6d07dc 100644 --- a/_subfiles/intro-MLEs/_sec_likelihood.qmd +++ b/_subfiles/intro-MLEs/_sec_likelihood.qmd @@ -103,6 +103,7 @@ $$\Lik_i(\theta) = \P(X_i=x_i)$$ --- :::{#thm-ds-lik-obs-lik} +#### Dataset likelihood as a product of observation likelihoods For $\iid$ data $\vx \eqdef \x1n$, the likelihood of the dataset is equal to the product of the observation-specific likelihood factors: diff --git a/_subfiles/intro-to-survival-analysis/_sec-cuhaz.qmd b/_subfiles/intro-to-survival-analysis/_sec-cuhaz.qmd index 509ae57e1..850cd405c 100644 --- a/_subfiles/intro-to-survival-analysis/_sec-cuhaz.qmd +++ b/_subfiles/intro-to-survival-analysis/_sec-cuhaz.qmd @@ -2,6 +2,8 @@ Since $\haz(t) = \deriv{t}\cb{-\log{\surv(t)}}$ (see @thm-h-logS), we also have: :::{#cor-surv-int-haz} +#### Survival function from the cumulative hazard + $$\surv(t) = \exp{-\int_{u=0}^t \haz(u)du}$${#eq-surv-int-haz} ::: diff --git a/_subfiles/intro-to-survival-analysis/_sec-exp-dist.qmd b/_subfiles/intro-to-survival-analysis/_sec-exp-dist.qmd index af2705234..5432ba908 100644 --- a/_subfiles/intro-to-survival-analysis/_sec-exp-dist.qmd +++ b/_subfiles/intro-to-survival-analysis/_sec-exp-dist.qmd @@ -75,6 +75,8 @@ $$ --- :::{#thm-mle-exp} +#### MLE of the exponential rate parameter + Let $T=\sum t_i$ and $U=\sum u_j$. Then: $$ diff --git a/_subfiles/intro-to-survival-analysis/_sec-inv-survf.qmd b/_subfiles/intro-to-survival-analysis/_sec-inv-survf.qmd index c5e24b6da..4f7030bc7 100644 --- a/_subfiles/intro-to-survival-analysis/_sec-inv-survf.qmd +++ b/_subfiles/intro-to-survival-analysis/_sec-inv-survf.qmd @@ -167,6 +167,7 @@ qexp(p = 0.5, rate = 2) {{< slidebreak >}} :::{#thm-inv-surv-is-quantile} +#### Inverse survival function is the quantile function The inverse survival function equals the $(1-p)$th [population quantile](probability.qmd#def-quantile-function) diff --git a/_subfiles/intro-to-survival-analysis/_sec-survf.qmd b/_subfiles/intro-to-survival-analysis/_sec-survf.qmd index 0f19ed4f4..e934f8b71 100644 --- a/_subfiles/intro-to-survival-analysis/_sec-survf.qmd +++ b/_subfiles/intro-to-survival-analysis/_sec-survf.qmd @@ -27,6 +27,7 @@ $$\surv(t) \eqdef \Pr(T > t)$$ --- :::{#thm-survival-expressions-1} +#### Equivalent expressions for the survival function $$ \begin{aligned} @@ -109,6 +110,7 @@ ggplot() + --- :::{#thm-surv-fn-as-mean-status} +#### Survival function as expected survival status If $A_t$ represents survival status at time $t$, with $A_t = 1$ denoting alive at time $t$ and $A_t = 0$ denoting deceased at time $t$, then: @@ -119,6 +121,7 @@ $$\surv(t) = \P(A_t=1) = \E{A_t}$$ --- :::{#thm-surv-and-mean} +#### Mean as the integral of the survival function If $T$ is a nonnegative random variable, then: diff --git a/_subfiles/logistic-regression/_sec-d_odds-d_logodds.qmd b/_subfiles/logistic-regression/_sec-d_odds-d_logodds.qmd index 0ab133c5e..45198741a 100644 --- a/_subfiles/logistic-regression/_sec-d_odds-d_logodds.qmd +++ b/_subfiles/logistic-regression/_sec-d_odds-d_logodds.qmd @@ -1,4 +1,6 @@ :::{#lem-deriv-invodds} +#### Derivative of odds w.r.t. log-odds + $$\derivf{\odds}{\logodds} = \odds$$ ::: @@ -26,6 +28,8 @@ $$ :::{#thm-d_odds-d_logodds} +#### Derivative of odds in terms of probability + $$\derivf{\omega}{\eta} = \frac{\pi}{1-\pi}$${#eq-d_omega-d_eta} ::: diff --git a/_subfiles/logistic-regression/_sec_OR-ratio-ratio.qmd b/_subfiles/logistic-regression/_sec_OR-ratio-ratio.qmd index 4293b445d..1ae597ede 100644 --- a/_subfiles/logistic-regression/_sec_OR-ratio-ratio.qmd +++ b/_subfiles/logistic-regression/_sec_OR-ratio-ratio.qmd @@ -5,6 +5,8 @@ so odds ratios are ratios of ratios: ::: :::{#thm-or-ratio-ratio} +#### Odds ratio as a ratio of ratios + $$ \ba \ratio(\odds_1, \odds_2) diff --git a/_subfiles/logistic-regression/_sec_OR_logistic.qmd b/_subfiles/logistic-regression/_sec_OR_logistic.qmd index 24dd4f313..30b077988 100644 --- a/_subfiles/logistic-regression/_sec_OR_logistic.qmd +++ b/_subfiles/logistic-regression/_sec_OR_logistic.qmd @@ -197,6 +197,8 @@ $$ :::{#thm-logistic-OR} +#### Odds ratio from difference in covariate patterns + The odds ratio comparing covariate patterns $\vx$ and $\vxs$ is: {{< include _subfiles/logistic-regression/_eq_OR_delta.qmd >}} @@ -211,6 +213,8 @@ By @sol-simplify-logistic-OR. :::{#cor-log-or} +#### Log odds ratio equals the difference in log-odds + $$\logf {\ror(\vx,\vxs)} = \difflogodds$$ ::: diff --git a/_subfiles/logistic-regression/_sec_d-pi_d-eta.qmd b/_subfiles/logistic-regression/_sec_d-pi_d-eta.qmd index 580457d07..1e705d18d 100644 --- a/_subfiles/logistic-regression/_sec_d-pi_d-eta.qmd +++ b/_subfiles/logistic-regression/_sec_d-pi_d-eta.qmd @@ -1,5 +1,7 @@ :::{#thm-d_prob-d_logodds} +#### Derivative of probability w.r.t. log-odds + $$\derivf{\prob}{\logodds} = \pi (1-\pi)$$ ::: @@ -39,6 +41,8 @@ $$ :::{#cor-d_pi-d_eta-var} +#### Derivative of probability w.r.t. linear predictor as a variance + If $\pi = \Pr(Y=1| \vX=\vx)$, then: $$\derivf{\pi}{\eta} = \Varf{Y|X=x}$$ diff --git a/_subfiles/logistic-regression/_sec_derive_logistic_loglik.qmd b/_subfiles/logistic-regression/_sec_derive_logistic_loglik.qmd index 201c96830..17f594da4 100644 --- a/_subfiles/logistic-regression/_sec_derive_logistic_loglik.qmd +++ b/_subfiles/logistic-regression/_sec_derive_logistic_loglik.qmd @@ -41,6 +41,8 @@ $$ :::{#lem-logistic-loglik-component} +#### Per-observation log-likelihood component + $$\ell_i(\pi_i) = y_i \eta_i - \logf{1+\odds_i}$$ ::: diff --git a/_subfiles/logistic-regression/_sec_expit.qmd b/_subfiles/logistic-regression/_sec_expit.qmd index 333865654..467841016 100644 --- a/_subfiles/logistic-regression/_sec_expit.qmd +++ b/_subfiles/logistic-regression/_sec_expit.qmd @@ -4,6 +4,8 @@ :::{#thm-prob-from-logodds} +#### Probability as a function of log-odds + ::: notes If $\prob$ is the probability of an event $A$, $\odds$ is the corresponding odds of $A$, @@ -61,6 +63,7 @@ Details left to the reader. --- :::{#thm-expit-prob-logodds} +#### Probability via the expit function If $\prob$ is the probability of an event $A$, $\odds$ is the corresponding odds of $A$, and $\logodds$ is the corresponding log-odds of $A$, diff --git a/_subfiles/logistic-regression/_sec_invodds.qmd b/_subfiles/logistic-regression/_sec_invodds.qmd index 79ebaeb0b..f4ece4df3 100644 --- a/_subfiles/logistic-regression/_sec_invodds.qmd +++ b/_subfiles/logistic-regression/_sec_invodds.qmd @@ -52,6 +52,8 @@ $$ :::{#thm-odds-to-prob} +#### Probability as a function of odds + If $\pi$ is the probability of an event and $\omega$ is the corresponding odds of that event, then: @@ -86,6 +88,8 @@ can be called the **inverse-odds function**. :::{#cor-invodds-pi} +#### Probability via the inverse-odds function + $$\prob = \invoddsf{\odds}$$ ::: @@ -100,6 +104,8 @@ By @def-inv-odds and @thm-odds-to-prob. :::{#cor-invodds-odds-inv} +#### Inverse-odds function inverts the odds function + $$\invoddsf{\odds} = \oddsinvf{\odds}$$ ::: @@ -252,6 +258,7 @@ $$ --- :::{#cor-inverse-odds-nonevent} +#### One plus odds in terms of non-event probability $$1+\odds = \frac{1}{1-\prob}$$ ::: diff --git a/_subfiles/logistic-regression/_sec_logistic_score_fn.qmd b/_subfiles/logistic-regression/_sec_logistic_score_fn.qmd index d75309c04..ff0e26fd4 100644 --- a/_subfiles/logistic-regression/_sec_logistic_score_fn.qmd +++ b/_subfiles/logistic-regression/_sec_logistic_score_fn.qmd @@ -4,6 +4,8 @@ As usual, by independence, we have: :::{#lem-score-logistic} +#### Score function decomposes over observations + $$ \ba \brown{\vec{\llik'}(\vb)} @@ -22,6 +24,8 @@ we can apply the [vector chain rule](math-prereqs.qmd#thm-chain-vec): :::{#lem-logistic-score-comp} +#### Chain rule applied to the score component + $$ \ba \magenta{\vec{\llik_i'}(\vb)} @@ -38,6 +42,8 @@ $$ :::{#lem-d_logodds-d_vb} +#### Derivative of log-odds with respect to coefficients + By [the derivative of a linear combination](math-prereqs.qmd#thm-deriv-lincom): $$ @@ -90,6 +96,7 @@ $$ :::{#thm-logistic-score-comp} +#### Score component for one observation $$\magenta{\llik_i'(\vb)} = \magenta{\vx_i \err_i}$${#eq-score-comp} ::: @@ -106,6 +113,8 @@ we have: :::{#thm-logistic-score-fn} +#### Logistic-model score function + $$ \ba \brown{\vec{\llik'}(\vb)} &= \sumin \magenta{\llik_i'(\vb)}\\ diff --git a/_subfiles/logistic-regression/_sec_logistic_slope_mean.qmd b/_subfiles/logistic-regression/_sec_logistic_slope_mean.qmd index 7c5f26281..38f7fdf53 100644 --- a/_subfiles/logistic-regression/_sec_logistic_slope_mean.qmd +++ b/_subfiles/logistic-regression/_sec_logistic_slope_mean.qmd @@ -2,6 +2,8 @@ :::{#lem-d_logodds-d_x} +#### Derivative of log-odds w.r.t. predictor + By [the derivative of a linear combination](math-prereqs.qmd#thm-deriv-lincom): $$ diff --git a/_subfiles/logistic-regression/_sec_logit.qmd b/_subfiles/logistic-regression/_sec_logit.qmd index 11ba85288..b720bfc00 100644 --- a/_subfiles/logistic-regression/_sec_logit.qmd +++ b/_subfiles/logistic-regression/_sec_logit.qmd @@ -15,6 +15,8 @@ $$\logodds \eqdef \logf{\omega}$${#eq-def-logodds} :::{#thm-logodds-pi} +#### Log-odds as a function of probability + If $\prob$ is the probability of an event $A$, $\odds$ is the corresponding odds of $A$, and $\eta$ is the corresponding log-odds of $A$, @@ -81,6 +83,7 @@ Apply @def-logit-fn and then @def-odds (details left to the reader). --- :::{#cor-logodds-logit} +#### Log-odds via the logit function If $\prob$ is the probability of an event $A$ and $\logodds$ is the corresponding log-odds of $A$, then: diff --git a/_subfiles/logistic-regression/_sec_odds_fn.qmd b/_subfiles/logistic-regression/_sec_odds_fn.qmd index e4913051a..b5e0fcd38 100644 --- a/_subfiles/logistic-regression/_sec_odds_fn.qmd +++ b/_subfiles/logistic-regression/_sec_odds_fn.qmd @@ -22,6 +22,7 @@ $$ --- :::{#thm-prob-to-odds} +#### Odds as a function of probability If $\prob$ is the probability of an event $A$ and $\odds$ is the corresponding odds of $A$, then: @@ -64,6 +65,7 @@ which is easier to remember and manipulate: ::: :::{#cor-oddsf-to-odds} +#### Odds via the odds function If $\prob$ is the probability of an outcome $A$ and $\odds$ is the corresponding odds of $A$, then: diff --git a/_subfiles/logistic-regression/_sec_odds_of_rare_events.qmd b/_subfiles/logistic-regression/_sec_odds_of_rare_events.qmd index f69f49e97..71e44bde0 100644 --- a/_subfiles/logistic-regression/_sec_odds_of_rare_events.qmd +++ b/_subfiles/logistic-regression/_sec_odds_of_rare_events.qmd @@ -50,6 +50,8 @@ $$ :::{#thm-odds-minus-probs} +#### Difference between odds and probability + Let $\odds = \frac{\pi}{1-\pi}$. Then: $$\odds - \pi = \frac{\pi^2}{1-\pi}$$ diff --git a/_subfiles/logistic-regression/_sec_overview_bernoulli_models.qmd b/_subfiles/logistic-regression/_sec_overview_bernoulli_models.qmd index d89ee30fe..9588af8ee 100644 --- a/_subfiles/logistic-regression/_sec_overview_bernoulli_models.qmd +++ b/_subfiles/logistic-regression/_sec_overview_bernoulli_models.qmd @@ -11,6 +11,8 @@ What is logistic regression? :::{#sol-def-logistic-regression} :::{#def-logistic-regression} +#### Logistic regression model + **Logistic regression** is a framework for modeling [binary](data.qmd#def-binary) outcomes, conditional on one or more *predictors* (a.k.a. *covariates*). ::: diff --git a/_subfiles/logistic-regression/_thm-d_odds_d_beta.qmd b/_subfiles/logistic-regression/_thm-d_odds_d_beta.qmd index a9feb13f5..c1f396e39 100644 --- a/_subfiles/logistic-regression/_thm-d_odds_d_beta.qmd +++ b/_subfiles/logistic-regression/_thm-d_odds_d_beta.qmd @@ -1,4 +1,5 @@ :::{#thm-d_odds_d_beta} +#### Gradient of odds w.r.t. coefficients ::: notes To derive $\derivf{\odds}{\vb}$, @@ -19,6 +20,8 @@ $$ :::{#cor-d_odds_d_beta} +#### Gradient of odds w.r.t. coefficients in terms of probability + $$ \ba \derivf{\odds}{\vb} diff --git a/_subfiles/logistic-regression/_thm-d_pi_d_beta.qmd b/_subfiles/logistic-regression/_thm-d_pi_d_beta.qmd index 6b741e07d..8ec56cc7e 100644 --- a/_subfiles/logistic-regression/_thm-d_pi_d_beta.qmd +++ b/_subfiles/logistic-regression/_thm-d_pi_d_beta.qmd @@ -2,7 +2,9 @@ :::{#thm-d_pi_d_beta} -Using +#### Gradient of fitted probability w.r.t. coefficients + +Using @lem-d_logodds-d_vb and @thm-d_prob-d_logodds: diff --git a/_subfiles/logistic-regression/_thm_odds-from-logodds.qmd b/_subfiles/logistic-regression/_thm_odds-from-logodds.qmd index 4edd4612d..cb07dcf1d 100644 --- a/_subfiles/logistic-regression/_thm_odds-from-logodds.qmd +++ b/_subfiles/logistic-regression/_thm_odds-from-logodds.qmd @@ -1,4 +1,5 @@ :::{#lem-odds-from-logodds} +#### Odds from log-odds ::: notes If $\odds$ is the odds of an event $A$ diff --git a/_subfiles/logistic-regression/_thms-deriv-odds.qmd b/_subfiles/logistic-regression/_thms-deriv-odds.qmd index 609035c73..4920650a3 100644 --- a/_subfiles/logistic-regression/_thms-deriv-odds.qmd +++ b/_subfiles/logistic-regression/_thms-deriv-odds.qmd @@ -30,6 +30,8 @@ $$ :::{#cor-deriv-odds} +#### Derivative of odds function in terms of odds + $$\derivf{\odds}{\prob} = \sqf{1+\odds}$$ ::: diff --git a/_subfiles/misc/_cor-deriv-expit.qmd b/_subfiles/misc/_cor-deriv-expit.qmd index c8b34efff..d66c684dc 100644 --- a/_subfiles/misc/_cor-deriv-expit.qmd +++ b/_subfiles/misc/_cor-deriv-expit.qmd @@ -1,3 +1,5 @@ :::{#cor-deriv-expit} +#### Derivative of expit + $$\dexpitf{\logodds} = (\expitf{\logodds}) (1 - \expitf{\logodds})$$ ::: diff --git a/_subfiles/misc/_cor-deriv-invodds.qmd b/_subfiles/misc/_cor-deriv-invodds.qmd index 088d18a2e..e628dda1c 100644 --- a/_subfiles/misc/_cor-deriv-invodds.qmd +++ b/_subfiles/misc/_cor-deriv-invodds.qmd @@ -1,5 +1,7 @@ :::{#cor-deriv-invodds} +#### Derivative of inverse-odds function + $$\doddsinvf{\odds} = \sqf{1 - \invoddsf{\odds}}$$ ::: diff --git a/_subfiles/misc/_cor_prob-nonevent.qmd b/_subfiles/misc/_cor_prob-nonevent.qmd index 42bdbd7bd..ca64e7198 100644 --- a/_subfiles/misc/_cor_prob-nonevent.qmd +++ b/_subfiles/misc/_cor_prob-nonevent.qmd @@ -1,5 +1,7 @@ :::{#cor-inverse-odds-nonevent2} +#### Probability of a non-event from the odds + If $\prob$ is the probability of event $A$ and $\odds$ is the corresponding odds of event $A$, then the probability that $A$ does not occur is: diff --git a/_subfiles/misc/_lem-one-minus-expit.qmd b/_subfiles/misc/_lem-one-minus-expit.qmd index 727192c81..d5b604fdf 100644 --- a/_subfiles/misc/_lem-one-minus-expit.qmd +++ b/_subfiles/misc/_lem-one-minus-expit.qmd @@ -1,4 +1,6 @@ :::{#lem-one-minus-expit} +#### One minus expit + $$1-\expitf{\logodds} = \inv{1+\exp{\logodds}}$$ ::: diff --git a/_subfiles/proportional-hazards-models/_cor-hazard-ratio-vs-baseline.qmd b/_subfiles/proportional-hazards-models/_cor-hazard-ratio-vs-baseline.qmd index 1526b8836..4839ead39 100644 --- a/_subfiles/proportional-hazards-models/_cor-hazard-ratio-vs-baseline.qmd +++ b/_subfiles/proportional-hazards-models/_cor-hazard-ratio-vs-baseline.qmd @@ -1,5 +1,7 @@ :::{#cor-hazard-ratio-vs-baseline} +#### Hazard factor from difference of log-hazard from baseline + $$\hazfactor(t|\vx)= \expf{\diffloghaz(t|\vx)}$$ ::: diff --git a/_subfiles/proportional-hazards-models/_def-ph-model.qmd b/_subfiles/proportional-hazards-models/_def-ph-model.qmd index 788af007a..ad389238a 100644 --- a/_subfiles/proportional-hazards-models/_def-ph-model.qmd +++ b/_subfiles/proportional-hazards-models/_def-ph-model.qmd @@ -27,6 +27,8 @@ Equivalently: :::{#lem-ph-lincomp} +#### Log-hazard as baseline plus a linear combination + In a proportional hazards model (that is, if @eq-ph-diffloghaz holds): $$ diff --git a/_subfiles/proportional-hazards-models/_sec-surv-conditional-hazards.qmd b/_subfiles/proportional-hazards-models/_sec-surv-conditional-hazards.qmd index 5cd03c915..903f8dba0 100644 --- a/_subfiles/proportional-hazards-models/_sec-surv-conditional-hazards.qmd +++ b/_subfiles/proportional-hazards-models/_sec-surv-conditional-hazards.qmd @@ -92,6 +92,8 @@ here $\loghaz(t|\vx)$ depends on **both** $t$ **and** $\vx$. {{< slidebreak >}} :::{#thm-haz-from-loghaz} +#### Hazard from log-hazard + $$ \ba \haz(t|\vx) &= \expf{\loghaz(t|\vx)} @@ -132,6 +134,8 @@ $$ :::{#cor-diffloghaz-log-HR} +#### Difference of log-hazard from baseline equals log of the hazard factor + $$\diffloghaz(t|\vx) = \logf{\hazfactor(t| \vx)}$$ ::: diff --git a/_subfiles/proportional-hazards-models/_sec-understand-coxph.qmd b/_subfiles/proportional-hazards-models/_sec-understand-coxph.qmd index 06d6c2a04..bb6b31eeb 100644 --- a/_subfiles/proportional-hazards-models/_sec-understand-coxph.qmd +++ b/_subfiles/proportional-hazards-models/_sec-understand-coxph.qmd @@ -22,7 +22,9 @@ we will indicate this dependence by extending our notation for hazard: :::{#lem-diffloghaz-ph} -If $\loghaz(t|\vx) = \loghaz_0(t) + \reglincomb$, then: +#### Difference of log-hazards between two covariate patterns + +If $\loghaz(t|\vx) = \loghaz_0(t) + \reglincomb$, then: $$ \ba @@ -37,6 +39,8 @@ $$ :::{#thm-hazard-ratio-ph} +#### Hazard ratio under proportional hazards + If $\loghaz(t|\vx) = \loghaz_0(t) + \reglincomb$, then: $$ @@ -90,6 +94,8 @@ $$\hr(t| \vx : \vxs) = \hr(\vx : \vxs)$$ {{< slidebreak >}} :::{#lem-ph-diffloghaz-0} +#### Difference of log-hazard from baseline + $$\diffloghaz(t|\vx)= \reglincomb$${#eq-diffloghaz-0-ph} ::: @@ -97,7 +103,9 @@ $$\diffloghaz(t|\vx)= \reglincomb$${#eq-diffloghaz-0-ph} :::{#thm-hazard-ratio-vs-baseline-ph} -If $\loghaz(t|\vx) = \loghaz_0(t) + \reglincomb$, then: +#### Hazard ratio versus baseline under proportional hazards + +If $\loghaz(t|\vx) = \loghaz_0(t) + \reglincomb$, then: $$\hazfactor(t|\vx) = \expf{\reglincomb}$$ @@ -123,6 +131,8 @@ $$ {{< slidebreak >}} :::{#thm-ph-haz-decomp} +#### Proportional-hazards decomposition of the hazard + $$\haz(t|\vx) = \haz_0(t)\hazfactor(\vx)$$ ::: @@ -134,6 +144,8 @@ Also: :::{#thm-ph-also} +#### Equivalent forms of the proportional-hazards model + $$ \ba \hazfactor(\vx) &= \expf{\diffloghaz(\vx)} @@ -212,6 +224,8 @@ As we saw above, Cox's proportional hazards model has this property, with $\hr(\ :::{#thm-haz-ratio-notations} +#### Relating the hazard-ratio and hazard-factor notations + ::: notes We are using two similar notations, $\hr(\vx,\vxs)$ and $\hazfactor(\vx)$. @@ -246,6 +260,8 @@ $$ Hence on the log scale, we have: :::{#thm-diff-loghaz-lincom} +#### Difference of log-hazards is a linear combination + $$ \ba \logf{\frac{\haz(t|\vx)}{\haz(t|\vxs)}} diff --git a/chapters/algebra.qmd b/chapters/algebra.qmd index 98afa35ec..77c27b726 100644 --- a/chapters/algebra.qmd +++ b/chapters/algebra.qmd @@ -36,6 +36,7 @@ If $a = b$, then for any function $f(x)$, $f(a) = f(b)$ ### Inequalities :::{#thm-add-ineq} +#### Adding to both sides of an inequality If $a -b$ --- :::{#thm-mult-ineq} +#### Multiplying both sides of an inequality by a nonnegative number If $a < b$ and $c \geq 0$, then $ca < cb$. ::: @@ -59,6 +61,7 @@ If $a < b$ and $c \geq 0$, then $ca < cb$. --- :::{#thm-negative-one} +#### Negation is multiplication by $-1$ $$-a = (-1)*a$$ diff --git a/chapters/basic-statistical-methods.qmd b/chapters/basic-statistical-methods.qmd index 3476ad07a..5eea73314 100644 --- a/chapters/basic-statistical-methods.qmd +++ b/chapters/basic-statistical-methods.qmd @@ -56,6 +56,7 @@ See @vittinghoff2e, §3.2. ### Sample mean ::: {#def-sample-mean} +#### Sample mean The **sample mean** of $n$ observations $x_1, \ldots, x_n$ is: @@ -65,6 +66,7 @@ $$\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i$$ ### Sample variance ::: {#def-sample-variance} +#### Sample variance The **sample variance** is: @@ -78,6 +80,7 @@ of the population variance $\sigma^2$. ### Sample standard deviation ::: {#def-sample-sd} +#### Sample standard deviation The **sample standard deviation** is $s = \sqrt{s^2}$. It is expressed in the same units as the original data, @@ -87,6 +90,7 @@ making it more interpretable than the variance. ### Sample median ::: {#def-sample-median} +#### Sample median The **sample median** is the middle value when observations are sorted in ascending order. @@ -102,6 +106,7 @@ The median is more robust to outliers than the mean. ### Interquartile range ::: {#def-IQR} +#### Interquartile range The **interquartile range (IQR)** is the difference between the 75th percentile (the third quartile, $Q_3$) and the 25th percentile @@ -117,6 +122,7 @@ Like the median, the IQR is robust to outliers. ### Sample proportion ::: {#def-sample-proportion} +#### Sample proportion For a binary outcome, the **sample proportion** of "successes" (coded as 1) is: @@ -268,6 +274,7 @@ See @vittinghoff2e, §3.3. ### Null hypothesis ::: {#def-null-hypothesis} +#### Null hypothesis The **null hypothesis** $H_0$ is a specific claim about the population parameter(s) that we test against the data. @@ -280,6 +287,7 @@ $$H_0: \mu_1 = \mu_2$$ ### Alternative hypothesis ::: {#def-alternative-hypothesis} +#### Alternative hypothesis The **alternative hypothesis** $H_1$ (or $H_A$) is the claim we are trying to find evidence for. @@ -293,6 +301,7 @@ $$H_1: \mu_1 \neq \mu_2$$ ### Definition ::: {#def-two-sample-t-test} +#### Two-sample t-test The **two-sample t-test** (Welch's t-test) tests whether the means of two independent groups are equal. @@ -342,6 +351,7 @@ t.test(glucose_HT, glucose_placebo) ### Definition ::: {#def-one-sample-t-test} +#### One-sample t-test The **one-sample t-test** tests whether the mean of a single population equals a specified null value $\mu_0$: @@ -360,6 +370,7 @@ Under $H_0$, $t \sim t_{n-1}$ (a t-distribution with $n-1$ degrees of freedom). ### Definition ::: {#def-paired-t-test} +#### Paired t-test The **paired t-test** compares two related measurements (e.g., pre- and post-treatment values from the same subjects). @@ -402,6 +413,7 @@ to compare means across $k \geq 2$ groups. ## Definition ::: {#def-one-way-anova} +#### One-way ANOVA In a **one-way ANOVA**, we test: @@ -447,6 +459,7 @@ See @vittinghoff2e, §3.5. ### Definition ::: {#def-contingency-table} +#### Contingency table A **contingency table** (cross-tabulation) displays the joint frequencies of two categorical variables. @@ -490,6 +503,7 @@ hers |> ### Definition ::: {#def-chi-square-test} +#### Chi-square test The **Pearson chi-square test** tests whether two categorical variables are independent. For a $2 \times 2$ table, @@ -520,6 +534,7 @@ chisq.test(hers$exercise, hers$HT) ### Definition ::: {#def-fishers-exact} +#### Fisher's exact test **Fisher's exact test** computes the exact probability of observing a $2 \times 2$ table at least as extreme as the observed table, @@ -552,6 +567,7 @@ See @vittinghoff2e, §3.6. ### Definition ::: {#def-pearson-r} +#### Pearson correlation coefficient The **Pearson correlation coefficient** measures the strength and direction of the linear association between two continuous variables $X$ and $Y$: @@ -579,6 +595,7 @@ cor.test(hers$BMI, hers$glucose, method = "pearson") ### Definition ::: {#def-spearman-r} +#### Spearman rank correlation The **Spearman rank correlation** $r_S$ is the Pearson correlation computed on the *ranks* of the observations. @@ -605,6 +622,7 @@ See @vittinghoff2e, §3.6 and [Linear Models Overview](Linear-models-overview.qm ### Definition ::: {#def-slr} +#### Simple linear regression A **simple linear regression** model relates a continuous outcome $Y$ to a single predictor $X$: @@ -657,6 +675,7 @@ for each 1 kg/m² increase in BMI. ### Definition ::: {#def-r-squared} +#### Coefficient of determination ($R^2$) The **coefficient of determination** $R^2$ measures the proportion of the total variance in $Y$ that is explained by the linear regression on $X$: diff --git a/chapters/negbinom.qmd b/chapters/negbinom.qmd index 60468a442..ae82eafb5 100644 --- a/chapters/negbinom.qmd +++ b/chapters/negbinom.qmd @@ -26,6 +26,7 @@ which brings us back to the Poisson distribution. --- :::{#thm-nb} +#### Mean and variance of the negative binomial distribution If $Y \sim \NegBin(\mu, \rho)$, then: - $\Expp[Y] = \mu$ diff --git a/chapters/parametric-survival-models.qmd b/chapters/parametric-survival-models.qmd index be9107d53..1a6447f6f 100644 --- a/chapters/parametric-survival-models.qmd +++ b/chapters/parametric-survival-models.qmd @@ -124,6 +124,7 @@ ggplot() + ### Properties of Weibull hazard functions :::{#thm-weibull-props} +#### Properties of Weibull hazard functions If $T$ has a Weibull distribution, then: diff --git a/chapters/poisson.qmd b/chapters/poisson.qmd index 40dce34b9..4eb8f7e21 100644 --- a/chapters/poisson.qmd +++ b/chapters/poisson.qmd @@ -292,6 +292,7 @@ Start from definition of event rate and use algebra to solve for $\mu$. --- ::: {#thm-non-exposed} +#### No exposure means no expected events When the exposure magnitude is 0, there is no opportunity for events to occur: $$\Expp[Y|T=0] = 0$$ @@ -314,6 +315,7 @@ In other words, this model assumes that if there is no exposure, there can't be ::: :::{#thm-exposure-log-scale} +#### Exposure is additive on the log scale If $\mu = \lambda\cdot t$, then: @@ -332,8 +334,9 @@ that term is called an **offset**. --- :::{#thm-sum-pois} +#### Sum of independent Poisson random variables -If $X$ and $Y$ are independent Poisson random variables with means +If $X$ and $Y$ are independent Poisson random variables with means $\mu_X$ and $\mu_Y$, their sum, $Z=X+Y$, is also a Poisson random variable, with mean $\mu_Z = \mu_X + \mu_Y$. diff --git a/chapters/probability.qmd b/chapters/probability.qmd index 1d5b21c37..058ed327b 100644 --- a/chapters/probability.qmd +++ b/chapters/probability.qmd @@ -71,6 +71,7 @@ and underpins the @thm-total-prob for countable partitions. --- :::{#thm-prob-subset} +#### Probability of a subset's intersection If $A$ and $B$ are statistical events and $A\subseteq B$, then $\Pr(A \cap B) = \Pr(A)$. ::: @@ -83,6 +84,7 @@ Left to the reader for now. --- :::{#thm-total-prob-1} +#### An event and its complement sum to 1 $$\Pr(A) + \Pr(\neg A) = 1$$ ::: @@ -95,6 +97,7 @@ By properties 2 and 3 of @def-probability. --- :::{#cor-p-neg0} +#### Complement rule $$\Pr(\neg A) = 1 - \Pr(A)$$ ::: @@ -107,6 +110,7 @@ By @thm-total-prob-1 and algebra. --- :::{#cor-p-neg} +#### Complement rule in probability ($\pi$) notation If the probability of an outcome $A$ is $\Pr(A)=\pi$, then the probability that $A$ does not occur is: @@ -1431,6 +1435,7 @@ $$\Cov{X,Y} \eqdef \Expf{(X - \E X)(Y - \E Y)}$$ --- :::{#thm-alt-cov} +#### Alternative formula for covariance $$\Cov{X,Y}= \E{XY} - \E{X} \E{Y}$$ ::: @@ -1643,6 +1648,7 @@ Left to the reader... --- :::{#cor-var-lincom2} +#### Variance of a sum of two random variables For any two random variables $X$ and $Y$ and scalars $a$ and $b$: