Skip to content

Reconcile differences in glucose outcomes for continuous rather than binary #75

@davebridges

Description

@davebridges

Gap 3: Audit and fix the "BMI (Non-Linear)" row in Table 4 — label, units, and contrast direction all appear wrong; also reconcile with absence-of-evidence critique on glucose additivity

Labels: reviewer-response bug table analysis priority-high

Summary

Two separate reviewers raised overlapping concerns about the glucose-additivity claim, through different lenses:

  • Reviewer A (original Gap 3): Table 3 reports no Cushing's × obesity interaction for fasting glucose (p=0.51, dichotomous), while Table 4's "BMI (Non-Linear)" row reports a significant interaction. Framed as a dichotomization-vs-continuous methodology gap.
  • Reviewer B (Gap 6, folded in): The paper declares additivity on a non-significant interaction test (p=0.51) without a formal equivalence framework. Absence of evidence is not evidence of absence. The 95% CI (-4.45 to 8.98 mg/dL) is wide enough to include clinically meaningful synergy.

Both reviewers are flagging the same underlying interpretive problem — the glucose additivity claim is stated more strongly than the statistics support — but via different mechanisms. On inspection of the actual code, the Table 4 "BMI (Non-Linear)" row that triggered Reviewer A's concern also has three technical problems that together explain most of the apparent discrepancy.

Three problems with the "BMI (Non-Linear)" row

  1. Label is wrong or misleading — the row is populated by bmi.linear.models, which uses lm(... Cushings * BMI) (linear). A separate spline function (bmi_obese_vs_lean_contrast_spline using ns(BMI, df = 4)) exists but may not be what's feeding the table.
  2. Units are per-BMI-unit, not the total lean-to-obese contrast the Methods paragraph describes — so the row is not on a comparable scale to every other row in Table 4.
  3. Contrast direction looks reversed: method = list("Obese – Lean" = c(1, -1)) applied to emtrends(specs = ~ Cushings, ...) evaluates Cushings=0 − Cushings=1, not obese−lean. The label does not match what the contrast computes.

Evidence the row is on a different scale / sign convention

Scaling the reported per-unit estimates by the approximate lean-to-obese BMI gap (~15 BMI units) gives values that don't match the dichotomous row, and signs flip for five of seven outcomes:

Outcome Dichotomous interaction (Table 3B) BMI (Non-Linear) reported Scaled × ~15 BMI units
Glucose +2.26 +0.35 +5.3
HbA1c +0.47 −0.04 −0.6
ALT +38.26 −3.36 −50
AST +40.02 −4.41 −66
MAP −6.58 +0.47 +7
SBP −8.89 +0.62 +9.3
DBP −5.48 +0.40 +6

If these were both the same interaction quantity on comparable scales, the signs should agree. The fact that they flip for every non-glycemic outcome is the smoking gun.

Root cause hypotheses

Two things are probably happening together:

  1. Sign flip from contrast coding. emtrends(model, specs = ~ Cushings, var = "BMI", at = list(BMI = c(lean, obese))) returns rows ordered by Cushings factor level (0, 1). The contrast c(1, -1) named "Obese – Lean" therefore computes slope_Cushings=0 − slope_Cushings=1 — the opposite sign of the dichotomous interaction (Cushings=1 − Cushings=0). The at = list(BMI = ...) argument has no effect for a linear model because the slope is constant, so this is really just the Cushings:BMI coefficient with a sign flip.
  2. Different model structure. The linear-BMI model is value ~ Gender + RaceEth + Cushings * BMI — no Obesity indicator. The dichotomous model uses Cushings * Obesity (threshold). These aren't just different parameterizations of the same quantity; they answer different questions. Presenting them in adjacent rows without flagging that is confusing regardless of any sign bug.

On Reviewer B's equivalence-testing request

Legitimate part: the paper states additivity from p=0.51 without acknowledging the CI is wide enough to include clinically meaningful synergy. Adding the CI explicitly to the prose and softening the claim is the right fix.

Overstated part: formal TOST equivalence testing for a secondary interpretive framing in an observational EHR study is methodological theater. TOST comes from pharmaceutical bioequivalence, where regulators set the tolerance. There is no established "smallest clinically important interaction" for fasting glucose — the reviewer invented ±5 mg/dL as an example. Manufacturing that number, applying TOST, and reporting the result adds ritual without information.

What actually answers the reviewer's concern: the marginal-effects plot from the fixed spline model (below). If the Cushing's − control glucose difference is visually flat across BMI 20–50 with CI tight around zero, the equivalence argument is made graphically. If it isn't flat, TOST wasn't going to save the claim anyway.

Skip formal TOST. Skip Reviewer B's Option 2 (100 random re-matches) — with the same specification and ratio, propensity matching is approximately deterministic up to tie-breaking, so this produces noise dressed as robustness.

What needs to change

A. Audit the code (Reviewer A)

  • Confirm which model's output is feeding the BMI (Non-Linear) row in the final table (linear vs spline). Grep for where the row is assembled and trace back to the function call.
  • If it is the linear function: rename the row. If it is the spline function: verify the contrast-vector arithmetic produces the expected sign by hand on a toy dataset.
  • Regardless of which: verify the sign convention matches the rest of Table 4 (Cushings effect in obese − Cushings effect in lean).

B. Fix and restructure the row (Reviewer A)

Replace the single ambiguous "BMI (Non-Linear)" row with two rows that are clearly labeled and on the same scale as the rest of the table (native outcome units, lean-mean-BMI → obese-mean-BMI contrast, Cushings effect direction):

  1. BMI as continuous (linear) — interaction contrasted across the lean-to-obese BMI range, not per unit.
  2. BMI as continuous (natural spline, 4 df) — same contrast, using the existing ns(BMI, df = 4) model.

C. R code to produce directly comparable contrasts

library(emmeans)
library(splines)
library(dplyr)

# Helper: Cushings × BMI interaction contrasted across
# the lean-mean-BMI to obese-mean-BMI range, on the native outcome scale,
# matching the sign convention used elsewhere in Table 4
# (Cushings effect at obese BMI) − (Cushings effect at lean BMI).
bmi_interaction_contrast <- function(model, data) {

  means_bmi <- data |>
    filter(!is.na(BMI)) |>
    mutate(obese = BMI >= 30) |>
    group_by(obese) |>
    summarise(mean_bmi = mean(BMI), .groups = "drop")

  bmi_lean  <- means_bmi$mean_bmi[!means_bmi$obese]
  bmi_obese <- means_bmi$mean_bmi[ means_bmi$obese]

  rg <- ref_grid(
    model,
    data = data,
    at = list(BMI = c(bmi_lean, bmi_obese)),
    cov.reduce = FALSE
  )

  # Row order in emm: C=0/lean, C=0/obese, C=1/lean, C=1/obese
  # Interaction = (C=1,obese − C=0,obese) − (C=1,lean − C=0,lean)
  emm <- emmeans(rg, ~ Cushings * BMI)

  contr <- contrast(
    emm,
    method = list(
      "Cushings × BMI (obese − lean)" = c( 1, -1, -1,  1)
    ),
    infer = TRUE
  )

  s <- as.data.frame(summary(contr))
  tibble(
    estimate = s$estimate,
    ci_lo    = s$lower.CL,
    ci_hi    = s$upper.CL,
    p        = s$p.value
  )
}

# Apply to both linear and spline versions for each outcome
outcomes <- list(
  Glucose = list(lin = lm.glucose.bmi.linear,
                 spl = lm.glucose.bmi.non.linear,
                 dat = glucose.data),
  HbA1c   = list(lin = lm.hba1c.bmi.linear,
                 spl = lm.hba1c.bmi.non.linear,
                 dat = hba1c.data),
  ALT     = list(lin = lm.alt.bmi.linear,
                 spl = lm.alt.bmi.non.linear,
                 dat = alt.data),
  AST     = list(lin = lm.ast.bmi.linear,
                 spl = lm.ast.bmi.non.linear,
                 dat = ast.data),
  MAP     = list(lin = lm.map.bmi.linear,
                 spl = lm.map.bmi.non.linear,
                 dat = bp.data),
  SBP     = list(lin = lm.sbp.bmi.linear,
                 spl = lm.sbp.bmi.non.linear,
                 dat = bp.data),
  DBP     = list(lin = lm.dbp.bmi.linear,
                 spl = lm.dbp.bmi.non.linear,
                 dat = bp.data)
)

bmi_continuous_rows <- purrr::imap_dfr(outcomes, \(m, name) {
  bind_rows(
    bmi_interaction_contrast(m$lin, m$dat) |> mutate(outcome = name, model = "Linear BMI"),
    bmi_interaction_contrast(m$spl, m$dat) |> mutate(outcome = name, model = "Spline BMI (ns df=4)")
  )
})

Both new rows should now be (a) in native outcome units, (b) on the same lean-to-obese scale as the dichotomous row, and (c) signed so that positive = synergy (larger Cushings effect at higher BMI).

D. Supplementary figure — marginal effects across BMI (addresses Reviewer B directly)

Produce a marginal-effects plot showing the predicted Cushing's − control difference across BMI 20–50 for each outcome using the spline model. For glucose specifically, this is the single most effective answer to the "absence of evidence" critique — if the curve is visually flat and the CI hugs zero, the equivalence argument is made graphically without needing TOST.

library(marginaleffects)
library(ggplot2)

plot_cushings_by_bmi <- function(model, data, outcome_name, y_lab) {
  contrasts_by_bmi <- avg_comparisons(
    model,
    variables = "Cushings",
    newdata   = datagrid(BMI = seq(20, 50, by = 0.5),
                         GenderCode = "F",
                         RaceEthnicity = "White"),
    by = "BMI"
  )

  ggplot(contrasts_by_bmi, aes(x = BMI, y = estimate)) +
    geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.25) +
    geom_line() +
    geom_hline(yintercept = 0, linetype = "dashed") +
    geom_vline(xintercept = 30, linetype = "dotted") +
    labs(
      title = outcome_name,
      x = "BMI (kg/m²)",
      y = y_lab
    ) +
    theme_minimal()
}

E. Sanity check before publishing

Once the new rows are computed, confirm:

  • Signs agree with the dichotomous row for every outcome. If they still disagree after this fix, that's a real finding (threshold vs gradient effect) and belongs in the Discussion. If they agree, most of Gap 3 dissolves.
  • Magnitudes are in the same order of magnitude as the dichotomous row.
  • The linear and spline rows agree with each other reasonably well. If the spline result is much larger than the linear one, there is genuine nonlinearity to discuss.

F. Manuscript text

For the glucose "absence of evidence" critique (Reviewer B), regardless of audit outcome:

Results (Dysglycemia section) — add the CI:

"The obesity-by-Cushing's interaction for fasting glucose was 2.26 mg/dL (95% CI −4.45 to 8.98), consistent with additivity but with a confidence interval that does not exclude modest synergistic effects."

Discussion — soften the claim:

Replace "Glucose homeostasis appears to be explained well by an additive model in our primary analysis where obesity and Cushing's disease have non-synergistic effects."

with: "Glucose homeostasis is consistent with additivity in our primary analysis, though the confidence interval (−4.45 to 8.98 mg/dL) does not formally exclude small interactions."

Limitations — acknowledge:

"We did not conduct formal equivalence testing; a non-significant interaction test cannot by itself establish additivity, and our data cannot exclude interactions smaller than the detectable effect size given our sample."

For the audit outcome (Reviewer A), scenario-dependent:

  • Scenario 1 — corrected continuous rows agree with dichotomous. Add:

    "Modeling BMI continuously (both linear and with natural cubic splines) produced interaction estimates consistent in direction and magnitude with the primary dichotomous analysis (Table 4)."

  • Scenario 2 — continuous rows agree in direction but differ in magnitude. Add:

    "Continuous-BMI models produced interaction estimates in the same direction but of somewhat different magnitude than the dichotomous analysis (Table 4), reflecting the difference between a threshold specification and a gradient specification."

  • Scenario 3 — continuous rows disagree in direction for some outcomes. Report honestly:

    "For [outcome], the dichotomous interaction was positive (synergy at BMI ≥30) while the continuous-BMI interaction was negative, consistent with a threshold rather than gradient effect: liver enzyme elevations in Cushing's disease may appear primarily once patients cross the obesity threshold, rather than scaling linearly with BMI."

Acceptance criteria

  • Source of the current BMI (Non-Linear) row identified in the code and documented in this issue.
  • Row replaced with two rows (Linear BMI, Spline BMI (ns df=4)) whose contrast is on the lean-to-obese scale, in native outcome units, with the same sign convention as the other rows.
  • Sanity-check table comparing dichotomous, linear-BMI, and spline-BMI interactions for all seven outcomes included in the supplement.
  • Supplementary figure: Cushings − control difference across BMI for each outcome from the spline model.
  • Glucose CI reported in prose and additivity claim softened in both Results and Discussion (addresses Reviewer B).
  • Limitations paragraph acknowledges that a non-significant interaction test cannot establish additivity.
  • Manuscript text updated to reflect whichever audit scenario (1, 2, or 3) the corrected analysis supports.
  • Skip formal TOST equivalence testing (Reviewer B Option 1 in its formal form) — the CI-framed softening and marginal-effects plot are sufficient.
  • Skip 100-random-seed re-matching (Reviewer B Option 2) — approximately deterministic up to tie-breaking.

Notes

  • The two reviewer critiques collapse onto the same response: fix the table, report the CI, and show the marginal-effects plot. The only Reviewer-B-specific text addition is the limitations paragraph about equivalence testing.
  • The sign-flip pattern in the table above is a good reminder that contrast labels in emmeans don't audit themselves. Worth adding a unit test that compares the linear-model contrast to the known Cushings:BMI coefficient.

References

  1. Source critiquing BMI dichotomization (Reviewer A).
  2. Continuous vs binary BMI in metabolic outcome interaction studies (Reviewer A).
  3. Altman & Bland "absence of evidence is not evidence of absence" (Reviewer B, ref 1 in their critique).
  4. Schuirmann TOST original paper (Reviewer B, refs 2–4) — cite only if a specific reviewer demands formal equivalence testing; otherwise omit.

(Locate full cites before response letter.)

Metadata

Metadata

Assignees

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions