Skip to content

Sensory: cross-validated discriminator for the observational relate step#433

Merged
kgdunn merged 9 commits into
mainfrom
claude/sensory-toolkit-priority-qwo6vv
Jun 30, 2026
Merged

Sensory: cross-validated discriminator for the observational relate step#433
kgdunn merged 9 commits into
mainfrom
claude/sensory-toolkit-priority-qwo6vv

Conversation

@kgdunn

@kgdunn kgdunn commented Jun 30, 2026

Copy link
Copy Markdown
Owner

What and why

The observational relate step reported only marginal Pearson correlations
(BH-corrected) between each sensory attribute and each measured descriptor. In
that view a genuine driver and a proxy that rides on it look identical, because
within one dataset they correlate equally, and a chance correlation in a small
product set can pass too. The descriptive-panel tutorial promised that telling
these apart "is covered separately" using out-of-sample evidence. This PR builds
that discriminator.

A limitation is stated plainly throughout, because it drives the design and the
narrative: same-data cross-validation cannot separate two near-collinear
descriptors (one carries the same information as the other, so it predicts just
as well out of sample). What a cross-validated discriminator can do is gate on
whether an attribute is predictable at all, demote correlations that do not
survive cross-validation, and report collinear descriptors as one inseparable
cluster rather than ranking within it. Separating a genuine correlate from a
collinear proxy needs an external dataset, a designed experiment, or mechanistic
knowledge.

What changed

  • New PLS diagnostics in multivariate/_diagnostics.py, bound as PLS
    convenience methods and exported from multivariate.methods:
    • target_projection(): the single latent component along a response's
      regression vector (Kvalheim and Karstang, 1989).
    • selectivity_ratio(): per-feature explained-to-residual variance ratio on
      that component (Rajalahti et al., 2009).
  • Sensory discriminator (discriminate_observational, surfaced as
    result.relate["discriminator"] and through the sensory_analyze_descriptive
    tool): a per-attribute leave-one-out Q-squared gate, a selectivity ratio per
    descriptor with a max-statistic (Westfall-Young) permutation test, and
    collinear-cluster grouping. PLS fits step the component count down on a
    near-collinear (singular) block. On by default; existing fast tests opt out
    with discriminator=False.
  • Tests: unit tests for the new diagnostics (including error paths), a
    direct discriminator test, and an extended end-to-end example that asserts the
    Q-squared gate, the brix collinear cluster kept as one inseparable group
    (Trap B), genuine lone drivers kept, and a chance correlate
    (lab_humidity vs Liking) flagged by the marginal test but demoted by the
    discriminator (Trap A).
  • Tutorial: a "Telling genuine drivers from proxies" section and a Step 5 in
    the worked example, reframed around what observational cross-validation can and
    cannot do.
  • Version bumped to 1.50.0 with CITATION.cff and CHANGELOG.md in sync.

Verification

ruff check . and mypy src/process_improve clean. Full multivariate suite
(167 passed, 1 skipped) and sensory suites (40 passed) green locally.

claude added 2 commits June 30, 2026 03:41
Two predictive-importance diagnostics next to vip(): target_projection()
forms the single latent component along a response's regression vector
(Kvalheim and Karstang 1989), and selectivity_ratio() reports each
feature's explained-to-residual variance ratio on that component
(Rajalahti et al. 2009). Both are bound as PLS convenience methods and
exported from multivariate.methods. The selectivity ratio ranks
predictive relevance but, by design, gives near-identical scores to two
collinear features, so it does not break ties within a collinear group.
Synthetic driver/proxy/noise block: SR is high for the predictive
direction (driver and its collinear proxy), low for noise, and gives
near-identical scores to the two collinear features (the limitation the
discriminator narrative rests on). TP scores correlate with the
response. Adds multi-response, error-path, and LDPE real-data checks.
@codecov

codecov Bot commented Jun 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 89.01734% with 19 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/process_improve/sensory/analysis.py 88.88% 7 Missing and 4 partials ⚠️
src/process_improve/multivariate/_diagnostics.py 88.05% 5 Missing and 3 partials ⚠️

📢 Thoughts on this report? Let us know!

claude added 7 commits June 30, 2026 03:54
discriminate_observational adds out-of-sample evidence on top of the
marginal associations: a per-attribute cross-validated Q-squared gate
(reusing PLS.select_n_components), a selectivity ratio per descriptor on
the target-projected predictive direction with a permutation p-value
(Benjamini-Hochberg corrected per attribute), and a collinear-cluster id
so proxies that ride on a driver are reported as one inseparable group.

The permutation loop is skipped for attributes the Q-squared gate finds
unpredictable. Surfaced as a 'discriminator' key on the relate output and
threaded through analyze_descriptive. Existing fast tests opt out via
discriminator=False.
Adds discriminator / n_permutations / random_state options to the
sensory_analyze_descriptive tool input and mentions the discriminator
output in the tool description. Passes a DataFrame to select_n_components
and indexes the selectivity ratio by position so mypy is satisfied.
…minator

The per-descriptor permutation now controls multiplicity with a
Westfall-Young max-statistic null per attribute, so a single genuine
driver is detectable without the resolution loss of a per-test BH floor.
The Q-squared gate uses leave-one-out cross-validation (cheap and robust
to near-collinear blocks), and PLS fits step the component count down on
a singular block.

The end-to-end example gains three measurement-condition nuisances; the
test now asserts the Q-squared gate, the brix collinear cluster kept as
one inseparable group (Trap B), genuine lone drivers kept, and a chance
correlate (lab_humidity vs Liking) flagged by the marginal test but
demoted by the discriminator (Trap A).
Adds a 'Telling genuine drivers from proxies' section explaining the
cross-validated Q-squared gate, the selectivity ratio on the
target-projected direction, and collinear clustering, and a Step 5 in the
worked example. States plainly that observational cross-validation demotes
coincidences and groups proxies with their driver but cannot rank
descriptors within a collinear cluster: that needs an external dataset, a
designed experiment, or mechanistic knowledge.
Version 1.50.0 (new feature: PLS target projection / selectivity ratio and
the cross-validated sensory discriminator), with CITATION and CHANGELOG in
sync. Adds unit tests for the diagnostic error paths (response selection,
unfitted models, the small-sample F-critical) and for the discriminator's
Q-squared gate and collinear clustering.
…torial

Add an 'under the hood' note pointing at target_projection and
selectivity_ratio with their one-line formulas, plus a References block
(Kvalheim and Karstang 1989; Rajalahti et al. 2009), so the worked example
is self-contained.
Each @tool_spec description now carries a Returns section enumerating the
JSON keys it produces, so an importing agent knows the output contract
without inspecting a response first. The analyze tool spells out the full
discriminator structure (per_attribute, descriptors, clusters) and notes
that descriptors sharing a cluster_id cannot be told apart.
@kgdunn kgdunn merged commit 6251178 into main Jun 30, 2026
14 checks passed
@kgdunn kgdunn deleted the claude/sensory-toolkit-priority-qwo6vv branch June 30, 2026 06:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants