Skip to content

WIP: Support polychoric correlation method in do_cor#1517

Open
kei51e wants to merge 3 commits into
masterfrom
fix/issue-30488
Open

WIP: Support polychoric correlation method in do_cor#1517
kei51e wants to merge 3 commits into
masterfrom
fix/issue-30488

Conversation

@kei51e

@kei51e kei51e commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Description

Add a polychoric correlation method to do_cor() for ordinal variables (e.g. survey scales 1-5), resolving #30488.

  • do_cor(..., method = "polychoric") computes the polychoric correlation matrix and standard errors in one call via polycor::hetcor() (ML estimation). Every column is coerced to an ordered factor so polychoric (not Pearson/polyserial) is used for every pair, and a two-sided z-test P value is derived from rho / SE. Non-estimable pairs (constant or near-perfect columns) come back as NA and are treated as not-significant on the UI.
  • use values that hetcor() does not support ("everything", "all.obs", "na.or.complete") are mapped to "pairwise.complete.obs", so polychoric does not error where pearson/spearman/kendall would succeed. "complete.obs" is passed through.
  • Added polycor to Imports.

Tests cover:

  • Value correctness: the strong pair recovers the latent ~0.7 (bounded, not inflated); the independent pair is near-zero and not statistically significant.
  • Grouped (Repeat By) data: each group gets its own correlation (group A positive, group B negative).
  • NA handling: missing responses are dropped pairwise via pairwise.complete.obs; every pair is still estimated.
  • Constant columns, complex (multibyte/special-char) column names, and the use mapping.

Checklist

Make sure you have performed the following items before submitting this pull request.
If not, please describe the reason.

  • Add test cases for this fix/enhancement
  • Pass devtools::check()
  • Pass devtools::test() -- tests/testthat/test_stats_wrapper.R passes (931 assertions, 0 failures)
  • Test installing from github
  • Tested with Exploratory

🤖 Generated with Claude Code

kei51e and others added 3 commits June 2, 2026 22:31
Add method = "polychoric" to do_cor (Analytics Correlation). Since
stats::cor()/cor.test() do not support it, compute the correlation
matrix and its standard-error matrix in a single polycor::hetcor(
ML = TRUE, std.err = TRUE) call, then derive a two-sided z-test p value
(z = rho / SE). Columns are coerced to ordered factors so polychoric is
used for every pair; non-estimable pairs (constant or near-perfect
columns) come back as NA, which the UI treats as not-significant.

- R/stats_wrapper.R: polychoric branch in do_cor_internal; the existing
  pearson/spearman path is unchanged
- DESCRIPTION: add polycor to Imports
- tests: polychoric core behaviour, constant-column safety, and complex
  column names (spaces / multibyte / symbols)

The tam-side UI/config lives in the paired change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ests

polycor::hetcor() only accepts "complete.obs" and "pairwise.complete.obs",
so do_cor(method="polychoric") errored when `use` was "everything",
"all.obs", or "na.or.complete" -- values the other methods accept via
cor()/cor.test(). Map those to "pairwise.complete.obs".

Add tests for the use mapping, grouped (Repeat By) data, and NA values via
pairwise complete obs, and tighten the value-correctness assertions so the
strong pair recovers the latent ~0.7 (not inflated) and the independent pair
is near-zero and not significant.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kei51e kei51e requested a review from Copilot June 3, 2026 06:36
@kei51e kei51e changed the title feat(#30488): support polychoric correlation method in do_cor Support polychoric correlation method in do_cor Jun 3, 2026
@kei51e kei51e closed this Jun 3, 2026
@kei51e kei51e reopened this Jun 3, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for computing polychoric correlations via do_cor(..., method = "polychoric"), intended for ordinal variables (e.g., Likert scales). The implementation integrates polycor::hetcor() into the existing correlation workflow so that both correlation estimates and standard errors (and derived z-test p-values) are produced in the same shape as existing do_cor() outputs.

Changes:

  • Add a "polychoric" branch in do_cor_internal() using polycor::hetcor() and derive z-statistics / two-sided p-values from rho / SE.
  • Update do_cor/do_cor.kv_ roxygen docs to include "polychoric" as a supported method.
  • Add polycor to package Imports and add test coverage for correctness, grouping, NA handling, constant columns, complex names, and use mapping.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
R/stats_wrapper.R Implements the new "polychoric" method path in do_cor_internal() and updates method documentation.
tests/testthat/test_stats_wrapper.R Adds targeted tests for polychoric correlations across multiple scenarios (correctness, NA handling, grouped data, constants, naming, use).
DESCRIPTION Bumps version/date and adds polycor to Imports.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@kei51e kei51e closed this Jun 3, 2026
@kei51e kei51e reopened this Jun 3, 2026
@kei51e kei51e changed the title Support polychoric correlation method in do_cor WIP: Support polychoric correlation method in do_cor Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants