WIP: Support polychoric correlation method in do_cor#1517
Open
kei51e wants to merge 3 commits into
Open
Conversation
Add method = "polychoric" to do_cor (Analytics Correlation). Since stats::cor()/cor.test() do not support it, compute the correlation matrix and its standard-error matrix in a single polycor::hetcor( ML = TRUE, std.err = TRUE) call, then derive a two-sided z-test p value (z = rho / SE). Columns are coerced to ordered factors so polychoric is used for every pair; non-estimable pairs (constant or near-perfect columns) come back as NA, which the UI treats as not-significant. - R/stats_wrapper.R: polychoric branch in do_cor_internal; the existing pearson/spearman path is unchanged - DESCRIPTION: add polycor to Imports - tests: polychoric core behaviour, constant-column safety, and complex column names (spaces / multibyte / symbols) The tam-side UI/config lives in the paired change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ests polycor::hetcor() only accepts "complete.obs" and "pairwise.complete.obs", so do_cor(method="polychoric") errored when `use` was "everything", "all.obs", or "na.or.complete" -- values the other methods accept via cor()/cor.test(). Map those to "pairwise.complete.obs". Add tests for the use mapping, grouped (Repeat By) data, and NA values via pairwise complete obs, and tighten the value-correctness assertions so the strong pair recovers the latent ~0.7 (not inflated) and the independent pair is near-zero and not significant. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds support for computing polychoric correlations via do_cor(..., method = "polychoric"), intended for ordinal variables (e.g., Likert scales). The implementation integrates polycor::hetcor() into the existing correlation workflow so that both correlation estimates and standard errors (and derived z-test p-values) are produced in the same shape as existing do_cor() outputs.
Changes:
- Add a
"polychoric"branch indo_cor_internal()usingpolycor::hetcor()and derive z-statistics / two-sided p-values fromrho / SE. - Update
do_cor/do_cor.kv_roxygen docs to include"polychoric"as a supported method. - Add
polycorto packageImportsand add test coverage for correctness, grouping, NA handling, constant columns, complex names, andusemapping.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
R/stats_wrapper.R |
Implements the new "polychoric" method path in do_cor_internal() and updates method documentation. |
tests/testthat/test_stats_wrapper.R |
Adds targeted tests for polychoric correlations across multiple scenarios (correctness, NA handling, grouped data, constants, naming, use). |
DESCRIPTION |
Bumps version/date and adds polycor to Imports. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add a
polychoriccorrelation method todo_cor()for ordinal variables (e.g. survey scales 1-5), resolving #30488.do_cor(..., method = "polychoric")computes the polychoric correlation matrix and standard errors in one call viapolycor::hetcor()(ML estimation). Every column is coerced to an ordered factor so polychoric (not Pearson/polyserial) is used for every pair, and a two-sided z-test P value is derived fromrho / SE. Non-estimable pairs (constant or near-perfect columns) come back as NA and are treated as not-significant on the UI.usevalues thathetcor()does not support ("everything","all.obs","na.or.complete") are mapped to"pairwise.complete.obs", so polychoric does not error where pearson/spearman/kendall would succeed."complete.obs"is passed through.polycorto Imports.Tests cover:
pairwise.complete.obs; every pair is still estimated.usemapping.Checklist
Make sure you have performed the following items before submitting this pull request.
If not, please describe the reason.
tests/testthat/test_stats_wrapper.Rpasses (931 assertions, 0 failures)🤖 Generated with Claude Code