Skip to content

Feature/155 add ghq case scores 0 12#223

Closed
andrewbaxter439 wants to merge 8 commits intodevelopfrom
feature/155-add-ghq-case-scores-0-12
Closed

Feature/155 add ghq case scores 0 12#223
andrewbaxter439 wants to merge 8 commits intodevelopfrom
feature/155-add-ghq-case-scores-0-12

Conversation

@andrewbaxter439
Copy link
Copy Markdown
Collaborator

What

  • Recalculates GHQ12 caseness scores (dhm_ghq) as 0-12 (in line with UKHLS scghq2_dv)
  • adds new regression terms for dhm and dhm_ghq to keep these in alignment
  • Updates initial population files, training population and expected statistics

Why

  • 0-12 scale is more informative
  • Can be converted to binary psychological distress measures post run
  • Still graphs isPsychologicallyDistressed as a cutoff value (>=4) from this prediction

@andrewbaxter439 andrewbaxter439 requested a review from Copilot August 6, 2025 15:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a feature to recalculate GHQ12 caseness scores (dhm_ghq) from a binary 0/1 scale to a continuous 0-12 scale, aligning with UKHLS standards. This change allows for more informative psychological distress measurement while maintaining backward compatibility through binary cutoff conversion.

  • Converts dhm_ghq from Boolean to Double type throughout the codebase
  • Updates regression models from Logit to Linear for GHQ case predictions
  • Maintains backward compatibility by using >=4 cutoff for binary psychological distress indicator

Reviewed Changes

Copilot reviewed 9 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
Person.java Changes dhm_ghq field from Boolean to Double, updates related methods and constraints
Parameters.java Converts GHQ case regression models from BinomialRegression to LinearRegression
RegressionName.java Updates regression types from Logit to Linear for GHQ case models
ManagerRegressions.java Moves GHQ case regressions from BinomialRegression to LinearRegression handler
02_create_UKHLS_variables.do Updates data processing to use 0-12 GHQ scale instead of binary caseness
Statistics files Expected test output updates reflecting the model changes

@andrewbaxter439
Copy link
Copy Markdown
Collaborator Author

Problems likely needing addressed with this:

  1. Initial populations not giving realistic values (imputing missing?)

Lots of '2's are added in the initial population in the 'replacing missing values' stage. UKHLS data:
og
With imputed values added:
impute

Solutions:

  • impute all as 0s (mostly young age missing)
  • different regression - ordered logistic regression?
  1. Predictions on a linear scale don't predict realistic values

First year population (not updated, same distribution as initial population above):
image

After 5 years of simulation:

image

Problem seems to be with linear estimation the coefficients can never add up to 12, nor do they predict expected frequency of 0s. So needs to be replaced by another estimation method perhaps?

@andrewbaxter439
Copy link
Copy Markdown
Collaborator Author

Update - a zero-inflated negative binomial prediction model seems best for the prediction of these values

image

This would have to be implemented in two places:

  • imputation of missing data in 02_create_UKHLS_variables.do
  • A new ZeroInflatedNegativeBinomialRegression model in JAS-mine-core

Both seem possible, but would need to figure out best approach for introducing this to SimPaths? Zinfnb models take two sets of coefficients, one to determine probability of 0 and one to determine probabilities of 'counts'. Implementation of this may need then two sheets per regression, with two stages of estimation. Perhaps a possibility to create a NegativeBinomialRegression class on its own and either a) combine implementation of binomial/negative binomial to create zero-inflation in a new JAS-mine object wrapper, of b) explicitly run both in turn as separate regressions in SimPaths?

@andrewbaxter439 andrewbaxter439 force-pushed the feature/155-add-ghq-case-scores-0-12 branch from 29868a6 to 59c65b6 Compare September 24, 2025 09:56
justin-ven added a commit to justin-ven/SimPaths that referenced this pull request Nov 17, 2025
@andrewbaxter439 andrewbaxter439 force-pushed the feature/155-add-ghq-case-scores-0-12 branch from 1315114 to 030f300 Compare November 17, 2025 12:33
@justin-ven
Copy link
Copy Markdown
Contributor

Merged with #294

@justin-ven justin-ven closed this Nov 25, 2025
@andrewbaxter439 andrewbaxter439 deleted the feature/155-add-ghq-case-scores-0-12 branch January 21, 2026 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants