Add do file for estimating financial distress parameters#315
Conversation
dkopasker
left a comment
There was a problem hiding this comment.
Just two comments: one about duplication of a variable, and one about compliance with the variable naming policy (which perhaps happened while you were on leave).
| dimxwt dhhwt jbhrs jshrs j2hrs jbstat les_c3 les_c4 lessp_c3 lessp_c4 lesdf_c4 ydses_c5 month scghq2_dv ydisp /// | ||
| ypnbihs_dv yptciihs_dv yplgrs_dv ynbcpdf_dv ypncp ypnoab swv sedex ssscp sprfm sedag stm dagsp lhw l1_lhw pno ppno hgbioad1 hgbioad2 der adultchildflag /// | ||
| econ_benefits econ_benefits_nonuc econ_benefits_uc /// | ||
| fihhmnnet1_dv ieqmoecd_dv /// |
There was a problem hiding this comment.
The equivalence scale is generated later in the do file (see gen moecd_eq = . //Modified OECD equivalence scale). For consistency, it may be best to use the generated variable.
There was a problem hiding this comment.
That's good to know. Currently, ieqmoecd_dv this is used to calculate equivalised income in the input\InitialPopulations\compile\RegressionEstimates\variable_update.do which is used by most of the estimation files.
It sounds like a sensible idea to replace this with moecd_eq, but this would affect all estimation scripts and ideally would require all of the scripts to be rerun (i.e. the parameters reestimated). @dav-sonn, @dariaple or others in Essex might have a view on this.
| * HM1_L: GHQ12 score 0-36 of all working-age adults - baseline effects * | ||
| ********************************************************************** | ||
|
|
||
| logit financial_distress /// |
There was a problem hiding this comment.
Does the variable named "financial_distress" comply with the new variable naming structure and does it need to?
There was a problem hiding this comment.
Good question. I understand that @dav-sonn's open PR #313 will rename it to "yFinDstrssFlag" in the initial population CSV files. That said, the regression estimation scripts all use the pooled dataset ukhls_pooled_all_obs_09.dta, which is created before this renaming takes place, so all the variables here (exp_emp, lhw_c5, and so on) will continue to have their "old" names.
I think this is a question that goes beyond this PR and should probably be resolved separately (although @dav-sonn might have a view?).
There was a problem hiding this comment.
yeah it looks like #313 will add the code to change the name in the csv file after the initial population has been created and independent of the regressions. So this change should be compatible with current state and should sync up nicely with refactoring!
There was a problem hiding this comment.
As Matteo pointed out below, we haven't refactored the regressors in the processes' estimation scripts. Refactored variables are those in the initial populations, the output CSV files, and the Person, BenefitUnit, and Household classes.
I hope this clarifies!
|
|
||
|
|
||
| ********************************************************************** | ||
| * HM1_L: GHQ12 score 0-36 of all working-age adults - baseline effects * |
There was a problem hiding this comment.
Update heading, along lines of:
| * HM1_L: GHQ12 score 0-36 of all working-age adults - baseline effects * | |
| * Financial Distress (binary) - estimated log odds of experiencing financial distress * |
|
For the other processes we have agreed not to refactor the estimation
scripts, but only refactor the final names...
…On Tue, 20 Jan 2026 at 15:19, Andrew Baxter ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In
input/InitialPopulations/compile/RegressionEstimates/reg_financial_distress.do
<#315 (comment)>
:
> +*******************************************************************
+
+use "$dir_ukhls_data/ukhls_pooled_all_obs_09.dta", clear
+do "$dir_do/variable_update"
+
+
+
+* Sample selection
+drop if dag < 16
+
+
+xtset idperson swv
+
+
+**********************************************************************
+* HM1_L: GHQ12 score 0-36 of all working-age adults - baseline effects *
Update heading, along lines of:
⬇️ Suggested change
-* HM1_L: GHQ12 score 0-36 of all working-age adults - baseline effects *
+* Financial Distress (binary) - estimated log odds of experiencing financial distress *
—
Reply to this email directly, view it on GitHub
<#315 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACOK4OCUGADDWHNTSI6BSJD4HZBP5AVCNFSM6AAAAACSIRR7QSVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTMOBSG42DEMRQHE>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***
com>
|
What
This PR:
reg_financial_distress.xlsx) with the output from this script based on the current initial population (this is slightly different from what was there before)And, unrelatedly (but necessary for the above):
fihhmnnet1_dvandieqmoecd_dvare used in input\InitialPopulations\compile\RegressionEstimates\variable_update.do, but weren't being kept in the intermediate data files used (ukhls_pooled_all_obs_09.dta) - now they areWhy
We want the code used to estimate model parameters to be stored in this repo. This wasn't the case for the financial distress process.
Validation