Skip to content

updated initial populations&estimates#214

Closed
dariaple wants to merge 3 commits intosimpaths:developfrom
dariaple:develop
Closed

updated initial populations&estimates#214
dariaple wants to merge 3 commits intosimpaths:developfrom
dariaple:develop

Conversation

@dariaple
Copy link
Copy Markdown
Contributor

@dariaple dariaple commented Jul 19, 2025

Summary of updates to initial populations:

  • Added the six-category ethnicity variable dot01 to initial populations where missing values are partially imputed using partner’s ethnicity (assuming partners have the same ethnicity.
  • Introduced the four-category variable dot, which is used in regression estimates:
  • Added a new disability dummy dlltsd01, which captures both self-reported disability and receipt of disability benefits. This dummy is used in regression estimates instead of dlltsd.
  • Added variable careHoursProvidedWeekly (capped), as used in the new labour supply estimates. Consequently, had to cap lhw to a maximum of 126 hours per week (hours of work) to ensure consistency

The initial populations data are saved here: …Box\CeMPA shared area\ESPON - OVERLAP_countries\UK\initial_populations (see version of July 2025)

Summary of updates to estimates:

  • Added year-specific dummies (y2020, y2021) to all processes to capture the impact of the pandemic.
  • Included the four-category variable dot in all processes (note that an extended ethnicity variable dot01 with standard five categories and “missing” kept as a separate category is retained in initial populations).
  • Replaced dhe with its physical (dhe_pcs) and mental (dhe_mcs) health components wherever health was included.
  • Replaced disability dummy dlltsd with dlltsd01, which captures both self-reported disability and receipt of disability benefits.
  • Weights: all processes now use dimxwt (individual cross-sectional weight).
  • Automated the production of Excel files with estimates.
  • Process-specific updates are detailed in the Excel files with estimates.
  • Added an Excel file summarising the structure of covariates used in the estimates.
  • Labour supply estimates are updated (version which accounts for hours of care provided per week).

Summary of updates to initial populations:
	Added the six-category ethnicity variable dot01 to initial populations where missing values are partially imputed using partner’s ethnicity (assuming partners have the same ethnicity.
	Introduced the four-category variable dot, which is used in regression estimates:
	Added a new disability dummy dlltsd01, which captures both self-reported disability and receipt of disability benefits. This dummy is used in regression estimates instead of dlltsd.
	Added variable careHoursProvidedWeekly (capped), as used in the new labour supply estimates. Consequently, had to cap lhw to a maximum of 126 hours per week (hours of work) to ensure consistency
The initial populations data are saved here: …Box\CeMPA shared area\ESPON - OVERLAP\_countries\UK\initial_populations (see version of July 2025)

Summary of updates to estimates:
	Added year-specific dummies (y2020, y2021) to all processes to capture the impact of the pandemic.
	Included the four-category variable dot in all processes (note that an extended ethnicity variable dot01 with standard five categories and “missing” kept as a separate category is retained in initial populations).
	Replaced dhe with its physical (dhe_pcs) and mental (dhe_mcs) health components wherever health was included.
	Replaced disability dummy dlltsd with dlltsd01, which captures both self-reported disability and receipt of disability benefits.
	Weights: all processes now use dimxwt (individual cross-sectional weight).
	Automated the production of Excel files with estimates.
	Process-specific updates are detailed in the Excel files with estimates.
	Added an Excel file summarising the structure of covariates used in the estimates.
	Labour supply estimates are updated (version which accounts for hours of care provided per week).
@dariaple dariaple requested a review from pbronka July 19, 2025 23:03
@dav-sonn
Copy link
Copy Markdown
Collaborator

dav-sonn commented Aug 6, 2025

@dariaple, we changed the code to accommodate the new estimates. However, there are two problems to be fixed in the input files:

  • In all input > reg_ files, sheets' names must include the country prefix (UK_)
  • In the reg_fertility file, there is a typo in column B: missing "E" in "COEFFICIENT"
    Would you be able to give it a look?
    Thanks in advance!

@dav-sonn dav-sonn mentioned this pull request Aug 6, 2025
Changed  reg_ files, by renaming sheets' names using the country prefix (UK_) & fixed a typo in reg_fertility file.
Also added spousal health characteristics to the initial populations.
@justin-ven
Copy link
Copy Markdown
Contributor

Redundant pull request as advised by email from Daria 17/11/2025

@justin-ven justin-ven closed this Nov 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants