Skip to content

Conversation

@MaxGhenis
Copy link
Collaborator

@MaxGhenis MaxGhenis commented Oct 5, 2025

Summary

This PR removes all random number generation from policyengine-uk. All stochastic variables are now generated in policyengine-uk-data and read from the dataset. The country package is now a purely deterministic rules engine.

⚠️ MERGE ORDER: The companion policyengine-uk-data PR #203 must be merged FIRST, then this PR

Changes

Removed

  • All take-up rate parameters (moved to policyengine-uk-data)
  • All random seed/draw variables for take-up decisions
  • Random seed variables for other stochastic processes

Simplified

All would_claim variables now use dataset values with deterministic fallbacks:

  • would_claim_child_benefit (default: True)
  • child_benefit_opts_out (default: False)
  • would_claim_pc (default: True)
  • would_claim_uc (default: True)
  • would_claim_tfc (default: True)
  • would_claim_extended_childcare (default: True)
  • would_claim_universal_childcare (default: True)
  • would_claim_targeted_childcare (default: True)

Other stochastic variables simplified to dataset-only:

  • household_owns_tv (default: True)
  • would_evade_tv_licence_fee (default: False)
  • main_residential_property_purchased_is_first_home (default: False)

Added

  • would_claim_marriage_allowance variable (default: True)
  • higher_earner_tie_break variable (default: 0.0) - for tie-breaking
  • attends_private_school_random_draw variable (default: 1.0) - for income-conditional

Preserved formulas (fully deterministic)

These variables keep their formulas but with NO random() calls:

  • attends_private_school (uses random draw from dataset)
  • is_disabled_for_benefits (deterministic rule: disabled if receives qualifying benefits)
  • is_higher_earner (uses random draw from dataset for tie-breaking)

Test updates

Updated expected fiscal impacts in reforms_config.yaml to reflect the new stochastic simulation method.

Trade-offs

IMPORTANT: Take-up rates can no longer be adjusted dynamically via policy reforms or in the web app. They are fixed in the microdata. This is an acceptable trade-off for the cleaner architecture of keeping the country package purely deterministic.

To adjust take-up rates for analysis, the microdata must be regenerated with updated parameter values in policyengine-uk-data.

Test Plan

  • Package imports successfully
  • All existing tests pass
  • Microsimulations produce correct results
  • Policy calculator (non-microsim) still works with deterministic defaults

Related PRs

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

This change moves all randomness generation from policyengine-uk to
policyengine-uk-data, following the pattern established in policyengine-us.

Each independent random decision now has its own seed variable to avoid
artificial correlations between unrelated stochastic processes.

Changes:
- Add 11 new seed variables (4 person-level, 4 benunit-level, 3 household-level):
  - is_disabled_for_benefits_seed
  - marriage_allowance_take_up_seed
  - is_higher_earner_seed
  - attends_private_school_seed
  - child_benefit_take_up_seed
  - child_benefit_opts_out_seed
  - pension_credit_take_up_seed
  - universal_credit_take_up_seed
  - first_home_purchase_seed
  - household_owns_tv_seed
  - tv_licence_evasion_seed

- Update all variables using random() to use their specific seed variable

This ensures reproducible simulations and allows the dataset to control
all stochastic elements of the model.

Related: policyengine-uk-data PR (must be merged first)
@MaxGhenis MaxGhenis force-pushed the migrate-random-to-data branch 10 times, most recently from 598937b to d4b0b58 Compare October 5, 2025 17:31
… from dataset

This change removes all random number generation from policyengine-uk. All
stochastic variables are now generated in policyengine-uk-data and read from
the dataset. The country package is now a purely deterministic rules engine.

## Key Changes

- Remove all take-up rate parameters (moved to policyengine-uk-data)
- Remove all random seed/draw variables for take-up decisions
- Simplify would_claim variables to dataset-only (no formula)
- Keep formulas with fallback to random() for policy calculator (non-microsim)
- Add would_claim_marriage_allowance variable
- Add random draw variables for tie-breaking and conditional probabilities

### Variables now sourced from dataset:
**Take-up decisions (boolean):**
- would_claim_child_benefit
- child_benefit_opts_out
- would_claim_pc
- would_claim_uc
- would_claim_marriage_allowance
- would_claim_tfc
- would_claim_extended_childcare
- would_claim_universal_childcare
- would_claim_targeted_childcare

**Other stochastic variables:**
- household_owns_tv
- would_evade_tv_licence_fee
- main_residential_property_purchased_is_first_home

**Random draws (for formulas):**
- is_higher_earner_random_draw (tie-breaking)
- attends_private_school_random_draw (income-conditional)

### Formulas preserved for policy calculator:
- attends_private_school (complex income percentile logic)
- is_disabled_for_benefits (conditional on qualifying benefits)
- is_higher_earner (uses random draw for tie-breaking)

These fall back to random() when not in a microsimulation context.

## Trade-offs

**IMPORTANT**: Take-up rates can no longer be adjusted via policy reforms.
They are fixed in the microdata. This is an acceptable trade-off for the
cleaner architecture. To adjust take-up rates, regenerate the microdata.

Related: policyengine-uk-data PR #[TBD] (must be merged FIRST)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@MaxGhenis
Copy link
Collaborator Author

Superseded by #1439 - fresh PR after resolving conflicts

@MaxGhenis MaxGhenis closed this Dec 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants