Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
180 changes: 180 additions & 0 deletions docs/book/validation/student-loan-repayments.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Student loan repayment validation\n",
"\n",
"This notebook compares PolicyEngine UK's calculated student loan repayments against reported repayments from the Family Resources Survey (FRS) microdata. Understanding the alignment between modelled and reported values helps assess model accuracy and identify areas for improvement."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Background\n",
"\n",
"Student loan repayments in the UK are calculated as a percentage of income above a threshold, varying by loan plan:\n",
"\n",
"- **Plan 1** (pre-2012 England/Wales, Scotland, NI): 9% of income above £24,990 (2024-25)\n",
"- **Plan 2** (post-2012 England/Wales): 9% of income above £27,295 (2024-25)\n",
"- **Plan 4** (Scotland post-2017): 9% of income above £27,660 (2024-25)\n",
"- **Plan 5** (England post-2023): 9% of income above £25,000 (2024-25)\n",
"- **Postgraduate**: 6% of income above £21,000 (2024-25)\n",
"\n",
"The FRS captures reported student loan repayments, while PolicyEngine calculates repayments based on income and loan plan type."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from policyengine_uk import Microsimulation\n",
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"sim = Microsimulation()\n",
"year = 2025"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get student loan data\n",
"reported = sim.calculate(\"student_loan_repayments\", year).values\n",
"modelled = sim.calculate(\"student_loan_repayment\", year).values\n",
"plan = sim.calculate(\"student_loan_plan\", year).values\n",
"income = sim.calculate(\"adjusted_net_income\", year).values\n",
"weight = sim.calculate(\"person_weight\", year).values"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Student loan plan distribution\n",
"\n",
"First, let's examine the distribution of student loan plans in the weighted population:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Plan distribution (weighted)\n",
"plan_names = {0: \"None\", 1: \"Plan 1\", 2: \"Plan 2\", 3: \"Postgraduate\", 4: \"Plan 4\", 5: \"Plan 5\"}\n",
"for plan_id, name in plan_names.items():\n",
" count = weight[plan == plan_id].sum() / 1e6\n",
" print(f\"{name}: {count:.2f}m people\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Aggregate comparison\n",
"\n",
"Comparing total reported vs modelled repayments:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"total_reported = (reported * weight).sum() / 1e9\n",
"total_modelled = (modelled * weight).sum() / 1e9\n",
"\n",
"print(f\"Total reported repayments: £{total_reported:.2f}bn\")\n",
"print(f\"Total modelled repayments: £{total_modelled:.2f}bn\")\n",
"print(f\"Ratio (modelled/reported): {total_modelled/total_reported:.2f}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Individual-level alignment\n",
"\n",
"For people who report making student loan repayments, how well do our calculations align?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Filter to people with reported repayments > 0\n",
"has_reported = reported > 0\n",
"\n",
"if has_reported.sum() > 0:\n",
" # Correlation\n",
" correlation = np.corrcoef(reported[has_reported], modelled[has_reported])[0, 1]\n",
" print(f\"Correlation (people with reported > 0): {correlation:.3f}\")\n",
" \n",
" # Match rate\n",
" both_positive = (reported > 0) & (modelled > 0)\n",
" match_rate = both_positive.sum() / has_reported.sum() * 100\n",
" print(f\"People with both reported & modelled > 0: {match_rate:.1f}% of reporters\")\n",
" \n",
" # Mean values\n",
" print(f\"\\nMean reported (reporters): £{reported[has_reported].mean():,.0f}\")\n",
" print(f\"Mean modelled (reporters): £{modelled[has_reported].mean():,.0f}\")\n",
" print(f\"Mean income (reporters): £{income[has_reported].mean():,.0f}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Analysis of discrepancies\n",
"\n",
"The relatively low individual-level correlation suggests several factors may explain differences:\n",
"\n",
"1. **Timing differences**: Reported repayments reflect actual payments made during the tax year, which may include voluntary overpayments or vary based on pay frequency and employment changes.\n",
"\n",
"2. **Employment variation**: Someone may have had periods below or above the repayment threshold during the year, while our model assumes constant annual income.\n",
"\n",
"3. **Multiple loan plans**: Some individuals may have both Plan 1 and Plan 2 loans, complicating the calculation.\n",
"\n",
"4. **Study status**: Current students may have different repayment patterns not fully captured in the model.\n",
"\n",
"5. **Plan misclassification**: The loan plan imputation in the microdata may not perfectly match individuals' actual loan types.\n",
"\n",
"Despite individual-level variation, the aggregate totals are reasonably aligned, suggesting the model captures the overall scale of student loan repayments in the UK economy."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"PolicyEngine UK's student loan repayment model produces aggregate totals within a reasonable range of reported values. The individual-level correlation is lower than for income tax calculations, reflecting the complexity of student loan timing and the limitations of annual income-based calculations. For microsimulation purposes, the model provides a reasonable approximation of student loan repayment flows, while users should be aware of these limitations when analysing individual-level impacts."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.10.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
description: Student loan interest rates
label: Interest rates
documentation: |
Interest rates applied to student loan balances vary by plan type.
Plan 2 has income-contingent rates that vary between RPI and RPI+3%.
Plan 5 and Plans 1/4 have simpler fixed rate structures.

Note: These parameters document the policy rules. Interest accrual
is not currently modelled in the microsimulation as it requires
tracking loan balances over time.

Reference: https://www.gov.uk/repaying-your-student-loan/interest
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
description: Plan 1 student loan interest rate
metadata:
label: Plan 1 interest rate
unit: /1
period: year
reference:
- title: Education (Student Loans) (Repayment) Regulations 2009, Regulation 21
href: https://www.legislation.gov.uk/uksi/2009/470/regulation/21
- title: GOV.UK - Student loan interest rates
href: https://www.gov.uk/repaying-your-student-loan/interest
documentation: |
Interest rate for Plan 1 student loans (pre-2012).
Per Regulation 21, set at the lower of RPI or Bank of England base rate + 1%.
In practice, this is typically RPI when RPI is low.

values:
2020-09-01: 0.012 # 1.2%
2021-09-01: 0.013 # 1.3%
2022-09-01: 0.051 # 5.1%
2023-09-01: 0.071 # 7.1%
2024-09-01: 0.072 # 7.2%
2025-09-01: 0.032 # 3.2%
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
description: Plan 2 maximum additional interest rate
metadata:
label: Plan 2 additional rate
unit: /1
period: year
reference:
- title: Education (Student Loans) (Repayment) (Amendment) (No. 2) Regulations 2012, Regulation 21A(10)
href: https://www.legislation.gov.uk/uksi/2012/1309/regulation/10/made
- title: GOV.UK - Student loan interest rates
href: https://www.gov.uk/repaying-your-student-loan/interest
documentation: |
Maximum additional interest rate for Plan 2 student loans.
Added to the base rate for higher earners.

Per Regulation 21A(10), the additional interest rate is calculated as:
3 × (I - L) / (H - L)
where I = borrower's income, L = lower threshold, H = higher threshold.

The actual additional rate is tapered between the lower and upper
income thresholds, from 0% at the lower threshold to this maximum
at the upper threshold.

For income at or above the upper threshold, the total rate is:
base_rate + additional_rate (i.e., RPI + 3%)

values:
2012-09-01: 0.03 # 3%
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
description: Plan 2 base interest rate (RPI)
metadata:
label: Plan 2 base rate
unit: /1
period: year
reference:
- title: Education (Student Loans) (Repayment) (Amendment) (No. 2) Regulations 2012, Regulation 21A
href: https://www.legislation.gov.uk/uksi/2012/1309/regulation/10/made
- title: GOV.UK - Student loan interest rates
href: https://www.gov.uk/repaying-your-student-loan/interest
documentation: |
Base interest rate for Plan 2 student loans (RPI).
This is the minimum rate applied to all Plan 2 borrowers.
Higher earners pay this plus an additional rate up to 3%.

Per Regulation 21A, the "standard interest rate" is the prevailing
market rate as determined by the Secretary of State (set to RPI).

values:
2020-09-01: 0.026 # 2.6%
2021-09-01: 0.012 # 1.2%
2022-09-01: 0.051 # 5.1%
2023-09-01: 0.071 # 7.1%
2024-09-01: 0.072 # 7.2%
2025-09-01: 0.032 # 3.2%
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
description: Plan 2 student loan interest rates
label: Plan 2 interest rates
documentation: |
Plan 2 student loan interest rates are income-contingent after graduation.

While studying: RPI + 3%
After graduating:
- Income at or below lower threshold: RPI only
- Income between lower and upper threshold: RPI + tapered rate (0% to 3%)
- Income at or above upper threshold: RPI + 3%

This progressive structure means higher earners pay more interest,
though the actual repayment amount still depends on income, not the
loan balance or interest rate.

Reference: https://www.gov.uk/repaying-your-student-loan/interest
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
description: Plan 2 lower income threshold for interest rate tapering
metadata:
label: Plan 2 interest lower threshold
unit: currency-GBP
period: year
reference:
- title: Education (Student Loans) (Repayment) (Amendment) (No. 2) Regulations 2012, Regulation 21A(10)
href: https://www.legislation.gov.uk/uksi/2012/1309/regulation/10/made
- title: GOV.UK - Student loan interest rates
href: https://www.gov.uk/repaying-your-student-loan/interest
documentation: |
Lower income threshold for Plan 2 interest rate tapering (L in the formula).
Below this income, borrowers pay the base rate (RPI) only.
Above this, the additional rate begins to taper in.

Per Regulation 21A(10), this is the "lower interest threshold".
Note: This is the same as the Plan 2 repayment threshold.

values:
2020-09-01: 26575
2021-09-01: 27295
2022-09-01: 27295
2023-09-01: 27660
2024-09-01: 27660
2025-09-01: 28470
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
description: Plan 2 upper income threshold for interest rate tapering
metadata:
label: Plan 2 interest upper threshold
unit: currency-GBP
period: year
reference:
- title: Education (Student Loans) (Repayment) (Amendment) (No. 2) Regulations 2012, Regulation 21A(10)
href: https://www.legislation.gov.uk/uksi/2012/1309/regulation/10/made
- title: GOV.UK - Student loan interest rates
href: https://www.gov.uk/repaying-your-student-loan/interest
documentation: |
Upper income threshold for Plan 2 interest rate tapering (H in the formula).
Above this income, borrowers pay the maximum rate (RPI + 3%).
Between the lower and upper thresholds, the rate is tapered.

Per Regulation 21A(10), this is the "higher interest threshold".

values:
2020-09-01: 47835
2021-09-01: 49130
2022-09-01: 49130
2023-09-01: 49585
2024-09-01: 49585
2025-09-01: 51245
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
description: Plan 4 student loan interest rate
metadata:
label: Plan 4 interest rate
unit: /1
period: year
reference:
- title: Education (Student Loans) (Repayment) (Scotland) Regulations 2009, Regulation 20
href: https://www.legislation.gov.uk/ssi/2009/168/regulation/20
- title: GOV.UK - Student loan interest rates
href: https://www.gov.uk/repaying-your-student-loan/interest
documentation: |
Interest rate for Plan 4 student loans (Scotland).
Per Regulation 20, set at the lower of RPI or Bank of England base rate + 1%.
Same rate as Plan 1.

values:
2020-09-01: 0.012 # 1.2%
2021-09-01: 0.013 # 1.3%
2022-09-01: 0.051 # 5.1%
2023-09-01: 0.071 # 7.1%
2024-09-01: 0.072 # 7.2%
2025-09-01: 0.032 # 3.2%
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
description: Plan 5 student loan interest rate
metadata:
label: Plan 5 interest rate
unit: /1
period: year
reference:
- title: Education (Student Loans) (Repayment) (Amendment) Regulations 2023
href: https://www.legislation.gov.uk/uksi/2023/207/made
- title: GOV.UK - Student loan interest rates
href: https://www.gov.uk/repaying-your-student-loan/interest
documentation: |
Interest rate for Plan 5 student loans (from September 2023).
Set at RPI only - no additional percentage based on income.
This is simpler than Plan 2 which has income-contingent rates.

The removal of the real interest rate for Plan 5 was announced in
the February 2022 Higher Education Policy Statement.

values:
2023-09-01: 0.071 # 7.1%
2024-09-01: 0.072 # 7.2%
2025-09-01: 0.032 # 3.2%
Loading