Adaptive Question Selector

Production-grade adaptive testing engine powered by Item Response Theory (IRT). Dynamically selects optimal questions based on real-time student ability estimation, delivering precise assessments in fewer questions than traditional fixed-length tests.

What is Adaptive Testing?

Traditional tests give every student the same questions. Adaptive tests are smarter — they adjust in real time:

Start with a medium-difficulty question
Student answers correctly → next question is harder
Student answers incorrectly → next question is easier
Converge on the student's true ability in fewer questions

This is the same approach used by the GRE, GMAT, and many standardized assessments. The underlying math is Item Response Theory (IRT) — a psychometric framework that models the relationship between student ability and question difficulty.

Student answers → Update ability estimate → Select optimal next question → Repeat
        ↑                                                                      |
        └──────────────────────────────────────────────────────────────────────┘

Features

2-Parameter Logistic (2PL) IRT model — models both difficulty and discrimination
Maximum Likelihood Estimation (MLE) — precise ability estimation with Bayesian regularization
Fisher Information question selection — picks the most informative question at each step
Adaptive stopping rules — ends when measurement precision is sufficient
Real-time REST API — create sessions, submit answers, get next question
Simulation endpoint — validate algorithm behavior with known true abilities
Security hardened — input validation, session limits, TTL eviction, overflow protection
Health monitoring — /health endpoint for orchestration and load balancers

Quick Start

# Clone and setup
git clone https://github.com/woodstocksoftware/adaptive-question-selector.git
cd adaptive-question-selector
python3.12 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Start the server
python -m uvicorn src.server:app --reload --port 8002

The API is now running at http://localhost:8002. Interactive docs at http://localhost:8002/docs.

How It Works

1. Create a Session with Your Question Pool

curl -X POST http://localhost:8002/sessions \
  -H "Content-Type: application/json" \
  -d '{
    "question_pool": [
      {"id": "q1", "difficulty": -2.0, "discrimination": 1.2, "content": "What is 2+2?"},
      {"id": "q2", "difficulty": -1.0, "discrimination": 1.0, "content": "Solve: 3x = 12"},
      {"id": "q3", "difficulty": 0.0,  "discrimination": 1.5, "content": "Factor: x² - 4"},
      {"id": "q4", "difficulty": 1.0,  "discrimination": 0.8, "content": "Derivative of sin(x)"},
      {"id": "q5", "difficulty": 2.0,  "discrimination": 1.3, "content": "Evaluate: ∫ e^x dx"}
    ],
    "selection_method": "max_info",
    "max_questions": 10,
    "stopping_se": 0.4
  }'

The response includes the first selected question and the initial ability estimate (θ = 0.0).

2. Submit Answers

curl -X POST http://localhost:8002/sessions/{session_id}/answer \
  -H "Content-Type: application/json" \
  -d '{"question_id": "q3", "correct": true}'

Each answer returns:

Updated ability estimate (θ) with standard error
The next optimal question (or session completion)
95% confidence interval on ability

3. Get Results

When the session completes (SE threshold met or max questions reached), the response includes a full summary:

{
  "theta": 0.85,
  "standard_error": 0.38,
  "confidence_interval": [-0.105, 1.805],
  "percentile": 80.2,
  "performance_level": "Proficient",
  "questions_answered": 7,
  "correct": 5,
  "accuracy": 71.4
}

4. Simulate to Validate

Test the algorithm against a known true ability:

curl "http://localhost:8002/simulate?true_theta=1.5&num_questions=20&pool_size=100"

Returns step-by-step convergence history showing how the estimate approaches the true value.

Mathematical Foundation

The 2PL Model

The probability of a correct response is:

$$P(X=1|\theta) = \frac{1}{1 + e^{-a(\theta - b)}}$$

Where:

θ (theta) — student ability, typically in [-3, +3]
b — item difficulty, same scale as θ
a — item discrimination, how well the item differentiates ability levels (0.1 to 3.0)

Fisher Information

The information a question provides about ability:

$$I(\theta) = a^2 \cdot P(\theta) \cdot Q(\theta)$$

Where Q(θ) = 1 - P(θ). Information is maximized when P(θ) = 0.5 — when the question difficulty matches the student's ability. Higher discrimination (a) yields more information.

Maximum Likelihood Estimation

Student ability is estimated by finding the θ that maximizes the log-likelihood:

$$\hat{\theta} = \arg\max_\theta \sum_{i} \left[ x_i \ln P_i(\theta) + (1-x_i) \ln Q_i(\theta) \right]$$

A weak N(0, σ²) prior provides regularization, preventing extreme estimates with sparse data. Standard error is derived from the inverse of total Fisher information.

API Reference

Endpoints

Method	Path	Status	Description
`GET`	`/health`	200	Health check and session count
`POST`	`/sessions`	201	Create adaptive test session
`GET`	`/sessions/{id}`	200	Get session status and ability estimate
`POST`	`/sessions/{id}/answer`	200	Submit answer, receive next question
`DELETE`	`/sessions/{id}`	200	Delete session
`POST`	`/estimate`	200	Standalone ability estimation
`GET`	`/simulate`	200	Simulate adaptive test

GET /health

Returns server status, version, and active session count. Use for health checks and monitoring.

POST /sessions → 201

Create a new adaptive testing session. Question pool is capped at 1,000 items; duplicate IDs are rejected.

Request body:

Field	Type	Default	Description
`question_pool`	`QuestionCreate[]`	required	Questions with IRT parameters (max 1,000)
`selection_method`	`"max_info" \| "target_50"`	`"max_info"`	Selection strategy
`max_questions`	`int`	`20`	Maximum questions to administer (1-100)
`stopping_se`	`float`	`0.3`	Stop when SE falls below this (0.1-1.0)

Question parameters:

Field	Type	Range	Description
`id`	`string`	—	Unique identifier (auto-generated if omitted)
`difficulty`	`float`	[-3, 3]	Item difficulty (b parameter)
`discrimination`	`float`	[0.1, 3]	Item discrimination (a parameter)
`content`	`string`	—	Question text
`topic_id`	`string`	—	Optional topic grouping

POST /sessions/{id}/answer

Submit an answer and receive the next question.

Request body:

Field	Type	Description
`question_id`	`string`	ID of the question being answered
`correct`	`bool`	Whether the answer was correct

POST /estimate

Standalone ability estimation from a batch of responses. Each response is validated via Pydantic.

Request body: Array of ResponseInput objects:

[
  {"difficulty": -1.0, "discrimination": 1.0, "correct": true},
  {"difficulty": 0.5, "discrimination": 1.2, "correct": false},
  {"difficulty": -0.5, "discrimination": 0.8, "correct": true}
]

Field	Type	Range	Default	Description
`difficulty`	`float`	[-3, 3]	required	Item difficulty (b)
`discrimination`	`float`	[0.1, 3]	`1.0`	Item discrimination (a)
`correct`	`bool`	—	required	Whether the response was correct

GET /simulate

Parameter	Type	Range	Default	Description
`true_theta`	`float`	[-4, 4]	`0.0`	Simulated student ability
`num_questions`	`int`	[1, 200]	`20`	Questions to administer
`pool_size`	`int`	[1, 1000]	`100`	Random question pool size

Selection Methods

Method	Strategy	Best For
`max_info`	Maximize Fisher Information at current θ	Fastest convergence, most precise
`target_50`	Select question closest to 50% probability	Balanced student experience

Performance Levels

θ Range	Level	Percentile
≥ 1.5	Advanced	~93rd+
0.5 to 1.5	Proficient	~69th - 93rd
-0.5 to 0.5	Basic	~31st - 69th
-1.5 to -0.5	Below Basic	~7th - 31st
< -1.5	Needs Support	Below ~7th

Security

The API is hardened for production use:

Input validation — all endpoints use Pydantic models with enforced parameter ranges
Session limits — max 10,000 concurrent sessions with 1-hour TTL eviction
Session IDs — 128-bit entropy via secrets.token_hex(16)
DoS protection — question pool capped at 1,000; simulation params bounded
Overflow guards — exponent clamping in probability calculation; math.isfinite() on MLE output
Error handling — global exception handler prevents stack trace leakage
CORS — wildcard origins without credentials (safe default)
Duplicate rejection — duplicate question IDs in a pool return 400

Testing

# Run all tests
python -m pytest tests/ -v

# With coverage
python -m pytest tests/ -v --cov=src --cov-report=term-missing

72 tests with 99% coverage:

IRT probability calculations and overflow protection
MLE ability estimation (normal, all-correct, all-incorrect, empty)
Fisher information computation
Question selection (max_info and target_50 methods)
API session lifecycle (create → answer → complete → delete)
Stopping rules (SE threshold, max questions, pool exhaustion)
Input validation (parameter ranges, duplicate IDs, invalid methods)
Session security (capacity limits, TTL eviction)
Standalone endpoints (/estimate validation, /simulate bounds)

Research References

Baker, F. B., & Kim, S.-H. (2004). Item Response Theory: Parameter Estimation Techniques. Marcel Dekker.
Lord, F. M. (1980). Applications of Item Response Theory to Practical Testing Problems. Lawrence Erlbaum.
van der Linden, W. J., & Glas, C. A. W. (2010). Elements of Adaptive Testing. Springer.
Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6(4), 473-492.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory. Sage.

Part of the Ed-Tech Suite

Component	Description
Adaptive Question Selector	IRT-based adaptive testing (this repo)
Question Bank MCP	Question management
Student Progress Tracker	Performance analytics
Simple Quiz Engine	Real-time quizzes
Learning Curriculum Builder	Curriculum design
Real-Time Event Pipeline	Event routing

Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feature/improvement)
Ensure tests pass (python -m pytest tests/ -v)
Ensure linting passes (ruff check src/ tests/)
Submit a pull request

License

MIT

Built by Jim Williams | GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
docs/screenshots		docs/screenshots
src		src
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
MEMORY.md		MEMORY.md
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adaptive Question Selector

What is Adaptive Testing?

Features

Quick Start

How It Works

1. Create a Session with Your Question Pool

2. Submit Answers

3. Get Results

4. Simulate to Validate

Mathematical Foundation

The 2PL Model

Fisher Information

Maximum Likelihood Estimation

API Reference

Endpoints

GET /health

POST /sessions → 201

POST /sessions/{id}/answer

POST /estimate

GET /simulate

Selection Methods

Performance Levels

Security

Testing

Research References

Part of the Ed-Tech Suite

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Adaptive Question Selector

What is Adaptive Testing?

Features

Quick Start

How It Works

1. Create a Session with Your Question Pool

2. Submit Answers

3. Get Results

4. Simulate to Validate

Mathematical Foundation

The 2PL Model

Fisher Information

Maximum Likelihood Estimation

API Reference

Endpoints

GET /health

POST /sessions → 201

POST /sessions/{id}/answer

POST /estimate

GET /simulate

Selection Methods

Performance Levels

Security

Testing

Research References

Part of the Ed-Tech Suite

Contributing

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages