Domain
Education / EdTech. A student AI tutor that needs to remember a learner's profile across a long tutoring session:
- Subject performance (struggling with calculus derivatives, strong at statistics)
- Learning style (prefers visual examples, needs step-by-step breakdowns)
- Session goals (preparing for exam on March 15)
- Past mistakes (confused about chain rule at T=20, corrected at T=35)
Why This Matters for LLM Memory Evaluation
The BENCHMARK_FACTS in simulator/facts.py only covers a personal assistant scenario (name, city, occupation). An EdTech scenario tests:
- Hierarchical facts: subject → topic → subtopic
- Evolving understanding: student's mastery level changes over the session
- Multi-update facts: the same concept can be "not understood" → "partial" → "mastered"
Implementation
# simulator/edtech_facts.py
EDTECH_FACTS = [
Fact("student_name", "Priya Nair", injected_at=0),
Fact("subject", "calculus", injected_at=1),
Fact("weak_topic", "chain rule", injected_at=2,
updated_at=35, updated_value="integration by parts"),
Fact("exam_date", "March 15", injected_at=3),
Fact("learning_style", "visual learner", injected_at=4),
Fact("grade_target", "A", injected_at=5),
Fact("last_score", "72%", injected_at=7,
updated_at=60, updated_value="84%"),
Fact("preferred_pace", "slow with examples", injected_at=9),
]
Acceptance Criteria
Domain
Education / EdTech. A student AI tutor that needs to remember a learner's profile across a long tutoring session:
Why This Matters for LLM Memory Evaluation
The BENCHMARK_FACTS in
simulator/facts.pyonly covers a personal assistant scenario (name, city, occupation). An EdTech scenario tests:Implementation
Acceptance Criteria
simulator/edtech_facts.pywithEDTECH_FACTSlistpython main.py --scenario edtechflag (or just use--facts edtech)README.mdresults table with EdTech scenario numbers