Skip to content

lkilefner/llm-quality-evaluation-examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Quality Evaluation Examples for K–12

Created by Lauren Kilefner (lkilefner)

Hi there! I’m Lauren! I'm a former K–12 math and robotics teacher who believes that educators are the magic, and that AI should support great teaching, not replace it. This repo shows how I evaluate AI-generated responses for clarity, accuracy, tone, developmental appropriateness, and usefulness for real students and teachers.

This is the same lens I use in the classroom:

  • Does the explanation make sense to the learner?
  • Is the tone supportive and confidence-building?
  • Are the steps clear and free of unnecessary complexity?
  • Will this help a student move forward?

What this repo includes

Folder What’s Inside Why it Matters
/prompts Sample K–12 academic + feedback prompts with ground truth exemplar answers Demonstrates how I define correctness and clarity
/rubrics A simple scoring rubric for evaluating AI responses Shows I can evaluate consistently and transparently
/datasets Small JSON + CSV datasets used for scoring practice Reflects readiness for LLM testing workflows
/dashboards Notes on how I would visualize and track evaluation results Shows how I think about patterns & quality over time

Example Prompts (Quick Links)

My Evaluation Priorities

When reviewing LLM outputs, I look for:

  1. Accuracy — Is the content correct?
  2. Alignment — Does the response follow the prompt and grade level?
  3. Clarity & Cognitive Load — Is the language simple and structured?
  4. Tone — Does the response encourage and support the learner?
  5. Actionability — Can a student or teacher use this to learn or improve?

In other words:
The explanation should feel like something I’d proudly hand to my own students.

Why I Created This

LLMs are incredibly powerful, but great teaching is personal, kind, and thoughtful. The work here reflects how I help ensure that AI explanations are not just correct, but also human, supportive, and age-appropriate.

About Me

  • Former accelerated math teacher & robotics instructor
  • Curriculum designer & instructional leader
  • Passionate about making learning feel possible for every student
  • Excited about how AI can reduce teacher overload and expand student support

If you'd like to connect:
LinkedIn: www.linkedin.com/in/laurenkilefner

Thanks for being here for caring about thoughtful, responsible AI use in education!

About

K–12 LLM evaluation examples using teacher-centered ground truths, rubrics, and tone-aware response review.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published