Welcome to the home for everything you need to design, build, and sustain high-quality evaluations for AI systems. This repo powers a Mintlify site focused on practical guidance for teams who want to measure model quality with clarity and rigor.
- Engaging guides that walk through the mindset, principles, and workflows behind reliable evaluations.
- Hands-on playbooks packed with facilitation tips, stakeholder prompts, and automation recipes.
- Templates and rubrics you can copy-paste into your own tooling to jumpstart experimentation.
npm install
npm run devThe docs run on Mintlify. Start the local server with npm run dev, then visit http://localhost:3000 to preview changes.
- Create a new branch for your update.
- Make your edits using MDX components when they add clarity or interactivity.
- Run the local server to verify the formatting and interactive elements.
- Open a pull request with a summary of the changes and any new assets you've added.
Whether you're refining prompts, collecting human preference data, or monitoring regressions, these docs should equip you with the patterns to build trustworthy evaluation loops. If you spot a gap or have a new tactic to share, contributions are encouraged!