Releases: VectorInstitute/eval-agents
Releases · VectorInstitute/eval-agents
eval-agents v0.3.0
What's Changed
- Report Generation: Adding notebooks to better explain the usage by @lotif in #61
- Add integration test before workspace start by @amrit110 in #62
- Report Generation: Adding unit tests by @lotif in #64
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #65
- Add notebooks for AML investigation use case by @fcogidi in #63
- Report Generation: Adding a system diagram by @lotif in #66
- Add knowledgeqa agent with refactored code by @amrit110 in #67
- Fix notebooks, agent code by @amrit110 in #68
- Update and fix config by @amrit110 in #69
- Bump up to 0.3.0 by @amrit110 in #70
Full Changelog: v0.2.1...v0.3.0
eval-agents v0.2.1
Full Changelog: v0.2.0...v0.2.1
aieng-eval-agents v0.2.0
What's Changed
- Bump astral-sh/setup-uv from 7.1.2 to 7.1.3 by @dependabot[bot] in #1
- Bump actions/checkout from 5.0.0 to 6.0.0 by @dependabot[bot] in #3
- Bump astral-sh/setup-uv from 7.1.3 to 7.1.4 by @dependabot[bot] in #2
- Bump astral-sh/setup-uv from 7.1.4 to 7.1.5 by @dependabot[bot] in #6
- Bump actions/setup-python from 6.0.0 to 6.1.0 by @dependabot[bot] in #5
- Bump actions/checkout from 6.0.0 to 6.0.1 by @dependabot[bot] in #4
- Bump astral-sh/setup-uv from 7.1.5 to 7.1.6 by @dependabot[bot] in #7
- Refactor package structure and update dependencies by @fcogidi in #9
- Bump astral-sh/setup-uv from 7.1.6 to 7.2.0 by @dependabot[bot] in #10
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #11
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #12
- Bump actions/setup-python from 6.1.0 to 6.2.0 by @dependabot[bot] in #14
- Bump actions/checkout from 6.0.1 to 6.0.2 by @dependabot[bot] in #15
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #16
- First version of the Report Generation Agent by @lotif in #13
- Add initial working implementation using search grounding by @amrit110 in #17
- Feature/knowledge agent by @amrit110 in #18
- Bump astral-sh/setup-uv from 7.2.0 to 7.2.1 by @dependabot[bot] in #25
- Remove unused weaviate and vertex config, add evaluator model env variable by @amrit110 in #23
- First Langfuse evaluations by @lotif in #19
- Move grounding tool to reusable parent tools dir by @amrit110 in #24
- Add code coverage reporting to unit test workflow by @amrit110 in #26
- Report Generation Agent: Adding Trajectory Evaluation by @lotif in #27
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #29
- Refactoring the Report Generation Agent demo UI and evaluation script by @lotif in #30
- Add AML investigation agent by @fcogidi in #21
- Extract langfuse tracing setup for google-adk to more general location by @fcogidi in #31
- Improve search tool to extract resolved urls by @amrit110 in #28
- Refactor Langfuse dataset upload by @fcogidi in #32
- Rename modules to knowledge_qa by @amrit110 in #33
- Add dataset upload script by @amrit110 in #34
- Add web fetch tool by @amrit110 in #36
- Add evaluation harness by @fcogidi in #37
- Bump astral-sh/setup-uv from 7.2.1 to 7.3.0 by @dependabot[bot] in #40
- Rename fields in
CaseRecordfor consistency with langfuse evaluators by @fcogidi in #38 - Report Generation: Switching from OpenAI SDK to Google ADK by @lotif in #35
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #41
- Upgrading cryptography pillow to address security issues by @lotif in #45
- Report Generation Agent: switching to SQLAlchemy by @lotif in #42
- Add LLM judge evaluator factory by @fcogidi in #39
- Add separate DB manager by @amrit110 in #50
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #51
- Add trace groundedness evaluator by @fcogidi in #44
- Remove legacy evaulation module by @amrit110 in #53
- Add evaluation script for knowledgeqa by @amrit110 in #43
- Report Generation: Adding basic online evaluation scores by @lotif in #46
- Report generation: Adding user feedback support in the form of thumbs up and thumbs down button in the UI by @lotif in #54
- Refactor AML investigation agent by @fcogidi in #48
- Add deterministic graders for AML investigation agent by @fcogidi in #49
- Add tools for reading and grepping csv/xls files by @amrit110 in #52
- Improve search tool code structure by @amrit110 in #55
- Add agent planner and event extraction by @amrit110 in #56
- Remove gradio file for knowledge_qa implementation by @amrit110 in #57
- Add small fix to grader and evaluate scripts by @amrit110 in #58
- Add additional tests for processing fns by @amrit110 in #59
- Add AML agent evaluation script by @fcogidi in #60
New Contributors
- @dependabot[bot] made their first contribution in #1
- @fcogidi made their first contribution in #9
- @pre-commit-ci[bot] made their first contribution in #11
- @lotif made their first contribution in #13
- @amrit110 made their first contribution in #17
Full Changelog: https://github.com/VectorInstitute/eval-agents/commits/v0.2.0