Releases · VectorInstitute/eval-agents

What's Changed

Bump astral-sh/setup-uv from 7.1.2 to 7.1.3 by @dependabot[bot] in #1
Bump actions/checkout from 5.0.0 to 6.0.0 by @dependabot[bot] in #3
Bump astral-sh/setup-uv from 7.1.3 to 7.1.4 by @dependabot[bot] in #2
Bump astral-sh/setup-uv from 7.1.4 to 7.1.5 by @dependabot[bot] in #6
Bump actions/setup-python from 6.0.0 to 6.1.0 by @dependabot[bot] in #5
Bump actions/checkout from 6.0.0 to 6.0.1 by @dependabot[bot] in #4
Bump astral-sh/setup-uv from 7.1.5 to 7.1.6 by @dependabot[bot] in #7
Refactor package structure and update dependencies by @fcogidi in #9
Bump astral-sh/setup-uv from 7.1.6 to 7.2.0 by @dependabot[bot] in #10
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #11
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #12
Bump actions/setup-python from 6.1.0 to 6.2.0 by @dependabot[bot] in #14
Bump actions/checkout from 6.0.1 to 6.0.2 by @dependabot[bot] in #15
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #16
First version of the Report Generation Agent by @lotif in #13
Add initial working implementation using search grounding by @amrit110 in #17
Feature/knowledge agent by @amrit110 in #18
Bump astral-sh/setup-uv from 7.2.0 to 7.2.1 by @dependabot[bot] in #25
Remove unused weaviate and vertex config, add evaluator model env variable by @amrit110 in #23
First Langfuse evaluations by @lotif in #19
Move grounding tool to reusable parent tools dir by @amrit110 in #24
Add code coverage reporting to unit test workflow by @amrit110 in #26
Report Generation Agent: Adding Trajectory Evaluation by @lotif in #27
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #29
Refactoring the Report Generation Agent demo UI and evaluation script by @lotif in #30
Add AML investigation agent by @fcogidi in #21
Extract langfuse tracing setup for google-adk to more general location by @fcogidi in #31
Improve search tool to extract resolved urls by @amrit110 in #28
Refactor Langfuse dataset upload by @fcogidi in #32
Rename modules to knowledge_qa by @amrit110 in #33
Add dataset upload script by @amrit110 in #34
Add web fetch tool by @amrit110 in #36
Add evaluation harness by @fcogidi in #37
Bump astral-sh/setup-uv from 7.2.1 to 7.3.0 by @dependabot[bot] in #40
Rename fields in CaseRecord for consistency with langfuse evaluators by @fcogidi in #38
Report Generation: Switching from OpenAI SDK to Google ADK by @lotif in #35
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #41
Upgrading cryptography pillow to address security issues by @lotif in #45
Report Generation Agent: switching to SQLAlchemy by @lotif in #42
Add LLM judge evaluator factory by @fcogidi in #39
Add separate DB manager by @amrit110 in #50
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in #51
Add trace groundedness evaluator by @fcogidi in #44
Remove legacy evaulation module by @amrit110 in #53
Add evaluation script for knowledgeqa by @amrit110 in #43
Report Generation: Adding basic online evaluation scores by @lotif in #46
Report generation: Adding user feedback support in the form of thumbs up and thumbs down button in the UI by @lotif in #54
Refactor AML investigation agent by @fcogidi in #48
Add deterministic graders for AML investigation agent by @fcogidi in #49
Add tools for reading and grepping csv/xls files by @amrit110 in #52
Improve search tool code structure by @amrit110 in #55
Add agent planner and event extraction by @amrit110 in #56
Remove gradio file for knowledge_qa implementation by @amrit110 in #57
Add small fix to grader and evaluate scripts by @amrit110 in #58
Add additional tests for processing fns by @amrit110 in #59
Add AML agent evaluation script by @fcogidi in #60

New Contributors

@dependabot[bot] made their first contribution in #1
@fcogidi made their first contribution in #9
@pre-commit-ci[bot] made their first contribution in #11
@lotif made their first contribution in #13
@amrit110 made their first contribution in #17

Full Changelog: https://github.com/VectorInstitute/eval-agents/commits/v0.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!

Releases: VectorInstitute/eval-agents

eval-agents v0.3.0

What's Changed

Contributors

Uh oh!

eval-agents v0.2.1

Uh oh!

aieng-eval-agents v0.2.0

What's Changed

New Contributors

Contributors

Uh oh!