Skip to content

Remove extra validation checks and add some tests for detect and evaluate #59

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 25, 2025

Conversation

alexlyzhov
Copy link
Contributor

@alexlyzhov alexlyzhov commented Apr 10, 2025

  • Remove field validation (let's do it on SDK backend instead of here?)
  • Remove the requirement for the "context" field to be present (need for IA2 to work w/o placeholder context)
  • Add tests for detect and evaluate

@alexlyzhov alexlyzhov requested a review from pjoshi30 April 10, 2025 01:48
assert hasattr(response, 'status') or hasattr(response, 'hallucination')
assert any("deprecated" in str(warn.message).lower() for warn in w)

def test_analyze_eval_missing_header_error(self):
Copy link
Contributor Author

@alexlyzhov alexlyzhov Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is the only one failing in the test suite. It's testing AnalyzeEval (the obsolete interface we still support). The SDK backend returns response 'task_definition must be a string' https://github.com/aimonlabs/aimon-sdk-backend/blob/d5aa4c8cb31b7ff3a79676dad74547c92195bc03/aimon-sdk-backend/service.py#L175. Not sure why this happens, task_definition is absent from query altogether and not needed, so the backend must not have checked if it's a string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes lets remove AnalyzeEval since its been deprecated for a while

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

"evaluation_id": am_eval.id,
"evaluation_run_id": eval_run.id,
}
if "prompt" in record and record["prompt"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when async is True, we still need to handle the validation but we can do that in the backend service

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we have this handled in the backend 👍

@pjoshi30
Copy link
Contributor

@alexlyzhov lets bump the version in the setup.py

Copy link
Contributor

@pjoshi30 pjoshi30 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -1,4 +1,4 @@
# File generated from our OpenAPI spec by Stainless. See CONTRIBUTING.md for details.

__title__ = "aimon"
__version__ = "0.9.2"
__version__ = "0.9.3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets do 0.10.0 since we are including the new IA detector

setup.py Outdated
@@ -8,7 +8,7 @@
name='aimon',
python_requires='>3.8.0',
packages=find_packages(),
version="0.9.2",
version="0.9.3",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.10.0

@pjoshi30 pjoshi30 merged commit 2b8f7a3 into main Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants