-
Notifications
You must be signed in to change notification settings - Fork 6
add support for llama2 and claude via bedrock #54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't been able to get the bedrock bits working locally, but I did my mistral experiment based on this, and seems to be working. So.... LGTM!
Python usage note... I don't think its the same PR, and there's serious downsides too (like where do you set default values) but some use of *args
and **kwargs
could potentially DRY up ai_label_student_work
, and make more visible the cases where args/kwargs are changed/overridden before being passed into the model, e.g.:
def ai_label_student_work(self, *args, **kwargs):
if llm_model.startswith("gpt"):
return self.openai_label_student_work(*args, **kwargs)
Obviously has some downsides too as you suddenly can't see what ai_label_student_work
takes params-wise, but I've seen this pattern help a lot of scientific and data sci code if there are lots of layers, and lots of params, and you add/remove params and suddenly have to pipe them around everywhere. May or may not be useful, YMMV.
can you please try running
exciting! How are mistral results looking so far?
I agree that this is a problem, and ideas for pythonic solutions are always welcome as I am still very new at python. Thank you for flagging. |
did an accuracy regression test run before merging to confirm no regression on |
Follows #53
Adds support for the following models via bedrock:
anthropic.claude-v2
meta.llama2-13b-chat-v1
meta.llama2-70b-chat-v1
This enables rubric tester to evaluate the following experiments in
s3://cdo-ai/teaching_assistant/experiments/
:ai-rubrics-json-llama2
ai-rubrics-json-reason-llama2
ai-rubrics-json-reason-claude
Cost warning: we pay cash (not AWS credits) for the use of these models. a complete test run with Claude costs about $4. See README updates in #53