-
Notifications
You must be signed in to change notification settings - Fork 0
feat: init v1 client format #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @abhishekg999, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces the initial version of the Judgeval SDK V1 API, which includes a new Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces the v1 API for the Judgeval Java SDK, centered around the new JudgmentClient. The changes include adding the new client, corresponding factories and builders, a quick start example for v1, and deprecating the old API classes with helpful migration guides. The new API design using a central client and factories is a good improvement.
My review focuses on improving correctness, robustness, and developer experience. I've identified a potential runtime error in CustomScorer, a memory leak risk in PromptScorerFactory's cache, and several areas for improvement in the new example code and API class implementations. I've also found that the migration examples in the Javadocs for deprecated classes are incorrect and will cause compilation errors, which I've suggested fixes for.
| @Override | ||
| public ScorerConfig getScorerConfig() { | ||
| throw new UnsupportedOperationException("CustomScorer does not use ScorerConfig"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overriding getScorerConfig() to throw an UnsupportedOperationException will cause a runtime crash when this scorer is used with asyncEvaluate or asyncTraceEvaluate, as the tracer mechanism relies on this method to build the evaluation payload. The default implementation from the parent APIScorer class should be sufficient. Please remove this override.
| private final String apiKey; | ||
| private final String organizationId; | ||
| private final boolean isTrace; | ||
| private static final Map<CacheKey, com.judgmentlabs.judgeval.internal.api.models.PromptScorer> cache = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a static ConcurrentHashMap for caching without any eviction policy can lead to a memory leak in long-running applications or in environments where many different scorers are fetched over time. Consider using a cache with an eviction policy (e.g., size-based or time-based eviction) like Guava's CacheBuilder or Caffeine. Alternatively, if the factory's lifecycle is managed, make the cache an instance field rather than static.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its expected that the total number of Scorers is finite and if loaded, will be used multiple times. I.E This is more of a utility to avoid the behavior of the USER loading pulling all scorers at once into a new class, then importing from there.
.../src/main/java/com/judgmentlabs/judgeval/scorers/api_scorers/custom_scorer/CustomScorer.java
Show resolved
Hide resolved
.../src/main/java/com/judgmentlabs/judgeval/scorers/api_scorers/prompt_scorer/PromptScorer.java
Show resolved
Hide resolved
...-java/src/main/java/com/judgmentlabs/judgeval/scorers/api_scorers/AnswerRelevancyScorer.java
Show resolved
Hide resolved
judgeval-java/src/main/java/com/judgmentlabs/judgeval/v1/evaluation/EvaluationFactory.java
Outdated
Show resolved
Hide resolved
| return new CustomScorerFactory(); | ||
| } | ||
|
|
||
| public BuiltInScorersFactory builtIn() { | ||
| return new BuiltInScorersFactory(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The customScorer() and builtIn() methods create new factory instances on every call. Since these factories (CustomScorerFactory, BuiltInScorersFactory) are stateless, they can be instantiated once and stored as final fields in ScorersFactory to avoid unnecessary object creation.
private final CustomScorerFactory customScorerFactory = new CustomScorerFactory();
private final BuiltInScorersFactory builtInScorersFactory = new BuiltInScorersFactory();
public CustomScorerFactory customScorer() {
return customScorerFactory;
}
public BuiltInScorersFactory builtIn() {
return builtInScorersFactory;
}| try { | ||
| Thread.sleep(10000); | ||
| } catch (InterruptedException e) { | ||
| e.printStackTrace(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Thread.sleep() to wait for asynchronous operations to complete is unreliable. The tracer might not have finished sending all data within the 10-second window, or the sleep might be unnecessarily long. Use the Tracer's built-in forceFlush() method to ensure all buffered spans are sent before the application exits.
// Wait for async traces to be exported before exiting.
tracer.forceFlush(10000);| private final String judgmentApiKey; | ||
| private final String organizationId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| try { | ||
| PromptScorer traceScorer = client.scorers() | ||
| .tracePromptScorer() | ||
| .create() | ||
| .name("trace-quality-scorer") | ||
| .prompt("Does the entire trace show high quality reasoning?") | ||
| .threshold(0.8) | ||
| .build(); | ||
| System.out.println(" Created TracePromptScorer: " + traceScorer.getName()); | ||
| } catch (Exception e) { | ||
| System.out.println(" TracePromptScorer creation demo"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This try-catch block is unnecessary. Creating a scorer using the builder pattern (create()...build()) is a local operation that does not perform I/O and is not expected to throw an exception. This block appears to be a copy-paste from the example for fetching an existing scorer, which does perform a network request. Removing the try-catch will make the code cleaner.
PromptScorer traceScorer = client.scorers()
.tracePromptScorer()
.create()
.name("trace-quality-scorer")
.prompt("Does the entire trace show high quality reasoning?")
.threshold(0.8)
.build();
System.out.println(" Created TracePromptScorer: " + traceScorer.getName());
adivate2021
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm just some minor comments
judgeval-java/src/main/java/com/judgmentlabs/judgeval/v1/data/APIScorerType.java
Outdated
Show resolved
Hide resolved
judgeval-java/src/main/java/com/judgmentlabs/judgeval/v1/scorers/built_in/DerailmentScorer.java
Outdated
Show resolved
Hide resolved
judgeval-java/src/main/java/com/judgmentlabs/judgeval/v1/evaluation/Evaluation.java
Show resolved
Hide resolved
judgeval-java/src/main/java/com/judgmentlabs/judgeval/v1/evaluation/Evaluation.java
Show resolved
Hide resolved
...va/src/main/java/com/judgmentlabs/judgeval/v1/scorers/prompt_scorer/PromptScorerFactory.java
Show resolved
Hide resolved
judgeval-java/src/main/java/com/judgmentlabs/judgeval/v1/tracer/BaseTracer.java
Outdated
Show resolved
Hide resolved
adivate2021
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
No description provided.