Skip to content

Conversation

@abhishekg999
Copy link
Collaborator

No description provided.

@gemini-code-assist
Copy link

Summary of Changes

Hello @abhishekg999, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the initial version of the Judgeval SDK V1 API, which includes a new JudgmentClient for interacting with the Judgeval service. It also deprecates the existing scorer classes and Tracer, guiding users to migrate to the new V1 API. The changes include version updates, new class implementations, and an example demonstrating the new API's usage.

Highlights

  • Version Update: The pull request updates the judgeval-java library to version 0.3.0 in both the core library and the example project.
  • V1 Implementation: Introduces the initial implementation of the V1 API for the Judgeval SDK, providing a new way to interact with the service.
  • Deprecation of Existing Scorers and Tracer: Marks existing scorer classes (AnswerCorrectnessScorer, AnswerRelevancyScorer, DerailmentScorer, FaithfulnessScorer, InstructionAdherenceScorer, CustomScorer, PromptScorer, TracePromptScorer) and the Tracer class as deprecated, recommending the use of the new V1 JudgmentClient instead.
  • Example Usage: Adds a new example, V1QuickStart.java, demonstrating the usage of the V1 API, including initializing the client, creating tracers, and using different types of scorers.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the v1 API for the Judgeval Java SDK, centered around the new JudgmentClient. The changes include adding the new client, corresponding factories and builders, a quick start example for v1, and deprecating the old API classes with helpful migration guides. The new API design using a central client and factories is a good improvement.

My review focuses on improving correctness, robustness, and developer experience. I've identified a potential runtime error in CustomScorer, a memory leak risk in PromptScorerFactory's cache, and several areas for improvement in the new example code and API class implementations. I've also found that the migration examples in the Javadocs for deprecated classes are incorrect and will cause compilation errors, which I've suggested fixes for.

Comment on lines +15 to +18
@Override
public ScorerConfig getScorerConfig() {
throw new UnsupportedOperationException("CustomScorer does not use ScorerConfig");
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Overriding getScorerConfig() to throw an UnsupportedOperationException will cause a runtime crash when this scorer is used with asyncEvaluate or asyncTraceEvaluate, as the tracer mechanism relies on this method to build the evaluation payload. The default implementation from the parent APIScorer class should be sufficient. Please remove this override.

private final String apiKey;
private final String organizationId;
private final boolean isTrace;
private static final Map<CacheKey, com.judgmentlabs.judgeval.internal.api.models.PromptScorer> cache = new ConcurrentHashMap<>();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using a static ConcurrentHashMap for caching without any eviction policy can lead to a memory leak in long-running applications or in environments where many different scorers are fetched over time. Consider using a cache with an eviction policy (e.g., size-based or time-based eviction) like Guava's CacheBuilder or Caffeine. Alternatively, if the factory's lifecycle is managed, make the cache an instance field rather than static.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its expected that the total number of Scorers is finite and if loaded, will be used multiple times. I.E This is more of a utility to avoid the behavior of the USER loading pulling all scorers at once into a new class, then importing from there.

Comment on lines 28 to 32
return new CustomScorerFactory();
}

public BuiltInScorersFactory builtIn() {
return new BuiltInScorersFactory();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The customScorer() and builtIn() methods create new factory instances on every call. Since these factories (CustomScorerFactory, BuiltInScorersFactory) are stateless, they can be instantiated once and stored as final fields in ScorersFactory to avoid unnecessary object creation.

    private final CustomScorerFactory customScorerFactory = new CustomScorerFactory();
    private final BuiltInScorersFactory builtInScorersFactory = new BuiltInScorersFactory();

    public CustomScorerFactory customScorer() {
        return customScorerFactory;
    }

    public BuiltInScorersFactory builtIn() {
        return builtInScorersFactory;
    }

Comment on lines 127 to 131
try {
Thread.sleep(10000);
} catch (InterruptedException e) {
e.printStackTrace();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using Thread.sleep() to wait for asynchronous operations to complete is unreliable. The tracer might not have finished sending all data within the 10-second window, or the sleep might be unnecessarily long. Use the Tracer's built-in forceFlush() method to ensure all buffered spans are sent before the application exits.

        // Wait for async traces to be exported before exiting.
        tracer.forceFlush(10000);

Comment on lines 14 to 15
private final String judgmentApiKey;
private final String organizationId;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The judgmentApiKey and organizationId fields are initialized in the constructor but are never used. They should be removed to avoid confusion and unnecessary code. The corresponding assignments in the constructor and fields in the builder should also be removed.

Comment on lines 66 to 77
try {
PromptScorer traceScorer = client.scorers()
.tracePromptScorer()
.create()
.name("trace-quality-scorer")
.prompt("Does the entire trace show high quality reasoning?")
.threshold(0.8)
.build();
System.out.println(" Created TracePromptScorer: " + traceScorer.getName());
} catch (Exception e) {
System.out.println(" TracePromptScorer creation demo");
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This try-catch block is unnecessary. Creating a scorer using the builder pattern (create()...build()) is a local operation that does not perform I/O and is not expected to throw an exception. This block appears to be a copy-paste from the example for fetching an existing scorer, which does perform a network request. Removing the try-catch will make the code cleaner.

            PromptScorer traceScorer = client.scorers()
                    .tracePromptScorer()
                    .create()
                    .name("trace-quality-scorer")
                    .prompt("Does the entire trace show high quality reasoning?")
                    .threshold(0.8)
                    .build();
            System.out.println("   Created TracePromptScorer: " + traceScorer.getName());

@abhishekg999 abhishekg999 changed the title feat: init v1 feat: init v1 client format Nov 6, 2025
Copy link
Contributor

@adivate2021 adivate2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm just some minor comments

Copy link
Contributor

@adivate2021 adivate2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@abhishekg999 abhishekg999 merged commit f727451 into main Nov 11, 2025
2 checks passed
@abhishekg999 abhishekg999 deleted the ahh/0.3.0 branch November 11, 2025 22:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants