Skip to content

Conversation

@robfrank
Copy link
Collaborator

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @robfrank, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where sequential index creation operations would fail with a NeedRetryException if background compaction was running. The fix implements a synchronous blocking mechanism, causing index creation to wait for ongoing asynchronous tasks to complete, thereby improving the reliability and predictability of index management without requiring manual retry logic from users.

Highlights

  • Sequential Index Creation Reliability: Addressed NeedRetryException that occurred when creating multiple indexes sequentially on large datasets, especially when background LSMTree compaction was active.
  • Synchronous Blocking for Index Creation: Modified TypeIndexBuilder.java and ManualIndexBuilder.java to wait for asynchronous tasks (like compaction) to complete before creating a new index, instead of throwing an exception.
  • Enhanced Test Coverage: Introduced Issue2702SequentialIndexCreationTest.java to comprehensively test sequential index creation and the new waiting behavior under compaction scenarios.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • 2702-sequential-index-creation-fix.md
  • engine/src/main/java/com/arcadedb/index/fulltext/LSMTreeFullTextIndex.java
    • Added Comparator import.
    • Added Javadoc comments to several methods (LSMTreeFullTextIndexFactoryHandler, get(Object[] keys, int limit), parseQueryTerms, put, remove, remove(Object[] keys, Identifiable rid), analyzeText).
    • Refactored searchMoreLikeThis method for cleaner code, removing redundant null/empty checks and simplifying score accumulation and sorting.
  • engine/src/main/java/com/arcadedb/schema/ManualIndexBuilder.java
    • Changed behavior to wait for async completion instead of throwing NeedRetryException.
    • Added explanatory comment.
  • engine/src/main/java/com/arcadedb/schema/TypeIndexBuilder.java
    • Changed behavior to wait for async completion instead of throwing NeedRetryException.
    • Added explanatory comment.
  • engine/src/test/java/com/arcadedb/index/Issue2702SequentialIndexCreationTest.java
    • Added new comprehensive test for sequential index creation and async compaction waiting.
Activity
  • No specific review comments or discussions have been recorded yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@claude
Copy link

claude bot commented Feb 11, 2026

Pull Request Review

I've reviewed PR #3408 and have the following feedback:

Overview

This PR addresses Issue #2702 by changing the behavior when creating indexes while async processing (LSMTree compaction) is running. Instead of immediately throwing NeedRetryException, the code now waits for async tasks to complete.


🟢 Positives

1. Code Quality

  • ✅ Clean, focused changes with clear intent
  • ✅ Excellent inline comments explaining the rationale
  • ✅ Consistent implementation across both TypeIndexBuilder and ManualIndexBuilder
  • ✅ Follows existing code style (single-statement if blocks without braces)

2. Test Coverage

  • ✅ Comprehensive test class with two distinct test scenarios
  • ✅ Tests cover the actual issue (sequential index creation on large datasets)
  • ✅ Tests verify the waiting behavior explicitly
  • ✅ Good use of assertThat() syntax as per project guidelines
  • ✅ Proper cleanup of GlobalConfiguration changes in finally blocks

3. Documentation

  • ✅ Extensive markdown documentation (2702-sequential-index-creation-fix.md) explaining the problem, solution, and trade-offs
  • ✅ Clear comments in code explaining the behavior change

🟡 Concerns & Suggestions

1. Thread Safety & Deadlock Risk ⚠️

Issue: Calling database.async().waitCompletion() while holding transaction locks could potentially lead to deadlock scenarios.

Location:

  • TypeIndexBuilder.java:86-89
  • ManualIndexBuilder.java:47-50

Concern: If the async compaction task needs to acquire any locks that the current thread holds, this could deadlock. The code path should be analyzed to ensure:

  • The async tasks don't need schema locks
  • The current transaction context won't block async workers

Recommendation: Add a timeout to waitCompletion() calls to prevent indefinite blocking:

if (database.isAsyncProcessing()) {
  // Wait with timeout to prevent indefinite blocking
  database.async().waitCompletion(timeout);
}

2. LSMTreeFullTextIndex.java Changes

Issue: This file contains significant refactoring unrelated to the main fix:

  • Removed AtomicInteger usage in searchMoreLikeThis (line 695: Map<RID, Integer> instead of Map<RID, AtomicInteger>)
  • Simplified control flow (removed many braces)
  • Added JavaDoc comments
  • Changed sorting implementation (line 715: using Comparator.comparingInt().reversed())

Concern: These changes, while improvements, are not part of the sequential index creation fix and make the PR harder to review. They could also introduce subtle bugs.

Recommendation:

  • Per CLAUDE.md: "don't add Claude as author" - The extensive JavaDoc additions suggest AI-generated documentation
  • 🔄 Consider splitting LSMTreeFullTextIndex changes into a separate PR for proper review
  • 🧪 Ensure the searchMoreLikeThis refactoring has adequate test coverage

3. Unrelated Documentation File

Issue: The file 2702-sequential-index-creation-fix.md in the root directory appears to be design documentation.

Question: Should this be:

  • Moved to a docs/ directory?
  • Converted to a GitHub issue comment instead?
  • Removed before merging (documentation belongs in code comments and commit messages)?

Per CLAUDE.md: "NEVER proactively create documentation files (*.md)" - This markdown file should likely not be in the repository.

4. Missing Validation

Question: What happens if database.async().waitCompletion() throws an exception? Should there be error handling?

if (database.isAsyncProcessing()) {
  try {
    database.async().waitCompletion();
  } catch (InterruptedException e) {
    Thread.currentThread().interrupt();
    throw new NeedRetryException("Interrupted while waiting for async tasks", e);
  }
}

5. Performance Considerations

Question: The documentation states this can take "30-60+ seconds" for compaction. This means:

  • Index creation DDL operations can block for minutes
  • Applications may experience timeouts if not configured properly
  • Users might perceive the system as hung

Recommendation: Consider adding:

  • Logging when waiting starts: LogManager.instance().log(..., FINE, "Waiting for async compaction...")
  • Progress indicators if possible
  • Documentation in user-facing docs about this behavior

🔍 Testing Recommendations

Run These Tests Before Merging:

# Test the new functionality
mvn test -Dtest=Issue2702SequentialIndexCreationTest

# Regression tests for index creation
mvn test -Dtest="*IndexBuilder*,*IndexCompaction*,LSMTreeIndexTest,TypeLSMTreeIndexTest"

# Full-text index tests (due to LSMTreeFullTextIndex changes)
mvn test -Dtest="*FullText*,*MoreLikeThis*"

🎯 Recommendations Summary

Must Address:

  1. ❌ Remove or relocate the `2702-sequential-index-creation-fix.md" file (violates CLAUDE.md guidelines)
  2. 🔄 Split LSMTreeFullTextIndex refactoring into a separate PR or justify its inclusion
  3. 🧪 Run full test suite, especially full-text search tests

Should Consider:

  1. Add timeout to waitCompletion() calls
  2. Add logging when blocking on async tasks
  3. Add exception handling for interrupted waits
  4. Verify no deadlock scenarios exist

Nice to Have:

  1. Document the blocking behavior in user-facing documentation
  2. Consider adding metrics/monitoring for wait times

✅ Conclusion

The core fix for Issue #2702 is sound and well-implemented. The approach is simple, predictable, and aligns with standard database behavior. However, the PR contains unrelated changes that should be addressed before merging.

Verdict: Approve with changes requested for:

  • Removing the markdown doc file
  • Clarifying/separating LSMTreeFullTextIndex changes
  • Adding timeout/logging for production robustness

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an issue where creating multiple indexes sequentially could fail with a NeedRetryException. The fix, implemented in TypeIndexBuilder and ManualIndexBuilder, is to wait for any ongoing asynchronous tasks to complete instead of throwing an exception. This change improves the developer experience by removing the need for manual retry logic.

A new test class, Issue2702SequentialIndexCreationTest, has been added to verify the fix under conditions that would previously trigger the error, which is great.

Additionally, the LSMTreeFullTextIndex class has been refactored for better readability and maintainability, including the addition of Javadocs and the use of more modern Java constructs.

My review includes one suggestion regarding coding style in LSMTreeFullTextIndex.java to ensure long-term maintainability.

@mergify
Copy link
Contributor

mergify bot commented Feb 11, 2026

🧪 CI Insights

Here's what we observed from your CI run for 2422c46.

🟢 All jobs passed!

But CI Insights is watching 👀

@codacy-production
Copy link

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-11.59% 84.62%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (ef9cb7c) 83379 60226 72.23%
Head commit (2422c46) 112953 (+29574) 68495 (+8269) 60.64% (-11.59%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#3408) 26 22 84.62%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

@codecov
Copy link

codecov bot commented Feb 11, 2026

Codecov Report

❌ Patch coverage is 65.38462% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.57%. Comparing base (ef9cb7c) to head (2422c46).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
.../arcadedb/index/fulltext/LSMTreeFullTextIndex.java 70.83% 2 Missing and 5 partials ⚠️
...n/java/com/arcadedb/schema/ManualIndexBuilder.java 0.00% 1 Missing ⚠️
...ain/java/com/arcadedb/schema/TypeIndexBuilder.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3408      +/-   ##
==========================================
+ Coverage   63.47%   63.57%   +0.09%     
==========================================
  Files        1250     1251       +1     
  Lines       83379    83395      +16     
  Branches    16986    16991       +5     
==========================================
+ Hits        52928    53018      +90     
+ Misses      23179    23092      -87     
- Partials     7272     7285      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.


// Wait for any running async tasks (e.g., compaction) to complete before creating new index
// This prevents NeedRetryException when creating multiple indexes sequentially on large datasets
if (database.isAsyncProcessing())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be

while (database.isAsyncProcessing())          
  database.async().waitCompletion();

In case of concurrency?

@@ -0,0 +1,163 @@
# Issue #2702: NeedRetryException when creating indexes sequentially on large datasets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this file in the codebase


// Wait for any running async tasks (e.g., compaction) to complete before creating new index
// This prevents NeedRetryException when creating multiple indexes sequentially on large datasets
if (database.isAsyncProcessing())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be

while (database.isAsyncProcessing())          
  database.async().waitCompletion();

In case of concurrency?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants