Skip to content

OAK-12010 Simplified index management (without optimizer)#2689

Open
thomasmueller wants to merge 15 commits intotrunkfrom
OAK-12010-subset
Open

OAK-12010 Simplified index management (without optimizer)#2689
thomasmueller wants to merge 15 commits intotrunkfrom
OAK-12010-subset

Conversation

@thomasmueller
Copy link
Member

@thomasmueller thomasmueller commented Jan 19, 2026

This is a subset of the (much) larger draft PR #2652

This is just support for the "diff.index"; without the optimizer.

Copy link
Contributor

@bhabegger bhabegger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean work 👍

I like the test coverage and the comments among the rest !

I don't see anything blocking and just have a few comments/suggestions.

if (!diffContent.exists()) {
continue;
}
PropertyState lastMod = diffContent.getProperty("jcr:lastModified");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not a constant definition for this type of "well-known" entries ? (same for :lastProcessed, jcr:data, /oak:index/, etc.)

In my opinion this helps a lot to understand, identify in one go what are the 'system' properties, paths, etc., and know where they are used. But also have them documented in the code base as javadoc.

(Not a blocker, just a curiosity 😉)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There probably is. I agree we are not consistently using these constants, so there is a risk of typos.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used constants now where available, and did add new ones where I think it is useful (which -- granted -- is subjective).

* @param store the node store
* @param indexDefinitions the /oak:index node
*/
public static void applyDiffIndexChanges(NodeStore store, NodeBuilder indexDefinitions) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ease readability maybe we could split this method in :

  • parsing entries
  • diffing entries

and delegate the orchestration to a caller method ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now split it into two parts that I think make sense.

JsonObject repositoryDefinitions = RootIndexesListService.getRootIndexDefinitions(indexDefinitions);
LOG.debug("Index list {}", repositoryDefinitions.toString());
try {
DiffIndexMerger.instance().merge(newImageLuceneDefinitions, repositoryDefinitions, store);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge also seams to do something with diff.index and diff.index.optimizer couldn't we call the mergeDiff directly instead (without a dependency to the NodeStore)?

Copy link
Member Author

@thomasmueller thomasmueller Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to re-use the "merge" feature in another project (when processing diffs that are sent via script, so come from external sources). So it's better to keep DiffIndexMerger#merge


// merge
JsonObject merged = null;
if (indexDiff == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example, maybe this premerge null checking logic could be moved down into the processMerge itself ? (The "nothing to do" case could be handled by a null or Optional merged result)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some code executed in this case (latestCustomized == null...) so that would require changing processMerge, which I would avoid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        if (indexDiff == null) {
            // no diff definition: use to the OOTB index
            if (latestCustomized == null) {
                log("Only a product index found, nothing to do");
                return false;
            }
            merged = latestProductIndex;
        } else {
            merged = processMerge(latestProductIndex, indexDiff);
        }

Could be rewritten:

           if(indexDiff == null && latestCustomized == null) {
                log("Only a product index found, nothing to do");
                return false;
           }
           merged = processMerge(latestProductIndex, indexDiff);

with processMerge returning latestProductIndex if indexDiff is null.

we could even let it return null if both are null and exit fast if merged is null with the log.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it would move part of the logic to processMerge, which I would like to avoid. Also, the comment is lost.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's fine then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's fine then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's fine then.

*
* "null" entries are not supported.
*/
public class JsonNodeBuilder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Builder in the name could be misleading as, unless I missed something. this class doesn't seem to follow the "Builder" pattern, and seams more like a Helper class (but then there are different style tastes out their).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right... It's a converter from "Json" to "NodeBuilder". It's confusing.

What about JsonNodeWriter, or JsonToNodeConverter?

Copy link
Contributor

@bhabegger bhabegger Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer Converter among the two as Writer is quite connotated (PrintWriter, StringWriter, etc.). However, both Writer and Converter still suggests some sort of instantiation and don't highlight that this class is a utility class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JsonNodeConversionUtil ?
JsonNodeConversionHelper ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, it's not really a converter... the nodes are persisted in Oak, and "converter" (for me) would imply that the nodes are just converted and the result is returned.

It's mostly about creating, and possibly updating and deleting, nodes... So what about:

  • JsonNodeSmith (recommended by a tool; sounds right to me)
  • JsonNodeCreator (because it's mostly about creating new nodes... but not purely)
  • JsonNodeWriter, for the case it's updating / deleting (currently, I think it's not, but possibly in the future)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then. Writer could pass, because it does I/O even though it's not in the usual sense. JsonNodeSaver ? JsonNodeStorer ? JsonNodeManager ? JsonNodeUpdater ?

I'm not really convinced by any... But maybe it triggers need ideas ? Otherwise just go with your preference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then. Writer could pass, because it does I/O even though it's not in the usual sense. JsonNodeSaver ? JsonNodeStorer ? JsonNodeManager ? JsonNodeUpdater ?

I'm not really convinced by any... But maybe it triggers need ideas ? Otherwise just go with your preference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so I use JsonNodeUpdater.

import java.util.ArrayList;

import org.apache.felix.inventory.Format;
import org.apache.jackrabbit.oak.commons.json.JsonObject;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing that OAK has it's own JSON processing library for historical reasons ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

+ " } }");

repositoryDefinitions = RootIndexesListService.getRootIndexDefinitions(store, "lucene");
assertSameJson("{\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame Java 11 is a bit too early for text blocks :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree (btw. text blocks are not "raw" and still need escaping).

But I think it's readable as is.

public void createDummy() {
// when enabling "deleteCreatesDummyIndex", then a dummy index is created
// (that indexes /dummy, which doesn't exist)
String merged = new DiffIndexMerger(new String[0], true, true, false).processMerge(JsonObject.fromJson("{}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which true is it ?

Suggested change
String merged = new DiffIndexMerger(new String[0], true, true, false).processMerge(JsonObject.fromJson("{}"
boolean deleteCreatesDummyIndex = true;
String merged = new DiffIndexMerger(new String[0], deleteCreatesDummyIndex, true, false).processMerge(JsonObject.fromJson("{}"

Would make it explicit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, "list of boolean flags" is not user friendly... thinking about an alternative.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added setters (there no real harm to not have the fields final in this case). Sure we could also add a Builder. Or a Factory, or AdapterFactoryBuilder... :-)

thomasmueller and others added 2 commits February 2, 2026 16:18
…/diff/DiffIndex.java

Co-authored-by: Benjamin Habegger <benjamin@habegger.tech>
…/diff/DiffIndex.java

Co-authored-by: Benjamin Habegger <benjamin@habegger.tech>
Copy link
Contributor

@amit-jain amit-jain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me!!
A few minor suggestions added.

* (input)
* @return whether a new version of an index was added
*/
public boolean processMerge(String indexName, JsonObject indexDiff, JsonObject newImageLuceneDefinitions, JsonObject combined) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes recommend breaking this method, example (from comments)

  • findLatestVersions
  • performMerge
  • isChanged
  • addMetadata

thomasmueller and others added 8 commits February 6, 2026 15:25
…/diff/DiffIndex.java

Co-authored-by: Amit Jain <amitj@apache.org>
…/diff/DiffIndexMerger.java

Co-authored-by: Amit Jain <amitj@apache.org>
…/diff/DiffIndex.java

Co-authored-by: Amit Jain <amitj@apache.org>
…/diff/JsonNodeBuilder.java

Co-authored-by: Benjamin Habegger <benjamin@habegger.tech>
…/diff/DiffIndex.java

Co-authored-by: Amit Jain <amitj@apache.org>
@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 6, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants