Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge common APIs for Dictionary #6176

Merged
merged 2 commits into from
Oct 23, 2020

Conversation

Jackie-Jiang
Copy link
Contributor

Description

Motivation:
Currently the APIs for Dictionary is split in 3 places: Dictionary, BaseImmutableDictionary, BaseMutableDictionary. In order to use them, we need to cast the dictionary first, which is hard to manage and can potentially cause casting error.
E.g. #6174 is caused by casting an immutable dictionary to BaseMutableDictionary.
We should move the common read APIs to the root Dictionary interface to avoid the casting, and let all types of dictionary support these APIs.

Merge the following common APIs from BaseImmutableDictionary and BaseMutableDictionary to Dictionary:

  • insertionIndexOf
  • getDictIdsInRange
  • compare
  • getMinVal
  • getMaxVal
  • getSortedValues

@Jackie-Jiang Jackie-Jiang requested a review from chenboat October 22, 2020 19:10

@Override
public Object getSortedValues() {
throw new UnsupportedOperationException();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this unsupported? since it returns true for isSorted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want a default impl, then perhaps just remove it, so the subclass must implement it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the note: This method is for the stats collection phase when sealing the consuming segment, so not required for regular immutable dictionary within the immutable segment.

@@ -81,6 +80,31 @@ public int indexOf(String stringValue) {
return (index >= 0) ? index : NULL_VALUE_INDEX;
}

@Override
public IntSet getDictIdsInRange(String lower, String upper, boolean includeLower, boolean includeUpper) {
throw new UnsupportedOperationException();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can this be supported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is for the unsorted dictionary. Added the note

*/
int indexOf(String stringValue);

/**
* Returns the insertion index of the string representation of the value in the dictionary. This method follows the
* same behavior as in {@link Arrays#binarySearch(Object[], Object)}. All sorted dictionary should support this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: dictionary-> dictionaries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

IntSet getDictIdsInRange(String lower, String upper, boolean includeLower, boolean includeUpper);

/**
* Returns the comparison result of value for dictId 1 and dictId 2, i.e. {@code value1.compareTo(value2)}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it string value comparison?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, actual value. Updated the javadoc

int compare(int dictId1, int dictId2);

/**
* Returns the minimum value in the dictionary. Note that for type BYTES, {@code ByteArray} will be returned.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if the dictionary is empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. It is undefined, and should not be called when the dictionary is empty. Updated the javadoc.

@@ -56,12 +56,12 @@ protected FilterBlock getNextBlock() {

int firstRangeId;
int lastRangeId;
if (_rangePredicateEvaluator instanceof OfflineDictionaryBasedRangePredicateEvaluator) {
if (_rangePredicateEvaluator instanceof SortedDictionaryBasedRangePredicateEvaluator) {
Copy link
Contributor

@chenboat chenboat Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change related to this API refactoring? I did not see any change to OfflineDictionaryBasedRangePredicateEvaluator in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We should determine the evaluator by whether the dictionary is sorted, instead of the instance of the dictionary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean I did not see any change to OfflineDictionaryBasedRangePredicateEvaluator nor SortedDictionaryBasedRangePredicateEvaluator in this PR. Is this an existing bug for range predicate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that is under the RangeIndexBasedFilterOperator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Got it. You made changes in RangePredicateEvaluatorFactory.java.

@deemoliu
Copy link
Contributor

Hello @Jackie-Jiang Can we include this fix in Apache Pinot (incubating) 0.6.0?

cc'ed @yupeng9 @chenboat

@Jackie-Jiang
Copy link
Contributor Author

@deemoliu I think we will cut the release on 10/30. @jackjlli can you please confirm?

@jackjlli
Copy link
Member

Yes, the new release 0.6.0 will be cut on 10/30.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants