Skip to content

Timestamp type index#8343

Merged
xiangfu0 merged 3 commits intoapache:masterfrom
xiangfu0:timestamp-type-index
Apr 11, 2022
Merged

Timestamp type index#8343
xiangfu0 merged 3 commits intoapache:masterfrom
xiangfu0:timestamp-type-index

Conversation

@xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Mar 12, 2022

Description

Adding Timestamp index for Pinot data type TIMESTAMP.

Timestamp stores value as millisecond epoch long value. Typical workload for timestamp is filtering on a time range and group by with different time granularities(days/month/etc).

The current implementation requires the query executor to extract values, apply the function then do filter/groupBy, no leverage on dictionary or index.

This PR introduces the Timestamp Index. Users can configure the most useful granularities for a Timestamp data type column.

  1. Pinot will pre-generate one column per time granularity with forward index and range index. The naming convention is $${ts_column_name}$${ts_granularity}, e.g. Timestamp column ts with granularities DAY, MONTH will have two extra columns generated: $ts$DAY and $ts$MONTH.
  2. Query overwrite for predicate and selection/groupby:
    2.1 GROUPBY: functions like dateTrunc('DAY', ts) will be translated to use the underly column $ts$DAY to fetch data.
    2.2 PREDICATE: range index is auto-built for all granularity columns.

Example query usage:

select count(*), datetrunc('WEEK', ts) as tsWeek from airlineStats WHERE tsWeek > fromDateTime('2014-01-16', 'yyyy-MM-dd') group by tsWeek  limit 10

Some preliminary benchmark shows the query perf over 2.7 billion records improved from 45 secs to 4.2 secs

select dateTrunc('YEAR', event_time) as y, dateTrunc('MONTH', event_time) as m,  sum(pull_request_commits) from githubEvents group by y, m limit 1000  Option(timeoutMs=3000000)

image
vs.
image

Release Notes

Introduces the Timestamp Index. Users can configure the most useful granularities for a Timestamp data type column.

  1. Pinot will pre-generate one column per time granularity with forward index and range index. The naming convention is $${ts_column_name}$${ts_granularity}, e.g. Timestamp column ts with granularities DAY, MONTH will have two extra columns generated: $ts$DAY and $ts$MONTH.
  2. Query overwrite for predicate and selection/groupby:
    2.1 SELECTION/GROUPBY: functions like dateTrunc('DAY', ts) will be translated to use the underly column $ts$DAY to fetch data.
    2.2 PREDICATE: range index is auto-built for all granularity columns.

Documentation

@xiangfu0 xiangfu0 marked this pull request as draft March 12, 2022 00:38
@xiangfu0 xiangfu0 force-pushed the timestamp-type-index branch 6 times, most recently from cb8f7d1 to b197bfe Compare March 12, 2022 07:44
@codecov-commenter
Copy link

codecov-commenter commented Mar 12, 2022

Codecov Report

Merging #8343 (a613e80) into master (b2e2a7e) will increase coverage by 0.08%.
The diff coverage is 53.08%.

@@             Coverage Diff              @@
##             master    #8343      +/-   ##
============================================
+ Coverage     70.57%   70.66%   +0.08%     
- Complexity     4283     4285       +2     
============================================
  Files          1672     1674       +2     
  Lines         87483    87640     +157     
  Branches      13241    13273      +32     
============================================
+ Hits          61745    61931     +186     
+ Misses        21454    21398      -56     
- Partials       4284     4311      +27     
Flag Coverage Δ
integration1 27.12% <5.55%> (+0.10%) ⬆️
integration2 26.11% <3.70%> (+0.03%) ⬆️
unittests1 66.99% <66.37%> (-0.01%) ⬇️
unittests2 14.08% <0.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...inot/common/function/scalar/DateTimeFunctions.java 95.23% <ø> (ø)
...he/pinot/segment/local/utils/TableConfigUtils.java 65.13% <0.00%> (-0.29%) ⬇️
...local/segment/index/loader/IndexLoadingConfig.java 66.00% <12.50%> (-4.58%) ⬇️
...roker/requesthandler/BaseBrokerRequestHandler.java 50.63% <15.21%> (-1.43%) ⬇️
...egment/spi/index/metadata/SegmentMetadataImpl.java 73.59% <25.00%> (-0.70%) ⬇️
...local/recordtransformer/ExpressionTransformer.java 87.71% <30.00%> (-12.29%) ⬇️
...pache/pinot/common/config/provider/TableCache.java 76.07% <66.66%> (-0.12%) ⬇️
...ot/spi/config/table/TimestampIndexGranularity.java 81.25% <81.25%> (ø)
...ot/segment/spi/creator/SegmentGeneratorConfig.java 82.49% <87.09%> (+0.40%) ⬆️
...indexsegment/immutable/ImmutableSegmentLoader.java 85.18% <100.00%> (+0.42%) ⬆️
... and 34 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b2e2a7e...a613e80. Read the comment docs.

@xiangfu0 xiangfu0 force-pushed the timestamp-type-index branch from b197bfe to b96fdbd Compare March 12, 2022 09:55
@xiangfu0 xiangfu0 force-pushed the timestamp-type-index branch 2 times, most recently from b85926d to 80ecfe8 Compare March 12, 2022 11:44
@xiangfu0 xiangfu0 force-pushed the timestamp-type-index branch from 80ecfe8 to 6b76294 Compare March 12, 2022 20:29
@kishoreg
Copy link
Member

Please add description

@xiangfu0 xiangfu0 force-pushed the timestamp-type-index branch from 0caf992 to 760e539 Compare March 12, 2022 23:58
@xiangfu0
Copy link
Contributor Author

Please add description

Will do. This is still WIP, also need to add unit & integration tests.

@xiangfu0 xiangfu0 force-pushed the timestamp-type-index branch 11 times, most recently from 963231b to de009b2 Compare March 16, 2022 22:59
@richardstartin
Copy link
Member

Can you explain if there is any dependency on distinct hash code values? I recall seeing hash codes being used as map keys but can’t find it.

@xiangfu0 xiangfu0 force-pushed the timestamp-type-index branch from 8740247 to 531cf6e Compare March 30, 2022 18:35
@xiangfu0
Copy link
Contributor Author

Can you explain if there is any dependency on distinct hash code values? I recall seeing hash codes being used as map keys but can’t find it.

Got suggestions from @Jackie-Jiang to change this logic to use the hint expressions to prevent the hash collision.
See: #8421

@xiangfu0 xiangfu0 force-pushed the timestamp-type-index branch 4 times, most recently from 3d077c2 to 5f57202 Compare March 31, 2022 08:30
@xiangfu0 xiangfu0 force-pushed the timestamp-type-index branch from 5f57202 to cdabfa3 Compare March 31, 2022 08:48
Copy link
Member

@richardstartin richardstartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This achieves a couple of goals

  • use an index for time range queries in common time units
  • have a dictionary for time values in common time units, which accelerates group by expressions
  • bypasses the time conversions which are slow and have to use a library, so can't be accelerated further

This could be extended to provide inventory for any affine function ax + b for given parameter values for a and b, or any monotonic function. So we could also do this for cents and dollars or celsius and Fahrenheit, assuming we have functions to represent these conversions, or even for logarithms.

I think this is the simplest possible implementation (just add a column for each unit) but I think it's wasteful.

  • time range parameters could just be translated into the base unit. e.g. to query a day range, convert the first day's start and the last day's end into milliseconds and query the range index in millisecond units. Parameter translation works for any affine or monotonic function, whereas the materialisation only works for a given scale a and offset b. That is, you don't need to store anything extra to make this work, we just need an affine function concept. If an affine function f is applied to a field x, and x has an index, that index also indexes f(query parameters).
  • this approach can never work for arbitrary arithmetic expressions like ax + b, because we need a physical data structure for each pair (a, b).

@richardstartin
Copy link
Member

Thinking about it more, the time transforms are indeed costly so storing the values makes sense for this use case, which can be extended to many time conversion functions. I don't think this approach generalises to arithmetic, where the transforms tend to vectorize anyway so aren't as slow as time conversion.

@xiangfu0 xiangfu0 force-pushed the timestamp-type-index branch 2 times, most recently from b5f7ace to 69e50bb Compare April 1, 2022 23:02
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my understanding, the current implementation does not handle the case of adding timestamp index on the fly. The index addition logic is not implemented in the BaseDefaultColumnHandler

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some validation to the TableConfigUtils.validate()? The timestamp index can only be applied to columns with millis granularity (TIMESTAMP data type, or DATE_TIME field with EPOCH:MILLISECONDS, or it will give wrong result.

@xiangfu0
Copy link
Contributor Author

xiangfu0 commented Apr 8, 2022

Can we add some validation to the TableConfigUtils.validate()? The timestamp index can only be applied to columns with millis granularity (TIMESTAMP data type, or DATE_TIME field with EPOCH:MILLISECONDS, or it will give wrong result.

make sense.
Adding the validate logic that this index only supports for TIMESTAMP type for now.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature release-notes Referenced by PRs that need attention when compiling the next release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants