Skip to content

Conversation

@JeffreySmith
Copy link

No description provided.

abhishekrb19 and others added 30 commits December 5, 2023 22:30
It wasn't checking the column name, so it would return a domain regardless
of the input column. This means that null filters on data sources with range
partitioning would lead to excessive pruning of segments, and therefore
missing results.
…er (apache#15496)

Description
With CentralizedDatasourceSchema (apache#14989) feature enabled, metadata for appended segments was not being refreshed. This caused numRows to be 0 for the new segments and would probably cause the datasource schema to not include columns from the new segments.

Analysis
The problem turned out in the new QuerySegmentWalker implementation in the Coordinator. It first finds the segment to be queried in the Coordinator timeline. Then it creates a new timeline of the segments present in the timeline.
The problem was that it is looking up complete partition set in the new timeline. Since the appended segments by themselves do not make a complete partition set, no SegmentMetadataQuery were executed.
…s Github Actions (apache#15506)

Add test logs zipping and archival steps for failures in Static Checks Github Actions
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
* add Assert function to verify in the DataGeneratorTest

* remove unused log in DataGeneratorTest

* add comment for DataGeneratorTest
Fixes a potential NPE which could occur while folding the HllSketchAggregator. If the sketch is null, druid could return a null HllSketchHolder object. Adding a null check here could help here

Resolves a null pointer exception in HllSketchAggregatorFactory
…alars for equality (apache#15503)

* fix array presenting columns to not match single element arrays to scalars for equality
* update docs to clarify usage model of mixed type columns
…pache#15517)

* Add better error messages for using OVERWRITE with INSERT statments
* Fixing NPE with virtual expression with unnest

* Fixing a comment
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
…eptions in Router (apache#15526)

Query with lookups in FilteredAggregator fails with this exception in router,

Cannot construct instance of `org.apache.druid.query.aggregation.FilteredAggregatorFactory`, problem: Lookup [campaigns_lookup[campaignId][is_sold][autodsp]] not found at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: 1, column: 913] (through reference chain: org.apache.druid.query.groupby.GroupByQuery["aggregations"]->java.util.ArrayList[1])
T
he problem is that constructor of FilteredAggregatorFactory is actually validating if the lookup exists in this statement dimFilter.toFilter().
This is failing on the router, which is to be expected, because, the router isn’t assigned any lookups.
The fix is to move to a lazy initialisation of the filter object in the constructor.
* Add release notes template

* Update spellcheck
* Add initial draft of MarkDanglingTombstonesAsUnused duty.

* Use overshadowed segments instead of all used segments.

* Add unit test for MarkDanglingSegmentsAsUnused duty.

* Add mock call

* Simplify code.

* Docs

* shorter lines formatting

* metric doc

* More tests, refactor and fix up some logic.

* update javadocs; other review comments.

* Make numCorePartitions as 0 in the TombstoneShardSpec.

* fix up test

* Add tombstone core partition tests

* Update docs/design/coordinator.md

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>

* review comment

* Minor cleanup

* Only consider tombstones with 0 core partitions

* Need to register the test shard type to make jackson happy

* test comments

* checkstyle

* fixup misc typos in comments

* Update logic to use overshadowed segments

* minor cleanup

* Rename duty to eternity tombstone instead of dangling. Add test for full eternity tombstone.

* Address review feedback.

---------

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
* Fix empty logs and status messages for mmless ingestion

* Add tests
### Description

This pr adds an api for retrieving unused segments for a particular datasource. The api supports pagination by the addition of `limit` and `lastSegmentId` parameters. The resulting unused segments are returned with optional `sortOrder`, `ASC` or `DESC` with respect to the matching segments `id`, `start time`, and `end time`, or not returned in any guarenteed order if `sortOrder` is not specified

`GET /druid/coordinator/v1/datasources/{dataSourceName}/unusedSegments?interval={interval}&limit={limit}&lastSegmentId={lastSegmentId}&sortOrder={sortOrder}`

Returns a list of unused segments for a datasource in the cluster contained within an optionally specified interval.
Optional parameters for limit and lastSegmentId can be given as well, to limit results and enable paginated results.
The results may be sorted in either ASC, or DESC order depending on specifying the sortOrder parameter.

`dataSourceName`: The name of the datasource
`interval`:                 the specific interval to search for unused segments for.
`limit`:                      the maximum number of unused segments to return information about. This property helps to
                                 support pagination
`lastSegmentId`:     the last segment id from which to search for results. All segments returned are > this segment 
                                 lexigraphically if sortOrder is null or ASC, or < this segment lexigraphically if sortOrder is DESC.
`sortOrder`:            Specifies the order with which to return the matching segments by start time, end time. A null
                                value indicates that order does not matter.

This PR has:

- [x] been self-reviewed.
   - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
- [x] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
- [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
- [ ] added integration tests.
- [x] been tested in a test Druid cluster.
The website pom was removed as part of
apache#14411 so we no longer need to
reference it as a module and the profile can be removed.

Dependabot is currently failing trying to look for this module, so
removing it should also fix that.
basapuram-kumar and others added 29 commits September 4, 2025 20:14
…on-wrap (#20)

* ODP-2759: Refactor Druid to use ambari-python-wrap for all scripts

* ODP-2759: Refactor Druid to use ambari-python-wrap for sh

* ODP-3545 : Fix python shebangs to fix build failure

---------

Co-authored-by: Prabhjyot Singh <prabhjyot@acceldata.io>
Co-authored-by: harshith gandhe <harshith@acceldata.io>
(cherry picked from commit 714dfb1)
….parquet:parquet-avro from 1.13.0 to 1.15.2 in /extensions-core/parquet-extensions (apache#18131) (#40)

* Bump org.apache.parquet:parquet-avro

Bumps [org.apache.parquet:parquet-avro](https://github.com/apache/parquet-mr) from 1.15.1 to 1.15.2.
- [Release notes](https://github.com/apache/parquet-mr/releases)
- [Changelog](https://github.com/apache/parquet-java/blob/master/CHANGES.md)
- [Commits](apache/parquet-java@apache-parquet-1.15.1...apache-parquet-1.15.2)

---
updated-dependencies:
- dependency-name: org.apache.parquet:parquet-avro
  dependency-version: 1.15.2
  dependency-type: direct:production
...



* remove second parquet.version

* fix

* bump.licenses.yaml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Zoltan Haindrich <kirk@rxd.hu>
OSV-4768 | CVE-2024-36114 : bumping io.airlift_aircompressor to 0.27
…version to 2.16.1 (#47)

* OSV-4763,OSV-4762 | CVE-2025-52999,PRISMA-2023-0067: Bumping jackson version to 2.16.1

* OSV-4763,OSV-4762 | CVE-2025-52999,PRISMA-2023-0067: Bumping jackson version to 2.16.1

---------

Co-authored-by: basapuram-kumar <bsprmkumar@gmail.com>
…46)

* Bump com.google.protobuf:protobuf-java from 3.24.0 to 3.25.5

Bumps [com.google.protobuf:protobuf-java](https://github.com/protocolbuffers/protobuf) from 3.24.0 to 3.25.5.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/protobuf_release.bzl)
- [Commits](protocolbuffers/protobuf@v3.24.0...v3.25.5)

---
updated-dependencies:
- dependency-name: com.google.protobuf:protobuf-java
  dependency-type: direct:production
...



* Updated the license

* Updated licenses.yaml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Shivam Garg <shigarg@visa.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* OSV-4822 | CVE-2023-52428: com.nimbusds_nimbus-jose-jwt - 9.48

---------

Co-authored-by: basapuram-kumar <bsprmkumar@gmail.com>
…#51)

* OSV-3069|PRISMA-2023-0067: Bumping jackson version to 2.15.2

* ODP-3069-Addendum | PRISMA-2023-0067: moving jackson to 2.12.7

---------

Co-authored-by: basapuram-kumar <bsprmkumar@gmail.com>
* Bump jackson to 2.14.1 and fabric8 to 7.2.0 (apache#18013)

(cherry picked from commit 691ea3c)

* Fix typo

* Removal of quidem-ut

---------

Co-authored-by: Lucas Capistrant <capistrant@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.