Skip to content

Certain queries produce QueryExecutionErrors #17701

@edg956

Description

@edg956

Hi.

While fixing our integration with PinotDb I have found what seem to be some bugs in the query compilation or execution process.

For context, we (OpenMetadata) have a stack of metrics we usually compute for our users' tables as part of our data profiling feature. I reckon non-OpenMetadata users might find themselves writing similar queries for their data observability needs.

I'll detail them separately in the order I have discovered them:

Comparing with floats in some nested functions throws a NumberFormatException

One of those metrics is actually a histogram, which for all other DBs we compute with a query roughly like this:

select
	sum(case when (age < 35.0) then 1 else 0 end) as age_group_1,
	sum(case when (age >= 35.0 and age < 45.0) then 1 else 0 end) age_group_2,
	sum(case when (age >= 45.0) then 1 else 0 end) age_group_3
from data_source;

Which returns the following error:

Error Code: 200

QueryExecutionError:
Received error query execution result block: {200=QueryExecutionError:
org.apache.pinot.spi.exception.BadQueryRequestException: java.lang.NumberFormatException: For input string: "35.0"
	at org.apache.pinot.core.operator.filter.predicate.PredicateEvaluatorProvider.getPredicateEvaluator(PredicateEvaluatorProvider.java:94)
	at org.apache.pinot.core.operator.filter.predicate.PredicateEvaluatorProvider.getPredicateEvaluator(PredicateEvaluatorProvider.java:100)
	at org.apache.pinot.core.plan.FilterPlanNode.constructPhysicalOperator(FilterPlanNode.java:310)
	at org.apache.pinot.core.plan.FilterPlanNode.run(FilterPlanNode.java:93)
...
Caused by: java.lang.NumberFormatException: For input string: "35.0"
	at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67)
	at java.base/java.lang.Integer.parseInt(Integer.java:668)
	at java.base/java.lang.Integer.parseInt(Integer.java:786)
	at org.apache.pinot.segment.local.segment.index.readers.IntDictionary.insertionIndexOf(IntDictionary.java:44)}
org.apache.pinot.query.service.dispatch.QueryDispatcher.runReducer(QueryDispatcher.java:306)
org.apache.pinot.query.service.dispatch.QueryDispatcher.submitAndReduce(QueryDispatcher.java:96)
org.apache.pinot.broker.requesthandler.MultiStageBrokerRequestHandler.handleRequest(MultiStageBrokerRequestHandler.java:219)
org.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.handleRequest(BaseBrokerRequestHandler.java:133)

This is solved when changing the boundaries to integers.

I initially thought it was an issue with comparing different types but, on top of discovering that it also happens on float<->float comparisons, I discovered another issue:

Nested casting can also break query execution

I attempted to cast the column to float as a fix (before realizing float<->float comparisons also fail), and got the following:

select
	sum(case when (cast(age as float) < 35.0) then 1 else 0 end) as age_group_1,
	sum(case when (cast(age as float) >= 35.0 and cast(age as float) < 45.0) then 1 else 0 end) age_group_2,
	sum(case when (cast(age as float) >= 45.0) then 1 else 0 end) age_group_3
from data_source;
ProcessingException(errorCode:150, message:SQLParsingError:
java.lang.Exception: Unable to find table for this query
	at org.apache.pinot.controller.api.resources.PinotQueryResource.getMultiStageQueryResponse(PinotQueryResource.java:222)
	at org.apache.pinot.controller.api.resources.PinotQueryResource.executeSqlQuery(PinotQueryResource.java:179)
	at org.apache.pinot.controller.api.resources.PinotQueryResource.handlePostSql(PinotQueryResource.java:127)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
...
Caused by: java.lang.RuntimeException: Error composing query plan for: select
	sum(case when (cast(age as float) < 35.0) then 1 else 0 end) as age_group_1,
	sum(case when (cast(age as float) >= 35.0 and cast(age as float) < 45.0) then 1 else 0 end) age_group_2,
	sum(case when (cast(age as float) >= 45.0) then 1 else 0 end) age_group_3
from data_source;
...
Caused by: java.lang.IllegalArgumentException: Table does not exist: 'data_source'
	at org.apache.pinot.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:145)
	at org.apache.pinot.query.catalog.PinotCatalog.getTable(PinotCatalog.java:78)
	at org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable(SimpleCalciteSchema.java:128)
	at org.apache.calcite.jdbc.CalciteSchema.getTable(CalciteSchema.java:283))
Finally, I discovered the source of these errors: SUM breaks the whole thing

If I changed the query to:

select
	case when (cast(age as float) < 35.0) then 1 else 0 end as age_group_1,
	case when (cast(age as float) >= 35.0 and cast(age as float) < 45.0) then 1 else 0 end age_group_2,
	case when (cast(age as float) >= 45.0) then 1 else 0 end age_group_3
from data_source;

And got 1's and 0's where I expected. So there's must be something fishy going about SUM.

For the record I am using PinotDB 1.2.0 in a simple docker compose deployment.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions