Skip to content

[multistage] Subqueries return different results depending on where clause #12949

@RalfJL

Description

@RalfJL

Currently I am investigating if Pinot can help us to calculate inbytes and outbytes from Radius accounting data.
The setup is:
sysbench (produces artificial accounting data) -> freeradius -> kafka -> pinot
The randomness of the artificial data is low so we end up with several "sessions" having the same Acct-Unique-Session-ID.
e.g. query:

select a.acctuniquesessionid, a.acctstatustype, a.eventtime, a.eventtimestamp from radius_json a
where a.acctuniquesessionid = 'da0f86138ec36b2364889f048bf2ac82'
order by a.eventtime

returns:

acctuniquesessionid	acctstatustype	eventtime	eventtimestamp
da0f86138ec36b2364889f048bf2ac82	Start	1712050245468	Apr 2 2024 09:30:45 UTC
da0f86138ec36b2364889f048bf2ac82	Stop	1712050293427	Apr 2 2024 09:31:33 UTC
da0f86138ec36b2364889f048bf2ac82	Start	1712759564017	Apr 10 2024 14:32:43 UTC
da0f86138ec36b2364889f048bf2ac82	Interim-Update	1712759589307	Apr 10 2024 14:33:09 UTC
da0f86138ec36b2364889f048bf2ac82	Interim-Update	1712759612449	Apr 10 2024 14:33:32 UTC
da0f86138ec36b2364889f048bf2ac82	Interim-Update	1712759616370	Apr 10 2024 14:33:36 UTC
da0f86138ec36b2364889f048bf2ac82	Stop	1712759668126	Apr 10 2024 14:34:27 UTC

Obviously there are two different sessions with the same acctuniquesessionid in the data.
In real life we can not guarantee that Session ID's are always unique, so this might also happen in real life data.

The solution is to filter out all 'Start' - 'Stop' combinations where there is a 'Start' or a 'Stop' in between.
And here comes the problem:
query:

select a.acctuniquesessionid, (a.eventtime - b.eventtime)/1000 as Zeit, a.acctstatustype, b.acctstatustype, a.eventtime, b.eventtime 
 ,( select count(c.eventtime) from radius_json c where b.acctuniquesessionid = c.acctuniquesessionid 
				 and a.acctuniquesessionid = c.acctuniquesessionid
				and a.eventtime > 0 and b.eventtime > 0
				and (c.acctstatustype = 'Stop' or c.acctstatustype = 'Start' ) 
 	 			and c.eventtime between b.eventtime+1 and a.eventtime-1
				 ) 

from radius_json a
join radius_json b
on b.acctuniquesessionid = a.acctuniquesessionid 
   and a.acctstatustype = 'Stop' and b.acctstatustype = 'Start' -- and a._3gppimsi <> 'null'
   and a.eventtime > b.eventtime
where  
	not exists ( select 1 from radius_json c where b.acctuniquesessionid = c.acctuniquesessionid and a.acctuniquesessionid = c.acctuniquesessionid
				and a.eventtime > 0 and b.eventtime > 0
				and (c.acctstatustype = 'Stop' or c.acctstatustype = 'Start' ) 
 	 			and c.eventtime between b.eventtime+1 and a.eventtime-1
				 ) 
-- and  			a.acctuniquesessionid = 'fac07ae7853810946d87e868e463af2c'
-- and 				a.acctuniquesessionid = 'ac6baa744130522c1eb1eec161114d1b'
and		   a.acctuniquesessionid = 'da0f86138ec36b2364889f048bf2ac82'
order by Zeit desc

Returns:

acctuniquesessionid	Zeit	acctstatustype	acctstatustype	eventtime	eventtime	EXPR$4
da0f86138ec36b2364889f048bf2ac82	104	Stop	Start	1712759668126	1712759564017	0
da0f86138ec36b2364889f048bf2ac82	47	Stop	Start	1712050293427	1712050245468	0

The two sessions as expected.

Removing the filter a.acctuniquesessionid = 'da0f86138ec36b2364889f048bf2ac82' shows a very different result
query:

select a.acctuniquesessionid, (a.eventtime - b.eventtime)/1000 as Zeit, a.acctstatustype, b.acctstatustype, a.eventtime, b.eventtime 
 ,( select count(c.eventtime) from radius_json c where b.acctuniquesessionid = c.acctuniquesessionid 
				 and a.acctuniquesessionid = c.acctuniquesessionid
				and a.eventtime > 0 and b.eventtime > 0
				and (c.acctstatustype = 'Stop' or c.acctstatustype = 'Start' ) 
 	 			and c.eventtime between b.eventtime+1 and a.eventtime-1
				 ) 

from radius_json a
join radius_json b
on b.acctuniquesessionid = a.acctuniquesessionid 
   and a.acctstatustype = 'Stop' and b.acctstatustype = 'Start' -- and a._3gppimsi <> 'null'
   and a.eventtime > b.eventtime
where  
	not exists ( select 1 from radius_json c where b.acctuniquesessionid = c.acctuniquesessionid and a.acctuniquesessionid = c.acctuniquesessionid
				and a.eventtime > 0 and b.eventtime > 0
				and (c.acctstatustype = 'Stop' or c.acctstatustype = 'Start' ) 
 	 			and c.eventtime between b.eventtime+1 and a.eventtime-1
				 ) 
-- and  			a.acctuniquesessionid = 'fac07ae7853810946d87e868e463af2c'
-- and 				a.acctuniquesessionid = 'ac6baa744130522c1eb1eec161114d1b'
-- and		   a.acctuniquesessionid = 'da0f86138ec36b2364889f048bf2ac82'
order by Zeit desc

result (shortened by thousands of other sessions):

da0f86138ec36b2364889f048bf2ac82	709422	Stop	Start	1712759668126	1712050245468	0

So here obviously both subqueries return wrong results. There is a 'Start' and a 'Stop' between timestamp 1712759668126 and 1712050245468 but it is not detected anymore when the filter for the acctuniquessionid is removed

Am I missing here something or am I hitting a bug in multistage engine?

I am using Pinot 1.1.0 as docker compose image (single instance of all server processes)
The query plan for both queries is the same beside the Logical Filter
first query:
LogicalFilter(condition=[AND(=($14, _UTF-8'da0f86138ec36b2364889f048bf2ac82'), =($12, _UTF-8'Stop'))])
second query:
LogicalFilter(condition=[=($12, _UTF-8'Stop')])

PLEASE NOTE: Some filters are redundant like "b.acctuniquesessionid = c.acctuniquesessionid and a.acctuniquesessionid = c.acctuniquesessionid" and are a test if the result changes. But it doesn't

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions