Since Apache Spark 3.4 (SPARK-28330), the physical plan fully represents OFFSET:
GlobalLimitExec and TakeOrderedAndProjectExec carries an explicit offset field
- When both
LIMIT and OFFSET are present, Spark stores limit + offset as the raw limit value
The current auron completely ignores the offset field.
As a result, the query:
SELECT * FROM t LIMIT 10 OFFSET 5
return the first 15 rows instead of rows 6–15, producing incorrect results.
We should fully support the LIMIT … OFFSET … semantics.