-
Notifications
You must be signed in to change notification settings - Fork 0
2775: chore: Remove many references to COMET_EXPR_ALLOW_INCOMPATIBLE
#21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request introduces a new dynamic per-expression incompatibility configuration mechanism and updates the codebase to use it. A new method ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| sql(s"create table $table(c9 int, c4 int) using parquet") | ||
| sql(s"insert into $table values(0, 0), (66, null), (null, 70), (null, null)") | ||
| val query = s"SELECT chr(c9), chr(c4) FROM $table" | ||
| checkSparkAnswerAndOperator(query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Comet: Missing compatibility config breaks native test.
The COMET_EXPR_ALLOW_INCOMPATIBLE configuration was removed from the Chr with null character test, but the test still calls checkSparkAnswerAndOperator(), which verifies that the query is natively accelerated by Comet. If chr() remains incompatible with Spark, removing this config without corresponding code changes to mark the expression as compatible will cause the test to fail, as checkSparkAnswerAndOperator() expects native acceleration to succeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value:annoying; category:bug; feedback:The Bugbot AI reviewer is not correct! I wasn't able to find any proof about this finding and all the tests pass without any issues.
PR Review: Remove references to
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review completed. 1 suggestion posted.
Comment augment review to trigger a new review at any time.
| | `spark.comet.explain.rules` | When this setting is enabled, Comet will log all plan transformations performed in physical optimizer rules. Default: false | false | | ||
| | `spark.comet.explainFallback.enabled` | When this setting is enabled, Comet will provide logging explaining the reason(s) why a query stage cannot be executed natively. Set this to false to reduce the amount of logging. | false | | ||
| | `spark.comet.logFallbackReasons.enabled` | When this setting is enabled, Comet will log warnings for all fallback reasons. Can be overridden by environment variable `ENABLE_COMET_LOG_FALLBACK_REASONS`. | false | | ||
| | `spark.comet.logFallbackReasons.enabled` | When this setting is enabled, Comet will log warnings for all fallback reasons. Can be overridden by environment variable `ENABLE_COMET_LOG_FALLBACK_REASONS`. | true | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs set the default to true, but CometConf.COMET_LOG_FALLBACK_REASONS is defined with createWithEnvVarOrDefault(..., false), so the code default is false (unless the env var is set). Consider aligning the documentation with the actual default or updating the default in code.
🤖 Was this useful? React with 👍 or 👎
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value:useful; category:documentation; feedback:The Augment AI reviewer is correct. The default value in the documentation has been changed but the actual value in the implementation is still the old one. Prevents inconsistency in the logic and its documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
spark/src/test/scala/org/apache/comet/CometCastSuite.scala (1)
660-669: Replace global COMET_EXPR_ALLOW_INCOMPATIBLE with Cast-scoped key in partial-support testsThese three test methods still use the global configuration flag. Update them to use the Cast-specific key:
test("cast StringType to FloatType (partial support)") { withSQLConf( - CometConf.COMET_EXPR_ALLOW_INCOMPATIBLE.key -> "true", + CometConf.getExprAllowIncompatConfigKey(classOf[Cast]) -> "true", SQLConf.ANSI_ENABLED.key -> "false") {Apply the same change at lines 678 and 695.
Also applies to: 676-685, 693-703
🧹 Nitpick comments (3)
docs/source/user-guide/latest/configs.md (1)
84-84: Grammar: Add a subject to the second sentence in the description.The description text reads "Can be overridden by environment variable..." but lacks a subject pronoun. Consider revising to "This can be overridden..." or "It can be overridden..." for grammatical completeness.
-| `spark.comet.logFallbackReasons.enabled` | When this setting is enabled, Comet will log warnings for all fallback reasons. Can be overridden by environment variable `ENABLE_COMET_LOG_FALLBACK_REASONS`. | true | +| `spark.comet.logFallbackReasons.enabled` | When this setting is enabled, Comet will log warnings for all fallback reasons. This can be overridden by environment variable `ENABLE_COMET_LOG_FALLBACK_REASONS`. | true |spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala (2)
188-193: Verify if these changes are intentional.These lines are marked as changed but don't appear to relate to the PR objective of replacing
COMET_EXPR_ALLOW_INCOMPATIBLEreferences. If these are formatting-only changes, consider separating them into a dedicated formatting commit for clearer history.
468-468: Minor incidental formatting changes noted.These lines show formatting adjustments to
withSQLConfblocks that don't relate to the PR's main objective. While not problematic, consider using a consistent auto-formatter across the codebase to minimize such incidental changes in focused PRs.Also applies to: 1430-1430, 1981-1981, 2085-2085, 2781-2781
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
docs/source/user-guide/latest/configs.md(1 hunks)spark/src/test/scala/org/apache/comet/CometArrayExpressionSuite.scala(16 hunks)spark/src/test/scala/org/apache/comet/CometCastSuite.scala(5 hunks)spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala(12 hunks)spark/src/test/scala/org/apache/comet/CometStringExpressionSuite.scala(1 hunks)spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala(1 hunks)spark/src/test/scala/org/apache/comet/exec/CometExecSuite.scala(2 hunks)spark/src/test/scala/org/apache/comet/exec/CometWindowExecSuite.scala(0 hunks)
💤 Files with no reviewable changes (1)
- spark/src/test/scala/org/apache/comet/exec/CometWindowExecSuite.scala
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: martin-augment
Repo: martin-augment/datafusion-comet PR: 17
File: docs/source/contributor-guide/adding_a_new_operator.md:349-354
Timestamp: 2025-11-11T15:01:48.203Z
Learning: For Apache DataFusion Comet debugging documentation, the correct configuration keys are `spark.comet.explain.format=verbose` for verbose explain plans and `spark.comet.logFallbackReasons.enabled=true` for logging fallback reasons (not `spark.comet.explain.verbose` or `spark.comet.logFallbackReasons` without `.enabled`).
📚 Learning: 2025-11-11T15:01:48.203Z
Learnt from: martin-augment
Repo: martin-augment/datafusion-comet PR: 17
File: docs/source/contributor-guide/adding_a_new_operator.md:349-354
Timestamp: 2025-11-11T15:01:48.203Z
Learning: For Apache DataFusion Comet debugging documentation, the correct configuration keys are `spark.comet.explain.format=verbose` for verbose explain plans and `spark.comet.logFallbackReasons.enabled=true` for logging fallback reasons (not `spark.comet.explain.verbose` or `spark.comet.logFallbackReasons` without `.enabled`).
Applied to files:
docs/source/user-guide/latest/configs.mdspark/src/test/scala/org/apache/comet/exec/CometExecSuite.scalaspark/src/test/scala/org/apache/comet/CometExpressionSuite.scala
📚 Learning: 2025-11-11T20:44:05.014Z
Learnt from: martin-augment
Repo: martin-augment/datafusion-comet PR: 19
File: native/spark-expr/src/array_funcs/array_insert.rs:221-238
Timestamp: 2025-11-11T20:44:05.014Z
Learning: In Rust code using Apache Arrow arrays, always check `is_null(index)` before calling `value(index)` on `PrimitiveArray` types (such as `Int32Array`, `Int64Array`, etc.), because `value()` does not check for nulls and returns arbitrary values for null slots. This applies to functions in `native/spark-expr/src` that process Arrow arrays.
Applied to files:
spark/src/test/scala/org/apache/comet/CometArrayExpressionSuite.scala
🧬 Code graph analysis (5)
spark/src/test/scala/org/apache/comet/CometStringExpressionSuite.scala (2)
spark/src/test/scala/org/apache/spark/sql/CometTestBase.scala (3)
checkSparkAnswerAndOperator(189-193)checkSparkAnswerAndOperator(199-206)checkSparkAnswerAndOperator(214-223)spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala (1)
test(48-55)
spark/src/test/scala/org/apache/comet/exec/CometExecSuite.scala (1)
common/src/main/scala/org/apache/comet/CometConf.scala (1)
CometConf(46-767)
spark/src/test/scala/org/apache/comet/CometCastSuite.scala (1)
common/src/main/scala/org/apache/comet/CometConf.scala (3)
CometConf(46-767)getExprAllowIncompatConfigKey(744-746)getExprAllowIncompatConfigKey(748-750)
spark/src/test/scala/org/apache/comet/CometExpressionSuite.scala (1)
common/src/main/scala/org/apache/comet/CometConf.scala (3)
CometConf(46-767)getExprAllowIncompatConfigKey(744-746)getExprAllowIncompatConfigKey(748-750)
spark/src/test/scala/org/apache/comet/CometArrayExpressionSuite.scala (3)
spark/src/test/scala/org/apache/spark/sql/CometTestBase.scala (8)
spark(429-429)checkSparkAnswerAndFallbackReasons(275-279)checkSparkAnswerAndFallbackReasons(282-301)makeParquetFileAllPrimitiveTypes(583-585)makeParquetFileAllPrimitiveTypes(650-739)checkSparkAnswerAndOperator(189-193)checkSparkAnswerAndOperator(199-206)checkSparkAnswerAndOperator(214-223)spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala (1)
isSpark40Plus(228-230)common/src/main/scala/org/apache/comet/CometConf.scala (3)
CometConf(46-767)getExprAllowIncompatConfigKey(744-746)getExprAllowIncompatConfigKey(748-750)
🪛 LanguageTool
docs/source/user-guide/latest/configs.md
[style] ~84-~84: To form a complete sentence, be sure to include a subject.
Context: ... log warnings for all fallback reasons. Can be overridden by environment variable `...
(MISSING_IT_THERE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: claude-review
🔇 Additional comments (31)
spark/src/test/scala/org/apache/comet/CometStringExpressionSuite.scala (2)
175-181: Chr test now runs without global incompat flag — LGTMBehavior remains explicit and self-contained.
186-191: Chr(null/0) compatibility coverage — LGTMCovers Spark’s chr(0)/null semantics while asserting native execution.
spark/src/test/scala/org/apache/comet/CometCastSuite.scala (5)
436-439: Switch to per‑expression key for Float→Decimal — LGTMUses CometConf.getExprAllowIncompatConfigKey(classOf[Cast]) as intended.
496-499: Switch to per‑expression key for Double→Decimal — LGTM
784-791: Timestamp cast (ignored test) toggles per‑expr key — LGTM
835-839: Timestamp subset test uses per‑expr key with timezone — LGTM
1268-1274: ANSI branch: per‑expr incompat key enabled — LGTMKeeps exception parity checks intact.
spark/src/test/scala/org/apache/comet/exec/CometExecSuite.scala (2)
531-531: Cleanup of extraneous expr flag (shuffle mode retained) — LGTM
1440-1484: Typed-literals partition spec with timezone only — LGTMRemoving global incompat flag here reduces blast radius; test intent preserved.
spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala (1)
931-943: Decimal(38,2) overflow test no longer needs global flag — LGTMInputs exercise boundary/overflow; asserting two native aggregates matches expected plan shape.
spark/src/test/scala/org/apache/comet/CometArrayExpressionSuite.scala (16)
26-26: Add explicit array expression imports — LGTMClarifies per‑expr key mapping by class.
139-169: array_append: dynamic key (ArrayInsert vs ArrayAppend by Spark version) — LGTM
173-199: array_prepend uses ArrayInsert key (Spark 3.5+ mapping) — LGTM
202-222: ArrayInsert happy‑path coverage under per‑expr key — LGTM
227-241: ArrayInsert unsupported args: assert fallback reasons — LGTMNote: reason strings can drift across versions; keep substring checks stable.
357-383: array_distinct under per‑expr key — LGTM
386-404: array_union under per‑expr key — LGTM
457-474: array_intersect under per‑expr key — LGTM
477-496: array_join under per‑expr key — LGTM
499-518: arrays_overlap under per‑expr key — LGTM
523-539: array_compact: broadened coverage; Spark 4 guard — LGTM
542-561: array_except basics under per‑expr key — LGTM
577-595: array_except all‑types path (native reader) uses per‑expr key — LGTM
615-616: Convert-from-Parquet path toggled — LGTMKeeps conversion tests isolated from native scan.
634-659: array_repeat: per‑expr key + explain fallback enabled — LGTM
728-740: array literals: enable explain fallback for clearer diagnostics — LGTMspark/src/test/scala/org/apache/comet/CometExpressionSuite.scala (5)
31-31: LGTM! Import addition supports per-expression configuration.The
FromUnixTimeimport is correctly added to support the dynamic configuration key usage at line 1749.
618-618: LGTM! Cast configuration correctly uses per-expression keys.The replacement of the global
COMET_EXPR_ALLOW_INCOMPATIBLE.keywithCometConf.getExprAllowIncompatConfigKey(classOf[Cast])correctly narrows the incompatibility configuration to Cast-specific operations in these timestamp casting tests.Also applies to: 640-640, 662-662
704-704: LGTM! TruncDate configuration correctly uses per-expression key.The use of
classOf[TruncDate]appropriately targets the date truncation functionality being tested.
752-752: LGTM! Timestamp truncation tests correctly configure relevant expressions.Line 752 appropriately enables Cast for timestamp_ntz operations, while lines 794-795 correctly configure both Cast and TruncTimestamp for a test that uses both casting and timestamp truncation operations.
Also applies to: 794-795
1749-1749: LGTM! FromUnixTime configuration correctly uses per-expression key.The use of
classOf[FromUnixTime]appropriately targets the unix timestamp conversion functionality being tested, consistent with the import addition at line 31.
value:useful; category:documentation; feedback:The Claude AI reviewer is correct. The default value in the documentation has been changed but the actual value in the implementation is still the old one. Prevents inconsistency in the logic and its documentation |
value:good-to-have; category:documentation; feedback:The CodeRabbit AI reviewer is correct that the grammar in the documentation could be improved for this config key. |
value:good-to-have; category:bug; feedback:The CodeRabbit AI reviewer is correct that the config key could be narrowed to a more specific one that it responsible only for the casting related operations. |
2775: To review by AI