-
Notifications
You must be signed in to change notification settings - Fork 691
Allow custom OptimizerHints #2216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Allow custom OptimizerHints #2216
Conversation
…efix Two orthogonal improvements to optimizer hint parsing: 1. `Option<OptimizerHint>` -> `Vec<OptimizerHint>`: the old Option silently dropped all but the first hint-style comment. Vec preserves all hint comments the parser encounters, letting consumers decide which to use. This is backwards compatible: `optimizer_hint: None` becomes `optimizer_hints: vec![]`, and `optimizer_hint.unwrap()` becomes `optimizer_hints[0]`. 2. Generic prefix extraction: the `/*+...*/` pattern is an established convention. Various systems extend it with `/*prefix+...*/` where the prefix is opaque alphanumeric text before `+`. Rather than adding a new dialect flag or struct for each system, the parser now captures any `[a-zA-Z0-9]*` run before `+` as a `prefix` field. Standard hints have `prefix: ""`. No new dialect surface -- same `supports_comment_optimizer_hint()` gate. This makes OptimizerHint a generic extension point: downstream consumers can define their own prefixed hint conventions and filter hints by prefix, without requiring any changes to the parser or dialect configuration.
0a5df55 to
4e2c3ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hello @altmannmarcelo,
i quite like your generalisation if it makes the feature useful for dialects supporting multiple hints! 👍 so i'm in for the change.
however, i wish you would introduce dialect flags to guide the parser to 1) allow the prefixes, and 2) allow accepting multiple hints (the AST can nicely present that with your suggestion of Vec<OptimizerHint>.)
yes, you wrote that downstream programs can do the validation themselves. and indeed they can. but, when you start writing your 3rd sql processor based on sqlparser, it become tiresome to repeat those validations in all of them (or to start maintaining a separate crate for these validations.) the great thing about sqlparser is that it has the concept of "dialects" and provides a common AST for all of them, yet is able to distinguish between the dialects' "idiosyncrasies." (having said that, i'm no authority and don't have a say in how sqlparser-rs wants to evolve.)
| Some((before_plus.to_string(), text.to_string())) | ||
| } else { | ||
| None | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think using str::split_once would make this shorter and leaner (possibly slightly more efficient), e.g. https://gist.github.com/rust-play/146d81960095525d6384f34d84ac7419
| assert_eq!(select.optimizer_hints.len(), 2); | ||
| assert_eq!(select.optimizer_hints[0].text, "one two three"); | ||
| assert_eq!(select.optimizer_hints[0].prefix, ""); | ||
| assert_eq!(select.optimizer_hints[1].text, "not a hint!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, this test was to assert that "not a hint!" is in fact "not an optimizer hint!" :)
Two orthogonal improvements to optimizer hint parsing:
Option<OptimizerHint>->Vec<OptimizerHint>: the old Option silently dropped all but the first hint-style comment. Vec preserves all hint comments the parser encounters, letting consumers decide which to use. This is backwards compatible:optimizer_hint: Nonebecomesoptimizer_hints: vec![], andoptimizer_hint.unwrap()becomesoptimizer_hints[0].Generic prefix extraction: the
/*+...*/pattern is an established convention. Various systems extend it with/*prefix+...*/where the prefix is opaque alphanumeric text before+. Rather than adding a new dialect flag or struct for each system, the parser now captures any[a-zA-Z0-9]*run before+as aprefixfield. Standard hints haveprefix: "". No new dialect surface -- samesupports_comment_optimizer_hint()gate. This makes OptimizerHint a generic extension point: downstream consumers can define their own prefixed hint conventions and filter hints by prefix, without requiring any changes to the parser or dialect configuration.