Skip to content

Commit 187f3c1

Browse files
beliefergengliangwang
authored andcommitted
[SPARK-28083][SQL] Support LIKE ... ESCAPE syntax
## What changes were proposed in this pull request? The syntax 'LIKE predicate: ESCAPE clause' is a ANSI SQL. For example: ``` select 'abcSpark_13sd' LIKE '%Spark\\_%'; //true select 'abcSpark_13sd' LIKE '%Spark/_%'; //false select 'abcSpark_13sd' LIKE '%Spark"_%'; //false select 'abcSpark_13sd' LIKE '%Spark/_%' ESCAPE '/'; //true select 'abcSpark_13sd' LIKE '%Spark"_%' ESCAPE '"'; //true select 'abcSpark%13sd' LIKE '%Spark\\%%'; //true select 'abcSpark%13sd' LIKE '%Spark/%%'; //false select 'abcSpark%13sd' LIKE '%Spark"%%'; //false select 'abcSpark%13sd' LIKE '%Spark/%%' ESCAPE '/'; //true select 'abcSpark%13sd' LIKE '%Spark"%%' ESCAPE '"'; //true select 'abcSpark\\13sd' LIKE '%Spark\\\\_%'; //true select 'abcSpark/13sd' LIKE '%Spark//_%'; //false select 'abcSpark"13sd' LIKE '%Spark""_%'; //false select 'abcSpark/13sd' LIKE '%Spark//_%' ESCAPE '/'; //true select 'abcSpark"13sd' LIKE '%Spark""_%' ESCAPE '"'; //true ``` But Spark SQL only supports 'LIKE predicate'. Note: If the input string or pattern string is null, then the result is null too. There are some mainstream database support the syntax. **PostgreSQL:** https://www.postgresql.org/docs/11/functions-matching.html **Vertica:** https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/LanguageElements/Predicates/LIKE-predicate.htm?zoom_highlight=like%20escape **MySQL:** https://dev.mysql.com/doc/refman/5.6/en/string-comparison-functions.html **Oracle:** https://docs.oracle.com/en/database/oracle/oracle-database/19/jjdbc/JDBC-reference-information.html#GUID-5D371A5B-D7F6-42EB-8C0D-D317F3C53708 https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Pattern-matching-Conditions.html#GUID-0779657B-06A8-441F-90C5-044B47862A0A ## How was this patch tested? Exists UT and new UT. This PR merged to my production environment and runs above sql: ``` spark-sql> select 'abcSpark_13sd' LIKE '%Spark\\_%'; true Time taken: 0.119 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark_13sd' LIKE '%Spark/_%'; false Time taken: 0.103 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark_13sd' LIKE '%Spark"_%'; false Time taken: 0.096 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark_13sd' LIKE '%Spark/_%' ESCAPE '/'; true Time taken: 0.096 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark_13sd' LIKE '%Spark"_%' ESCAPE '"'; true Time taken: 0.092 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark%13sd' LIKE '%Spark\\%%'; true Time taken: 0.109 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark%13sd' LIKE '%Spark/%%'; false Time taken: 0.1 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark%13sd' LIKE '%Spark"%%'; false Time taken: 0.081 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark%13sd' LIKE '%Spark/%%' ESCAPE '/'; true Time taken: 0.095 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark%13sd' LIKE '%Spark"%%' ESCAPE '"'; true Time taken: 0.113 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark\\13sd' LIKE '%Spark\\\\_%'; true Time taken: 0.078 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark/13sd' LIKE '%Spark//_%'; false Time taken: 0.067 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark"13sd' LIKE '%Spark""_%'; false Time taken: 0.084 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark/13sd' LIKE '%Spark//_%' ESCAPE '/'; true Time taken: 0.091 seconds, Fetched 1 row(s) spark-sql> select 'abcSpark"13sd' LIKE '%Spark""_%' ESCAPE '"'; true Time taken: 0.091 seconds, Fetched 1 row(s) ``` I create a table and its schema is: ``` spark-sql> desc formatted gja_test; key string NULL value string NULL other string NULL # Detailed Table Information Database test Table gja_test Owner test Created Time Wed Apr 10 11:06:15 CST 2019 Last Access Thu Jan 01 08:00:00 CST 1970 Created By Spark 2.4.1-SNAPSHOT Type MANAGED Provider hive Table Properties [transient_lastDdlTime=1563443838] Statistics 26 bytes Location hdfs://namenode.xxx:9000/home/test/hive/warehouse/test.db/gja_test Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat org.apache.hadoop.mapred.TextInputFormat OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Storage Properties [field.delim= , serialization.format= ] Partition Provider Catalog Time taken: 0.642 seconds, Fetched 21 row(s) ``` Table `gja_test` exists three rows of data. ``` spark-sql> select * from gja_test; a A ao b B bo "__ """__ " Time taken: 0.665 seconds, Fetched 3 row(s) ``` At finally, I test this function: ``` spark-sql> select * from gja_test where key like value escape '"'; "__ """__ " Time taken: 0.687 seconds, Fetched 1 row(s) ``` Closes #25001 from beliefer/ansi-sql-like. Lead-authored-by: gengjiaan <gengjiaan@360.cn> Co-authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
1 parent b86d4bb commit 187f3c1

File tree

12 files changed

+167
-28
lines changed

12 files changed

+167
-28
lines changed

docs/sql-keywords.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,7 @@ Below is a list of all the keywords in Spark SQL.
104104
<tr><td>DROP</td><td>non-reserved</td><td>non-reserved</td><td>reserved</td></tr>
105105
<tr><td>ELSE</td><td>reserved</td><td>non-reserved</td><td>reserved</td></tr>
106106
<tr><td>END</td><td>reserved</td><td>non-reserved</td><td>reserved</td></tr>
107+
<tr><td>ESCAPE</td><td>reserved</td><td>non-reserved</td><td>reserved</td></tr>
107108
<tr><td>ESCAPED</td><td>non-reserved</td><td>non-reserved</td><td>non-reserved</td></tr>
108109
<tr><td>EXCEPT</td><td>reserved</td><td>strict-non-reserved</td><td>reserved</td></tr>
109110
<tr><td>EXCHANGE</td><td>non-reserved</td><td>non-reserved</td><td>non-reserved</td></tr>

sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -724,7 +724,8 @@ predicate
724724
: NOT? kind=BETWEEN lower=valueExpression AND upper=valueExpression
725725
| NOT? kind=IN '(' expression (',' expression)* ')'
726726
| NOT? kind=IN '(' query ')'
727-
| NOT? kind=(RLIKE | LIKE) pattern=valueExpression
727+
| NOT? kind=RLIKE pattern=valueExpression
728+
| NOT? kind=LIKE pattern=valueExpression (ESCAPE escapeChar=STRING)?
728729
| IS NOT? kind=NULL
729730
| IS NOT? kind=(TRUE | FALSE | UNKNOWN)
730731
| IS NOT? kind=DISTINCT FROM right=valueExpression
@@ -1265,6 +1266,7 @@ nonReserved
12651266
| DROP
12661267
| ELSE
12671268
| END
1269+
| ESCAPE
12681270
| ESCAPED
12691271
| EXCHANGE
12701272
| EXISTS
@@ -1525,6 +1527,7 @@ DISTRIBUTE: 'DISTRIBUTE';
15251527
DROP: 'DROP';
15261528
ELSE: 'ELSE';
15271529
END: 'END';
1530+
ESCAPE: 'ESCAPE';
15281531
ESCAPED: 'ESCAPED';
15291532
EXCEPT: 'EXCEPT';
15301533
EXCHANGE: 'EXCHANGE';

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,8 @@ package object dsl {
9898
case _ => In(expr, list)
9999
}
100100

101-
def like(other: Expression): Expression = Like(expr, other)
101+
def like(other: Expression, escapeChar: Char = '\\'): Expression =
102+
Like(expr, other, escapeChar)
102103
def rlike(other: Expression): Expression = RLike(expr, other)
103104
def contains(other: Expression): Expression = Contains(expr, other)
104105
def startsWith(other: Expression): Expression = StartsWith(expr, other)

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -70,8 +70,8 @@ abstract class StringRegexExpression extends BinaryExpression
7070
* Simple RegEx pattern matching function
7171
*/
7272
@ExpressionDescription(
73-
usage = "str _FUNC_ pattern - Returns true if str matches pattern, " +
74-
"null if any arguments are null, false otherwise.",
73+
usage = "str _FUNC_ pattern[ ESCAPE escape] - Returns true if str matches `pattern` with " +
74+
"`escape`, null if any arguments are null, false otherwise.",
7575
arguments = """
7676
Arguments:
7777
* str - a string expression
@@ -83,16 +83,15 @@ abstract class StringRegexExpression extends BinaryExpression
8383
% matches zero or more characters in the input (similar to .* in posix regular
8484
expressions)
8585
86-
The escape character is '\'. If an escape character precedes a special symbol or another
87-
escape character, the following character is matched literally. It is invalid to escape
88-
any other character.
89-
9086
Since Spark 2.0, string literals are unescaped in our SQL parser. For example, in order
9187
to match "\abc", the pattern should be "\\abc".
9288
9389
When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it fallbacks
9490
to Spark 1.6 behavior regarding string literal parsing. For example, if the config is
9591
enabled, the pattern to match "\abc" should be "\abc".
92+
* escape - an character added since Spark 3.0. The default escape character is the '\'.
93+
If an escape character precedes a special symbol or another escape character, the
94+
following character is matched literally. It is invalid to escape any other character.
9695
""",
9796
examples = """
9897
Examples:
@@ -104,19 +103,25 @@ abstract class StringRegexExpression extends BinaryExpression
104103
spark.sql.parser.escapedStringLiterals false
105104
> SELECT '%SystemDrive%\\Users\\John' _FUNC_ '\%SystemDrive\%\\\\Users%';
106105
true
106+
> SELECT '%SystemDrive%/Users/John' _FUNC_ '/%SystemDrive/%//Users%' ESCAPE '/';
107+
true
107108
""",
108109
note = """
109110
Use RLIKE to match with standard regular expressions.
110111
""",
111112
since = "1.0.0")
112113
// scalastyle:on line.contains.tab
113-
case class Like(left: Expression, right: Expression) extends StringRegexExpression {
114+
case class Like(left: Expression, right: Expression, escapeChar: Char = '\\')
115+
extends StringRegexExpression {
114116

115-
override def escape(v: String): String = StringUtils.escapeLikeRegex(v)
117+
override def escape(v: String): String = StringUtils.escapeLikeRegex(v, escapeChar)
116118

117119
override def matches(regex: Pattern, str: String): Boolean = regex.matcher(str).matches()
118120

119-
override def toString: String = s"$left LIKE $right"
121+
override def toString: String = escapeChar match {
122+
case '\\' => s"$left LIKE $right"
123+
case c => s"$left LIKE $right ESCAPE '$c'"
124+
}
120125

121126
override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
122127
val patternClass = classOf[Pattern].getName
@@ -149,10 +154,18 @@ case class Like(left: Expression, right: Expression) extends StringRegexExpressi
149154
} else {
150155
val pattern = ctx.freshName("pattern")
151156
val rightStr = ctx.freshName("rightStr")
157+
// We need double escape to avoid org.codehaus.commons.compiler.CompileException.
158+
// '\\' will cause exception 'Single quote must be backslash-escaped in character literal'.
159+
// '\"' will cause exception 'Line break in literal not allowed'.
160+
val newEscapeChar = if (escapeChar == '\"' || escapeChar == '\\') {
161+
s"""\\\\\\$escapeChar"""
162+
} else {
163+
escapeChar
164+
}
152165
nullSafeCodeGen(ctx, ev, (eval1, eval2) => {
153166
s"""
154167
String $rightStr = $eval2.toString();
155-
$patternClass $pattern = $patternClass.compile($escapeFunc($rightStr));
168+
$patternClass $pattern = $patternClass.compile($escapeFunc($rightStr, '$newEscapeChar'));
156169
${ev.value} = $pattern.matcher($eval1.toString()).matches();
157170
"""
158171
})

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -484,7 +484,7 @@ object LikeSimplification extends Rule[LogicalPlan] {
484484
private val equalTo = "([^_%]*)".r
485485

486486
def apply(plan: LogicalPlan): LogicalPlan = plan transformAllExpressions {
487-
case Like(input, Literal(pattern, StringType)) =>
487+
case Like(input, Literal(pattern, StringType), escapeChar) =>
488488
if (pattern == null) {
489489
// If pattern is null, return null value directly, since "col like null" == null.
490490
Literal(null, BooleanType)
@@ -503,8 +503,7 @@ object LikeSimplification extends Rule[LogicalPlan] {
503503
Contains(input, Literal(infix))
504504
case equalTo(str) =>
505505
EqualTo(input, Literal(str))
506-
case _ =>
507-
Like(input, Literal.create(pattern, StringType))
506+
case _ => Like(input, Literal.create(pattern, StringType), escapeChar)
508507
}
509508
}
510509
}

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1386,7 +1386,14 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
13861386
case SqlBaseParser.IN =>
13871387
invertIfNotDefined(In(e, ctx.expression.asScala.map(expression)))
13881388
case SqlBaseParser.LIKE =>
1389-
invertIfNotDefined(Like(e, expression(ctx.pattern)))
1389+
val escapeChar = Option(ctx.escapeChar).map(string).map { str =>
1390+
if (str.length != 1) {
1391+
throw new ParseException("Invalid escape string." +
1392+
"Escape string must contains only one character.", ctx)
1393+
}
1394+
str.charAt(0)
1395+
}.getOrElse('\\')
1396+
invertIfNotDefined(Like(e, expression(ctx.pattern), escapeChar))
13901397
case SqlBaseParser.RLIKE =>
13911398
invertIfNotDefined(RLike(e, expression(ctx.pattern)))
13921399
case SqlBaseParser.NULL if ctx.NOT != null =>

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -39,9 +39,10 @@ object StringUtils extends Logging {
3939
* throw an [[AnalysisException]].
4040
*
4141
* @param pattern the SQL pattern to convert
42+
* @param escapeStr the escape string contains one character.
4243
* @return the equivalent Java regular expression of the pattern
4344
*/
44-
def escapeLikeRegex(pattern: String): String = {
45+
def escapeLikeRegex(pattern: String, escapeChar: Char): String = {
4546
val in = pattern.toIterator
4647
val out = new StringBuilder()
4748

@@ -50,13 +51,14 @@ object StringUtils extends Logging {
5051

5152
while (in.hasNext) {
5253
in.next match {
53-
case '\\' if in.hasNext =>
54+
case c1 if c1 == escapeChar && in.hasNext =>
5455
val c = in.next
5556
c match {
56-
case '_' | '%' | '\\' => out ++= Pattern.quote(Character.toString(c))
57+
case '_' | '%' => out ++= Pattern.quote(Character.toString(c))
58+
case c if c == escapeChar => out ++= Pattern.quote(Character.toString(c))
5759
case _ => fail(s"the escape character is not allowed to precede '$c'")
5860
}
59-
case '\\' => fail("it is not allowed to end with the escape character")
61+
case c if c == escapeChar => fail("it is not allowed to end with the escape character")
6062
case '_' => out ++= "."
6163
case '%' => out ++= ".*"
6264
case c => out ++= Pattern.quote(Character.toString(c))

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,84 @@ class RegexpExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
118118
checkLiteralRow("""%SystemDrive%\Users\John""" like _, """\%SystemDrive\%\\Users%""", true)
119119
}
120120

121+
Seq('/', '#', '\"').foreach { escapeChar =>
122+
test(s"LIKE Pattern ESCAPE '$escapeChar'") {
123+
// null handling
124+
checkLiteralRow(Literal.create(null, StringType).like(_, escapeChar), "a", null)
125+
checkEvaluation(
126+
Literal.create("a", StringType).like(Literal.create(null, StringType), escapeChar), null)
127+
checkEvaluation(
128+
Literal.create(null, StringType).like(Literal.create(null, StringType), escapeChar), null)
129+
checkEvaluation(Literal.create("a", StringType).like(
130+
NonFoldableLiteral.create("a", StringType), escapeChar), true)
131+
checkEvaluation(Literal.create("a", StringType).like(
132+
NonFoldableLiteral.create(null, StringType), escapeChar), null)
133+
checkEvaluation(Literal.create(null, StringType).like(
134+
NonFoldableLiteral.create("a", StringType), escapeChar), null)
135+
checkEvaluation(Literal.create(null, StringType).like(
136+
NonFoldableLiteral.create(null, StringType), escapeChar), null)
137+
138+
// simple patterns
139+
checkLiteralRow("abdef" like(_, escapeChar), "abdef", true)
140+
checkLiteralRow("a_%b" like(_, escapeChar), s"a${escapeChar}__b", true)
141+
checkLiteralRow("addb" like(_, escapeChar), "a_%b", true)
142+
checkLiteralRow("addb" like(_, escapeChar), s"a${escapeChar}__b", false)
143+
checkLiteralRow("addb" like(_, escapeChar), s"a%$escapeChar%b", false)
144+
checkLiteralRow("a_%b" like(_, escapeChar), s"a%$escapeChar%b", true)
145+
checkLiteralRow("addb" like(_, escapeChar), "a%", true)
146+
checkLiteralRow("addb" like(_, escapeChar), "**", false)
147+
checkLiteralRow("abc" like(_, escapeChar), "a%", true)
148+
checkLiteralRow("abc" like(_, escapeChar), "b%", false)
149+
checkLiteralRow("abc" like(_, escapeChar), "bc%", false)
150+
checkLiteralRow("a\nb" like(_, escapeChar), "a_b", true)
151+
checkLiteralRow("ab" like(_, escapeChar), "a%b", true)
152+
checkLiteralRow("a\nb" like(_, escapeChar), "a%b", true)
153+
154+
// empty input
155+
checkLiteralRow("" like(_, escapeChar), "", true)
156+
checkLiteralRow("a" like(_, escapeChar), "", false)
157+
checkLiteralRow("" like(_, escapeChar), "a", false)
158+
159+
// SI-17647 double-escaping backslash
160+
checkLiteralRow(s"""$escapeChar$escapeChar$escapeChar$escapeChar""" like(_, escapeChar),
161+
s"""%$escapeChar$escapeChar%""", true)
162+
checkLiteralRow("""%%""" like(_, escapeChar), """%%""", true)
163+
checkLiteralRow(s"""${escapeChar}__""" like(_, escapeChar),
164+
s"""$escapeChar$escapeChar${escapeChar}__""", true)
165+
checkLiteralRow(s"""$escapeChar$escapeChar${escapeChar}__""" like(_, escapeChar),
166+
s"""%$escapeChar$escapeChar%$escapeChar%""", false)
167+
checkLiteralRow(s"""_$escapeChar$escapeChar$escapeChar%""" like(_, escapeChar),
168+
s"""%$escapeChar${escapeChar}""", false)
169+
170+
// unicode
171+
// scalastyle:off nonascii
172+
checkLiteralRow("a\u20ACa" like(_, escapeChar), "_\u20AC_", true)
173+
checkLiteralRow("a€a" like(_, escapeChar), "_€_", true)
174+
checkLiteralRow("a€a" like(_, escapeChar), "_\u20AC_", true)
175+
checkLiteralRow("a\u20ACa" like(_, escapeChar), "_€_", true)
176+
// scalastyle:on nonascii
177+
178+
// invalid escaping
179+
val invalidEscape = intercept[AnalysisException] {
180+
evaluateWithoutCodegen("""a""" like(s"""${escapeChar}a""", escapeChar))
181+
}
182+
assert(invalidEscape.getMessage.contains("pattern"))
183+
val endEscape = intercept[AnalysisException] {
184+
evaluateWithoutCodegen("""a""" like(s"""a$escapeChar""", escapeChar))
185+
}
186+
assert(endEscape.getMessage.contains("pattern"))
187+
188+
// case
189+
checkLiteralRow("A" like(_, escapeChar), "a%", false)
190+
checkLiteralRow("a" like(_, escapeChar), "A%", false)
191+
checkLiteralRow("AaA" like(_, escapeChar), "_a_", true)
192+
193+
// example
194+
checkLiteralRow(s"""%SystemDrive%${escapeChar}Users${escapeChar}John""" like(_, escapeChar),
195+
s"""$escapeChar%SystemDrive$escapeChar%$escapeChar${escapeChar}Users%""", true)
196+
}
197+
}
198+
121199
test("RLIKE Regular Expression") {
122200
checkLiteralRow(Literal.create(null, StringType) rlike _, "abdef", null)
123201
checkEvaluation("abdef" rlike Literal.create(null, StringType), null)

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,18 @@ class ExpressionParserSuite extends AnalysisTest {
186186
assertEqual("a not regexp 'pattern%'", !('a rlike "pattern%"))
187187
}
188188

189+
test("like escape expressions") {
190+
val message = "Escape string must contains only one character."
191+
assertEqual("a like 'pattern%' escape '#'", 'a.like("pattern%", '#'))
192+
assertEqual("a like 'pattern%' escape '\"'", 'a.like("pattern%", '\"'))
193+
intercept("a like 'pattern%' escape '##'", message)
194+
intercept("a like 'pattern%' escape ''", message)
195+
assertEqual("a not like 'pattern%' escape '#'", !('a.like("pattern%", '#')))
196+
assertEqual("a not like 'pattern%' escape '\"'", !('a.like("pattern%", '\"')))
197+
intercept("a not like 'pattern%' escape '\"/'", message)
198+
intercept("a not like 'pattern%' escape ''", message)
199+
}
200+
189201
test("like expressions with ESCAPED_STRING_LITERALS = true") {
190202
val conf = new SQLConf()
191203
conf.setConfString(SQLConf.ESCAPED_STRING_LITERALS.key, "true")

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -367,6 +367,7 @@ class TableIdentifierParserSuite extends SparkFunSuite with SQLHelper {
367367
"drop",
368368
"else",
369369
"end",
370+
"escape",
370371
"escaped",
371372
"except",
372373
"exchange",
@@ -581,6 +582,7 @@ class TableIdentifierParserSuite extends SparkFunSuite with SQLHelper {
581582
"distinct",
582583
"else",
583584
"end",
585+
"escape",
584586
"except",
585587
"false",
586588
"fetch",

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala

Lines changed: 28 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,13 +23,34 @@ import org.apache.spark.sql.catalyst.util.StringUtils._
2323
class StringUtilsSuite extends SparkFunSuite {
2424

2525
test("escapeLikeRegex") {
26-
assert(escapeLikeRegex("abdef") === "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E")
27-
assert(escapeLikeRegex("a\\__b") === "(?s)\\Qa\\E\\Q_\\E.\\Qb\\E")
28-
assert(escapeLikeRegex("a_%b") === "(?s)\\Qa\\E..*\\Qb\\E")
29-
assert(escapeLikeRegex("a%\\%b") === "(?s)\\Qa\\E.*\\Q%\\E\\Qb\\E")
30-
assert(escapeLikeRegex("a%") === "(?s)\\Qa\\E.*")
31-
assert(escapeLikeRegex("**") === "(?s)\\Q*\\E\\Q*\\E")
32-
assert(escapeLikeRegex("a_b") === "(?s)\\Qa\\E.\\Qb\\E")
26+
val expectedEscapedStrOne = "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E"
27+
val expectedEscapedStrTwo = "(?s)\\Qa\\E\\Q_\\E.\\Qb\\E"
28+
val expectedEscapedStrThree = "(?s)\\Qa\\E..*\\Qb\\E"
29+
val expectedEscapedStrFour = "(?s)\\Qa\\E.*\\Q%\\E\\Qb\\E"
30+
val expectedEscapedStrFive = "(?s)\\Qa\\E.*"
31+
val expectedEscapedStrSix = "(?s)\\Q*\\E\\Q*\\E"
32+
val expectedEscapedStrSeven = "(?s)\\Qa\\E.\\Qb\\E"
33+
assert(escapeLikeRegex("abdef", '\\') === expectedEscapedStrOne)
34+
assert(escapeLikeRegex("abdef", '/') === expectedEscapedStrOne)
35+
assert(escapeLikeRegex("abdef", '\"') === expectedEscapedStrOne)
36+
assert(escapeLikeRegex("a\\__b", '\\') === expectedEscapedStrTwo)
37+
assert(escapeLikeRegex("a/__b", '/') === expectedEscapedStrTwo)
38+
assert(escapeLikeRegex("a\"__b", '\"') === expectedEscapedStrTwo)
39+
assert(escapeLikeRegex("a_%b", '\\') === expectedEscapedStrThree)
40+
assert(escapeLikeRegex("a_%b", '/') === expectedEscapedStrThree)
41+
assert(escapeLikeRegex("a_%b", '\"') === expectedEscapedStrThree)
42+
assert(escapeLikeRegex("a%\\%b", '\\') === expectedEscapedStrFour)
43+
assert(escapeLikeRegex("a%/%b", '/') === expectedEscapedStrFour)
44+
assert(escapeLikeRegex("a%\"%b", '\"') === expectedEscapedStrFour)
45+
assert(escapeLikeRegex("a%", '\\') === expectedEscapedStrFive)
46+
assert(escapeLikeRegex("a%", '/') === expectedEscapedStrFive)
47+
assert(escapeLikeRegex("a%", '\"') === expectedEscapedStrFive)
48+
assert(escapeLikeRegex("**", '\\') === expectedEscapedStrSix)
49+
assert(escapeLikeRegex("**", '/') === expectedEscapedStrSix)
50+
assert(escapeLikeRegex("**", '\"') === expectedEscapedStrSix)
51+
assert(escapeLikeRegex("a_b", '\\') === expectedEscapedStrSeven)
52+
assert(escapeLikeRegex("a_b", '/') === expectedEscapedStrSeven)
53+
assert(escapeLikeRegex("a_b", '\"') === expectedEscapedStrSeven)
3354
}
3455

3556
test("filter pattern") {

sql/core/src/main/scala/org/apache/spark/sql/dynamicpruning/PartitionPruning.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,7 @@ object PartitionPruning extends Rule[LogicalPlan] with PredicateHelper {
159159
case Not(expr) => isLikelySelective(expr)
160160
case And(l, r) => isLikelySelective(l) || isLikelySelective(r)
161161
case Or(l, r) => isLikelySelective(l) && isLikelySelective(r)
162-
case Like(_, _) => true
162+
case Like(_, _, _) => true
163163
case _: BinaryComparison => true
164164
case _: In | _: InSet => true
165165
case _: StringPredicate => true

0 commit comments

Comments
 (0)