-
Notifications
You must be signed in to change notification settings - Fork 199
Closed
Description
Describe the bug
The current initcap implementation uses DataFusion's initcap, which does not match Spark's semantics.
Spark's initcap function semantics (per its description):
Returns `str` with the first letter of each word in uppercase.
All other letters are in lowercase. Words are delimited by white space.
DataFusion's initcap function semantics (per its description):
Capitalizes the first character in each word in the input string.
Words are delimited by non-alphanumeric characters.
To Reproduce
Run the follwing SQLs:
CREATE TABLE tbl(id INT, txt STRING) USING parquet;
INSERT INTO tbl VALUES
(1, 'Hello_world'),
(2, 'robert rose-smith'),
(3, 'foo.bar/baz')
;
select id, initcap(txt) from tbl;
Expected behavior
1,Hello_world
2,Robert Rose-smith
3,Foo.bar/baz
Actual behavior
1,Hello_World
2,Robert Rose-Smith
3,Foo.Bar/Baz
Screenshots
Additional context
surjikal
Metadata
Metadata
Assignees
Labels
No labels