Skip to content

Commit 056f546

Browse files
chore: generate basic spark function tests (#16409)
* basic spark functions tests * basic spark functions tests * fix script * fix script * update readme * update slt files for spark to have an extra header for context * undo * Update datafusion/sqllogictest/test_files/spark/README.md Co-authored-by: Oleks V <comphead@users.noreply.github.com> * Update datafusion/sqllogictest/test_files/spark/README.md Co-authored-by: Oleks V <comphead@users.noreply.github.com> * prettier --------- Co-authored-by: Oleks V <comphead@users.noreply.github.com>
1 parent 11fc52d commit 056f546

File tree

214 files changed

+6851
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

214 files changed

+6851
-0
lines changed

datafusion/sqllogictest/test_files/spark/README.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,16 @@
2121

2222
This directory contains test files for the `spark` test suite.
2323

24+
## RoadMap
25+
26+
Implementing the `datafusion-spark` compatible functions project is still a work in progress.
27+
Many of the tests in this directory are commented out and are waiting for help with implementation.
28+
29+
For more information please see:
30+
31+
- [The `datafusion-spark` Epic](https://github.com/apache/datafusion/issues/15914)
32+
- [Spark Test Generation Script] (https://github.com/apache/datafusion/pull/16409#issuecomment-2972618052)
33+
2434
## Testing Guide
2535

2636
When testing Spark functions:
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
# This file was originally created by a porting script from:
19+
# https://github.com/lakehq/sail/tree/43b6ed8221de5c4c4adbedbb267ae1351158b43c/crates/sail-spark-connect/tests/gold_data/function
20+
# This file is part of the implementation of the datafusion-spark function library.
21+
# For more information, please see:
22+
# https://github.com/apache/datafusion/issues/15914
23+
24+
## Original Query: SELECT array(1, 2, 3);
25+
## PySpark 3.5.5 Result: {'array(1, 2, 3)': [1, 2, 3], 'typeof(array(1, 2, 3))': 'array<int>', 'typeof(1)': 'int', 'typeof(2)': 'int', 'typeof(3)': 'int'}
26+
#query
27+
#SELECT array(1::int, 2::int, 3::int);
28+
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
# This file was originally created by a porting script from:
19+
# https://github.com/lakehq/sail/tree/43b6ed8221de5c4c4adbedbb267ae1351158b43c/crates/sail-spark-connect/tests/gold_data/function
20+
# This file is part of the implementation of the datafusion-spark function library.
21+
# For more information, please see:
22+
# https://github.com/apache/datafusion/issues/15914
23+
24+
## Original Query: SELECT array_repeat('123', 2);
25+
## PySpark 3.5.5 Result: {'array_repeat(123, 2)': ['123', '123'], 'typeof(array_repeat(123, 2))': 'array<string>', 'typeof(123)': 'string', 'typeof(2)': 'int'}
26+
#query
27+
#SELECT array_repeat('123'::string, 2::int);
28+
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
# This file was originally created by a porting script from:
19+
# https://github.com/lakehq/sail/tree/43b6ed8221de5c4c4adbedbb267ae1351158b43c/crates/sail-spark-connect/tests/gold_data/function
20+
# This file is part of the implementation of the datafusion-spark function library.
21+
# For more information, please see:
22+
# https://github.com/apache/datafusion/issues/15914
23+
24+
## Original Query: SELECT sequence(1, 5);
25+
## PySpark 3.5.5 Result: {'sequence(1, 5)': [1, 2, 3, 4, 5], 'typeof(sequence(1, 5))': 'array<int>', 'typeof(1)': 'int', 'typeof(5)': 'int'}
26+
#query
27+
#SELECT sequence(1::int, 5::int);
28+
29+
## Original Query: SELECT sequence(5, 1);
30+
## PySpark 3.5.5 Result: {'sequence(5, 1)': [5, 4, 3, 2, 1], 'typeof(sequence(5, 1))': 'array<int>', 'typeof(5)': 'int', 'typeof(1)': 'int'}
31+
#query
32+
#SELECT sequence(5::int, 1::int);
33+
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
# This file was originally created by a porting script from:
19+
# https://github.com/lakehq/sail/tree/43b6ed8221de5c4c4adbedbb267ae1351158b43c/crates/sail-spark-connect/tests/gold_data/function
20+
# This file is part of the implementation of the datafusion-spark function library.
21+
# For more information, please see:
22+
# https://github.com/apache/datafusion/issues/15914
23+
24+
## Original Query: SELECT bit_count(0);
25+
## PySpark 3.5.5 Result: {'bit_count(0)': 0, 'typeof(bit_count(0))': 'int', 'typeof(0)': 'int'}
26+
#query
27+
#SELECT bit_count(0::int);
28+
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
# This file was originally created by a porting script from:
19+
# https://github.com/lakehq/sail/tree/43b6ed8221de5c4c4adbedbb267ae1351158b43c/crates/sail-spark-connect/tests/gold_data/function
20+
# This file is part of the implementation of the datafusion-spark function library.
21+
# For more information, please see:
22+
# https://github.com/apache/datafusion/issues/15914
23+
24+
## Original Query: SELECT bit_get(11, 0);
25+
## PySpark 3.5.5 Result: {'bit_get(11, 0)': 1, 'typeof(bit_get(11, 0))': 'tinyint', 'typeof(11)': 'int', 'typeof(0)': 'int'}
26+
#query
27+
#SELECT bit_get(11::int, 0::int);
28+
29+
## Original Query: SELECT bit_get(11, 2);
30+
## PySpark 3.5.5 Result: {'bit_get(11, 2)': 0, 'typeof(bit_get(11, 2))': 'tinyint', 'typeof(11)': 'int', 'typeof(2)': 'int'}
31+
#query
32+
#SELECT bit_get(11::int, 2::int);
33+
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
# This file was originally created by a porting script from:
19+
# https://github.com/lakehq/sail/tree/43b6ed8221de5c4c4adbedbb267ae1351158b43c/crates/sail-spark-connect/tests/gold_data/function
20+
# This file is part of the implementation of the datafusion-spark function library.
21+
# For more information, please see:
22+
# https://github.com/apache/datafusion/issues/15914
23+
24+
## Original Query: SELECT getbit(11, 0);
25+
## PySpark 3.5.5 Result: {'getbit(11, 0)': 1, 'typeof(getbit(11, 0))': 'tinyint', 'typeof(11)': 'int', 'typeof(0)': 'int'}
26+
#query
27+
#SELECT getbit(11::int, 0::int);
28+
29+
## Original Query: SELECT getbit(11, 2);
30+
## PySpark 3.5.5 Result: {'getbit(11, 2)': 0, 'typeof(getbit(11, 2))': 'tinyint', 'typeof(11)': 'int', 'typeof(2)': 'int'}
31+
#query
32+
#SELECT getbit(11::int, 2::int);
33+
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
# This file was originally created by a porting script from:
19+
# https://github.com/lakehq/sail/tree/43b6ed8221de5c4c4adbedbb267ae1351158b43c/crates/sail-spark-connect/tests/gold_data/function
20+
# This file is part of the implementation of the datafusion-spark function library.
21+
# For more information, please see:
22+
# https://github.com/apache/datafusion/issues/15914
23+
24+
## Original Query: SELECT shiftright(4, 1);
25+
## PySpark 3.5.5 Result: {'shiftright(4, 1)': 2, 'typeof(shiftright(4, 1))': 'int', 'typeof(4)': 'int', 'typeof(1)': 'int'}
26+
#query
27+
#SELECT shiftright(4::int, 1::int);
28+
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
# This file was originally created by a porting script from:
19+
# https://github.com/lakehq/sail/tree/43b6ed8221de5c4c4adbedbb267ae1351158b43c/crates/sail-spark-connect/tests/gold_data/function
20+
# This file is part of the implementation of the datafusion-spark function library.
21+
# For more information, please see:
22+
# https://github.com/apache/datafusion/issues/15914
23+
24+
## Original Query: SELECT shiftrightunsigned(4, 1);
25+
## PySpark 3.5.5 Result: {'shiftrightunsigned(4, 1)': 2, 'typeof(shiftrightunsigned(4, 1))': 'int', 'typeof(4)': 'int', 'typeof(1)': 'int'}
26+
#query
27+
#SELECT shiftrightunsigned(4::int, 1::int);
28+
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
# This file was originally created by a porting script from:
19+
# https://github.com/lakehq/sail/tree/43b6ed8221de5c4c4adbedbb267ae1351158b43c/crates/sail-spark-connect/tests/gold_data/function
20+
# This file is part of the implementation of the datafusion-spark function library.
21+
# For more information, please see:
22+
# https://github.com/apache/datafusion/issues/15914
23+
24+
## Original Query: SELECT concat('Spark', 'SQL');
25+
## PySpark 3.5.5 Result: {'concat(Spark, SQL)': 'SparkSQL', 'typeof(concat(Spark, SQL))': 'string', 'typeof(Spark)': 'string', 'typeof(SQL)': 'string'}
26+
#query
27+
#SELECT concat('Spark'::string, 'SQL'::string);
28+

0 commit comments

Comments
 (0)