You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Licensed to the Apache Software Foundation (ASF) under one
3
+
or more contributor license agreements. See the NOTICE file
4
+
distributed with this work for additional information
5
+
regarding copyright ownership. The ASF licenses this file
6
+
to you under the Apache License, Version 2.0 (the
7
+
"License"); you may not use this file except in compliance
8
+
with the License. You may obtain a copy of the License at
9
+
10
+
http://www.apache.org/licenses/LICENSE-2.0
11
+
12
+
Unless required by applicable law or agreed to in writing,
13
+
software distributed under the License is distributed on an
14
+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15
+
KIND, either express or implied. See the License for the
16
+
specific language governing permissions and limitations
17
+
under the License.
18
+
-->
19
+
20
+
# Contributing to Apache DataFusion Comet
21
+
22
+
We welcome contributions to Comet in many areas, and encourage new contributors to get involved.
23
+
24
+
Here are some areas where you can help:
25
+
26
+
- Testing Comet with existing Spark jobs and reporting issues for any bugs or performance issues
27
+
- Contributing code to support Spark expressions, operators, and data types that are not currently supported
28
+
- Reviewing pull requests and helping to test new features for correctness and performance
29
+
- Improving documentation
30
+
31
+
## Finding issues to work on
32
+
33
+
We maintain a list of good first issues in GitHub [here](https://github.com/apache/datafusion-comet/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22).
34
+
35
+
## Reporting issues
36
+
37
+
We use [GitHub issues](https://github.com/apache/datafusion-comet/issues) for bug reports and feature requests.
38
+
39
+
## Asking for Help
40
+
41
+
The Comet project uses the same Slack and Discord channels as the main Apache DataFusion project. See details at
42
+
[Apache DataFusion Communications]. There are dedicated Comet channels in both Slack and Discord.
43
+
44
+
## Regular public meetings
45
+
46
+
The Comet contributors hold regular video calls where new and current contributors are welcome to ask questions and
47
+
coordinate on issues that they are working on.
48
+
49
+
See the [Apache DataFusion Comet community meeting] Google document for more information.
Copy file name to clipboardExpand all lines: docs/source/contributor-guide/debugging.md
+27-19Lines changed: 27 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,12 +20,13 @@ under the License.
20
20
# Comet Debugging Guide
21
21
22
22
This HOWTO describes how to debug JVM code and Native code concurrently. The guide assumes you have:
23
+
23
24
1. Intellij as the Java IDE
24
25
2. CLion as the Native IDE. For Rust code, the CLion Rust language plugin is required. Note that the
25
-
Intellij Rust plugin is not sufficient.
26
+
Intellij Rust plugin is not sufficient.
26
27
3. CLion/LLDB as the native debugger. CLion ships with a bundled LLDB and the Rust community has
27
-
its own packaging of LLDB (`lldb-rust`). Both provide a better display of Rust symbols than plain
28
-
LLDB or the LLDB that is bundled with XCode. We will use the LLDB packaged with CLion for this guide.
28
+
its own packaging of LLDB (`lldb-rust`). Both provide a better display of Rust symbols than plain
29
+
LLDB or the LLDB that is bundled with XCode. We will use the LLDB packaged with CLion for this guide.
29
30
4. We will use a Comet _unit_ test as the canonical use case.
30
31
31
32
_Caveat: The steps here have only been tested with JDK 11_ on Mac (M1)
@@ -42,21 +43,24 @@ use advanced `lldb` debugging.
42
43
1. Add a Debug Configuration for the unit test
43
44
44
45
1. In the Debug Configuration for that unit test add `-Xint` as a JVM parameter. This option is
45
-
undocumented *magic*. Without this, the LLDB debugger hits a EXC_BAD_ACCESS (or EXC_BAD_INSTRUCTION) from
46
-
which one cannot recover.
46
+
undocumented _magic_. Without this, the LLDB debugger hits a EXC_BAD_ACCESS (or EXC_BAD_INSTRUCTION) from
47
+
which one cannot recover.
48
+
49
+
1. Add a println to the unit test to print the PID of the JVM process. (jps can also be used but this is less error prone if you have multiple jvm processes running)
50
+
51
+
```JDK8
52
+
println("Waiting for Debugger: PID - ", ManagementFactory.getRuntimeMXBean().getName())
53
+
```
54
+
55
+
This will print something like : `PID@your_machine_name`.
47
56
48
-
1. Add a println to the unit test to print the PID of the JVM process. (jps can also be used but this is less error prone if you have multiple jvm processes running)
49
-
```JDK8
50
-
println("Waiting for Debugger: PID - ", ManagementFactory.getRuntimeMXBean().getName())
51
-
```
52
-
This will print something like : `PID@your_machine_name`.
57
+
For JDK9 and newer
53
58
54
-
For JDK9 and newer
55
-
```JDK9
56
-
println("Waiting for Debugger: PID - ", ProcessHandle.current.pid)
57
-
```
59
+
```JDK9
60
+
println("Waiting for Debugger: PID - ", ProcessHandle.current.pid)
61
+
```
58
62
59
-
==> Note the PID
63
+
==> Note the PID
60
64
61
65
1. Debug-run the test in Intellij and wait for the breakpoint to be hit
By default, Comet outputs the exception details specific for Comet.
103
+
104
+
By default, Comet outputs the exception details specific for Comet.
100
105
101
106
```scala
102
107
scala> spark.sql("my_failing_query").show(false)
@@ -112,7 +117,7 @@ This was likely caused by a bug in DataFusion's code and we would welcome that y
112
117
```
113
118
114
119
There is a verbose exception option by leveraging DataFusion [backtraces](https://arrow.apache.org/datafusion/user-guide/example-usage.html#enable-backtraces)
115
-
This option allows to append native DataFusion stacktrace to the original error message.
120
+
This option allows to append native DataFusion stacktrace to the original error message.
116
121
To enable this option with Comet it is needed to include `backtrace` feature in [Cargo.toml](https://github.com/apache/arrow-datafusion-comet/blob/main/core/Cargo.toml) for DataFusion dependencies
@@ -151,6 +157,8 @@ at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:126)
151
157
(reduced)
152
158
153
159
```
160
+
154
161
Note:
162
+
155
163
- The backtrace coverage in DataFusion is still improving. So there is a chance the error still not covered, if so feel free to file a [ticket](https://github.com/apache/arrow-datafusion/issues)
156
164
- The backtrace evaluation comes with performance cost and intended mostly for debugging purposes
Copy file name to clipboardExpand all lines: docs/source/contributor-guide/development.md
+13-7Lines changed: 13 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,25 +49,29 @@ A few common commands are specified in project's `Makefile`:
49
49
-`make clean`: clean up the workspace
50
50
-`bin/comet-spark-shell -d . -o spark/target/` run Comet spark shell for V1 datasources
51
51
-`bin/comet-spark-shell -d . -o spark/target/ --conf spark.sql.sources.useV1SourceList=""` run Comet spark shell for V2 datasources
52
-
52
+
53
53
## Development Environment
54
+
54
55
Comet is a multi-language project with native code written in Rust and JVM code written in Java and Scala.
55
-
For Rust code, the CLion IDE is recommended. For JVM code, IntelliJ IDEA is recommended.
56
+
For Rust code, the CLion IDE is recommended. For JVM code, IntelliJ IDEA is recommended.
56
57
57
58
Before opening the project in an IDE, make sure to run `make` first to generate the necessary files for the IDEs. Currently, it's mostly about
58
59
generating protobuf message classes for the JVM side. It's only required to run `make` once after cloning the repo.
59
60
60
61
### IntelliJ IDEA
61
-
First make sure to install the Scala plugin in IntelliJ IDEA.
62
+
63
+
First make sure to install the Scala plugin in IntelliJ IDEA.
62
64
After that, you can open the project in IntelliJ IDEA. The IDE should automatically detect the project structure and import as a Maven project.
63
65
64
66
### CLion
67
+
65
68
First make sure to install the Rust plugin in CLion or you can use the dedicated Rust IDE: RustRover.
66
69
After that you can open the project in CLion. The IDE should automatically detect the project structure and import as a Cargo project.
67
70
68
71
### Running Tests in IDEA
72
+
69
73
Like other Maven projects, you can run tests in IntelliJ IDEA by right-clicking on the test class or test method and selecting "Run" or "Debug".
70
-
However if the tests is related to the native side. Please make sure to run `make core`or `cd core && cargo build` before running the tests in IDEA.
74
+
However if the tests is related to the native side. Please make sure to run `make core` or `cd core && cargo build` before running the tests in IDEA.
71
75
72
76
## Benchmark
73
77
@@ -82,9 +86,11 @@ To run TPC-H or TPC-DS micro benchmarks, please follow the instructions
82
86
in the respective source code, e.g., `CometTPCHQueryBenchmark`.
83
87
84
88
## Debugging
89
+
85
90
Comet is a multi-language project with native code written in Rust and JVM code written in Java and Scala.
86
-
It is possible to debug both native and JVM code concurrently as described in the [DEBUGGING guide](DEBUGGING.md)
91
+
It is possible to debug both native and JVM code concurrently as described in the [DEBUGGING guide](debugging)
87
92
88
93
## Submitting a Pull Request
89
-
Comet uses `cargo fmt`, [Scalafix](https://github.com/scalacenter/scalafix) and [Spotless](https://github.com/diffplug/spotless/tree/main/plugin-maven) to
90
-
automatically format the code. Before submitting a pull request, you can simply run `make format` to format the code.
94
+
95
+
Comet uses `cargo fmt`, [Scalafix](https://github.com/scalacenter/scalafix) and [Spotless](https://github.com/diffplug/spotless/tree/main/plugin-maven) to
96
+
automatically format the code. Before submitting a pull request, you can simply run `make format` to format the code.
0 commit comments