Skip to content

Commit d29cf97

Browse files
authored
docs: Developer documentation for Rust (#34)
1 parent c5db2a0 commit d29cf97

File tree

4 files changed

+198
-3
lines changed

4 files changed

+198
-3
lines changed

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ version = "0.1.0-beta.2"
1313
authors = ["Kyle Barron <kylebarron2@gmail.com>"]
1414
edition = "2024"
1515
license = "MIT OR Apache-2.0"
16-
repository = "https://github.com/datafusion-contrib/datafusion-geo"
16+
repository = "https://github.com/datafusion-contrib/geodatafusion"
1717
rust-version = "1.85"
1818
categories = ["science::geo"]
1919

DEVELOP.md

Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
# Development Guide
2+
3+
This guide explains how to contribute to `geodatafusion`, a spatial extension for Apache DataFusion.
4+
5+
## Project Structure
6+
7+
This is a Rust workspace with multiple crates:
8+
9+
- `rust/geodatafusion` - Core library with spatial User-Defined Functions (UDFs)
10+
- Internally, each _provider_ of functions is organized in submodules:
11+
- `native/` - Operations that are natively implemented, without the use of other dependencies like `geo`
12+
- `geo/` - Operations implemented using the `geo` crate
13+
- `geohash/` - GeoHash encoding/decoding, using the `geohash` crate
14+
- `rust/geodatafusion-flatgeobuf` - FlatGeobuf format support
15+
- `rust/geodatafusion-geoparquet` - GeoParquet format support
16+
- `rust/geodatafusion-geojson` - GeoJSON format support
17+
- `python/` - Python bindings (separate workspace)
18+
19+
## Prerequisites
20+
21+
### Rust Development
22+
23+
- Rust. The minimum supported Rust version (MSRV) is defined by `rust-version` in `Cargo.toml`. You can update Rust using [rustup](https://rustup.rs/): `rustup update stable`.
24+
25+
## Getting Started
26+
27+
### Clone the Repository
28+
29+
```bash
30+
git clone https://github.com/datafusion-contrib/geodatafusion.git
31+
cd geodatafusion
32+
```
33+
34+
### Build the Project
35+
36+
```bash
37+
# Build all Rust crates
38+
cargo build
39+
40+
# Build with all features
41+
cargo build --all-features
42+
```
43+
44+
## Development Workflow
45+
46+
### Running Tests
47+
48+
```bash
49+
# Run all Rust tests
50+
cargo test --all-features
51+
52+
# Run tests for a specific crate
53+
cargo test -p geodatafusion
54+
```
55+
56+
### Code Formatting
57+
58+
We use `rustfmt`:
59+
60+
```bash
61+
cargo +nightly-2025-05-14 fmt -- --unstable-features \
62+
--config imports_granularity=Module,group_imports=StdExternalCrate
63+
```
64+
65+
We use the nightly compiler for formatting because import ordering is an unstable feature.
66+
67+
### Linting
68+
69+
Run clippy on all crates
70+
71+
```bash
72+
cargo clippy --all-features --tests -- -D warnings
73+
```
74+
75+
### Documentation
76+
77+
Build and view documentation
78+
79+
```bash
80+
cargo doc --all-features --open
81+
```
82+
83+
## Contributing
84+
85+
### Adding New Functions
86+
87+
We follow the [PostGIS API](https://postgis.net/docs/reference.html) as closely as possible. When implementing a new function:
88+
89+
1. **Check the README** - See if the function is listed in the function table
90+
2. **Find similar implementations** - Look at existing functions in the same category
91+
3. **Implement the function** - Follow existing patterns in `rust/geodatafusion/src/udf/`
92+
4. **Add tests** - Include unit tests and integration tests
93+
5. **Update documentation** - Add doc comments and update the README checkboxes
94+
95+
#### Function Categories
96+
97+
Functions are organized by category in `rust/geodatafusion/src/udf/`:
98+
99+
- `native/constructors/` - Geometry constructors (ST_MakePoint, etc.)
100+
- `native/accessors/` - Geometry accessors (ST_X, ST_Y, etc.)
101+
- `native/io/` - Input/output (WKT, WKB)
102+
- `native/bounding_box/` - Bounding box functions
103+
- `geo/measurement/` - Measurement functions (ST_Area, ST_Distance)
104+
- `geo/processing/` - Processing functions (ST_Buffer, ST_Simplify)
105+
- `geo/relationships/` - Spatial relationships (ST_Intersects, etc.)
106+
- `geo/validation/` - Validation functions (ST_IsValid)
107+
- `geohash/` - GeoHash functions
108+
109+
### Code Style
110+
111+
- Use meaningful variable and function names
112+
- Add doc comments for public APIs
113+
- Follow Rust naming conventions (snake_case for functions, PascalCase for types)
114+
- Keep functions focused and single-purpose
115+
- Prefer explicit error handling over panics
116+
117+
### Testing Guidelines
118+
119+
- Test both valid and invalid inputs
120+
- Test edge cases (empty geometries, null values, etc.)
121+
- Use descriptive test names
122+
- Add SQL integration tests when appropriate
123+
- Test against PostGIS behavior when possible
124+
125+
Example test structure:
126+
127+
```rust
128+
#[test]
129+
fn test_st_area_polygon() {
130+
// Test case description
131+
let input = /* ... */;
132+
let expected = /* ... */;
133+
let result = st_area(input);
134+
assert_eq!(result, expected);
135+
}
136+
```
137+
138+
## Continuous Integration
139+
140+
Our CI pipeline runs on every pull request and includes:
141+
142+
1. **Formatting** - Checks code formatting with `rustfmt`
143+
2. **Linting** - Runs `clippy` with all features
144+
3. **Tests** - Runs test suite with all features
145+
4. **Documentation** - Ensures docs build without warnings
146+
5. **Python CI** - Tests Python bindings
147+
6. **Conventional Commits** - Validates commit message format
148+
149+
Make sure all checks pass before requesting review.
150+
151+
## Commit Messages
152+
153+
We use [Conventional Commits](https://www.conventionalcommits.org/):
154+
155+
```
156+
<type>(<scope>): <description>
157+
158+
[optional body]
159+
160+
[optional footer]
161+
```
162+
163+
Types:
164+
165+
- `feat`: New feature
166+
- `fix`: Bug fix
167+
- `docs`: Documentation changes
168+
- `test`: Adding or updating tests
169+
- `refactor`: Code refactoring
170+
- `chore`: Maintenance tasks
171+
- `ci`: CI/CD changes
172+
173+
Examples:
174+
```
175+
feat(geodatafusion): Add ST_Buffer implementation
176+
fix(geodatafusion-flatgeobuf): Handle multipoint parsing edge case
177+
docs: Update README with ST_Area examples
178+
```
179+
180+
## Getting Help
181+
182+
- **Issues**: Open an issue on [GitHub](https://github.com/datafusion-contrib/geodatafusion/issues)
183+
- **Discussions**: Use GitHub Discussions for questions
184+
- **Documentation**: Check the [README](README.md) and [PostGIS docs](https://postgis.net/docs/)
185+
186+
## Additional Resources
187+
188+
- [Apache DataFusion](https://datafusion.apache.org/)
189+
- [PostGIS Reference](https://postgis.net/docs/reference.html)
190+
- [GeoArrow Specification](https://geoarrow.org/)
191+
- [GeoRust ecosystem](https://github.com/georust)
192+
193+
## License
194+
195+
This project is dual-licensed under MIT OR Apache-2.0. By contributing, you agree to license your contributions under the same terms.

python/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ edition = "2024"
55
authors = ["Kyle Barron <kylebarron2@gmail.com>"]
66
description = "Python bindings for geodatafusion, a geospatial extension to DataFusion"
77
readme = "README.md"
8-
repository = "https://github.com/datafusion-contrib/datafusion-geo"
8+
repository = "https://github.com/datafusion-contrib/geodatafusion"
99
license = "MIT OR Apache-2.0"
1010
keywords = ["python", "arrow", "geospatial"]
1111
categories = ["wasm", "science::geo"]

python/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ pip install geodatafusion
1111

1212
## Usage
1313

14-
To use, register the User-Defined Functions (UDFs) provided by `geodatafusion` on your `SessionContext`. The easiest way to do this is via `geodatafusion.register_all`. The [top-level Rust README](https://github.com/datafusion-contrib/datafusion-geo) contains a tracker of the UDFs currently implemented.
14+
To use, register the User-Defined Functions (UDFs) provided by `geodatafusion` on your `SessionContext`. The easiest way to do this is via `geodatafusion.register_all`. The [top-level Rust README](https://github.com/datafusion-contrib/geodatafusion) contains a tracker of the UDFs currently implemented.
1515

1616
```py
1717
from datafusion import SessionContext

0 commit comments

Comments
 (0)