|
| 1 | +# Development Guide |
| 2 | + |
| 3 | +This guide explains how to contribute to `geodatafusion`, a spatial extension for Apache DataFusion. |
| 4 | + |
| 5 | +## Project Structure |
| 6 | + |
| 7 | +This is a Rust workspace with multiple crates: |
| 8 | + |
| 9 | +- `rust/geodatafusion` - Core library with spatial User-Defined Functions (UDFs) |
| 10 | + - Internally, each _provider_ of functions is organized in submodules: |
| 11 | + - `native/` - Operations that are natively implemented, without the use of other dependencies like `geo` |
| 12 | + - `geo/` - Operations implemented using the `geo` crate |
| 13 | + - `geohash/` - GeoHash encoding/decoding, using the `geohash` crate |
| 14 | +- `rust/geodatafusion-flatgeobuf` - FlatGeobuf format support |
| 15 | +- `rust/geodatafusion-geoparquet` - GeoParquet format support |
| 16 | +- `rust/geodatafusion-geojson` - GeoJSON format support |
| 17 | +- `python/` - Python bindings (separate workspace) |
| 18 | + |
| 19 | +## Prerequisites |
| 20 | + |
| 21 | +### Rust Development |
| 22 | + |
| 23 | +- Rust. The minimum supported Rust version (MSRV) is defined by `rust-version` in `Cargo.toml`. You can update Rust using [rustup](https://rustup.rs/): `rustup update stable`. |
| 24 | + |
| 25 | +## Getting Started |
| 26 | + |
| 27 | +### Clone the Repository |
| 28 | + |
| 29 | +```bash |
| 30 | +git clone https://github.com/datafusion-contrib/geodatafusion.git |
| 31 | +cd geodatafusion |
| 32 | +``` |
| 33 | + |
| 34 | +### Build the Project |
| 35 | + |
| 36 | +```bash |
| 37 | +# Build all Rust crates |
| 38 | +cargo build |
| 39 | + |
| 40 | +# Build with all features |
| 41 | +cargo build --all-features |
| 42 | +``` |
| 43 | + |
| 44 | +## Development Workflow |
| 45 | + |
| 46 | +### Running Tests |
| 47 | + |
| 48 | +```bash |
| 49 | +# Run all Rust tests |
| 50 | +cargo test --all-features |
| 51 | + |
| 52 | +# Run tests for a specific crate |
| 53 | +cargo test -p geodatafusion |
| 54 | +``` |
| 55 | + |
| 56 | +### Code Formatting |
| 57 | + |
| 58 | +We use `rustfmt`: |
| 59 | + |
| 60 | +```bash |
| 61 | +cargo +nightly-2025-05-14 fmt -- --unstable-features \ |
| 62 | + --config imports_granularity=Module,group_imports=StdExternalCrate |
| 63 | +``` |
| 64 | + |
| 65 | +We use the nightly compiler for formatting because import ordering is an unstable feature. |
| 66 | + |
| 67 | +### Linting |
| 68 | + |
| 69 | +Run clippy on all crates |
| 70 | + |
| 71 | +```bash |
| 72 | +cargo clippy --all-features --tests -- -D warnings |
| 73 | +``` |
| 74 | + |
| 75 | +### Documentation |
| 76 | + |
| 77 | +Build and view documentation |
| 78 | + |
| 79 | +```bash |
| 80 | +cargo doc --all-features --open |
| 81 | +``` |
| 82 | + |
| 83 | +## Contributing |
| 84 | + |
| 85 | +### Adding New Functions |
| 86 | + |
| 87 | +We follow the [PostGIS API](https://postgis.net/docs/reference.html) as closely as possible. When implementing a new function: |
| 88 | + |
| 89 | +1. **Check the README** - See if the function is listed in the function table |
| 90 | +2. **Find similar implementations** - Look at existing functions in the same category |
| 91 | +3. **Implement the function** - Follow existing patterns in `rust/geodatafusion/src/udf/` |
| 92 | +4. **Add tests** - Include unit tests and integration tests |
| 93 | +5. **Update documentation** - Add doc comments and update the README checkboxes |
| 94 | + |
| 95 | +#### Function Categories |
| 96 | + |
| 97 | +Functions are organized by category in `rust/geodatafusion/src/udf/`: |
| 98 | + |
| 99 | +- `native/constructors/` - Geometry constructors (ST_MakePoint, etc.) |
| 100 | +- `native/accessors/` - Geometry accessors (ST_X, ST_Y, etc.) |
| 101 | +- `native/io/` - Input/output (WKT, WKB) |
| 102 | +- `native/bounding_box/` - Bounding box functions |
| 103 | +- `geo/measurement/` - Measurement functions (ST_Area, ST_Distance) |
| 104 | +- `geo/processing/` - Processing functions (ST_Buffer, ST_Simplify) |
| 105 | +- `geo/relationships/` - Spatial relationships (ST_Intersects, etc.) |
| 106 | +- `geo/validation/` - Validation functions (ST_IsValid) |
| 107 | +- `geohash/` - GeoHash functions |
| 108 | + |
| 109 | +### Code Style |
| 110 | + |
| 111 | +- Use meaningful variable and function names |
| 112 | +- Add doc comments for public APIs |
| 113 | +- Follow Rust naming conventions (snake_case for functions, PascalCase for types) |
| 114 | +- Keep functions focused and single-purpose |
| 115 | +- Prefer explicit error handling over panics |
| 116 | + |
| 117 | +### Testing Guidelines |
| 118 | + |
| 119 | +- Test both valid and invalid inputs |
| 120 | +- Test edge cases (empty geometries, null values, etc.) |
| 121 | +- Use descriptive test names |
| 122 | +- Add SQL integration tests when appropriate |
| 123 | +- Test against PostGIS behavior when possible |
| 124 | + |
| 125 | +Example test structure: |
| 126 | + |
| 127 | +```rust |
| 128 | +#[test] |
| 129 | +fn test_st_area_polygon() { |
| 130 | + // Test case description |
| 131 | + let input = /* ... */; |
| 132 | + let expected = /* ... */; |
| 133 | + let result = st_area(input); |
| 134 | + assert_eq!(result, expected); |
| 135 | +} |
| 136 | +``` |
| 137 | + |
| 138 | +## Continuous Integration |
| 139 | + |
| 140 | +Our CI pipeline runs on every pull request and includes: |
| 141 | + |
| 142 | +1. **Formatting** - Checks code formatting with `rustfmt` |
| 143 | +2. **Linting** - Runs `clippy` with all features |
| 144 | +3. **Tests** - Runs test suite with all features |
| 145 | +4. **Documentation** - Ensures docs build without warnings |
| 146 | +5. **Python CI** - Tests Python bindings |
| 147 | +6. **Conventional Commits** - Validates commit message format |
| 148 | + |
| 149 | +Make sure all checks pass before requesting review. |
| 150 | + |
| 151 | +## Commit Messages |
| 152 | + |
| 153 | +We use [Conventional Commits](https://www.conventionalcommits.org/): |
| 154 | + |
| 155 | +``` |
| 156 | +<type>(<scope>): <description> |
| 157 | +
|
| 158 | +[optional body] |
| 159 | +
|
| 160 | +[optional footer] |
| 161 | +``` |
| 162 | + |
| 163 | +Types: |
| 164 | + |
| 165 | +- `feat`: New feature |
| 166 | +- `fix`: Bug fix |
| 167 | +- `docs`: Documentation changes |
| 168 | +- `test`: Adding or updating tests |
| 169 | +- `refactor`: Code refactoring |
| 170 | +- `chore`: Maintenance tasks |
| 171 | +- `ci`: CI/CD changes |
| 172 | + |
| 173 | +Examples: |
| 174 | +``` |
| 175 | +feat(geodatafusion): Add ST_Buffer implementation |
| 176 | +fix(geodatafusion-flatgeobuf): Handle multipoint parsing edge case |
| 177 | +docs: Update README with ST_Area examples |
| 178 | +``` |
| 179 | + |
| 180 | +## Getting Help |
| 181 | + |
| 182 | +- **Issues**: Open an issue on [GitHub](https://github.com/datafusion-contrib/geodatafusion/issues) |
| 183 | +- **Discussions**: Use GitHub Discussions for questions |
| 184 | +- **Documentation**: Check the [README](README.md) and [PostGIS docs](https://postgis.net/docs/) |
| 185 | + |
| 186 | +## Additional Resources |
| 187 | + |
| 188 | +- [Apache DataFusion](https://datafusion.apache.org/) |
| 189 | +- [PostGIS Reference](https://postgis.net/docs/reference.html) |
| 190 | +- [GeoArrow Specification](https://geoarrow.org/) |
| 191 | +- [GeoRust ecosystem](https://github.com/georust) |
| 192 | + |
| 193 | +## License |
| 194 | + |
| 195 | +This project is dual-licensed under MIT OR Apache-2.0. By contributing, you agree to license your contributions under the same terms. |
0 commit comments