A blazing fast, memory-safe alternative to NumPy + Pandas, written in Rust
RustFrames is the foundational data & array library for Rust, providing high-performance, memory-safe alternatives to NumPy and Pandas. Built from the ground up in Rust, it delivers blazing-fast numerical computing with zero-cost abstractions and fearless concurrency.
- Memory Safety → No segfaults, buffer overflows, or undefined behavior
- Performance First → SIMD acceleration, multithreading, and GPU support
- Seamless Interop → Native Apache Arrow, Parquet, CSV, and NumPy compatibility
- Unified API → One library for both tabular DataFrames and n-dimensional Arrays
- Zero-Cost Abstractions → High-level APIs with no performance overhead
- Rust Ecosystem → Perfect integration with the broader Rust data ecosystem
- N-dimensional arrays with efficient memory layout
- SIMD-accelerated operations for maximum performance
- Broadcasting support for NumPy-style operations
- Linear algebra operations (matrix multiplication, decompositions)
- GPU acceleration support (CUDA/ROCm)
- Type safety with compile-time shape checking where possible
- High-performance DataFrames and Series
- Advanced operations: GroupBy, window functions, joins
- Apache Arrow backend for zero-copy interoperability
- Multiple I/O formats: CSV, Parquet, JSON, Arrow IPC
- Memory-efficient columnar storage
- Lazy evaluation for complex query optimization
- SIMD acceleration on x86_64 and ARM64
- Parallel processing with Rayon
- GPU kernels for CUDA and ROCm
- Memory mapping for large datasets
- Vectorized string operations
Add RustFrames to your Cargo.toml
:
rustframes = "1.0"
# Optional features
rustframes = {version = "1.0", features = ["gpu", "arrow", "parquet"] }
Feature | Description | Default |
---|---|---|
gpu |
Enable CUDA/ROCm GPU acceleration | ❌ |
arrow |
Apache Arrow integration | ✅ |
parquet |
Parquet file format support | ✅ |
simd |
SIMD acceleration | ✅ |
rayon |
Parallel processing | ✅ |
use rustframes::array::Array;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create arrays
let a = Array::from_vec(vec![1.0, 2.0, 3.0, 4.0], (2, 2))?;
let b = Array::from_vec(vec![5.0, 6.0, 7.0, 8.0], (2, 2))?;
// Element-wise operations with broadcasting
let result = &a + &b;
println!("Addition:\n{}", result);
// Linear algebra
let product = a.dot(&b)?;
println!("Matrix multiplication:\n{}", product);
// SIMD-accelerated operations
let scaled = a.scale(2.0);
println!("Scaled:\n{}", scaled);
Ok(())
}
use rustframes::dataframe::DataFrame;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load data from CSV
let mut df = DataFrame::from_csv("sales_data.csv")?;
// Basic operations
println!("Shape: {:?}", df.shape());
println!("First 5 rows:\n{}", df.head(5));
// Filtering and selection
let filtered = df
.filter("sales > 1000")?
.select(&["product", "sales", "region"])?;
// GroupBy operations
let summary = df
.group_by(&["region"])?
.agg(&[("sales", "sum"), ("profit", "mean")])?;
println!("Regional summary:\n{}", summary);
// Window functions
let with_rolling = df
.window("sales")?
.rolling(30)?
.mean()?;
// Save results
summary.to_parquet("regional_summary.parquet")?;
Ok(())
}
use rustframes::array::Array;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Large matrix operations on GPU
let a = Array::random((1000, 1000)).to_gpu()?;
let b = Array::random((1000, 1000)).to_gpu()?;
// GPU-accelerated matrix multiplication
let result = a.gpu_matmul(&b)?;
let cpu_result = result.to_cpu()?;
println!("GPU computation completed: {}", cpu_result.shape());
Ok(())
}
use rustframes::dataframe::DataFrame;
use arrow::array::Int32Array;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create DataFrame from Arrow arrays
let arrow_array = Int32Array::from(vec![1, 2, 3, 4, 5]);
let df = DataFrame::from_arrow_array("numbers", arrow_array)?;
// Export to Arrow for interop with other tools
let arrow_table = df.to_arrow_table()?;
Ok(())
}
Complete API documentation is available at docs.rs/rustframes.
rustframes::array
- N-dimensional arrays and linear algebrarustframes::dataframe
- DataFrames, Series, and tabular operationsrustframes::io
- File I/O for CSV, Parquet, Arrow formatsrustframes::gpu
- GPU acceleration utilitiesrustframes::simd
- SIMD optimization helpers
RustFrames integrates seamlessly with the Rust data ecosystem:
- Polars - Use RustFrames arrays in Polars expressions
- DataFusion - Query RustFrames DataFrames with SQL
- Candle - Deep learning with RustFrames tensors
- PyO3 - Expose RustFrames to Python
- Wasm-pack - Run RustFrames in the browser
We welcome contributions! Please see our Contributing Guide for details.
git clone https://github.com/ryan-tobin/rustframes.git
cd rustframes
cargo test --all-features
cargo bench
cargo bench --features "gpu,simd"
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.