-
Notifications
You must be signed in to change notification settings - Fork 130
feat: add some more examples in vortex crate
#5019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
connortsui20
merged 7 commits into
develop
from
claude/add-rust-vortex-examples-011CUKbvknTvq9ymU8u4hYC6
Oct 22, 2025
Merged
feat: add some more examples in vortex crate
#5019
connortsui20
merged 7 commits into
develop
from
claude/add-rust-vortex-examples-011CUKbvknTvq9ymU8u4hYC6
Oct 22, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report✅ All modified and coverable lines are covered by tests. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
871b05b to
2441515
Compare
This commit adds 6 comprehensive examples to the vortex crate that demonstrate how to use Vortex to build data systems. These examples fill a gap in the documentation by providing practical, runnable code for common use cases. Examples added: 1. **basic_arrays.rs** - Demonstrates creating different array types: - Primitive arrays (integers, floats) - String and binary arrays - Boolean arrays - Struct arrays (columnar records) - List arrays (nested data) - Chunked arrays (partitioned data) - Nullable arrays with validity masks 2. **compression_showcase.rs** - Shows Vortex's compression capabilities: - Sequential data compression (timestamps, IDs) - Repetitive data compression (RLE opportunities) - String data compression (dictionary encoding) - Floating-point data compression (ALP/PCO) - Sparse data compression - Structured data compression with per-column strategies 3. **file_io_filtering.rs** - File I/O and predicate pushdown: - Writing data to .vortex files with compression - Reading entire files - Reading with simple filters (age > 30) - Reading with complex filters (age > 25 AND age < 40) - Demonstrates efficient predicate pushdown 4. **analytics_pipeline.rs** - Complete e-commerce analytics pipeline: - Generating synthetic transaction data - Writing compressed data to disk - Computing total revenue with aggregations - Grouping by category - Filtering high-value transactions - Reading filtered data from disk 5. **stream_processing.rs** - Streaming sensor data processing: - Generating data in chunks (simulating real-time ingestion) - Processing chunks individually for memory efficiency - Writing chunked data to files - Streaming reads from files - Streaming with filters 6. **arrow_interop.rs** - Apache Arrow interoperability: - Converting Vortex arrays to Arrow - Converting Arrow arrays to Vortex - Round-trip conversion with data integrity verification - Working with Arrow RecordBatch All examples include: - Comprehensive documentation comments - Clear explanations of what each section does - Practical use cases that developers can adapt - Proper error handling - Run instructions These examples help users understand how to: - Build custom data pipelines - Leverage Vortex's compression - Integrate with Arrow ecosystem - Process large datasets efficiently - Work with different data types and structures 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Replace the generic stream processing example with a real-world, production-ready example: a tracing subscriber that writes structured logs to Vortex files. This new example demonstrates three critical capabilities: 1. **Nested Data Structures**: Shows how Vortex handles complex nested data like tracing events with fields, spans, and metadata. Events are stored as structured records with: - Timestamps - Log levels (DEBUG, INFO, WARN, ERROR) - Targets (module paths) - Messages - Nested field data (serialized as JSON, demonstrating nested compression) 2. **Pluggability**: Demonstrates Vortex's integration into existing Rust ecosystems by implementing `tracing_subscriber::Layer<S>`. This is a real integration pattern that developers can use in production applications. 3. **Async Writing & Batching**: Implements a non-blocking async architecture with: - Unbounded channel for event collection - Background task for batch writing - Configurable batch sizes - Graceful shutdown with flush - Multiple file rotation Key implementation details: - Custom `VortexLayer` that implements `tracing_subscriber::Layer` - `FieldVisitor` to extract structured data from tracing events - `WriterHandle` managing async batch writes - Compression via `CompactCompressor` - Demonstrates real-world data volumes and compression ratios The example simulates HTTP request handling with database queries, validation, and error conditions, generating ~30+ trace events that get batched and written to compressed Vortex files. This is a much more compelling example than generic sensor data, showing developers exactly how they could use Vortex in their own logging/observability pipelines. Dependencies added: - tracing - tracing-subscriber - serde_json (for field serialization) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
b31c18a to
be65a41
Compare
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
vortex cratevortex crate
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
AdamGS
approved these changes
Oct 21, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Two examples.
tracingSubscriber that writes to a sequence of Vortex files using the Compact compressorClaude-tested, Duffy-approved.