Skip to content

Conversation

@a10y
Copy link
Contributor

@a10y a10y commented Oct 21, 2025

Two examples.

  1. A fairly trivial showcase of compression performance
  2. A more involved and complex example of building a tracing Subscriber that writes to a sequence of Vortex files using the Compact compressor
image

Claude-tested, Duffy-approved.

@a10y a10y marked this pull request as draft October 21, 2025 13:10
@codecov
Copy link

codecov bot commented Oct 21, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.31%. Comparing base (036d44a) to head (58b8c09).
⚠️ Report is 21 commits behind head on develop.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@a10y a10y force-pushed the claude/add-rust-vortex-examples-011CUKbvknTvq9ymU8u4hYC6 branch 2 times, most recently from 871b05b to 2441515 Compare October 21, 2025 13:35
@a10y a10y added documentation Improvements or additions to documentation changelog/feature A new feature labels Oct 21, 2025
claude and others added 5 commits October 21, 2025 10:25
This commit adds 6 comprehensive examples to the vortex crate that demonstrate
how to use Vortex to build data systems. These examples fill a gap in the
documentation by providing practical, runnable code for common use cases.

Examples added:

1. **basic_arrays.rs** - Demonstrates creating different array types:
   - Primitive arrays (integers, floats)
   - String and binary arrays
   - Boolean arrays
   - Struct arrays (columnar records)
   - List arrays (nested data)
   - Chunked arrays (partitioned data)
   - Nullable arrays with validity masks

2. **compression_showcase.rs** - Shows Vortex's compression capabilities:
   - Sequential data compression (timestamps, IDs)
   - Repetitive data compression (RLE opportunities)
   - String data compression (dictionary encoding)
   - Floating-point data compression (ALP/PCO)
   - Sparse data compression
   - Structured data compression with per-column strategies

3. **file_io_filtering.rs** - File I/O and predicate pushdown:
   - Writing data to .vortex files with compression
   - Reading entire files
   - Reading with simple filters (age > 30)
   - Reading with complex filters (age > 25 AND age < 40)
   - Demonstrates efficient predicate pushdown

4. **analytics_pipeline.rs** - Complete e-commerce analytics pipeline:
   - Generating synthetic transaction data
   - Writing compressed data to disk
   - Computing total revenue with aggregations
   - Grouping by category
   - Filtering high-value transactions
   - Reading filtered data from disk

5. **stream_processing.rs** - Streaming sensor data processing:
   - Generating data in chunks (simulating real-time ingestion)
   - Processing chunks individually for memory efficiency
   - Writing chunked data to files
   - Streaming reads from files
   - Streaming with filters

6. **arrow_interop.rs** - Apache Arrow interoperability:
   - Converting Vortex arrays to Arrow
   - Converting Arrow arrays to Vortex
   - Round-trip conversion with data integrity verification
   - Working with Arrow RecordBatch

All examples include:
- Comprehensive documentation comments
- Clear explanations of what each section does
- Practical use cases that developers can adapt
- Proper error handling
- Run instructions

These examples help users understand how to:
- Build custom data pipelines
- Leverage Vortex's compression
- Integrate with Arrow ecosystem
- Process large datasets efficiently
- Work with different data types and structures

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Replace the generic stream processing example with a real-world,
production-ready example: a tracing subscriber that writes structured
logs to Vortex files.

This new example demonstrates three critical capabilities:

1. **Nested Data Structures**: Shows how Vortex handles complex nested
   data like tracing events with fields, spans, and metadata. Events
   are stored as structured records with:
   - Timestamps
   - Log levels (DEBUG, INFO, WARN, ERROR)
   - Targets (module paths)
   - Messages
   - Nested field data (serialized as JSON, demonstrating nested compression)

2. **Pluggability**: Demonstrates Vortex's integration into existing
   Rust ecosystems by implementing `tracing_subscriber::Layer<S>`.
   This is a real integration pattern that developers can use in
   production applications.

3. **Async Writing & Batching**: Implements a non-blocking async
   architecture with:
   - Unbounded channel for event collection
   - Background task for batch writing
   - Configurable batch sizes
   - Graceful shutdown with flush
   - Multiple file rotation

Key implementation details:
- Custom `VortexLayer` that implements `tracing_subscriber::Layer`
- `FieldVisitor` to extract structured data from tracing events
- `WriterHandle` managing async batch writes
- Compression via `CompactCompressor`
- Demonstrates real-world data volumes and compression ratios

The example simulates HTTP request handling with database queries,
validation, and error conditions, generating ~30+ trace events that
get batched and written to compressed Vortex files.

This is a much more compelling example than generic sensor data,
showing developers exactly how they could use Vortex in their own
logging/observability pipelines.

Dependencies added:
- tracing
- tracing-subscriber
- serde_json (for field serialization)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
@a10y a10y force-pushed the claude/add-rust-vortex-examples-011CUKbvknTvq9ymU8u4hYC6 branch from b31c18a to be65a41 Compare October 21, 2025 14:26
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
@a10y a10y changed the title feat: add some fun examples in vortex crate feat: add some more examples in vortex crate Oct 21, 2025
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
@a10y a10y marked this pull request as ready for review October 21, 2025 14:42
@a10y a10y requested review from AdamGS and lwwmanning October 21, 2025 14:42
@connortsui20 connortsui20 merged commit 6b04887 into develop Oct 22, 2025
40 checks passed
@connortsui20 connortsui20 deleted the claude/add-rust-vortex-examples-011CUKbvknTvq9ymU8u4hYC6 branch October 22, 2025 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants