A general-purpose, temporal-first entity mastering and conflict-resolution engine written in Rust.
Unirust provides precise temporal modeling, entity resolution, and conflict detection with strong guarantees about temporal correctness and auditability. It's designed to handle complex entity mastering scenarios where data comes from multiple sources with different perspectives and temporal validity periods.
Traditional entity resolution algorithms ignore temporal information, leading to incorrect merges and lost historical context. When data from multiple sources arrives at different times or has overlapping validity periods, standard approaches fail to maintain temporal consistency.
Customer Data Management
- Merge CRM, e-commerce, and support records while preserving interaction timelines
- Resolve conflicts when customers change contact information over time
- Maintain accurate customer journey mapping across touchpoints
Financial Services
- Consolidate trading accounts and positions from multiple systems with different update frequencies
- Track entity relationships and ownership changes for compliance and risk management
- Detect suspicious patterns through temporal entity evolution analysis
Identity and Access Management
- Merge user identities across systems while maintaining access history
- Resolve conflicts when user attributes change at different times
- Provide complete audit trails for security governance
Healthcare Systems
- Merge patient records from different hospitals while preserving medical history integrity
- Resolve conflicts when patient information is updated asynchronously
- Ensure temporal consistency for critical medical decisions
Master Data Management
- Create golden records that preserve temporal context and source attribution
- Handle data quality issues from asynchronous updates across systems
- Maintain lineage and provenance for regulatory compliance
- Temporal Model: Precise interval-based time modeling with Allen's interval relations
- High-Performance Entity Resolution: Optimized O(n) blocking algorithm with parallel processing
- Conflict Detection: Direct and indirect conflict detection with detailed reporting
- Knowledge Graph Export: JSONL, DOT, PNG, and SVG export formats
- Perspective Support: Multi-perspective data handling with configurable weights
- Audit Trail: Complete traceability of merges and conflicts
- Scalable Architecture: Designed to handle millions of entities with streaming and blocking
- Parallel Processing: Adaptive parallelization for loosely coupled entities
- Rust 1.70+
- Graphviz (optional, for visualization with
dotcommand) - Git (for cloning and development)
git clone https://github.com/unirust/unirust.git
cd unirust
cargo buildOr add to your Cargo.toml:
[dependencies]
unirust = "0.1.0"cargo run --example basic_exampleThis will demonstrate:
- Creating an ontology with identity keys and constraints
- Adding records from multiple perspectives
- Building clusters through optimized entity resolution
- Detecting conflicts
- Exporting knowledge graphs in multiple formats
cargo bench --bench entity_benchmarkThis will run performance benchmarks with different entity counts (1000, 5000) and overlap probabilities (1%, 10%, 30%).
temporal: Interval arithmetic and temporal relationsmodel: Core data structures (Record, Descriptor, etc.)ontology: Identity keys, strong identifiers, and constraintsdsu: Union-Find with temporal guardslinker: High-performance entity resolution with parallel processing and blockingconflicts: Conflict detection and reportinggraph: Knowledge graph exportstore: Record storage and indexingutils: Visualization and export utilities
- Records: Temporal entities with descriptors and identity information
- Clusters: Groups of records representing the same logical entity
- Identity Keys: Attributes that must match for records to be considered the same entity
- Strong Identifiers: Attributes that prevent merging when they conflict
- Temporal Guards: Validation that merges only occur when temporal constraints are satisfied
use unirust::*;
// Create ontology
let mut ontology = Ontology::new();
let name_attr = AttrId(0);
let email_attr = AttrId(1);
let identity_key = IdentityKey::new(vec![name_attr, email_attr], "name_email".to_string());
ontology.add_identity_key(identity_key);
// Create store and add records
let mut store = Store::new();
// ... add records ...
// Build clusters
let clusters = linker::build_clusters(&store, &ontology)?;
// Detect conflicts
let observations = conflicts::detect_conflicts(&store, &clusters, &ontology)?;
// Export knowledge graph
let graph = graph::export_graph(&store, &clusters, &observations, &ontology)?;The library includes utilities for generating visual representations:
use unirust::utils;
// Export to DOT format
let dot = utils::export_to_dot(&store, &clusters, &observations, &ontology)?;
// Generate PNG/SVG visualizations
utils::generate_graph_visualizations(&store, &clusters, &observations, &ontology, "output")?;The linker uses several optimization techniques for high performance:
- Blocking Algorithm: Reduces O(n²) complexity to O(n) by grouping records by identity key values
- Streaming Processing: Processes edges in streams to maintain O(1) memory usage
- Parallel Processing: Adaptive parallelization for loosely coupled entities with high overlap
- Smart Thresholding: Automatically chooses sequential vs parallel processing based on block size
Benchmark results demonstrate improved performance, particularly for high-overlap scenarios with parallel processing.
See the examples/ directory for complete working examples:
basic_example.rs: Simple entity resolution with conflict detection and visualization
cargo testcargo doc --openThis project is licensed under the MIT License - see the LICENSE file for details.
We welcome contributions! Please see our Contributing Guidelines for details on how to:
- Report bugs and request features
- Set up a development environment
- Submit pull requests
- Follow our coding standards
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Run the test suite:
cargo test - Commit your changes:
git commit -m 'Add amazing feature' - Push to your branch:
git push origin feature/amazing-feature - Open a Pull Request
- 📖 Documentation (coming soon)
- 🐛 Issue Tracker
- 💬 Discussions (coming soon)
- Inspired by temporal entity resolution research
- Built with the Rust ecosystem and community
- Thanks to all contributors who help improve Unirust
