Skip to content

CarDamm/xem-log

Repository files navigation

xem-log

A high-performance, asynchronous log ingestion service written in Rust. xem-log is designed to receive OTLP/gRPC log streams, optimize them using zero-copy parsing, and persist them as partitioned Apache Parquet files in S3-compatible storage.

Architecture Overview

The service acts as a bridge between high-frequency log producers and long-term analytical storage. It minimizes overhead by leveraging the Rust ownership model and the Tokio asynchronous runtime.

  1. Ingestion: Receives logs via gRPC using the OpenTelemetry Protocol (OTLP).
  2. Processing: Implements zero-copy parsing to transform raw protobuf data into structured internal formats without unnecessary allocations.
  3. Buffering: Manages an in-memory buffer that triggers a flush based on configurable time-intervals or batch-size thresholds.
  4. Storage: Encodes batches into Apache Parquet (columnar format) and uploads them to S3/MinIO for efficient downstream analysis.
  5. Observability: Provides real-time telemetry via a dedicated Prometheus metrics endpoint.

Technical Highlights

  • Asynchronous I/O: Fully powered by tokio and tonic for non-blocking network operations.
  • Zero-Copy Design: Utilizes serde and specialized memory management to ensure high throughput and low CPU usage.
  • Columnar Efficiency: Direct conversion to Parquet ensures that stored logs occupy minimal space and remain highly queryable.
  • Containerized & Orchestrated: Multi-stage Docker builds ensure a minimal runtime footprint (Debian Slim), with full orchestration via Docker Compose.
  • CI/CD Ready: Integrated GitHub Actions pipeline for automated linting, unit testing, and integration testing with localized infrastructure.

Getting Started

Prerequisites

  • Rust Toolchain (1.75+ recommended)
  • Docker and Docker Compose

Local Deployment

  1. Configure Environment: Setup the required variables.

    cp .env.example .env
  2. Start Infrastructure: Spin up MinIO (S3 Emulator) and Prometheus.

    cd infra
    docker-compose up -d
  3. Launch xem-log: Build and run the ingester.

    docker-compose up --build -d

The service will be available at:

  • gRPC Ingest: localhost:4317
  • Prometheus Metrics: localhost:9091/metrics
  • MinIO Console: localhost:9001 (Credentials: minioadmin / minioadmin)

Configuration

Configuration is managed via environment variables. You can customize the behavior by editing the .env file in the root directory:

Variable Description Default
XEMLOG_GRPC_PORT Port for the OTLP/gRPC listener 4317
XEMLOG_METRICS_PORT Port for the Prometheus metrics server 9090
XEMLOG_S3_ENDPOINT URL for S3-compatible storage http://host.docker.internal:9000
XEMLOG_S3_BUCKET Target bucket for Parquet files xemlog-bucket
BATCH_SIZE_THRESHOLD Max logs in memory before flushing 1000

Future Roadmap

The project is designed for extensibility. Planned features include:

  • DuckDB Analytical CLI: A lightweight companion tool to perform SQL queries directly on the S3 Parquet files without requiring a full OLAP database.
  • Continuous Delivery (CD) Pipeline: Automated GitHub Actions to build and push optimized images to GitHub Container Registry (GHCR).
  • Write-Ahead Log (WAL): Implementation of a local persistent buffer to ensure zero data loss in the event of an unexpected service interruption.
  • Dynamic Filtering: A DSL (Domain Specific Language) to filter or mask sensitive log data before it reaches the storage layer.
  • S3 Partitioning Strategy: Enhanced path logic to organize files by YYYY/MM/DD/HH for optimized data discovery.

License

This project is licensed under the Apache License, Version 2.0. See the LICENSE file for the full text.

About

XemLog: High-performance, Cloud-Native Log Ingester & Archiver built in Rust. Features an OTLP-compliant ingestion layer, zero-copy parsing, and a stateless architecture leveraging S3/Parquet for cost-efficient, enterprise-grade observability.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors