Skip to content

yugabyte/log_analyzer

Repository files navigation

Log Analyzer for YugabyteDB

A modern, maintainable, and efficient log analysis tool for YugabyteDB support bundles. This version follows best practices including proper separation of concerns, comprehensive error handling, type hints, and clean architecture.

πŸš€ Features

  • Support Bundle Analysis: Extract and analyze YugabyteDB support bundles
  • Parquet File Analysis: Process log data stored in Parquet format
  • Pattern Matching: Configurable regex patterns for log message analysis
  • Parallel Processing: Multi-threaded analysis for improved performance
  • Web Interface: Flask-based web server for viewing reports
  • Database Storage: PostgreSQL integration for report persistence
  • Comprehensive Logging: Structured logging with colorized output
  • Type Safety: Full type hints throughout the codebase

πŸ“‹ Requirements

  • Python 3.8+
  • PostgreSQL 12+
  • DuckDB (for Parquet analysis)

πŸ› οΈ Installation

  1. Clone the repository:

    git clone <repository-url>
    cd log_analyzer
  2. Create a virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Set up configuration:

    # Copy example configuration files
    cp db_config.json.example db_config.json
    cp server_config.json.example server_config.json
    
    # Edit configuration files with your settings
    nano db_config.json
    nano server_config.json
  5. Set up database:

    -- Run the schema.sql file in your PostgreSQL database
    psql -d your_database -f schema.sql

πŸ—οΈ Architecture

The codebase follows a clean architecture pattern with clear separation of concerns:

log_analyzer/
β”œβ”€β”€ config/                 # Configuration management
β”‚   └── settings.py        # Centralized settings
β”œβ”€β”€ models/                # Data models
β”‚   └── log_metadata.py    # Type-safe data structures
β”œβ”€β”€ services/              # Business logic services
β”‚   β”œβ”€β”€ analysis_service.py    # Main analysis orchestration
β”‚   β”œβ”€β”€ database_service.py    # Database operations
β”‚   β”œβ”€β”€ file_processor.py      # File handling
β”‚   β”œβ”€β”€ pattern_matcher.py     # Pattern matching
β”‚   └── parquet_service.py     # Parquet analysis
β”œβ”€β”€ utils/                 # Utilities and helpers
β”‚   β”œβ”€β”€ exceptions.py      # Custom exceptions
β”‚   └── logging_config.py  # Logging configuration
β”œβ”€β”€ webserver/             # Web interface
β”‚   β”œβ”€β”€ app.py             # Flask app
β”‚   β”œβ”€β”€ static/            # Static files
β”‚   └── templates/         # HTML templates
β”œβ”€β”€ lib/                   # Legacy library modules
β”œβ”€β”€ tests/                 # Test suite
β”œβ”€β”€ log_analyzer.py        # Main application
└── requirements.txt       # Dependencies

πŸš€ Usage

Command Line Interface

Analyze Support Bundle

# Basic analysis
python log_analyzer.py -s support_bundle.tar.gz

# With custom time range
python log_analyzer.py -s support_bundle.tar.gz \
  -t "1231 10:30" -T "1231 23:59"

# With node and log type filters
python log_analyzer.py -s support_bundle.tar.gz \
  -n "n1,n2" --types "pg,ts"

# With custom patterns
python log_analyzer.py -s support_bundle.tar.gz \
  --histogram-mode "error1,error2,error3"

# Parallel processing
python log_analyzer.py -s support_bundle.tar.gz \
  -p 8

Analyze Parquet Files

# Analyze Parquet directory
python log_analyzer.py --parquet_files /path/to/parquet/dir

# With custom patterns
python log_analyzer.py --parquet_files /path/to/parquet/dir \
  --histogram-mode "error1,error2,error3"

# Parallel processing
python log_analyzer.py --parquet_files /path/to/parquet/dir \
  -p 8

Web Interface

  1. Start the web server:
   python webserver/app.py
  1. Access the web interface: Open your browser and navigate to http://localhost:5000

  2. View reports:

    • Browse all reports on the main page
    • Click on any report to view detailed analysis
    python webserver/app.py
  3. Access the web interface:

   python webserver/app.py
   Open your browser and navigate to `http://localhost:5000`

3. **View reports**:
   - Browse all reports on the main page
   - Click on any report to view detailed analysis
   - Use the search functionality to find specific reports

## βš™οΈ Configuration

### Database Configuration (`db_config.json`)
```json
{
  "host": "localhost",
  "port": 5432,
  "dbname": "log_analyzer",
  "user": "postgres",
  "password": "your_password"
}

Server Configuration (server_config.json)

{
  "host": "127.0.0.1",
  "port": 5000
}

Log Configuration (log_conf.yml)

universe:
  log_messages:
    - name: "tablet_not_found"
      pattern: "Tablet.*not found"
      solution: "Check tablet distribution and replication"
    - name: "leader_not_ready"
      pattern: "Leader.*not ready"
      solution: "Check leader election and consensus"

pg:
  log_messages:
    - name: "connection_error"
      pattern: "connection.*failed"
      solution: "Check network connectivity and firewall rules"

πŸ§ͺ Testing

Run the test suite:

# Run all tests
pytest

# Run with coverage
pytest --cov=.

# Run specific test file
pytest tests/test_analysis_service.py

πŸ”§ Development

Adding New Features

  1. Create new service:

    # services/new_service.py
    from utils.exceptions import AnalysisError
    
    class NewService:
        def __init__(self):
            pass
        
        def process_data(self, data: Dict[str, Any]) -> Dict[str, Any]:
            # Implementation
            pass
  2. Add tests:

    # tests/test_new_service.py
    import pytest
    from services.new_service import NewService
    
    def test_new_service():
        service = NewService()
        result = service.process_data({"test": "data"})
        assert result is not None

πŸ“Š Performance

The version includes several performance improvements:

  • Parallel Processing: Multi-threaded analysis for large support bundles
  • Efficient File Handling: Streaming file processing to reduce memory usage
  • Database Optimization: Prepared statements and connection pooling
  • Caching: Pattern compilation caching for repeated analysis

πŸ”’ Error Handling

The version includes comprehensive error handling:

  • Custom Exceptions: Domain-specific exception classes
  • Graceful Degradation: Continue processing even if some files fail
  • Detailed Logging: Structured logging with different levels
  • User-Friendly Messages: Clear error messages for end users

πŸ“ˆ Monitoring

The application includes built-in monitoring capabilities:

  • Progress Tracking: Real-time progress bars for long-running operations
  • Performance Metrics: Timing information for different analysis phases
  • Resource Usage: Memory and CPU usage monitoring
  • Error Tracking: Detailed error logs with stack traces

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/new-feature
  3. Make your changes following the coding standards
  4. Add tests for new functionality
  5. Run the test suite: pytest
  6. Submit a pull request

Coding Standards

  • Use type hints throughout
  • Follow PEP 8 style guidelines
  • Write comprehensive docstrings
  • Add tests for new functionality
  • Use meaningful variable and function names

πŸ”„ Migration from Original Version

The version maintains backward compatibility with the original:

  1. Same Command Line Interface: All original arguments are supported
  2. Same Output Format: Reports are generated in the same JSON format
  3. Same Web Interface: The web UI remains functionally identical
  4. Configuration Files: Existing configuration files work without changes

Key Improvements

  • Better Error Handling: More informative error messages
  • Improved Performance: Faster processing with parallel execution
  • Enhanced Logging: Better visibility into analysis progress
  • Type Safety: Reduced bugs through static type checking
  • Maintainability: Cleaner code structure for easier maintenance

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published