Skip to content

feat: Implement StructuredLoggingAssessor #79

@jeremyeder

Description

@jeremyeder

feat: Implement StructuredLoggingAssessor

Attribute Definition

Attribute ID: structured_logging (Attribute #18 - Tier 3)

Definition: Logging in structured format (JSON) with consistent field names and types.

Why It Matters: Structured logs are machine-parseable. AI can analyze logs to diagnose issues, identify patterns, suggest optimizations, and correlate events across distributed systems.

Impact on Agent Behavior:

  • Log query and analysis capabilities
  • Event correlation across services
  • Pattern identification for debugging
  • Data-driven optimization suggestions
  • Anomaly detection

Measurable Criteria:

  • Use structured logging library: structlog (Python), winston (Node.js), zap (Go)
  • Standard fields across all logs:
    • timestamp (ISO 8601 format)
    • level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
    • message (human-readable)
    • context: request_id, user_id, session_id, trace_id
  • Consistent naming convention (snake_case or camelCase, not both)
  • Log levels used appropriately
  • Never log sensitive data: passwords, tokens, credit cards, PII (without anonymization)
  • JSON format for production

Implementation Requirements

File Location: src/agentready/assessors/code_quality.py

Class Name: StructuredLoggingAssessor

Tier: 3 (Important)

Default Weight: 0.015 (1.5% of total score)

Assessment Logic

Scoring Approach: Detect structured logging library usage and validate log format

Evidence to Check (score components):

  1. Structured logging library present (50%)

    • Python: structlog, python-json-logger
    • JavaScript: winston, pino, bunyan
    • Go: zap, zerolog
    • Check dependencies (package.json, pyproject.toml, go.mod)
  2. Logging configuration (30%)

    • Look for logging config files
    • Check for JSON formatter configuration
    • Verify consistent field names
  3. Code usage patterns (20%)

    • Sample Python/JS files for logging calls
    • Check if context fields included (request_id, etc.)
    • Verify no sensitive data in log statements

Scoring Logic:

lib_score = 100 if structured_lib_found else 0
config_score = 100 if logging_config_found else 50
usage_score = calculate_usage_quality(sample_log_calls)

total_score = (lib_score * 0.5) + (config_score * 0.3) + (usage_score * 0.2)

status = "pass" if total_score >= 75 else "fail"

Code Pattern to Follow

Reference: Similar to TypeAnnotationsAssessor for language-specific checks

Pattern:

  1. Check is_applicable() for supported languages
  2. Inspect dependency files for structured logging libraries
  3. Search codebase for logging configuration
  4. Sample log statements to verify structured format
  5. Calculate proportional score

Example Finding Responses

Pass (Score: 95)

Finding(
    attribute=self.attribute,
    status="pass",
    score=95.0,
    measured_value="structlog configured",
    threshold="structured logging library",
    evidence=[
        "structlog found in dependencies",
        "JSON formatter configured in logging.conf",
        "Sampled 15 log calls: all use structured format",
        "Context fields present: request_id, user_id, duration",
    ],
    remediation=None,
    error_message=None,
)

Fail (Score: 20)

Finding(
    attribute=self.attribute,
    status="fail",
    score=20.0,
    measured_value="unstructured logging",
    threshold="structured logging library",
    evidence=[
        "No structured logging library found",
        "Using built-in logging or print statements",
        "Sampled 10 log calls: all unstructured strings",
        "No consistent log format detected",
    ],
    remediation=self._create_remediation(),
    error_message=None,
)

Not Applicable

Finding.not_applicable(
    self.attribute,
    reason="No logging usage detected in codebase"
)

Registration

Add to src/agentready/services/scanner.py in create_all_assessors():

from ..assessors.code_quality import (
    TypeAnnotationsAssessor,
    CyclomaticComplexityAssessor,
    StructuredLoggingAssessor,  # Add this import
)

def create_all_assessors() -> List[BaseAssessor]:
    return [
        # ... existing assessors ...
        StructuredLoggingAssessor(),  # Add this line
    ]

Testing Guidance

Test File: tests/unit/test_assessors_code_quality.py

Test Cases to Add:

  1. test_structured_logging_pass_python: Repository with structlog dependency
  2. test_structured_logging_pass_javascript: Repository with winston dependency
  3. test_structured_logging_fail_no_library: Repository using print/console.log
  4. test_structured_logging_partial_score: Library present but no config
  5. test_structured_logging_not_applicable: No logging detected

Note: AgentReady uses Python's built-in logging, likely scores low (opportunity for improvement).

Dependencies

External Tools: None (dependency file parsing only)

Python Standard Library:

  • json for parsing package.json
  • toml or tomli for parsing pyproject.toml
  • re for searching log statements in code

Remediation Steps

def _create_remediation(self) -> Remediation:
    return Remediation(
        summary="Implement structured logging for better observability",
        steps=[
            "Install structured logging library (structlog, winston, zap)",
            "Configure JSON formatter for production logs",
            "Define standard log fields (timestamp, level, message, context)",
            "Add context fields: request_id, user_id, trace_id",
            "Replace unstructured log calls with structured logging",
            "Never log sensitive data (passwords, tokens, PII)",
        ],
        tools=["structlog", "winston", "pino", "zap"],
        commands=[
            "# Python - Install structlog",
            "pip install structlog",
            "",
            "# JavaScript - Install winston",
            "npm install winston",
            "",
            "# Go - Install zap",
            "go get -u go.uber.org/zap",
        ],
        examples=[
            """# Python - Good structured logging
import structlog

logger = structlog.get_logger()

logger.info(
    "user_login_success",
    user_id="user_123",
    request_id="req_abc",
    duration_ms=45,
    ip_address="192.168.1.1"
)

# Output:
# {"timestamp": "2025-01-20T10:30:00Z", "level": "info",
#  "event": "user_login_success", "user_id": "user_123",
#  "request_id": "req_abc", "duration_ms": 45}
""",
            """# Python - Bad unstructured logging
import logging

logger = logging.getLogger(__name__)

# Unstructured string - hard to parse
logger.info(f"User {user_id} logged in from {ip} in {duration}ms")
""",
            """// JavaScript - Good structured logging
const winston = require('winston');

const logger = winston.createLogger({
  format: winston.format.json(),
  transports: [new winston.transports.Console()]
});

logger.info({
  event: 'user_login_success',
  userId: 'user_123',
  requestId: 'req_abc',
  durationMs: 45,
  ipAddress: '192.168.1.1'
});
""",
        ],
        citations=[
            Citation(
                source="structlog",
                title="Structured Logging for Python",
                url="https://www.structlog.org/",
                relevance="structlog library documentation",
            ),
            Citation(
                source="Daily.dev",
                title="12 Logging Best Practices",
                url="https://daily.dev/blog/logging-best-practices",
                relevance="Industry best practices for logging",
            ),
        ],
    )

Implementation Notes

  1. Dependency Detection: Check pyproject.toml, package.json, go.mod for logging libraries
  2. Library Mapping:
    • Python: structlog, python-json-logger
    • JavaScript: winston, pino, bunyan
    • Go: zap, zerolog, logrus
    • Ruby: semantic_logger
  3. Config Detection: Look for logging.conf, logger.py, winston.config.js
  4. Code Sampling: Use grep/ripgrep to find log statements: logger.info, log.Info, etc.
  5. Pattern Matching: Detect structured format (dict/object) vs string concatenation
  6. Edge Cases: Some repos use print/console.log (score 0), others have no logging (not_applicable)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions