Skip to content

luke-pekala/csv-log-processor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Module D — CSV Log Processor

Phase 1 · Python Fundamentals · Weeks 5–6

Reads a server access log CSV, parses each row with full error handling, aggregates stats (requests per endpoint, error rates, slowest routes), and writes a summary report. Malformed rows are written to a separate error log instead of crashing the program.


Project structure

module-d/
├── processor.py        # main pipeline + CLI entry point
├── parser.py           # parse_row(), validate_row(), ParseError
├── reporter.py         # compute_stats(), write_summary(), write_error_log()
├── data/
│   └── access.log.csv  # 50-row sample log with 4 malformed rows
├── tests/
│   └── test_parser.py
└── README.md

Setup

cd phase1/module-d
# No external packages — stdlib only (csv, pathlib, logging, datetime)

Usage

python processor.py data/access.log.csv
python processor.py data/access.log.csv --out-dir results/

Output files are written to output/ (or --out-dir):

  • summary.txt — full analysis report
  • parse_errors.txt — details of every malformed row

Example output

============================================================
 LOG ANALYSIS REPORT
 Generated: 2024-03-01 09:00:00
============================================================

── OVERVIEW ─────────────────────────────────────────────
  Rows parsed successfully : 46
  Rows with parse errors   : 4
  Total rows processed     : 50

  HTTP error rate          : 15.2%
  Avg response time        : 89.4 ms
  Max response time        : 612.3 ms

── STATUS CODES ──────────────────────────────────────────
  200   32  ████████████████████████████████
  201    3  ███
  204    3  ███
  401    1  █
  403    1  █
  404    4  ████
  422    1  █
  500    3  ███

── TOP ENDPOINTS ─────────────────────────────────────────
  /api/users              13  ████████████████████
  /api/products           10  ████████████████
  /api/orders              9  ██████████████
  /api/stats               3  █████
  /api/login               3  █████

Run tests

python -m pytest tests/ -v

Concepts used

Concept Where
pathlib.Path All file I/O in processor.py and reporter.py
csv.DictReader Reading CSV rows as dicts in read_log()
try/except Catching KeyError and ValueError in parse_row()
raise ... from e Exception chaining for debuggable tracebacks
Custom exception ParseError carries the failing row + reason
logging logging.warning() for skipped rows, logging.info() for progress
Context managers with path.open() as f: guarantees file closure
Generator expression sum(p["ms"] for p in entries) in compute_stats()

Skills demonstrated

File I/O pathlib csv.DictReader try/except Custom exceptions logging Context managers


Portfolio entry

CSV Log Processor — Python files + error handling

A CLI tool that processes server access logs, handles malformed rows gracefully with custom exceptions, and writes aggregated reports to disk. Zero crashes on dirty data. Built as Module D of a 26-app Python learning roadmap.

About

A CSV log processing pipeline with custom exceptions, error handling, and structured reporting. Module D of the Python AI Roadmap.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages