This repository contains a Python-based ETL pipeline designed to extract and transform travel data from a TRAMS Back Office system and prepare it for secure delivery to a third-party platform. This process enables integrated reporting across multiple data environments.
The core objective of this project is to automate the formatting and validation of several exported datasets to align with a third-party's import requirements. It ensures consistent schema adherence, transforms internal data into external formats, and packages deliverables into a structured archive.
Each exported file corresponds to a normalized table from the TRIMS back office:
Source Table | Exported File | Description |
---|---|---|
Bookings | Transaction.txt |
Core transaction-level data (PNRs/bookings) |
Segments | Segment.txt |
Associated segment records |
Bookings | ExchAndRfnd.txt |
Exchange and refund information |
UDIDS | Enhancement.txt |
User-defined enhancements or metadata fields |
Taxes | Tax.txt |
Per-ticket or per-segment tax breakdowns |
The ETL process performs the following:
- Field validation based on length, data type, required/conditional status, and allowed values
- Field transformations including:
- Standardized date formatting (e.g.,
YYYY-MM-DD HH:MM
) - Padding numeric/decimal fields (e.g.,
0.0000
) - Stripping suffixes (e.g., trailing dashes)
- Standardized date formatting (e.g.,
- Zip file packaging with a timestamped convention
- Archiving to prevent reprocessing
├── archive/ # Archived deliverables (excluded from git)
├── errors/ # Validation error outputs
├── logs/ # Batch execution logs
├── processed/ # Cleaned and validated files staged for zipping
├── templates/ # File definitions or format templates
├── utils/ # Shared validation and formatting utilities
├── main.py # Batch entry point
├── .gitignore
├── README.md
└── SECURITY.md
ℹ️ Note: The inbound/ folder mentioned in some scripts is not present by default.
## 📂Sample Output Files:
AB1_DE_20170117_150500230_Transaction.txt
AB1_DE_20170117_150500430_Segment.txt
AB1_DE_20170117_150501010_Enhancement.txt
AB1_DE_20170117_150501430_Tax.txt
AB1_DE_20170117_150502000_ExchAndRfnd.txt
## 📂Zip Archive:
AB1_DE_20170117_150502001_Import.zip
## 🛠 Technologies
Python 3.10+
openpyxl for Excel parsing
zipfile, shutil, datetime, logging
Modular utility functions for formatting and validation