https://github.com/Hypercart-Dev-Tools/WP-DB-Toolkit
- No server setup required: Run directly on your machine without spinning up a WordPress/PHP environment or database server.
- Memory efficiency: Python's streaming approach processes multi-gigabyte SQL dumps without loading them entirely into memory, unlike typical PHP scripts or SQL queries.
- Portability: Works on any machine with Python installed—no dependency on WordPress, PHP, or a running database instance.
- Simplicity: Clean CLI interface with straightforward filtering options, easier to understand and modify than complex SQL queries or WordPress hooks.
- Safety: Analyze production dumps locally without touching live databases or importing sensitive data into development servers.
- Speed: Optimized streaming parser processes 16 GB dumps with 81 million rows in 20-30 minutes.
If you care about getting to the right data and not be stuck with trying to setup a bunch of tools and running out of memory, this is the tool for you.
With multiple scripts available to assist you—whether for direct querying or server-based analysis—this toolkit provides a comprehensive solution for WooCommerce database analysis.
- No dependencies other than Python.
- No need to import the database into your local server.
- Fast and efficient for large databases.
- Optimized for WooCommerce data analysis focused on coupons, orders, order items, customers, and products.
- Still want to import the database into your local server? Use the shell script provided.
- Tuned for AI VS Code Agent to quickly setup and run custom reports for you (Agents.md).
- Avoid Local Server Complications: Use the Python script if you want to query a MySQL dump file without importing it into your local development server, which could time out or encounter other complications.
- Performance: The script is capable of querying a 16 GB SQL file with 81 million rows in about 20-30 minutes, making it highly efficient for large datasets.
- Local Server Imports: If you prefer to import the MySQL file into your local server, the provided shell script can help you import the SQL file efficiently and then run reports against it.
This repository provides a robust and memory-efficient solution for parsing large MySQL dump files, specifically targeting WordPress/WooCommerce databases. It includes multiple scripts for extracting WooCommerce data:
- Coupons: Extract low-usage or expired coupons based on configurable filters.
- Orders: Extract order summaries with flexible filtering by status, date range, customer, billing details, and order total.
Both scripts are designed to be reliable and feature-rich compared to ad-hoc solutions.
- Streams through large MySQL dump files without loading the entire file into memory.
- Handles both
wp_postsandwp_postmetatables, with configurable table prefixes.
- Filters WooCommerce coupons (
shop_couponposts) based on:- Usage count: Default is
usage_count <= 2. - Date: Coupons older than a configurable cutoff date (default:
2024-06-01).
- Usage count: Default is
- Supports additional flags for customizing filters, such as
--max-usageand--cutoff-date.
- Filters WooCommerce orders (
shop_orderposts) based on:- Status: Filter by order status (e.g.,
wc-completed,wc-processing). - Date range: Filter by order creation date (
--date-from,--date-to). - Customer: Filter by customer ID or billing email.
- Billing state: Filter by billing state (e.g.,
CAorCalifornia). - Order total: Filter by minimum and/or maximum order amount.
- Status: Filter by order status (e.g.,
- Outputs filtered results to a CSV file for manual review.
- Coupons columns:
coupon_code,usage_count,date_created. - Orders columns:
order_id,status,date_created,order_total,currency,customer_id,billing_email,billing_state,date_paid,date_completed.
- Preview results without writing to a file using the
--previewflag. - Limit the number of previewed rows with
--preview-limit.
- Override default table names and prefixes with flags like
--posts-tableand--prefix. - Designed to be easily extended for additional filters or output formats.
Compared to other ad-hoc solutions, this script offers:
- Reliability: Thorough pre-flight validation ensures the input file and filters are correct before processing.
- Efficiency: Streams data to minimize memory usage, making it suitable for very large dump files.
- Flexibility: Highly configurable filters and output options.
- Ease of Use: Clear CLI interface with helpful flags and options.
- Safety: Avoids committing sensitive data by using a
temp/directory ignored via.gitignore.
To verify parsing and filters before a full run:
python3 wpdbtk-extract-coupons.py temp/do-not-sync-example.sql \
--preview \
--preview-limit 10To extract coupons and save the results to a CSV file:
python3 wpdbtk-extract-coupons.py temp/do-not-sync-binoid-coupons.sql \
temp/binoid-low-usage-coupons.csvTo preview orders with specific filters:
python3 wpdbtk-extract-orders.py temp/do-not-sync-binoid-coupons.sql \
--preview \
--preview-limit 10 \
--status wc-completedExample: Extract all California orders from 2024 with order value $100 or more:
python3 wpdbtk-extract-orders.py temp/do-not-sync-binoid-coupons.sql \
temp/california-orders-2024-min100.csv \
--billing-state CA \
--date-from 2024-01-01 \
--date-to 2024-12-31 \
--min-total 100Example: Extract completed orders for a specific customer:
python3 wpdbtk-extract-orders.py temp/do-not-sync-binoid-coupons.sql \
temp/customer-orders.csv \
--customer-id 123 \
--status wc-completedExample: Extract orders by billing email:
python3 wpdbtk-extract-orders.py temp/do-not-sync-binoid-coupons.sql \
temp/email-orders.csv \
--billing-email customer@example.com- Versioning: Increment the
__version__in each script (wpdbtk-extract-coupons.py,wpdbtk-extract-orders.py, etc.) for meaningful changes. - Changelog: Document changes in
CHANGELOG.mdusing the Keep a Changelog format. - Temporary Files: Use the
temp/directory for large dumps and outputs; it is ignored by Git to prevent accidental commits of sensitive data. - Documentation: Add new documentation under
PROJECT/1-INBOX/and update the status checklist inAGENTS.md. - Scripts: Each WooCommerce entity (coupons, orders, etc.) has its own extraction script with consistent CLI patterns and filtering options.
This project is dual-licensed and is not an OSI-approved open source license:
- Personal Use License (
LICENSE-PERSONAL.md): Free for personal, non-commercial use by a single individual. Modification and free redistribution are allowed with attribution. - Commercial License (
LICENSE-COMMERCIAL.md): Required for any company/agency use or any team of two (2) or more people. Contacthello@hypercart.io.
Required attribution (must be preserved in source and documentation):
- Hypercart Woo DB Analysis Toolkit
- https://github.com/Hypercart-Dev-Tools
See
NOTICEfor the standardized attribution block.
- Extend filtering capabilities to include additional WooCommerce-specific fields.
- Support for alternative output formats (e.g., JSON).
- Improved logging and error handling for edge cases.
Want to catch errors and make your WP Codebase more performant? Check out Hypercart Code Check at https://github.com/Hypercart-Dev-Tools/WP-Code-Check