database-records-compressor

Comparison - Compressing high volume of grouped data into database

This project demonstrates a Python script that compares the performance of storing records in a standard SQLite database versus a database with compressed records. It showcases how compressing data before insertion can lead to significant storage savings and evaluates the trade-offs in terms of write and read performance.

Features

Data Compression: Uses zlib to compress records grouped by date.
Parallel Processing: Leverages multiprocessing to perform compression and decompression in parallel, utilizing multiple CPU cores to speed up the process.
Performance Comparison: Provides a clear summary table comparing storage size, write time, and read time between the normal and compressed database approaches.

How to Run

Ensure you have Python 3 installed.
Clone the repository.
Run the script from your terminal:
```
python database_insert.py
```

Output

The script will:

Clean up any old database files.
Generate a configurable number of sample records.
Write the records to a normal SQLite database (example.db).
Group, compress, and write the records to a compressed SQLite database (example-compressed.db).
Read the records from both databases to measure read performance.
Verify that the data in both databases is identical.
Print a summary table with the final performance metrics.

Example Output:

--- Final Summary ---
Results for 10,000,000 records and 1,000 unique groups.
---------------------------------------------------------------------------------------------
| Metric                    | Normal          | Compressed      | Variation                 |
|---------------------------|-----------------|-----------------|---------------------------|
| Storage Size (KB)         | 395184.00       | 103024.00       | 3.84x smaller (-73.93%)   |
| Write Time (s)            | 14.0467         | 9.3553          | -33.40%                   |
| Read Time (s)             | 14.2523         | 7.1820          | -49.61%                   |
---------------------------------------------------------------------------------------------

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
database_insert.py		database_insert.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

database-records-compressor

Features

How to Run

Output

About

Uh oh!

Releases

Packages

Languages

License

allansli/database-records-compressor

Folders and files

Latest commit

History

Repository files navigation

database-records-compressor

Features

How to Run

Output

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages