Skip to content

Real world: making backups

Franco Corbelli edited this page Sep 27, 2024 · 6 revisions

Introduction

zpaqfranz is a hybrid data archiving tool. What does that mean? It compresses data, which can be used for backups and later restored if needed. While it does offer advanced backup features (mostly for advanced users), it’s more similar to a traditional archiver like WinZIP, RAR, or 7zip.

If you need a "true" backup program, you might want to look elsewhere.

Now, let's briefly explore how zpaqfranz works and how it differs from other archivers, using 7zip as a reference since it’s likely familiar to you.

Essentially, zpaqfranz reads a set of files and folders, processes them, and writes them into an archive. This archive is a file with the .zpaq extension, containing all the selected files, just like how you would store documents in a .7z file with 7zip.

Versioning

Here’s where the key difference lies. zpaqfranz can store multiple versions of files within the same .zpaq archive. If you're familiar with ZFS snapshots or TimeMachine on Mac, this concept should sound familiar.

Example:

Let's say you have a document called pippo.doc. You archive it in a compressed file, thebackup.zpaq, creating version 1.

Now, you modify the content of pippo.doc. This new file is different from version 1. When you archive it again using zpaqfranz in the same thebackup.zpaq, it will store both versions:

  • Version 1 (the original)
  • Version 2 (the modified version)

This process can continue with versions 3, 4, 5, and so on.

Summary:

Within a single .zpaq archive, you can maintain multiple versions of the same file. Each time you archive a file again, a new version is stored. You can then extract any specific version later. For example, you might want to restore yesterday's version of pippo.doc today.

This versioning feature is typically not supported by conventional archivers, where only the latest version of a file is stored in the archive.

Repeated Backups

In real-world scenarios, it's common to perform repeated backups at regular intervals. For example, let's say you have a folder C:\dati, and every night, you create a backup on another storage device, such as a USB hard drive, D:\thebackup.zpaq.

For now, let's not worry about the "how" — we’ll cover that later. The key idea is that, by scheduling a batch process to run at 3:00 AM every night, you can back up all the data in your folder, ensuring it's stored safely.

Clear (?) example:

If today you delete pippo.doc from C:\dati and continue running backups, you can still restore both:

  • Version 1 of pippo.doc
  • Version 2 of pippo.doc

Even though the file is deleted from your original folder, it remains intact in the .zpaq backup, along with all its previous versions.

Key Point

Once data is added to a .zpaq archive, it is never deleted. It remains there forever. So, if at some point (let's say a year ago) there was a file called pluto.doc in your C:\dati folder, and it was included in a backup, you can still restore it even after a year—even if the entire C:\dati folder has been deleted.

This is why zpaqfranz (and we should always acknowledge the original creator of zpaq, Dr. Matt Mahoney—the main credit is his) is incomparably better than conventional archivers when dealing with repeated backups.

Another Example: Database Backups

A common scenario is backing up databases. Many websites use RDBMS programs (like MySQL, MariaDB, Postgres, etc.) to manage data. It’s standard practice to create regular backups of this data using various tools (which we won't cover here in detail). These are often called dumps, and they are created at regular intervals (daily, hourly, or every 15 minutes—depending on the situation).

This process allows data to be restored based on the frequency of these dumps.

Timestamping

A common method is to add a timestamp to each dump, essentially creating separate backups marked by the time they were created. For instance, if you're backing up daily, you might have:

  • themygooddump_2024_09_01.sql
  • themygooddump_2024_09_02.sql
  • themygooddump_2024_09_03.sql

If you want to restore the backup from September 1, 2024, you would keep that dump file, along with others for subsequent days. However, as you accumulate more dumps, you’ll eventually need to prune the older ones to free up space. For example, you might delete the dump from September 1, 2024, permanently losing that data from both the database and the backup.

This practice of deleting older data to save space is quite common.

The Magic of zpaq

With zpaq, this issue doesn't exist. Data is never deleted—it stays forever. This is the dream for anyone managing backups.

How Does zpaq Achieve This?

It’s a complex process, with many pages explaining the details, but for now, trust me (feel free to dive into the documentation later). The short version is that zpaq stores different versions of files using minimal space. Instead of saving the entire file multiple times for each version, it only stores the differences between the versions.

In simple terms: if version 1 of pippo.doc is nearly identical to version 2, zpaq will only store the small changes, so the space used is roughly equivalent to the size of a single copy of pippo.doc.

Summary

zpaq stores all data from a specified set of folders inside a single archive. Every time you run the same backup command (essentially triggering an update or "refresh"), the data is saved in the archive without modifying or deleting the older versions.

Daily Backup Scenario

In a typical scenario, where backups are updated daily, you will be able to restore the complete state of your data from any given day, even years in the past. Thanks to zpaq's efficient storage technology, space consumption is minimal, allowing you to keep thousands of versions without excessive disk usage.

Use Case: Daily Backups

If you're looking to make a single copy of your data, for example to move it onto a USB drive, zpaqfranz is not the best tool. For that, you're better off using 7z, WinRAR, or similar archiving software.

However, if your goal is to create daily (or even hourly) backups of your files, ensuring they're protected from changes, deletions (even accidental), or ransomware, zpaqfranz is the ideal solution. It will easily handle repeated backups while preserving all past versions of your data.

Clone this wiki locally