Skip to content
/ rdedup Public

Data deduplication engine, supporting optional compression and public key encryption.

Notifications You must be signed in to change notification settings

dpc/rdedup

Repository files navigation

Travis CI Build Status crates.io Gitter Chat

rdedup

rdedup is a data deduplication engine and a backup software. See current project status and original use case description wiki pages.

rdedup is generally similar to existing software like duplicacy, restic, attic, duplicity, zbackup, etc., with a skew towards asymmetric encryption and synchronization friendly data model. Thanks to Rust and solid architecture, rdedup is also exteremely performant and very reliable (no data-loss bugs ever reported).

rdedup is written in Rust and provides both command line tool and library API (rdedup-lib). The library can be used to embed the core engine into other applications, or building custom frontends and tools.

Features

  • simple but solid cryptography:
    • libsodium based
    • public-key encryption mode (the only tool like that I'm aware of, and primary reason rdedup was created)
  • flat-file synchronization friendly (Dropbox/syncthing, rsync, rclone)
  • immutable data-conflict-free data store
  • cloud backends are WIP
  • incremental, scalable garbage collection
  • variety of supported algorithms:
    • chunking: fastcdc, gear, bup
    • hashing: blake2b, sha256
    • compression: zstd, deflate, xz2, bzip2, none
    • encryption: curve25519, none
    • very easy to add new ones
    • check rdedup init --help output for up-to-date list
  • extreme performance and parallelism - see Rust fearless concurrency in rdedup
  • reliability focus (eg. rdedup is using fsync + rename to avoid data corruption even in case of a hardware crash)
  • built-in time/performance profiler

Strong parts

It's written in Rust. It's a modern language, that is actually really nice to use. Rust makes it easy to have a very robust and fast software.

The author is a nice person, welcomes contributions, and helps users. Or at least he's trying... :)

Shortcomings and missing features:

rdedup currently does not implement own backup/restore functionality (own directory traversal), and because of that it's typically paired with tar or rdup tools. Built-in directory traversal could improve deduplication ratio for workloads with many small, frequently changing files.

Cloud storage integrations are missing. The architecture to support it is mostly implemented, but the actual backends are not.

Installation

If you have cargo installed:

cargo install --locked rdedup

If not, I highly recommend installing rustup (think pip, npm but for Rust)

If you're interested in running rdedup with maximum possible performance, try:

RUSTFLAGS="-C target-cpu=native" cargo install --locked rdedup

In case of troubles, check rdedup building issues or report a new one (sorry)!

Usage

See rdedup -h for help.

Rdedup always operates on a repo, that you provide as an argument (eg. --dir <DIR>), or via environment variable (eg. RDEDUP_DIR).

Supported commands:

  • rdedup init - create a new repo.
    • rdedup init --help for repository configuration options.
  • rdedup store <name> - store data from standard input under a given name.
  • rdedup load <name> - load data stored under given name and write it to standard output.
  • rdedup rm <name> - remove the given name.
  • rdedup ls - list all stored names.
  • rdedup gc - remove any no longer reachable data.

In combination with rdup this can be used to store and restore your backup like this:

rdup -x /dev/null "$HOME" | rdedup store home
rdedup load home | rdup-up "$HOME.restored"

rdedup is data agnostic, so formats like tar, cpio and other will work, but to get benefits of deduplication, archive format should not be compressed or encrypted already.

RDEDUP_PASSPHRASE environment variable

If RDEDUP_PASSPHRASE is defined, it will be used instead of interactively asking user for password.

License

rdedup is licensed under: MPL-2.0