Skip to content

alan-turing-institute/reprosyn

Repository files navigation

License: MITtests Documentation Status

Reprosyn: synthesising tabular data

Reprosyn is a python library for generating synthetic data.

Reprosyn's aim is to wrap generators so that they can be easily used with same interface. See how to add generators.

It can be used either as a python package or a command-line tool.

The Documentation is the best place to get started.

Installation

You can install it via pip:

pip install git+https://github.com/alan-turing-institute/reprosyn.git

or, if using poetry:

poetry add git+https://github.com/alan-turing-institute/reprosyn.git#main

This will give you the command line tool rsyn, and the python package reprosyn.

Reprosyn uses poetry to manage dependencies. See the poetry installation docs for how to install poetry on your system.

Additional dependencies

Some dependencies are optional by default to support cross-platform use. If you are installing on mac it is recommended to install with an --all-extras flag. See the installation documentation for more information.

For developers

To install locally:

git clone https://github.com/alan-turing-institute/reprosyn
cd reprosyn
poetry install #installs package and dependences
poetry shell #opens environment in a subshell
rsyn --help

Example Usage

The Documentation is the best place to get started.

See also the Examples notebook for examples of using all methods.

Related Projects

Reprosyn is under active development as a companion library for Toolbox for Adversarial Privacy Auditing, to support research comparing synthetic data generators.

There are lots of other great synthetic data packages. Such as:

Note that some of Reprosyn's design is inspired by QUIPP.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages