Skip to content

clflushopt/tpchgen-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tpchgen-rs

Apache licensed Build Status

Blazing fast TPCH benchmark data generator in pure Rust !

Usage

tpchgen-cli is a dbgen compatible CLI tool that generates tables from the TPCH benchmark dataset.

tpchgen is the library that implements the data generation logic for TPCH and it can be used to embed data generation logic natively in Rust.

CLI Usage

We tried to make the tpchgen-cli experience as close to dbgen as possible for no other reason than maybe make it easier for you to have a drop-in replacement.

$ tpchgen-cli -h
TPC-H Data Generator

Usage: tpchgen-cli [OPTIONS] --output-dir <OUTPUT_DIR>

Options:
  -s, --scale-factor <SCALE_FACTOR>  Scale factor to address defaults to 1 [default: 1]
  -o, --output-dir <OUTPUT_DIR>      Output directory for generated files
  -t, --tables <TABLES>              Which tables to generate (default: all) [possible values: nation, region, part, supplier, part-supp, customer, orders, line-item]
  -p, --parts <PARTS>                Number of parts to generate (for parallel generation) [default: 1]
      --part <PART>                  Which part to generate (1-based, only relevant if parts > 1) [default: 1]
  -h, --help                         Print help

For example generating a dataset with a scale factor of 1 (1GB) can be done like this :

$ tpchgen-cli -s 1 --output-dir=/tmp/tpch

Contributing

Pull requests are welcome. For major changes, please open an issue first for discussion. See our contributors guide for more details.

License

The project is licensed under the APACHE 2.0 license.

References

  • The TPC-H Specification, see the specification page.
  • The Original dbgen Implementation you must submit an official request to access the software dbgen at their official website