Skip to content

Make sure we can export in a dir structure that tiq-test can handle #29

Closed
@alexcpsec

Description

@alexcpsec

MOAR work!

Here is how things look on the tiq-test data directory right now:

aperture-2:data alexcp$ ls
enriched    population  raw
aperture-2:data alexcp$ ls raw
public_inbound  public_outbound
aperture-2:data alexcp$ ls raw/pu
public_inbound/  public_outbound/
aperture-2:data alexcp$ ls raw/public_inbound/
20140615.csv.gz 20140618.csv.gz 20140622.csv.gz 20140625.csv.gz 20140628.csv.gz 20140701.csv.gz 20140704.csv.gz 20140707.csv.gz 20140710.csv.gz 20140713.csv.gz
20140616.csv.gz 20140619.csv.gz 20140623.csv.gz 20140626.csv.gz 20140629.csv.gz 20140702.csv.gz 20140705.csv.gz 20140708.csv.gz 20140711.csv.gz 20140714.csv.gz
20140617.csv.gz 20140620.csv.gz 20140624.csv.gz 20140627.csv.gz 20140630.csv.gz 20140703.csv.gz 20140706.csv.gz 20140709.csv.gz 20140712.csv.gz 20140715.csv.gz

Basically we have the following structure:
data/[DATATYPE]/[DATAGROUP]/[YYYYMMDD].csv.gz considering that:

  • DATATYPE should be either raw or enriched. The names are references to what to expect on the data structure of the CSVs inside (as described on the README). Disregard the population type, it should not be a target for this presentation.
  • DATAGROUP is in reference to the group name of the combine output (currently the "inbound" and "outbound" separation). They can be whatever you like, I am using public_inbound and public_outbound for the presentation data.
  • YYYYMMDDis the way dates should be represented in the whole world.

Please note the CSVs are gzipped. The code expects that as well.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions