Skip to content
Sebastian Göttel edited this page Mar 1, 2025 · 2 revisions

After successfully running the pipeline, the generated JSON and XML files are stored in a structured directory within the subreddits/ folder in the project root.

The base output directory is subreddits/. Within this directory, the pipeline creates a folder named according to the subreddit and the processing mode used:

subreddits/
└── <subreddit>_<mode>/
    ├── <subreddit>_json_<mode>/
    │   ├── 00001/
    │   │   ├── ...JSON files...
    │   ├── 00002/
    │   │   ├── ...JSON files...
    │   └── ...
    └── <subreddit>_xml_<mode>/
        ├── 00001/
        │   ├── ...XML files...
        ├── 00002/
        │   ├── ...XML files...
        └── ...

JSON Files:

  • Grouped Mode: {link_id}_flat.json (e.g., 10ax890_flat.json)
  • No-Group Mode: {link_id}_{comment_id}.json (e.g., 10wugax_jepcf1r.json)

XML Files:

  • Grouped Mode: {link_id}.xml (e.g., 10ax890.xml)
  • No-Group Mode: {link_id}_{comment_id}.xml (e.g., 10wugax_jepcf1r.xml)

Clone this wiki locally