The following folders contain files that are not a part of the pipeline, the output is for testing and debugging and is how the output looks like and the pdfs are just pdfs ready to be ingested.
pipeline/
├── output/
│ ├── extracted_text.json/
│ ├── filtered_results.json/
│ └── <name>_<page>_res.png/
└── pdfs/