I'm using SimplePDFViewer to scrape a PDF in an app that has implemented logging and I've discovered that doing so is generating unexpected log statements to stdout.
I created a slightly larger than minimal mvp to illustrate the problem. The attached zip contains the python code
and a test pdf, testPartial.pdf
To see the problem extract both files to the same directory and run the program twice, once with the --pdf
flag and once without:
will run without executing the code on lines 23 & 24, so SimplePDFViewer is not run. The expected result will be:
Be patient - Extracting text & strings from testPartial.pdf
PDF scrapping complete!
Generated file test…
Note that I'm using logging instead of print to generate text output to the console.
python --pdf
will run executing lines 23 & 24, so SimplePDFView is run. The unexpected result will be:
Be patient - Extracting text & strings from testPartial.pdf
PDF scrapping complete!
INFO:logTest:PDF scrapping complete!
DEBUG:logTest:doing somethings else…
Generated file test…
INFO:logTest:Generated file test…
You can see that there are 3 logging lines included and the format is clearly different than the stream formatter I setup on line 52 or the file formatter I set up on line 64 of
Version Information:
- pdfreader: 0.1.11
- python: 3.10.2