-
Notifications
You must be signed in to change notification settings - Fork 814
Open
Description
I tried passing the image directly as input to Zerox, but it returned an error.
ERROR:root:Error converting PDF to images: Unable to get page count.
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/Users/kinda/Developer/清韵公司项目/open-source/Revornix/celery-worker/engine/zerox.py", line 55, in <module>
result = asyncio.run(
^^^^^^^^^^^^
File "/Users/kinda/anaconda3/envs/link/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/Users/kinda/anaconda3/envs/link/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kinda/anaconda3/envs/link/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/Users/kinda/Developer/清韵公司项目/open-source/Revornix/celery-worker/engine/zerox.py", line 28, in analyse_website
result = await zerox(
^^^^^^^^^^^^
File "/Users/kinda/anaconda3/envs/link/lib/python3.11/site-packages/pyzerox/core/zerox.py", line 149, in zerox
results = await process_pages_in_batches(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kinda/anaconda3/envs/link/lib/python3.11/site-packages/pyzerox/processor/pdf.py", line 94, in process_pages_in_batches
tasks = [
^
TypeError: 'NoneType' object is not iterable
Here is my code below:
...
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
await page.goto(url, wait_until="networkidle")
html_content = await page.content()
await page.screenshot(path=temp_shot_img_path, full_page=True)
await browser.close()
result = await zerox(
file_path=temp_shot_img_path,
model="gpt-4o-mini",
cleanup=True,
)
...Metadata
Metadata
Assignees
Labels
No labels