Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when retrieving metadata #2547

Closed
StrangeRanger opened this issue Sep 23, 2022 · 7 comments
Closed

Error when retrieving metadata #2547

StrangeRanger opened this issue Sep 23, 2022 · 7 comments

Comments

@StrangeRanger
Copy link

StrangeRanger commented Sep 23, 2022

Describe the bug/problem
There are times when I attempt to grab the metadata for a newly or previously uploaded PDF, that it produces an error, saying it can't retrieve anything. It's not something that happens all the time, so it's difficult to 100% recreate problem every time.

To Reproduce
Steps to reproduce the behavior:

  1. Import some PDF into instance
  2. Attempt to retrieve metadata
  3. Possibly encounter metadata retrieval error

Logfile

[2022-09-23 09:59:14,256]  WARN {cps.uploader:229} Cannot extract cover image, using default: NoDecodeDelegateForThisImageFormat `' @ error/constitute.c/ReadImage/572
[2022-09-23 09:59:14,256]  WARN {cps.uploader:230} On Windows this error could be caused by missing ghostscript
[2022-09-23 09:59:39,727]  INFO {scholarly:95} Getting https://scholar.google.com/scholar?hl=en&q=Operating%20Systems_%20Three%20Easy%20Pieces%20Remzi%20H.%20Arpaci-Dusseau&as_vis=0&as_sdt=0,33
[2022-09-23 09:59:42,176]  INFO {scholarly:121} Got a captcha request.
[2022-09-23 09:59:42,179]  INFO {scholarly:351} Neither Chrome nor Firefox/Geckodriver found in PATH
[2022-09-23 09:59:42,179]  INFO {scholarly:162} Exception AttributeError while fetching page: ("'NoneType' object has no attribute 'current_url'",)
[2022-09-23 09:59:42,180]  INFO {scholarly:163} Retrying with a new session.
[2022-09-23 09:59:44,490]  INFO {scholarly:121} Got a captcha request.
[2022-09-23 09:59:44,492]  INFO {scholarly:351} Neither Chrome nor Firefox/Geckodriver found in PATH
[2022-09-23 09:59:44,492]  INFO {scholarly:162} Exception AttributeError while fetching page: ("'NoneType' object has no attribute 'current_url'",)
[2022-09-23 09:59:44,492]  INFO {scholarly:163} Retrying with a new session.
[2022-09-23 09:59:44,495]  WARN {cps.metadata_provider.scholar:56} Cannot Fetch from Google Scholar.
[2022-09-23 09:59:44,497] ERROR {cps:1457} Exception on /metadata/search [POST]
Traceback (most recent call last):
  File "/var/calibre/.local/share/virtualenvs/calibre-web-eKriAZSx/lib/python3.10/site-packages/flask/app.py", line 2073, in wsgi_app
    response = self.full_dispatch_request()
  File "/var/calibre/.local/share/virtualenvs/calibre-web-eKriAZSx/lib/python3.10/site-packages/flask/app.py", line 1518, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/var/calibre/.local/share/virtualenvs/calibre-web-eKriAZSx/lib/python3.10/site-packages/flask/app.py", line 1516, in full_dispatch_request
    rv = self.dispatch_request()
  File "/var/calibre/.local/share/virtualenvs/calibre-web-eKriAZSx/lib/python3.10/site-packages/flask/app.py", line 1502, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/var/calibre/.local/share/virtualenvs/calibre-web-eKriAZSx/lib/python3.10/site-packages/flask_login/utils.py", line 303, in decorated_view
    return current_app.ensure_sync(func)(*args, **kwargs)
  File "/var/www/calibre.randomserver.xyz/calibre-web/cps/search_metadata.py", line 141, in metadata_search
    data.extend([asdict(x) for x in future.result() if x])
TypeError: 'NoneType' object is not iterable

Expected behavior
Grab metadata without a problem

Screenshots
Screen Shot 2022-09-23 at 10 09 01 AM

Environment (please complete the following information):

  • OS: Ubuntu 22.04
  • Python version: Pipenv installed - 3.10.x
  • Calibre-Web version: 0.6.19
  • Browser: Brave Version 1.43.93, Chromium: 105.0.5195.127
@StrangeRanger StrangeRanger changed the title Error when importing metadata Error when retrieving metadata Sep 23, 2022
@arseniov
Copy link

arseniov commented Sep 25, 2022

I have the same issue updating or fetching metadata for EPUB and KEPUB files. Haven't tried other file formats.

My log entries:

ERROR {cps.editbooks:231} 'NoneType' object is not callable

Environment (please complete the following information):

  • OS: Synology DSM 7.1

  • Python version: Python 3.10

  • Calibre-Web version: 0.6.19

  • Browser: Edge 105.0.1343.48

@OzzieIsaacs
Copy link
Collaborator

It's not something that happens all the time, so it's difficult to 100% recreate problem every time.

Sometimes google scholary asks for a captcha, this isn't handled by the used library, so it throws an exception, but only sometimes. If you update to the newest nightly version, calibre-web should not crash anymore

@arseniov
Copy link

arseniov commented Oct 2, 2022

It's not something that happens all the time, so it's difficult to 100% recreate problem every time.

Sometimes google scholary asks for a captcha, this isn't handled by the used library, so it throws an exception, but only sometimes. If you update to the newest nightly version, calibre-web should not crash anymore

Thanks, I need to install manually, what is the nightly version? Is it the "1" or "Develop" branch on GitHub?

@OzzieIsaacs
Copy link
Collaborator

It‘s the master branch.

@StrangeRanger
Copy link
Author

@OzzieIsaacs I'm using the master, and updated to the latest push, but it doesn't seem to have fixed anything. Still produces a "Search error!!". This time it's happening with the two non-highlighted ones.
Screen Shot 2022-10-06 at 3 02 18 PM

@jpbaril
Copy link

jpbaril commented Oct 10, 2022

If I only select Amazon and Google as sources it works.

@OzzieIsaacs
Copy link
Collaborator

Please update to the newest nightly version, should be fixed now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants