Skip to content

Many nodes fails to be processed in khan_academy_fr #100

Open

Description

1046 nodes have failed to be processed in https://farm.openzim.org/pipeline/62191f74-ff73-473d-acc3-49af55fb5f8b/debug

I browsed through the errors and found following patterns (I might have missed some).

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 98, in wrapper
    return func(self, item)
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 222, in add_node
    handler(node_id)
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 325, in add_topic_node
    node = self.db.get_node(node_id, with_parents=True, with_children=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 189, in get_node
    "children_count": self.get_node_children_count(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 125, in get_node_children_count
    return self.get_cell(
           ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 69, in get_cell
    return self.get_row(query, *args, **kwargs)[0]
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
TypeError: 'NoneType' object is not subscriptable

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 100, in wrapper
    raise RuntimeError(f"Failed to process {kind} node {node_id}") from exc
RuntimeError: Failed to process topic node 232ba2df649f5225b0bf7d16613fc70b
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 98, in wrapper
    return func(self, item)
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 222, in add_node
    handler(node_id)
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 327, in add_topic_node
    html = self.jinja2_env.get_template("topic.html").render(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/jinja2/environment.py", line 1301, in render
    self.environment.handle_exception()
  File "/usr/local/lib/python3.12/site-packages/jinja2/environment.py", line 936, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/templates/topic.html", line 1, in top-level template code
    {% extends "base.html" %}
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/templates/base.html", line 36, in top-level template code
    {% block content %}{% endblock %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/templates/topic.html", line 9, in block 'content'
    {% for child in children %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 114, in get_node_children
    "thumbnail": self.get_thumbnail_name(rowdict["id"]),
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 217, in get_thumbnail_name
    thumbnail = self.get_node_thumbnail(node_id)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 214, in get_node_thumbnail
    return self.get_node_file(node_id, thumbnail=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 198, in get_node_file
    return next(self.get_node_files(node_id, thumbnail=thumbnail))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 203, in get_node_files
    for row in self.get_rows(
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 73, in get_rows
    cursor = conn.execute(query, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.InterfaceError: bad parameter or other API misuse

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 100, in wrapper
    raise RuntimeError(f"Failed to process {kind} node {node_id}") from exc
RuntimeError: Failed to process topic node 02cf7d8d22b4520fb6c8cd1d8e731052
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 98, in wrapper
    return func(self, item)
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 222, in add_node
    handler(node_id)
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 327, in add_topic_node
    html = self.jinja2_env.get_template("topic.html").render(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: jinja2.environment.Template.render() argument after ** must be a mapping, not NoneType

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 100, in wrapper
    raise RuntimeError(f"Failed to process {kind} node {node_id}") from exc
RuntimeError: Failed to process topic node bccfcc046a7f5f8ea093a0c27bfa2f66
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 98, in wrapper
    return func(self, item)
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 218, in add_node
    thumbnail = self.db.get_node_thumbnail(node_id)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 214, in get_node_thumbnail
    return self.get_node_file(node_id, thumbnail=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 198, in get_node_file
    return next(self.get_node_files(node_id, thumbnail=thumbnail))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/database.py", line 211, in get_node_files
    yield dict(row)
          ^^^^^^^^^
IndexError: tuple index out of range

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 100, in wrapper
    raise RuntimeError(f"Failed to process {kind} node {node_id}") from exc
RuntimeError: Failed to process topic node 4a3ad4543c5b5f1bbb02bc88b82e52c6
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 98, in wrapper
    return func(self, item)
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 220, in add_node
    self.funnel_file(thumbnail["id"], thumbnail["ext"])
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 227, in funnel_file
    url, fname = get_kolibri_url_for(fid, fext)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 85, in get_kolibri_url_for
    remote_dirs = (file_id[0], file_id[1])
                   ~~~~~~~^^^
TypeError: 'NoneType' object is not subscriptable

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kolibri2zim/scraper.py", line 100, in wrapper
    raise RuntimeError(f"Failed to process {kind} node {node_id}") from exc
RuntimeError: Failed to process topic node 8ab0e74e98605f698b6ba9f244920f21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions