Skip to content

rebuilding index fails #160

Open
Open
@ggthedev

Description

@ggthedev

Hi
A novice user who just discovered this wonderful utility, encountered the following error while trying the -r option.
Here is the complete error info:

~ ❯ cppman -r
Indexing 'https://cplusplus.com/reference/' (depth 1)...
Exception in thread Thread-1 (_worker):
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
self.run()
File "/usr/local/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 975, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/Cellar/cppman/0.5.6/libexec/lib/python3.11/site-packages/cppman/crawler.py", line 248, in _worker
if self.process_document(url, content, depth):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/cppman/0.5.6/libexec/lib/python3.11/site-packages/cppman/main.py", line 247, in process_document
keywords = self._extract_keywords(content)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/cppman/0.5.6/libexec/lib/python3.11/site-packages/cppman/main.py", line 381, in _extract_keywords
soup = BeautifulSoup(text, "lxml")
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/cppman/0.5.6/libexec/lib/python3.11/site-packages/bs4/init.py", line 249, in init
raise FeatureNotFound(
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
=== Done https://cplusplus.com/reference/
Indexing 'https://en.cppreference.com/w/cpp' (depth 1)...
Exception in thread Thread-2 (_worker):
Traceback (most recent call last):
File "/usr/local/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
self.run()
File "/usr/local/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/threading.py", line 975, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/Cellar/cppman/0.5.6/libexec/lib/python3.11/site-packages/cppman/crawler.py", line 248, in _worker
if self.process_document(url, content, depth):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/cppman/0.5.6/libexec/lib/python3.11/site-packages/cppman/main.py", line 247, in process_document
keywords = self._extract_keywords(content)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/cppman/0.5.6/libexec/lib/python3.11/site-packages/cppman/main.py", line 381, in _extract_keywords
soup = BeautifulSoup(text, "lxml")
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/cppman/0.5.6/libexec/lib/python3.11/site-packages/bs4/init.py", line 249, in init
raise FeatureNotFound(
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
=== Done https://en.cppreference.com/w/cpp
~ ❯

Please let me know is the issue related to Beautifulsoup? I am able to infer, some parser for lxml is missing, but not sure which one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions