Skip to content

bpo-42988: Fix security issue in the pydoc server #24337

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions Lib/pydoc.py
Original file line number Diff line number Diff line change
Expand Up @@ -2544,9 +2544,42 @@ def bltinlink(name):
'key = %s' % key, '#ffffff', '#ee77aa', '<br>'.join(results))
return 'Search Results', contents

def validate_source_path(path):
with warnings.catch_warnings():
warnings.filterwarnings('ignore') # ignore problems during import
def onerror(modname):
pass
for importer, modname, ispkg in pkgutil.walk_packages(onerror=onerror):
try:
spec = pkgutil._get_spec(importer, modname)
except SyntaxError:
# raised by tests for bad coding cookies or BOM
continue
loader = spec.loader
if hasattr(loader, 'get_source'):
try:
source = loader.get_source(modname)
except Exception:
continue
if hasattr(loader, 'get_filename'):
sourcepath = loader.get_filename(modname)
if path == sourcepath:
return
else:
try:
module = importlib._bootstrap._load(spec)
except ImportError:
continue
sourcepath = getattr(module, '__file__', None)
if path == sourcepath:
return
else:
raise ValueError('not found {found}')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I've randomly opened this PR. I'd like to ask you to check this error message, it looks like unfinished.


def html_getfile(path):
"""Get and display a source file listing safely."""
path = urllib.parse.unquote(path)
validate_source_path(path)
with tokenize.open(path) as fp:
lines = html.escape(fp.read())
body = '<pre>%s</pre>' % lines
Expand Down
13 changes: 12 additions & 1 deletion Lib/test/test_pydoc.py
Original file line number Diff line number Diff line change
Expand Up @@ -1369,7 +1369,11 @@ def test_url_requests(self):
("topics", "Pydoc: Topics"),
("keywords", "Pydoc: Keywords"),
("pydoc", "Pydoc: module pydoc"),
("test.test_pydoc", "Pydoc: module test.test_pydoc"),
("get?key=pydoc", "Pydoc: module pydoc"),
("get?key=test.test_pydoc", "Pydoc: module test.test_pydoc"),
("get?key=html", "Pydoc: package html"),
("get?key=sys", "Pydoc: built-in module sys"),
("search?key=pydoc", "Pydoc: Search Results"),
("topic?key=def", "Pydoc: KEYWORD def"),
("topic?key=STRINGS", "Pydoc: TOPIC STRINGS"),
Expand All @@ -1381,11 +1385,18 @@ def test_url_requests(self):
for url, title in requests:
self.call_url_handler(url, title)

path = string.__file__
# File in restricted walk_packages path.
path = __file__
title = "Pydoc: getfile " + path
url = "getfile?key=" + path
self.call_url_handler(url, title)

# File outside of restricted walk_packages path.
path = pydoc.__file__
title = "Pydoc: Error - getfile?key=" + path
url = "getfile?key=" + path
self.call_url_handler(url, title)


class TestHelper(unittest.TestCase):
def test_keywords(self):
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
The ``/getfile?key=`` route of the :mod:`pydoc` Web server checks now that
the argument is the file path of the source of one of modules. It prevents
reading arbitrary files.