Skip to content

GH-123599: url2pathname(): handle authority section in file URL #126844

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Apr 10, 2025

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Nov 14, 2024

In urllib.request.url2pathname(), if the authority resolves to the current host, discard it. If an authority is present but resolves somewhere else, then on Windows we return a UNC path (as before), and on other platforms we raise URLError.

Affects pathlib.Path.from_uri() in the same way.

I'm indebted to Eryk Sun for his analysis.

… on POSIX

Adjust `urllib.request.url2pathname()` to parse the URL authority and path
with `urlsplit()` on POSIX. If the authority is empty or resolves to the
current host, it is ignored and the URL path is used as the pathname.
If not, we raise `URLError`.
@barneygale barneygale changed the title GH-126838: url2pathname(): handle non-empty authority section on POSIX GH-123599: url2pathname(): handle non-empty authority section on POSIX Nov 27, 2024
@barneygale barneygale changed the title GH-123599: url2pathname(): handle non-empty authority section on POSIX GH-123599: url2pathname(): handle authority section on POSIX Mar 19, 2025
@barneygale barneygale changed the title GH-123599: url2pathname(): handle authority section on POSIX GH-123599: url2pathname(): handle authority section in file URL Mar 19, 2025
Copy link
Member

@AA-Turner AA-Turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two reviews for the price of one!

barneygale and others added 3 commits March 20, 2025 01:58
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
@barneygale
Copy link
Contributor Author

Quick note on timing for this PR and #125866.

If possible I'd like to land this PR in time for 3.14 beta 1, in ~6 weeks time. It would mean we restrict backwards-incompatible changes to 3.14, with none planned for 3.15. I think that would be better for users who might be affected by the changes - e.g. folks who previously wrapped the urllib.request functions and fixed up faulty results - as they'd need to deal with changes only once.

I'll wait until 3.15 before I look at #125866 so I don't overload 3.14. That change will be 100% backwards-compatible.

Happy to re-think if anyone is unhappy with this plan. Cheers

@barneygale
Copy link
Contributor Author

This PR also solves #90812. Once it lands, I'll open a PR against #90812 that adds tests.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some changes may be not so innocent:

  • url2pathname() now performs network requests (and hang for a time).
  • url2pathname() now can raise URLError.
  • The result of gethostbyname_ex() and gethostname() is cached. Previously, you could reset this by setting FileHandler.names = None.
  • There may be a difference in handling authorities with port. It is not covered by tests.

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some docs comments.

@barneygale
Copy link
Contributor Author

barneygale commented Mar 29, 2025

Thank you for the reviews!

  • url2pathname() now performs network requests (and hang for a time).

That seems to be needed per RFC 8089 section 3 (ref) and section 2 less explicitly.

And it only comes up if the hostname isn't empty or "localhost"

Even so, perhaps we should put the new behaviour behind an argument like resolve_netloc=False?

  • url2pathname() now can raise URLError.

I think this is preferable to returning a nonsense local path, e.g. url2pathname('//google.com/foo') returns its argument unchanged on POSIX at the moment.

  • The result of gethostbyname_ex() and gethostname() is cached. Previously, you could reset this by setting FileHandler.names = None.

Fixed!

  • There may be a difference in handling authorities with port. It is not covered by tests.

I've added the test cases you suggested.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 👍

@barneygale barneygale requested a review from AA-Turner April 10, 2025 18:34
Copy link
Member

@AA-Turner AA-Turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

A

@barneygale
Copy link
Contributor Author

Thank you both :)

@barneygale barneygale enabled auto-merge (squash) April 10, 2025 19:34
@barneygale barneygale merged commit 66cdb2b into python:main Apr 10, 2025
39 checks passed
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot aarch64 Fedora Stable Refleaks 3.x (tier-2) has failed when building commit 66cdb2b.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/123/builds/926) and take a look at the build logs.
  4. Check if the failure is related to this commit (66cdb2b) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/123/builds/926

Failed tests:

  • test_perf_profiler

Failed subtests:

  • test_python_calls_appear_in_the_stack_if_perf_activated - test.test_perf_profiler.TestPerfProfilerWithDwarf.test_python_calls_appear_in_the_stack_if_perf_activated

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/Lib/test/test_perf_profiler.py", line 364, in test_python_calls_appear_in_the_stack_if_perf_activated
    self.assertIn(f"py::foo:{script}", stdout)
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'py::foo:/tmp/test_python_ovn4jxvt/tmp1na5uet8/perftest.py' not found in 'python 4080915 1475437.339487:          1 cycles:Pu: \n\t    ffffb3591ac0 _start+0x0 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 4080915 1475437.339532:          1 cycles:Pu: \n\tffffaa282d37fc78 [unknown] ([unknown])\n\tffffaa282d38049c [unknown] ([unknown])\n\tffffaa282bf215e4 [unknown] ([unknown])\n\t    ffffb3591ac0 _start+0x0 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 4080915 1475437.340098:          1 cycles:Pu: \n\t    ffffb357d830 _dl_map_object_from_fd+0xb50 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb357e133 _dl_map_object+0x1e7 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb35795bf openaux+0x3f (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb3578303 _dl_catch_exception+0x63 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb3579b33 _dl_map_object_deps+0x553 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb358f19f dl_main+0x139f (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb358c5ff _dl_sysdep_start+0x1df (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb358db17 _dl_start_final+0x5ab (inlined)\n\t    ffffb358db17 _dl_start+0x5ab (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb3591ad3 _start+0x13 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 4080915 1475437.340126:        310 cycles:Pu: \n\t    ffffb357d830 _dl_map_object_from_fd+0xb50 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb357e133 _dl_map_object+0x1e7 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb35795bf openaux+0x3f (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb3578303 _dl_catch_exception+0x63 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb3579b33 _dl_map_object_deps+0x553 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb358f19f dl_main+0x139f (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb358c5ff _dl_sysdep_start+0x1df (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb358db17 _dl_start_final+0x5ab (inlined)\n\t    ffffb358db17 _dl_start+0x5ab (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffffb3591ad3 _start+0x13 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 4080915 1475437.341107:        962 cycles:Pu: \n\t          5142ac _mi_options_init+0x20 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          51cd5b mi_process_load+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          51cdeb _mi_process_init+0xb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t    ffffb33263b7 __libc_start_main@@GLIBC_2.34+0x117 (/usr/lib64/libc.so.6)\n\t          41f22f _start+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\npython 4080915 1475437.341179:      40139 cycles:Pu: \n\t          41ed90 sysconf@plt+0x0 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          50f9a7 _mi_prim_mem_init+0x17 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t    ffffb34c1e97 __new_exitfn_called+0x7 (/usr/lib64/libc.so.6)\n\t    ffffb34c1e97 __new_exitfn_called+0x7 (/usr/lib64/libc.so.6)\n\npython 4080915 1475437.342286:      86790 cycles:Pu: \n\tffffaa282d37f654 [unknown] ([unknown])\n\tffffaa282d380484 [unknown] ([unknown])\n\tffffaa282bf215e4 [unknown] ([unknown])\n\t          538c14 _PyType_InitCache+0x1c (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          67479b init_interpreter+0x9f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          675bb3 _PyInterpreterState_New+0xc7 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66a6bb pycore_create_interpreter+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66b24b pyinit_config+0x6f (/home/buildbot/buildarea/3.x.c
h64.refleak/build/python)\n\t          6a4caf pymain_init+0xff (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4d7f pymain_main+0xf (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4e13 Py_BytesMain+0x27 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          41f337 main+0xb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t    ffffb332625b __libc_start_call_main+0x7b (/usr/lib64/libc.so.6)\n\t    ffffb332633b __libc_start_main@@GLIBC_2.34+0x9b (/usr/lib64/libc.so.6)\n\t          41f22f _start+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\npython 4080915 1475437.343799:    1449856 cycles:Pu: \n\t          502490 _Py_INCREF_IncRefTotal+0x0 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          4e8ddf Py_INCREF+0x2b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          4f0a7f _Py_NewRef+0x3f (inlined)\n\t          4f0a7f PyDict_SetItem+0x3f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          536543 add_tp_new_wrapper+0x6b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          5365e7 type_ready_set_new+0x63 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          543d27 type_ready+0x87 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          543fbf init_static_type+0x1a7 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          5440c3 _PyStaticType_InitBuiltin+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          502e93 _PyTypes_InitTypes+0xa7 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66a9eb pycore_init_types+0x1b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66b08f pycore_interp_init+0x10b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66b28b pyinit_config+0xaf (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          672b77 pyinit_core+0xdb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          672c4f Py_InitializeFromConfig+0x97 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4caf pymain_init+0xff (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4d7f pymain_main+0xf (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4e13 Py_BytesMain+0x27 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          41f337 main+0xb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t    ffffb332625b __libc_start_call_main+0x7b (/usr/lib64/libc.so.6)\n\t    ffffb332633b __libc_start_main@@GLIBC_2.34+0x9b (/usr/lib64/libc.so.6)\n\t          41f22f _start+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\npython 4080915 1475437.345794:    3462647 cycles:Pu: \n\t          51f774 allocate_from_new_pool+0x24 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          51fc7f pymalloc_alloc+0x97 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          51fcc3 _PyObject_Malloc+0x2b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          507427 _PyMem_DebugRawAlloc+0xbf (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          507477 _PyMem_DebugRawMalloc+0x17 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          508227 _PyMem_DebugMalloc+0


Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/Lib/test/test_perf_profiler.py", line 366, in test_python_calls_appear_in_the_stack_if_perf_activated
    self.assertIn(f"py::baz:{script}", stdout)
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'py::baz:/tmp/test_python_md55p2cr/tmpm_v5y19b/perftest.py' not found in 'python 4101386 1476164.623344:          1 cycles:Pu: \n\tffffaa282d37fc78 [unknown] ([unknown])\n\tffffaa282d38049c [unknown] ([unknown])\n\tffffaa282bf215e4 [unknown] ([unknown])\n\t    ffffb80d2ac0 _start+0x0 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 4101386 1476164.623372:          1 cycles:Pu: \n\t    ffffb80d2ac0 _start+0x0 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 4101386 1476164.624094:          1 cycles:Pu: \n\t          50940c _mi_thread_id+0x0 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          519f4f mi_heap_main_init+0x23 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          51cd37 mi_process_load+0xb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          51cdeb _mi_process_init+0xb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t    ffffb7e663b7 __libc_start_main@@GLIBC_2.34+0x117 (/usr/lib64/libc.so.6)\n\t          41f22f _start+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\npython 4101386 1476164.624114:        405 cycles:Pu: \n\t          50941c _mi_thread_id+0x10 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          519f4f mi_heap_main_init+0x23 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          51cd37 mi_process_load+0xb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t

@picnixz

This comment was marked as resolved.

seehwan pushed a commit to seehwan/cpython that referenced this pull request Apr 16, 2025
…RL (python#126844)

In `urllib.request.url2pathname()`, if the authority resolves to the
current host, discard it. If an authority is present but resolves somewhere
else, then on Windows we return a UNC path (as before), and on other
platforms we raise `URLError`.

Affects `pathlib.Path.from_uri()` in the same way.

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants