Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urllib.parse.urlparse output named tuples description is wrong for Python 3.9 and 3.10 #91708

Closed
dnicolodi opened this issue Apr 19, 2022 · 3 comments
Labels
docs Documentation in the Doc dir

Comments

@dnicolodi
Copy link

The documentation for urllib.parse.urlparse states that:

The return value is a named tuple, which means that its items can be accessed by index or as named attributes, which are:

Attribute Index Value Value if not present
...
params 3 No longer used always an empty string

However it seems that the documentation does not reflect reality:

>>> 
>>> import urllib.parse
>>> p = urllib.parse.urlparse('http://foo.test/test;param')
>>> p
ParseResult(scheme='http', netloc='foo.test', path='/test', params='param', query='', fragment='')
>>> p[3]
'param'

and the returned named tuple has a populated params field.

@dnicolodi dnicolodi added the docs Documentation in the Doc dir label Apr 19, 2022
@domdfcoding
Copy link
Contributor

The example and table changed to what they are now in gh-29816, but I cannot see any related changes in https://github.com/python/cpython/blob/main/Lib/urllib/parse.py which would make urlparse always return an empty string for params. I think the change to the table in gh-29816 is incorrect.

The example in the docs (urlparse("scheme://netloc/path;parameters?query#fragment")) shows correct output (params='') solely because "scheme" is not a scheme for which urlparse will parse the path params, per the list on line 59:

uses_params = ['', 'ftp', 'hdl', 'prospero', 'http', 'imap',
'https', 'shttp', 'rtsp', 'rtspu', 'sip', 'sips',
'mms', 'sftp', 'tel']

urlparse checks if the scheme is in that list; if it isn't params will be an empty string, but in other cases it will be parsed from the URL:

cpython/Lib/urllib/parse.py

Lines 389 to 392 in d7d7e6c

if scheme in uses_params and ';' in url:
url, params = _splitparams(url)
else:
params = ''

I find it a little confusing that params is only parsed for known schemes, but it's probably too late now to change it without breakage. The docs would benefit from clarification of when it is and isn't parsed, with an example to demonstrate the parsing.

@xmo-odoo
Copy link

And if params should be de-emphasized, wouldn't it make sense to reorganise the documentation to promote urlsplit instead (despite its worse naming), and mark urlparse, urlunparse, and ParseResult* as deprecated?

Also possibly urlunsplit, under the assumption that most users have stopped using raw tuples, and thus could just call geturl.

ambv pushed a commit that referenced this issue Oct 7, 2022
Revert params note in urllib.parse.urlparse table
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 7, 2022
…thonGH-96699)

Revert params note in urllib.parse.urlparse table
(cherry picked from commit eed8045)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 7, 2022
…thonGH-96699)

Revert params note in urllib.parse.urlparse table
(cherry picked from commit eed8045)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 7, 2022
…thonGH-96699)

Revert params note in urllib.parse.urlparse table
(cherry picked from commit eed8045)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
ambv pushed a commit that referenced this issue Oct 7, 2022
…H-96699) (#98052)

Revert params note in urllib.parse.urlparse table
(cherry picked from commit eed8045)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
ambv pushed a commit that referenced this issue Oct 7, 2022
…-96699) (#98054)

Revert params note in urllib.parse.urlparse table
(cherry picked from commit eed8045)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
ambv pushed a commit that referenced this issue Oct 7, 2022
…H-96699) (#98053)

Revert params note in urllib.parse.urlparse table
(cherry picked from commit eed8045)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
@ambv ambv closed this as completed Oct 7, 2022
@ambv
Copy link
Contributor

ambv commented Oct 7, 2022

Thanks! ✨ 🍰 ✨

carljm added a commit to carljm/cpython that referenced this issue Oct 8, 2022
* main: (38 commits)
  pythongh-92886: make test_ast pass with -O (assertions off) (pythonGH-98058)
  pythongh-92886: make test_coroutines pass with -O (assertions off) (pythonGH-98060)
  pythongh-57179: Add note on symlinks for os.walk (python#94799)
  pythongh-94808: Fix regex on exotic platforms (python#98036)
  pythongh-90085: Remove vestigial -t and -c timeit options (python#94941)
  pythonGH-83901: Improve Signature.bind error message for missing keyword-only params (python#95347)
  pythongh-61105: Add default param, note on using cookiejar subclass (python#95427)
  pythongh-96288: Add a sentence to `os.mkdir`'s docstring. (python#96271)
  pythongh-96073: fix backticks in NEWS entry (pythonGH-98056)
  pythongh-92886: [clinic.py] raise exception on invalid input instead of assertion (pythonGH-98051)
  pythongh-97997: Add col_offset field to tokenizer and use that for AST nodes (python#98000)
  pythonGH-88968: Reject socket that is already used as a transport (python#98010)
  pythongh-96346: Use double caching for re._compile() (python#96347)
  pythongh-91708: Revert params note in urllib.parse.urlparse table (python#96699)
  pythongh-96265: Fix some formatting in faq/design.rst (python#96924)
  pythongh-73196: Add namespace/scope clarification for inheritance section (python#92840)
  pythongh-97646: Change `.js` and `.mjs` files mimetype to conform to RFC 9239 (python#97934)
  pythongh-97923: Always run Ubuntu SSL tests with others in CI (python#97940)
  pythongh-97956: Mention `generate_global_objects.py` in `AC How-To` (python#97957)
  pythongh-96959: Update HTTP links which are redirected to HTTPS (python#98039)
  ...
mpage pushed a commit to mpage/cpython that referenced this issue Oct 11, 2022
pablogsal pushed a commit that referenced this issue Oct 22, 2022
…H-96699) (#98052)

Revert params note in urllib.parse.urlparse table
(cherry picked from commit eed8045)

Co-authored-by: Stanley <46876382+slateny@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

No branches or pull requests

4 participants