Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-85679: Recommend encoding="utf-8" in tutorial #91778

Merged
merged 2 commits into from
May 2, 2022

Conversation

methane
Copy link
Member

@methane methane commented Apr 21, 2022

Fixes #85679

@methane methane added docs Documentation in the Doc dir skip news needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes labels Apr 21, 2022
@methane methane changed the title bpo-85679: Use encoding="utf-8" in tutorial gh-85679: Use encoding="utf-8" in tutorial Apr 21, 2022
If *encoding* is not specified, the default is platform dependent
(see :func:`open`).
But passing ``encoding="utf-8"`` is highly recommended because
UTF-8 is the most commonly used encoding for now.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general rule: don't start a sentence with "and, but, so, or then" words. and try not to end with "for now".

Explicitly passing ``encoding='utf-8'`` is recommended if that is what you need as it is the most common text encoding in the world and leaves no room for doubt about your code's intent.

perhaps.

Copy link
Member Author

@methane methane Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tutorial and reader won't know what they need.
I want to teach that UTF-8 is the first choice.

How about this?

``encoding="utf-8"`` is recommended unless you need to use other encoding
because UTF-8 is the de-facto standard nowadays.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:
Because UTF-8 is the modern de-facto standard, ``encoding="utf-8"`` is recommended unless you know that you need to use a different encoding.

(see :func:`open`).
But passing ``encoding="utf-8"`` is highly recommended because
UTF-8 is the most commonly used encoding for now.
``'b'`` appended to the mode opens the file in :dfn:`binary mode`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appending a ``'b'`` to the mode opens the file in :dfn:`binary mode`. Binary mode data is read and written as ``bytes`` objects without use of a codec.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use "encoding" instead of "codec". I don't see "codec" used anywhere else in this file.

Copy link
Member Author

@methane methane Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I didn't rewrite this paragraph at all. I just reflow it.)

How about this?

Appending a ``'b'`` to the mode opens the file in :dfn:`binary mode`.
Binary mode data is read and written as :class:`bytes` objects.
You can not specify *encoding* when opening file in binary mode.


x = json.load(f)

.. note::
JSON files must be encoded in UTF-8. Use ``encoding="utf-8"`` when opening
JSON file as :term:`text file` for both of reading and writing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a :term:`text file`. add the 'a' and no the "for reading and writing" text can go.

@@ -279,11 +279,11 @@ Reading and Writing Files
object: file

:func:`open` returns a :term:`file object`, and is most commonly used with
two arguments: ``open(filename, mode)``.
two or three arguments: ``open(filename, mode, encoding=None)``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We like the encoding to be a keyword for readability so I'd word this similar to "two arguments, often with an encoding keyword when using a text mode" rather than including encoding in the number and mentioning two numbers. it feels more clear to me that way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, we haven't described the "text mode" yet. It is described in below.

How about "two positional arguments and one keyword argument"?
Since binary file is rare than text file, we can focus on text file at this first open() example.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@methane
Copy link
Member Author

methane commented Apr 25, 2022

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@gpshead: please review the changes made to this pull request.

@bedevere-bot bedevere-bot requested a review from gpshead April 25, 2022 09:13
@methane methane changed the title gh-85679: Use encoding="utf-8" in tutorial gh-85679: Recommend encoding="utf-8" in tutorial May 2, 2022
@methane methane merged commit 614420d into python:main May 2, 2022
@methane methane deleted the tutorial-utf8 branch May 2, 2022 08:25
@miss-islington
Copy link
Contributor

Thanks @methane for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9, 3.10.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-92133 is a backport of this pull request to the 3.10 branch.

@bedevere-bot
Copy link

GH-92134 is a backport of this pull request to the 3.9 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label May 2, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 2, 2022
)

(cherry picked from commit 614420d)

Co-authored-by: Inada Naoki <songofacandy@gmail.com>
miss-islington added a commit that referenced this pull request May 2, 2022
(cherry picked from commit 614420d)

Co-authored-by: Inada Naoki <songofacandy@gmail.com>
miss-islington added a commit that referenced this pull request May 2, 2022
(cherry picked from commit 614420d)

Co-authored-by: Inada Naoki <songofacandy@gmail.com>
hello-adam pushed a commit to hello-adam/cpython that referenced this pull request Jun 2, 2022
)

(cherry picked from commit 614420d)

Co-authored-by: Inada Naoki <songofacandy@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use utf-8 in "Reading and Writing Files" tutorial.
5 participants