Skip to content

Commit 21f3d15

Browse files
authored
gh-135676: lexical analysis: Improve section on Numeric literals (GH-134850)
1 parent 343719d commit 21f3d15

File tree

3 files changed

+168
-55
lines changed

3 files changed

+168
-55
lines changed

Doc/reference/datamodel.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -262,6 +262,8 @@ Booleans (:class:`bool`)
262262
a string, the strings ``"False"`` or ``"True"`` are returned, respectively.
263263

264264

265+
.. _datamodel-float:
266+
265267
:class:`numbers.Real` (:class:`float`)
266268
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
267269

Doc/reference/expressions.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -134,8 +134,7 @@ Literals
134134
Python supports string and bytes literals and various numeric literals:
135135

136136
.. productionlist:: python-grammar
137-
literal: `stringliteral` | `bytesliteral`
138-
: | `integer` | `floatnumber` | `imagnumber`
137+
literal: `stringliteral` | `bytesliteral` | `NUMBER`
139138

140139
Evaluation of a literal yields an object of the given type (string, bytes,
141140
integer, floating-point number, complex number) with the given value. The value

Doc/reference/lexical_analysis.rst

Lines changed: 165 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -922,11 +922,20 @@ Numeric literals
922922
floating-point literal, hexadecimal literal
923923
octal literal, binary literal, decimal literal, imaginary literal, complex literal
924924

925-
There are three types of numeric literals: integers, floating-point numbers, and
926-
imaginary numbers. There are no complex literals (complex numbers can be formed
927-
by adding a real number and an imaginary number).
925+
:data:`~token.NUMBER` tokens represent numeric literals, of which there are
926+
three types: integers, floating-point numbers, and imaginary numbers.
928927

929-
Note that numeric literals do not include a sign; a phrase like ``-1`` is
928+
.. grammar-snippet::
929+
:group: python-grammar
930+
931+
NUMBER: `integer` | `floatnumber` | `imagnumber`
932+
933+
The numeric value of a numeric literal is the same as if it were passed as a
934+
string to the :class:`int`, :class:`float` or :class:`complex` class
935+
constructor, respectively.
936+
Note that not all valid inputs for those constructors are also valid literals.
937+
938+
Numeric literals do not include a sign; a phrase like ``-1`` is
930939
actually an expression composed of the unary operator '``-``' and the literal
931940
``1``.
932941

@@ -940,38 +949,67 @@ actually an expression composed of the unary operator '``-``' and the literal
940949
.. _integers:
941950

942951
Integer literals
943-
----------------
952+
^^^^^^^^^^^^^^^^
944953

945-
Integer literals are described by the following lexical definitions:
954+
Integer literals denote whole numbers. For example::
946955

947-
.. productionlist:: python-grammar
948-
integer: `decinteger` | `bininteger` | `octinteger` | `hexinteger`
949-
decinteger: `nonzerodigit` (["_"] `digit`)* | "0"+ (["_"] "0")*
950-
bininteger: "0" ("b" | "B") (["_"] `bindigit`)+
951-
octinteger: "0" ("o" | "O") (["_"] `octdigit`)+
952-
hexinteger: "0" ("x" | "X") (["_"] `hexdigit`)+
953-
nonzerodigit: "1"..."9"
954-
digit: "0"..."9"
955-
bindigit: "0" | "1"
956-
octdigit: "0"..."7"
957-
hexdigit: `digit` | "a"..."f" | "A"..."F"
956+
7
957+
3
958+
2147483647
958959

959960
There is no limit for the length of integer literals apart from what can be
960-
stored in available memory.
961+
stored in available memory::
962+
963+
7922816251426433759354395033679228162514264337593543950336
964+
965+
Underscores can be used to group digits for enhanced readability,
966+
and are ignored for determining the numeric value of the literal.
967+
For example, the following literals are equivalent::
968+
969+
100_000_000_000
970+
100000000000
971+
1_00_00_00_00_000
972+
973+
Underscores can only occur between digits.
974+
For example, ``_123``, ``321_``, and ``123__321`` are *not* valid literals.
961975

962-
Underscores are ignored for determining the numeric value of the literal. They
963-
can be used to group digits for enhanced readability. One underscore can occur
964-
between digits, and after base specifiers like ``0x``.
976+
Integers can be specified in binary (base 2), octal (base 8), or hexadecimal
977+
(base 16) using the prefixes ``0b``, ``0o`` and ``0x``, respectively.
978+
Hexadecimal digits 10 through 15 are represented by letters ``A``-``F``,
979+
case-insensitive. For example::
965980

966-
Note that leading zeros in a non-zero decimal number are not allowed. This is
967-
for disambiguation with C-style octal literals, which Python used before version
968-
3.0.
981+
0b100110111
982+
0b_1110_0101
983+
0o177
984+
0o377
985+
0xdeadbeef
986+
0xDead_Beef
969987

970-
Some examples of integer literals::
988+
An underscore can follow the base specifier.
989+
For example, ``0x_1f`` is a valid literal, but ``0_x1f`` and ``0x__1f`` are
990+
not.
971991

972-
7 2147483647 0o177 0b100110111
973-
3 79228162514264337593543950336 0o377 0xdeadbeef
974-
100_000_000_000 0b_1110_0101
992+
Leading zeros in a non-zero decimal number are not allowed.
993+
For example, ``0123`` is not a valid literal.
994+
This is for disambiguation with C-style octal literals, which Python used
995+
before version 3.0.
996+
997+
Formally, integer literals are described by the following lexical definitions:
998+
999+
.. grammar-snippet::
1000+
:group: python-grammar
1001+
1002+
integer: `decinteger` | `bininteger` | `octinteger` | `hexinteger` | `zerointeger`
1003+
decinteger: `nonzerodigit` (["_"] `digit`)*
1004+
bininteger: "0" ("b" | "B") (["_"] `bindigit`)+
1005+
octinteger: "0" ("o" | "O") (["_"] `octdigit`)+
1006+
hexinteger: "0" ("x" | "X") (["_"] `hexdigit`)+
1007+
zerointeger: "0"+ (["_"] "0")*
1008+
nonzerodigit: "1"..."9"
1009+
digit: "0"..."9"
1010+
bindigit: "0" | "1"
1011+
octdigit: "0"..."7"
1012+
hexdigit: `digit` | "a"..."f" | "A"..."F"
9751013

9761014
.. versionchanged:: 3.6
9771015
Underscores are now allowed for grouping purposes in literals.
@@ -984,26 +1022,58 @@ Some examples of integer literals::
9841022
.. _floating:
9851023

9861024
Floating-point literals
987-
-----------------------
1025+
^^^^^^^^^^^^^^^^^^^^^^^
9881026

989-
Floating-point literals are described by the following lexical definitions:
1027+
Floating-point (float) literals, such as ``3.14`` or ``1.5``, denote
1028+
:ref:`approximations of real numbers <datamodel-float>`.
9901029

991-
.. productionlist:: python-grammar
992-
floatnumber: `pointfloat` | `exponentfloat`
993-
pointfloat: [`digitpart`] `fraction` | `digitpart` "."
994-
exponentfloat: (`digitpart` | `pointfloat`) `exponent`
995-
digitpart: `digit` (["_"] `digit`)*
996-
fraction: "." `digitpart`
997-
exponent: ("e" | "E") ["+" | "-"] `digitpart`
1030+
They consist of *integer* and *fraction* parts, each composed of decimal digits.
1031+
The parts are separated by a decimal point, ``.``::
1032+
1033+
2.71828
1034+
4.0
1035+
1036+
Unlike in integer literals, leading zeros are allowed in the numeric parts.
1037+
For example, ``077.010`` is legal, and denotes the same number as ``77.10``.
1038+
1039+
As in integer literals, single underscores may occur between digits to help
1040+
readability::
1041+
1042+
96_485.332_123
1043+
3.14_15_93
9981044

999-
Note that the integer and exponent parts are always interpreted using radix 10.
1000-
For example, ``077e010`` is legal, and denotes the same number as ``77e10``. The
1001-
allowed range of floating-point literals is implementation-dependent. As in
1002-
integer literals, underscores are supported for digit grouping.
1045+
Either of these parts, but not both, can be empty. For example::
10031046

1004-
Some examples of floating-point literals::
1047+
10. # (equivalent to 10.0)
1048+
.001 # (equivalent to 0.001)
10051049

1006-
3.14 10. .001 1e100 3.14e-10 0e0 3.14_15_93
1050+
Optionally, the integer and fraction may be followed by an *exponent*:
1051+
the letter ``e`` or ``E``, followed by an optional sign, ``+`` or ``-``,
1052+
and a number in the same format as the integer and fraction parts.
1053+
The ``e`` or ``E`` represents "times ten raised to the power of"::
1054+
1055+
1.0e3 # (represents 1.0×10³, or 1000.0)
1056+
1.166e-5 # (represents 1.166×10⁻⁵, or 0.00001166)
1057+
6.02214076e+23 # (represents 6.02214076×10²³, or 602214076000000000000000.)
1058+
1059+
In floats with only integer and exponent parts, the decimal point may be
1060+
omitted::
1061+
1062+
1e3 # (equivalent to 1.e3 and 1.0e3)
1063+
0e0 # (equivalent to 0.)
1064+
1065+
Formally, floating-point literals are described by the following
1066+
lexical definitions:
1067+
1068+
.. grammar-snippet::
1069+
:group: python-grammar
1070+
1071+
floatnumber:
1072+
| `digitpart` "." [`digitpart`] [`exponent`]
1073+
| "." `digitpart` [`exponent`]
1074+
| `digitpart` `exponent`
1075+
digitpart: `digit` (["_"] `digit`)*
1076+
exponent: ("e" | "E") ["+" | "-"] `digitpart`
10071077

10081078
.. versionchanged:: 3.6
10091079
Underscores are now allowed for grouping purposes in literals.
@@ -1014,20 +1084,62 @@ Some examples of floating-point literals::
10141084
.. _imaginary:
10151085

10161086
Imaginary literals
1017-
------------------
1087+
^^^^^^^^^^^^^^^^^^
10181088

1019-
Imaginary literals are described by the following lexical definitions:
1089+
Python has :ref:`complex number <typesnumeric>` objects, but no complex
1090+
literals.
1091+
Instead, *imaginary literals* denote complex numbers with a zero
1092+
real part.
10201093

1021-
.. productionlist:: python-grammar
1022-
imagnumber: (`floatnumber` | `digitpart`) ("j" | "J")
1094+
For example, in math, the complex number 3+4.2\ *i* is written
1095+
as the real number 3 added to the imaginary number 4.2\ *i*.
1096+
Python uses a similar syntax, except the imaginary unit is written as ``j``
1097+
rather than *i*::
1098+
1099+
3+4.2j
1100+
1101+
This is an expression composed
1102+
of the :ref:`integer literal <integers>` ``3``,
1103+
the :ref:`operator <operators>` '``+``',
1104+
and the :ref:`imaginary literal <imaginary>` ``4.2j``.
1105+
Since these are three separate tokens, whitespace is allowed between them::
10231106

1024-
An imaginary literal yields a complex number with a real part of 0.0. Complex
1025-
numbers are represented as a pair of floating-point numbers and have the same
1026-
restrictions on their range. To create a complex number with a nonzero real
1027-
part, add a floating-point number to it, e.g., ``(3+4j)``. Some examples of
1028-
imaginary literals::
1107+
3 + 4.2j
10291108

1030-
3.14j 10.j 10j .001j 1e100j 3.14e-10j 3.14_15_93j
1109+
No whitespace is allowed *within* each token.
1110+
In particular, the ``j`` suffix, may not be separated from the number
1111+
before it.
1112+
1113+
The number before the ``j`` has the same syntax as a floating-point literal.
1114+
Thus, the following are valid imaginary literals::
1115+
1116+
4.2j
1117+
3.14j
1118+
10.j
1119+
.001j
1120+
1e100j
1121+
3.14e-10j
1122+
3.14_15_93j
1123+
1124+
Unlike in a floating-point literal the decimal point can be omitted if the
1125+
imaginary number only has an integer part.
1126+
The number is still evaluated as a floating-point number, not an integer::
1127+
1128+
10j
1129+
0j
1130+
1000000000000000000000000j # equivalent to 1e+24j
1131+
1132+
The ``j`` suffix is case-insensitive.
1133+
That means you can use ``J`` instead::
1134+
1135+
3.14J # equivalent to 3.14j
1136+
1137+
Formally, imaginary literals are described by the following lexical definition:
1138+
1139+
.. grammar-snippet::
1140+
:group: python-grammar
1141+
1142+
imagnumber: (`floatnumber` | `digitpart`) ("j" | "J")
10311143

10321144

10331145
.. _operators:

0 commit comments

Comments
 (0)