Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime library doesn't support valid ISO-8601 alternative for midnight #102450

Closed
TizzySaurus opened this issue Mar 5, 2023 · 18 comments · Fixed by #105856
Closed

datetime library doesn't support valid ISO-8601 alternative for midnight #102450

TizzySaurus opened this issue Mar 5, 2023 · 18 comments · Fixed by #105856
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@TizzySaurus
Copy link
Contributor

TizzySaurus commented Mar 5, 2023

Bug report

According to ISO-8601, a time of 24:00 on a given date is a valid alternative to 00:00 of the following date, however Python does not support this, raising the following error when attempted: ValueError: hour must be in 0..23.

This bug can be seen from multiple scenarios, specifically anything that internally calls the _check_time_fields function, such as the following:

>>> import datetime
>>> datetime.datetime(2022, 1, 2, 24, 0, 0)  # should be equivalent to 2022-01-03 00:00:00
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    datetime.datetime(2022, 3, 4, 24, 0, 0)
ValueError: hour must be in 0..23

The fix for this is relatively simple: have an explicit check within _check_time_fields for the scenario where hour == 24 and minute == 0 and second == 0 and microsecond == 0, or more concisely hour == 24 and not any((minute, second, microsecond)), and in this scenario increase the day by one (adjusting the week/month/year as necessary) and set the hour to 0.

I imagine the check_time_args C function would also have to be updated.

Your environment

  • CPython versions tested on: 3.9.12, 3.10.7, 3.11.2 (presumably applies to all)
  • Operating system and architecture: MacOS Ventura arm64 (presumably applies to all)

Linked PRs

@TizzySaurus TizzySaurus added the type-bug An unexpected behavior, bug, or error label Mar 5, 2023
@terryjreedy
Copy link
Member

From the referenced wikipedia page:

As of ISO 8601-1:2019/Amd 1:2022, midnight may be referred to as "00:00:00", corresponding to the instant at the beginning of a calendar day; or "24:00:00", corresponding to the instant at the end of a calendar day.[1] ISO 8601-1:2019 as originally published removed "24:00" as a representation for the end of day although it was permitted in earlier versions of the standard.

Looking at the datetime commit log, I am pretty sure that 24:00:00 support was not removed after the 2019 version but was never there. It is obviously a nuisance.

While the doc for datetime.datetime explicitly disclaims such support ("0 <= hour < 24"), the doc for datetime.datetime.fromisoformat does not. It says "Return a datetime corresponding to a date_string in any valid ISO 8601 format, with the following exceptions:" and lists 4 exceptions. To me, the bug here is the omission of the pretty clearly intended '240000' exception. The fix would be to add the 5th exception to the doc.

The request to add support would then be a feature addition. I have no opinion, but think it less likely to be accepted than rejected.

@mdickinson
Copy link
Member

See also #54636

@pganssle
Copy link
Member

I think this actually is a bug. The original intention was that fromisoformat was supposed to support parsing any valid ISO-8601 string that can be represented by a datetime.datetime object (plus certain expansions so that all possible outputs of datetime.isoformat() are covered). We decided not to support a few things like fractional minutes and fractional hours because we believe that these formats are so rarely used that if you encounter them in the wild, it's more likely that you have encountered a bug than something real that you want to parse.

Support for 24 as an alternative to midnight doesn't really fit into that category, and can be useful in certain situations, so I'd be happy to see a PR for this.

@TizzySaurus
Copy link
Contributor Author

@pganssle

I think this actually is a bug. ...I'd be happy to see a PR for this.

I agree with you that this is something I feel is a bug and I'd like to see added, even with the points mentioned by @terryjreedy. To that end, I'd be interested in writing the PR for this.

Code wise, is it just the _check_time_fields function and the check_time_args C function that would need to be updated?

I suppose documentation would also need to be updated, but I'm not sure how to do this.

It seems the queries in #54636 were addressed, but just to confirm, I'd expect datetime(2023, 1, 1, 24) to be parsed equal to datetime(2023, 1, 2, 0) (and thus .day would be 2, .hour would be 0, etc. -- the .hour property of a created datetime object will still never be 24), and inclusion of smaller units not equal to zero (such as datetime(2023, 1, 1, 24, 0, 0, 567890)) is invalid and will error.

@bluetech
Copy link

What about datetime.time? There the situation is different - time(0, 0, 0) should not be equal to time(24, 0, 0). And I doubt datetime.time can be extended at this point to support time(24, 0, 0) as a distinct value. So datetime.time.fromisoformat would have to keep rejecting 24:00:00, which would be inconsistent with datetime.datetime.fromisoformat if this proposal is accepted.

@pganssle
Copy link
Member

Code wise, is it just the _check_time_fields function and the check_time_args C function that would need to be updated?

No, we shouldn't make it so that datetime.datetime or datetime.time accept "24" as a valid value, the adjustment should happen in the parsing stage. This is how it's done in dateutil.parser.isoparse.

I think you want to update datetime_fromisoformat in the C code and fromisoformat in the Python code. The equivalent functions in datetime.time just need to be updated so that when you get 24:00 it is rendered as time(0, 0, 0).

You will also want to add test cases to datetimetester.py. In addition to test cases like "2023-05-14T24:00" rendering to datetime(2023, 5, 15), you also want to make sure 2023-05-14T24:01 still raises an exception.

@bluetech

What about datetime.time? There the situation is different - time(0, 0, 0) should not be equal to time(24, 0, 0). And I doubt datetime.time can be extended at this point to support time(24, 0, 0) as a distinct value. So datetime.time.fromisoformat would have to keep rejecting 24:00:00, which would be inconsistent with datetime.datetime.fromisoformat if this proposal is accepted.

This proposal should have no effect on the datetime and time primary constructors. If ISO-8601 says that T24:00 is midnight (and I think it does), then datetime.time.fromisoformat("24:00") should return datetime.time(0, 0, 0).

@bluetech
Copy link

According to the Wikipedia quote above, time 00:00 is beginning midnight, and 24:00 is the next day's midnight.

Just to explain where I ran into it, I work on a system which defines "parts of day" like this (for example):

  • Night 00:00 - 05:00
  • Morning 05:00 - 11:00
  • Noon: 11:00 - 17:00
  • Evening 17:00 - 24:00

This is in a PostgreSQL database using the time type. The ranges are [inclusive, exclusive). If 24:00 becomes 00:00, the Evening range becomes time(17, 0), time(0, 0) which doesn't work.

So the two midnights look like different things to me, I don't think 24:00 should parse to time(0).

@pganssle
Copy link
Member

This is in a PostgreSQL database using the time type. The ranges are [inclusive, exclusive). If 24:00 becomes 00:00, the Evening range becomes time(17, 0), time(0, 0) which doesn't work.

So the two midnights look like different things to me, I don't think 24:00 should parse to time(0).

I mean, the alternative is that it doesn't parse at all and you simply cannot represent the last interval, so it doesn't work either way. At least if the parse works, you have the option to find a different way to represent that interval. time represents a time of day, and there's only one midnight per day. My copy of ISO-8601 says that 24:00 represents the end of the day and there's no suggestion that this is only valid as part of a datetime, so I think we're supposed to parse it as time(0), even though it's true that this is necessarily a lossy conversion.

@TizzySaurus
Copy link
Contributor Author

No, we shouldn't make it so that datetime.datetime or datetime.time accept "24" as a valid value, the adjustment should happen in the parsing stage.

This proposal should have no effect on the datetime and time primary constructors.

Does this mean you're saying that datetime.datetime.fromisoformat("2023-01-02 24:00:00") should be valid, but datetime.datetime(2023, 1, 2, 24, 0, 0) should be invalid? That wasn't strictly what I was going for with this issue, but I guess it is strictly more accurate (since this is an iso thing). Although it does seem odd to have it valid in fromisoformat and the equivalent constructor be invalid.

I was originally envisioning something along the lines of

def the_datetime_primary_constructor(year, month, day, hour, second, microsecond):
    if (hour == 24):
        if (second == microsecond == 0):
            hour = 0
            day += 1  # with the correct handling for "rollover", i.e. adjusting month/year as necessary.
        else:
            raise ValueError("For hour to be 24, second and microsecond must be 0.")
    ...

This means that hours=24 would be valid in the constructor, but .hours of the returned object would still return 0

@pganssle
Copy link
Member

Does this mean you're saying that datetime.datetime.fromisoformat("2023-01-02 24:00:00") should be valid, but datetime.datetime(2023, 1, 2, 24, 0, 0) should be invalid? That wasn't strictly what I was going for with this issue, but I guess it is strictly more accurate (since this is an iso thing). Although it does seem odd to have it valid in fromisoformat and the equivalent constructor be invalid.

Yes, I'm saying precisely this. The main constructor and underlying data model does not have anything to do with ISO 8601. Consider that 2023-W01-1 is also a valid ISO 8601 representation for datetime(2023, 1, 2). If there's a case to be made for accepting "24" for the "hour" component of datetime.datetime, it should be made independent of whether or not there's a corresponding ISO 8601 format.

Also, I will note my bias here: ISO 8601 is an annoying and awful spec. It's proprietary and not even freely available. It's ridiculously convoluted and so expansive that no parser I know of implements the whole spec. Despite the byzantine complexity, it is also vague about things like what the date-time separator can be or how many digits can come after the fraction separator and it has no provision for things like fractional offsets. RFC3339 tries to tame the complexity but is only acceptable for representing absolute times. No one understands the ISO week calendar, either. Everyone thinks they like ISO 8601 because YYYY-MM-DD is a good way to represent dates, and people have converged on that, but if I could I would burn ISO 8601 to the ground and start over (sure we'd probably also get some weird camel of a format designed by committee, but at least if we did it today it would be in the public domain).

Needless to say, the fact that ISO 8601 does some weird thing is not a good argument for importing that particular abstraction. It's probably a good argument for including it in fromisoformat, but that's the extent of it.

@TizzySaurus
Copy link
Contributor Author

... It's probably a good argument for including it in fromisoformat, but that's the extent of it.

Yeah, fair enough. I'll go ahead and work on a PR just for the change to fromisoformat (and relevant tests etc.) then 👍

@TizzySaurus
Copy link
Contributor Author

TizzySaurus commented May 18, 2023

I'm having some issues with "building" changes that I'm making to the python files. I found the contributing guide section on building, and followed that, but my changes to the Lib/_pydatetime.py file aren't being reflected in the outputted python.exe (I'm on MacOS).

Steps I'm doing:

  • Add print("Hello from datetime.datetime.fromisoformat()") above L1855
  • Run ./configure --with-pydebug && make -s -j && ./python.exe
  • Enter the following code into the repl:
>>> import datetime
>>> datetime.datetime.fromisoformat("2022-01-01 00:00:00")

The above steps don't appear to result in the print statement being executed, as I'd expect. I've also tried return cls(2000, 01, 01, 0, 0, 0, 0) to see if outputs are somehow blocked, but that doesn't work either.

According to the note in the dev guide, I shouldn't need to build for changes to python code, but figured I'd try that when it didn't seem to work.

Can I please get some assistance?

@ericvsmith
Copy link
Member

This isn't the best place to get help with this, probably something on discuss.python.org would be better.

But as a hint, the code is probably in Modules/_datetimemodule.c.

@pganssle
Copy link
Member

But as a hint, the code is probably in Modules/_datetimemodule.c.

To be clear, the code is in both, datetime is a PEP 399 module.

@TizzySaurus Feel free to make a draft PR and ping me on it.

@ericvsmith
Copy link
Member

Yes, @pganssle is correct. Thanks for the PEP reference, I couldn't find it quickly.

@TizzySaurus
Copy link
Contributor Author

@pganssle have now got a draft PR: #105856. Would appreciate if you could take a look and confirm I'm heading in the right direction.

@iritkatriel iritkatriel added the stdlib Python modules in the Lib dir label Nov 26, 2023
pganssle pushed a commit that referenced this issue Sep 25, 2024
… calls. (#105856)

* Add NEWS.d entry

* Allow ISO-8601 24:00 alternative to midnight on datetime.time.fromisoformat()

* Allow ISO-8601 24:00 alternative to midnight on datetime.datetime.fromisoformat()

* Add NEWS.d entry

* Improve error message when hour is 24 and minute/second/microsecond is not 0

* Add tests for 24:00 fromisoformat

* Remove duplicate call to days_in_month() by storing in variable

* Add Python implementation

* Fix Lint

* Fix differing error msg in datetime.fromisoformat implementations when 24hrs has non-zero time component(s)

* Fix using time components inside tzinfo in Python implementation

* Don't parse tzinfo in C implementation when invalid iso midnight

* Remove duplicated variable in datetime test assertion line

* Add self to acknowledgements

* Remove duplicate NEWS entry

* Linting

* Add missing test case for when wrapping the year makes it invalid (too large)
emilyemorehouse added a commit to lysnikolaou/cpython that referenced this issue Sep 26, 2024
* main: (69 commits)
  Add "annotate" SET_FUNCTION_ATTRIBUTE bit to dis. (python#124566)
  pythongh-124412: Add helpers for converting annotations to source format (python#124551)
  pythongh-119180: Disallow instantiation of ConstEvaluator objects (python#124561)
  For-else deserves its own section in the tutorial (python#123946)
  Add 3.13 as a version option to the crash issue template (python#124560)
  pythongh-123242: Note that type.__annotations__ may not exist (python#124557)
  pythongh-119180: Make FORWARDREF format look at __annotations__ first (python#124479)
  pythonGH-58058: Add quick reference for `ArgumentParser` to argparse docs (pythongh-124227)
  pythongh-41431: Add `datetime.time.strptime()` and `datetime.date.strptime()` (python#120752)
  pythongh-102450: Add ISO-8601 alternative for midnight to `fromisoformat()` calls. (python#105856)
  pythongh-124370: Add "howto" for free-threaded Python (python#124371)
  pythongh-121277: Allow `.. versionadded:: next` in docs (pythonGH-121278)
  pythongh-119400:  make_ssl_certs: update reference test data automatically, pass in expiration dates as parameters python#119400  (pythonGH-119401)
  pythongh-119180: Avoid going through AST and eval() when possible in annotationlib (python#124337)
  pythongh-124448: Update Windows builds to use Tcl/Tk 8.6.15 (pythonGH-124449)
  pythongh-123884 Tee of tee was not producing n independent iterators (pythongh-124490)
  pythongh-124378: Update test_ttk for Tcl/Tk 8.6.15 (pythonGH-124542)
  pythongh-124513: Check args in framelocalsproxy_new() (python#124515)
  pythongh-101100: Add a table of class attributes to the "Custom classes" section of the data model docs (python#124480)
  Doc: Use ``major.minor`` for documentation distribution archive filenames (python#124489)
  ...
@TizzySaurus
Copy link
Contributor Author

@pganssle Now that #105856 has been merged, when can we expect it to be in a Python release? I suppose it'd be in either 3.13 or 3.14?

@ericvsmith
Copy link
Member

It will be in 3.14.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

7 participants