Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TZ: Issue parsing glibc timezones #12

Closed
d-a-v opened this issue Nov 10, 2020 · 0 comments · Fixed by #14
Closed

TZ: Issue parsing glibc timezones #12

d-a-v opened this issue Nov 10, 2020 · 0 comments · Fixed by #14

Comments

@d-a-v
Copy link

d-a-v commented Nov 10, 2020

ref: esp8266/Arduino#7699

Posix timezone strings defined in current common linux distributions running glibc are sometimes incorrectly parsed by newlib.

Their format starts with ABRVnn[ABRV[nn]][,...].
For example: GMT0BST,... is London TZ descriptor with two abbreviations GMT and BST.
ABRV is an abbreviation meaning something for humans. BST means "British Summer Time".

Such abbreviations are not defined for every timezone around the world. It is said in https://data.iana.org/time-zones/theory.html (source) that :

If there is no common English abbreviation, use numeric offsets like -05 and +0530 that are generated by zic's %z notation.

These numeric offsets are enclosed between <...>. For example, abbreviation for Sao Paulo TZ is <-03>3 (instead of for example valid SAOPAULO3).

The full path from official definitions starts from the above repository: zic the zoneinfo compiler uses files defining timezones on all continents to build most linux distribution's /usr/share/zoneinfo/* files, which are parsed by this tool to produce a csv file used by esp8266/arduino. One will notice that quite a large number of abbreviations are numeric.

The issue is that numeric abbreviations like <-03>3 are incorreclty parsed by newlib.

On the other hand, it seems that glibc's TZ parser is able to do so despite the fact that numeric abbreviations do not seem to follow posix TZ definition.

Abbreviations values are anyway unused in esp8266/arduino time library. To circumvent the parsing issue, numeric abbreviations are (about to be) converted to a posix compliant random string thanks to a script (in the PR referred on top of this message).

d-a-v added a commit to esp8266/Arduino that referenced this issue Nov 10, 2020
* TZ: help newlib parser

Timezones coded with numeric abbreviations <±nn>±nn<±nn>[±nn][,...] are incorrectly parsed
by newlib's TZ parser.
Replacing <±nn> occurences by UNK allows newlib's TZ parser to nicely interpret all timezones.
Detailed explanation in earlephilhower/newlib-xtensa#12
earlephilhower added a commit that referenced this issue Nov 11, 2020
Some timezones are now encoded with "<+/-nn>" instead of a text name.
Allow those greater-/less-than in the parser.

Fixes #12
earlephilhower added a commit that referenced this issue Nov 11, 2020
Fixes #12

TZ.h can now contain timezone names of the form <[+-]nn> when there is
no commonly used timezone abbreviation for the setting.  Adjust the
tzset function to handle this case by special-casing the name parsing
when the first character is a '<'.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant