Skip to content

Is it OK to strip all trailing zeros and dot from numbers? #13

@aportagain

Description

@aportagain

Currently, we strip trailing zeros and the dot:

encoder.FLOAT_REPR = lambda o: format(o, '.4f').rstrip('0').rstrip('.')

so for example a Python float usually printed as "1.0" becomes CF-JSON "1". I believe this neither breaks our own CF-JSON spec, nor the JSON spec in general. JSON only has a single "number" data type (because in contrast to other languages, JavaScript only provides a single "number" data type), so it does not differentiate between for example integers and floating-point numbers. So in JSON, "1" and "1.0" are the same... but many JSON parsers try to guess the most appropriate data type for whichever language they're running in, so those two one-vs-three-characters from the .json file (or API response, same same) will often end up as two slightly different things in the software reading that .json file.

The current CF-JSON spec says two things that are supposed to alleviate this potential ambiguity, but sadly don't fully solve / prevent it yet. "The number of significant figures used to express numeric data values should be sensible for data efficiency and readability. However the decimal precision of the data itself does not imply its underlying accuracy or precision." and "A type field representing the original data type can also be included". If we were to make the "type" field mandatory rather than optional, we would get rid of this ambiguity, but increase development effort both for producers and consumers. I think it's an option to consider in the next revision of the CF-JSON spec... and part of a more general trade-off between keeping the spec lightweight and therefore also simple / fast to implement and use on one side, and on the other side making it more detailed / explicit / unambiguous but also more difficult to implement and use.

Alternatively, we could also try to make life easier for parsers by preserving the first zero after the decimal point, at the cost of increased payload (as usual often negligible compared to the size of the datetime strings I think).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions