-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Labels
Description
While chasing a Unicode-related bug, I realized that our stored JSON (on GitHub) has ugly escaped Unicode characters, e.g. in this study and this tree collection.
These Unicode characters are handled gracefully in our indexing and web apps, but these escape sequences aren't strictly needed as we store all JSON as utf-8. Meanwhile, they're hideous and make it hard to read and search the stored files on GitHub.
- Is this something we want or need to fix?
- Would this fix apply to all document types (studies, tree collections, tax. amendments)?
- Are there other clients or use cases that would be broken by this change?
If we want to restore pretty Unicode for data saved in the future, it seems to all boil down to a single call to json.dump in peyotl that's used for all JSON docs. If we add ensure_ascii=False to this call as shown here, it should save Unicode characters directly (sans escape) in phylesystem.
Reactions are currently unavailable