-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support binary strings, preserve UTF-8 and UTF-16 errors #2314
Open
Maxdamantus
wants to merge
8
commits into
jqlang:master
Choose a base branch
from
Maxdamantus:210520-wtf8b
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Commits on Jul 21, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 067e682 - Browse repository at this point
Copy the full SHA 067e682View commit details
Commits on Jul 22, 2023
-
Binary strings: preserve UTF-8 and UTF-16 errors
The internal string representation is changed from UTF-8 with replacement characters to a modified form of "WTF-8" that is able to distinctly encode UTF-8 errors and UTF-16 errors. This handles UTF-8 errors in raw string inputs and handles UTF-8 and UTF-16 errors in JSON input. UTF-16 errors (using "\uXXXX") and UTF-8 errors (using the original raw bytes) are maintained when emitting JSON. When emitting raw strings, UTF-8 errors are maintained and UTF-16 errors are converted into replacement characters.
Configuration menu - View commit details
-
Copy full SHA for 6aff473 - Browse repository at this point
Copy the full SHA 6aff473View commit details -
Configuration menu - View commit details
-
Copy full SHA for 79f0479 - Browse repository at this point
Copy the full SHA 79f0479View commit details -
Correct UTF-8 and UTF-16 errors during concatenation
UTF-8 errors and UTF-16 errors that were previously encoded into the ends of strings will now potentially be used to form correct code points. This is mostly a matter of making string equality behave expectedly, since without this normalisation, it is possible to produce `jv` strings that are converted to UTF-8 or UTF-16 the same way but are not equal due well-formed code units that may or may not be encoded as errors.
Configuration menu - View commit details
-
Copy full SHA for 7fde46e - Browse repository at this point
Copy the full SHA 7fde46eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2e1b5d2 - Browse repository at this point
Copy the full SHA 2e1b5d2View commit details -
Preserve UTF-8 and UTF-16 errors in
explode
Errors are emitted as negative code points instead of being transformed into replacement characters. `implode` is also updated accordingly so the original string can be reconstructed without data loss.
Configuration menu - View commit details
-
Copy full SHA for f68f25b - Browse repository at this point
Copy the full SHA f68f25bView commit details -
Remove UTF-8 backtracking workaround
This is no longer needed as strings are capable of storing partial UTF-8 sequences.
Configuration menu - View commit details
-
Copy full SHA for 5c2fe32 - Browse repository at this point
Copy the full SHA 5c2fe32View commit details -
Configuration menu - View commit details
-
Copy full SHA for 911d01a - Browse repository at this point
Copy the full SHA 911d01aView commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.