-
Notifications
You must be signed in to change notification settings - Fork 194
Enable string conversion in EUC-JP. #1296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable string conversion in EUC-JP. #1296
Conversation
@swift-ci Please test |
Sources/FoundationEssentials/String/StringProtocol+Essentials.swift
Outdated
Show resolved
Hide resolved
Sources/FoundationInternationalization/ICU/ICU+StringConverter.swift
Outdated
Show resolved
Hide resolved
Sources/FoundationEssentials/String/StringProtocol+Essentials.swift
Outdated
Show resolved
Hide resolved
7ae3f7f
to
b0a8981
Compare
@swift-ci Please test |
@parkera @jmschonfeld Thank you for reviewing. In response to your feedback, I've made some changes:
|
Please let me request reviews again as per changes above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good to me, but also want @jmschonfeld to chime in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for putting this together!
96de131
to
bc1656d
Compare
Background: EUC-JP is not supported by OSS CoreFoundation, while it is supported by macOS Foundation Framework. See swiftlang#1016 This commit resolves the issue by calling ICU API if necessary.
bc1656d
to
844479e
Compare
@swift-ci Please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - that resolves the build failures, but I have a comment about one issue in the tests that I've discovered as well (this should hopefully be the last thing, thanks for bearing with me)
XCTAssertNil(sushi.data(using: String._Encoding.japaneseEUC)) | ||
XCTAssertEqual( | ||
sushi.data(using: String._Encoding.japaneseEUC, allowLossyConversion: true), | ||
"Sushi?".data(using: .utf8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test unfortunately fails in the FOUNDATION_FRAMEWORK
configuration because Foundation.framework
's implementation of this conversion replaces both UTF-16 scalars of the emoji with ?
rather than replacing the entire grapheme cluster with a single ?
. I suspect that maybe ICU is replacing only with a single ?
- do you happen to know if that's intentional? If so, for now we might need to vary the expected value to be Sushi??
in FOUNDATION_FRAMEWORK
vs. Sushi?
in !FOUNDATION_FRAMEWORK
to account for the discrepancy between Foundation.framework
and ICU (assuming we think this result from ICU is indeed correct and not a bug in how we're calling ICU)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my overlooking again, but I'm not sure if we should implement conversion just like Foundation framework.
I mean the behavior of Foundation framework looks buggy because it doesn't seem to handle surrogate pairs correctly. (🍣 is non-BMP but consists of only one scalar U+1F363
. Does that "buggy" behavior come from the historical reason that NSString
has been represented as a sequence of UTF–16??)
Of course, we can imitate such behavior by replacing UCNV_FROM_U_CALLBACK_SUBSTITUTE
with a custom callback function.
However, let me point out that current swift-foundation
's implementation for .ascii
is code-point-basis, not UTF-16-value-basis:
import Foundation
let sushiEmoji = "🍣"
print(sushiEmoji.data(using: .ascii, allowLossyConversion: true)!.count)
// -> Prints "1"
Options...?
- To keep consistent with current (e.g.)
.ascii
implementation:- Divide the test case for
FOUNDATION_FRAMEWORK
and!FOUNDATION_FRAMEWORK
.
- Divide the test case for
- To make it compatible with Foundation framework's behavior:
- Implement a custom callback for
ucnv_setFromUCallBack
.
- Implement a custom callback for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I suspect this behavior comes from the fact that NSString
is natively stored as UTF-16. For now, let's just divide the test case with a FOUNDATION_FRAMEWORK
/!FOUNDATION_FRAMEWORK
value for this expectation since I think enabling the conversion at all on non-Darwin is a good step even if the behavior is slightly different than Foundation.framework
, and I can come back with a separate change to look into making this behavior the same across both platforms (while ensuring we don't break compatibility with existing clients).
…ersion. In response to: swiftlang#1296 (comment)
@jmschonfeld (I didn't know CI didn't execute test cases with |
@swift-ci Please test |
Thanks for all of your updates - looks great to me now! |
* Enable string conversion in EUC-JP. Background: EUC-JP is not supported by OSS CoreFoundation, while it is supported by macOS Foundation Framework. See swiftlang#1016 This commit resolves the issue by calling ICU API if necessary. * ICU: Omit encodings that should be supported by FoundationEssentials. In response to: swiftlang#1296 (comment) * ICU: Remove unnecessary `nonisolated(unsafe)` from static property. In response to: swiftlang#1296 (comment) * Add comment to `func _icuMakeStringFromBytes_impl`. In response to: swiftlang#1296 (comment) * Delegate string conversion to ICU only when encoding is EUC-JP. In response to: swiftlang#1296 (comment) * Replace dynamic `_icu*` functions only if `!FOUNDATION_FRAMEWORK`. In response to: swiftlang#1296 (comment) * Divide test cases depending on `FOUNDATION_FRAMEWORK` for EUC_JP conversion. In response to: swiftlang#1296 (comment)
Background: EUC-JP is not supported by OSS CoreFoundation, while it is supported by macOS Foundation Framework.
See #1016
This PR resolves the issue by calling ICU API if necessary.