Skip to content

Commit 59c7ea3

Browse files
authored
Merge pull request #965 from hsivonen/adapter
Enable choice from multiple Unicode back ends
2 parents 9163f30 + 662970f commit 59c7ea3

File tree

6 files changed

+152
-196
lines changed

6 files changed

+152
-196
lines changed

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,7 @@ URL library for Rust, based on the [URL Standard](https://url.spec.whatwg.org/).
1212
[Documentation](https://docs.rs/url)
1313

1414
Please see [UPGRADING.md](https://github.com/servo/rust-url/blob/main/UPGRADING.md) if you are upgrading from a previous version.
15+
16+
## Alternative Unicode back ends
17+
18+
`url` depends on the `idna` crate. By default, `idna` uses [ICU4X](https://github.com/unicode-org/icu4x/) as its Unicode back end. If you wish to opt for different tradeoffs between correctness, run-time performance, binary size, compile time, and MSRV, please see the [README of the latest version of the `idna_adapter` crate](https://docs.rs/crate/idna_adapter/latest) for how to opt into a different Unicode back end.

idna/Cargo.toml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
[package]
22
name = "idna"
3-
version = "1.0.2"
3+
version = "1.0.3"
44
authors = ["The rust-url developers"]
55
description = "IDNA (Internationalizing Domain Names in Applications) and Punycode."
66
keywords = ["no_std", "web", "http"]
77
repository = "https://github.com/servo/rust-url/"
88
license = "MIT OR Apache-2.0"
99
autotests = false
1010
edition = "2018"
11-
rust-version = "1.67"
11+
rust-version = "1.57" # For panic in const context
1212

1313
[lib]
1414
doctest = false
@@ -17,7 +17,7 @@ doctest = false
1717
default = ["std", "compiled_data"]
1818
std = ["alloc"]
1919
alloc = []
20-
compiled_data = ["icu_normalizer/compiled_data", "icu_properties/compiled_data"]
20+
compiled_data = ["idna_adapter/compiled_data"]
2121

2222
[[test]]
2323
name = "tests"
@@ -36,10 +36,9 @@ tester = "0.9"
3636
serde_json = "1.0"
3737

3838
[dependencies]
39-
icu_normalizer = "1.4.3"
40-
icu_properties = "1.4.2"
4139
utf8_iter = "1.0.4"
4240
smallvec = { version = "1.13.1", features = ["const_generics"]}
41+
idna_adapter = "1"
4342

4443
[[bench]]
4544
name = "all"

idna/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,10 @@ Apps that need to display host names to the user should use `uts46::Uts46::to_us
2828
* `std` - Adds `impl std::error::Error for Errors {}` (and implies `alloc`).
2929
* By default, all of the above are enabled.
3030

31+
## Alternative Unicode back ends
32+
33+
By default, `idna` uses [ICU4X](https://github.com/unicode-org/icu4x/) as its Unicode back end. If you wish to opt for different tradeoffs between correctness, run-time performance, binary size, compile time, and MSRV, please see the [README of the latest version of the `idna_adapter` crate](https://docs.rs/crate/idna_adapter/latest) for how to opt into a different Unicode back end.
34+
3135
## Breaking changes since 0.5.0
3236

3337
* Stricter IDNA 2008 restrictions are no longer supported. Attempting to enable them panics immediately. UTS 46 allows all the names that IDNA 2008 allows, and when transitional processing is disabled, they resolve the same way. There are additional names that IDNA 2008 disallows but UTS 46 maps to names that IDNA 2008 allows (notably, input is mapped to fold-case output). UTS 46 also allows symbols that were allowed in IDNA 2003 as well as newer symbols that are allowed according to the same principle. (Earlier versions of this crate allowed rejecting such symbols. Rejecting characters that UTS 46 maps to IDNA 2008-permitted characters wasn't supported in earlier versions, either.)

idna/benches/all.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,11 @@ fn to_ascii_cow_plain(bench: &mut Bencher) {
5454
bench.iter(|| idna::domain_to_ascii_cow(black_box(encoded), idna::AsciiDenyList::URL));
5555
}
5656

57+
fn to_ascii_cow_hyphen(bench: &mut Bencher) {
58+
let encoded = "hyphenated-example.com".as_bytes();
59+
bench.iter(|| idna::domain_to_ascii_cow(black_box(encoded), idna::AsciiDenyList::URL));
60+
}
61+
5762
fn to_ascii_cow_leading_digit(bench: &mut Bencher) {
5863
let encoded = "1test.example".as_bytes();
5964
bench.iter(|| idna::domain_to_ascii_cow(black_box(encoded), idna::AsciiDenyList::URL));
@@ -99,6 +104,7 @@ benchmark_group!(
99104
to_ascii_simple,
100105
to_ascii_merged,
101106
to_ascii_cow_plain,
107+
to_ascii_cow_hyphen,
102108
to_ascii_cow_leading_digit,
103109
to_ascii_cow_unicode_mixed,
104110
to_ascii_cow_punycode_mixed,

0 commit comments

Comments
 (0)