Conversation
Instead of having the latest element in enums act as the count for enums, we can use std::mem::variant_count.
In localization initialization, we now check if the system-preffered language starts with a prefix rather than checking for equality. This should now correctly detect languages. This fixes microsoft#74. fix typo
ffda5eb to
3e1c6b9
Compare
|
Thanks! I'll leave this to Leonard to merge, if he has any feedback |
src/bin/edit/localization.rs
Outdated
| let mut lang = LangId::en; | ||
|
|
||
| for l in langs { | ||
| println!("lang: {}", l); |
There was a problem hiding this comment.
There's some leftover debug code.
src/bin/edit/main.rs
Outdated
| // Licensed under the MIT License. | ||
|
|
||
| #![feature(let_chains, linked_list_cursors, os_string_truncate, string_from_utf8_lossy_owned)] | ||
| #![feature(let_chains, linked_list_cursors, os_string_truncate, string_from_utf8_lossy_owned, variant_count)] |
There was a problem hiding this comment.
We also received very rightful criticism that we shouldn't depend on too many nightly features. As such, I think we should back out this particular change.
src/helpers.rs
Outdated
| // If the bytes don't match, it's not a prefix | ||
| if text_byte != prefix_byte { |
There was a problem hiding this comment.
What I meant is that we need a case-insensitive starts_with function. Since the region codes are expected to be ASCII-only we could write an ASCII-only starts_with function without relying on ICU to do the comparison (since ICU may be unavailable).
src/helpers.rs
Outdated
| for i in 0..prefix.len() { | ||
| // Get the ASCII byte from both text and prefix | ||
| let text_byte = text_bytes[i]; | ||
| let prefix_byte = prefix_bytes[i]; |
There was a problem hiding this comment.
I forgot to mention that you can use text_bytes.iter().zip(prefix_bytes.iter()) for such patterns. See here: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.zip
|
I pushed directly in your PR as I unfortunately messed up a bit in the unix code and I felt like it would fit if I used your PR to fix both things at once. I apologize. |
| const LANG_MAP: &[(&str, LangId)] = &[ | ||
| ("en", LangId::en), | ||
| // ---------------- | ||
| ("de", LangId::de), | ||
| ("es", LangId::es), | ||
| ("fr", LangId::fr), | ||
| ("it", LangId::it), | ||
| ("ja", LangId::ja), | ||
| ("ko", LangId::ko), | ||
| ("pt-br", LangId::pt_br), | ||
| ("ru", LangId::ru), | ||
| ("zh-hant", LangId::zh_hant), | ||
| ("zh-tw", LangId::zh_hant), | ||
| ("zh", LangId::zh_hans), | ||
| ]; |
There was a problem hiding this comment.
This trick is neat IMO because it reduces the binary size quite a bit. It also allows the compiler to inline the starts_with_ignore_ascii_case call since there's only 1 call to it now.
| // Since the comparison is ASCII, we don't need to worry about that. | ||
| let s = self.as_bytes(); | ||
| let p = prefix.as_bytes(); | ||
| p.len() <= s.len() && s[..p.len()].eq_ignore_ascii_case(p) |
There was a problem hiding this comment.
Makes the code a lot shorter and easier to read IMO.
I'm not gonna lie here, Copilot told me about eq_ignore_ascii_case. Convenient search engine. 😅
| let mut res = Vec::new_in(arena); | ||
| res.extend(s.as_bytes().iter().map(|&b| if b == b'_' { b'-' } else { b })); | ||
| unsafe { ArenaString::from_utf8_unchecked(res) } |
There was a problem hiding this comment.
I broke this by giving wrong direction over in #104. This code has the benefit of being more compact (no extra string copies).
(Iterators in Rust have size hints so the .extend() call knows exactly how many items it needs to expect and can preallocate accordingly.)
Closes microsoft#74 Co-authored-by: Leonard Hecker <leonard@hecker.io>
In localization initialization, we now check if the system-preferred
language starts with a prefix rather than checking for equality. This
should now correctly detect languages.
Instead of having the latest element in enums act as the count for
enums, we can use std::mem::variant_count.
This also fixes #74