forked from chromium/chromium
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove CaseInsensitiveCompare from string_util.h
There were a number of callers in net using this for HTTP headers. I think these callers actually just need ASCII case-insensitive comparisons so these were changed. The omnibox code used this functor. I added a new omnibox-specific one which does not have the locale issues of the old string_util one, but which still has the UTF-16 and combining accent issues (described in great detail in the comment for this). The Windows installer code can't depend on ICU so it calls the Win32 function to do case-insensitive comparisons. This should match the system comparison for registry keys better anyway. I also changed a caller of StartsWith to use this version. I wrote this StartsWith call using ToLower in a previous patch, but it turns out that the lengths of case-mapped strings do change in practice, making the offset computations of the suyrrounding code incorrect. This new version will be like the old version (will miss some cases of case-insensitive equality) but will handle 0x80-0xFF properly. BUG=24917 Review URL: https://codereview.chromium.org/1230583014 Cr-Commit-Position: refs/heads/master@{#338624}
- Loading branch information
brettw
authored and
Commit bot
committed
Jul 14, 2015
1 parent
5b3151c
commit a2027fb
Showing
11 changed files
with
92 additions
and
45 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
// Copyright 2015 The Chromium Authors. All rights reserved. | ||
// Use of this source code is governed by a BSD-style license that can be | ||
// found in the LICENSE file. | ||
|
||
#ifndef COMPONENTS_OMNIBOX_BROWSER_AUTOCOMPLETE_I18N_H_ | ||
#define COMPONENTS_OMNIBOX_BROWSER_AUTOCOMPLETE_I18N_H_ | ||
|
||
#include "base/strings/string16.h" | ||
#include "third_party/icu/source/common/unicode/uchar.h" | ||
|
||
// Functor for a simple 16-bit Unicode case-insensitive comparison. This is | ||
// designed for the autocomplete system where we would rather get prefix lenths | ||
// correct than handle all possible case sensitivity issues. | ||
// | ||
// Any time this is used the result will be incorrect in some cases that | ||
// certain users will be able to discern. Ideally, this class would be deleted | ||
// and we would do full Unicode case-sensitivity mappings using | ||
// base::i18n::ToLower. However, ToLower can change the lenghts of strings, | ||
// making computations of offsets or prefix lengths difficult. Getting all | ||
// edge cases correct will require careful implementation and testing. In the | ||
// mean time, we use this simpler approach. | ||
// | ||
// This comparator will not handle combining accents properly since it compares | ||
// 16-bit values in isolation. If the two strings use the same sequence of | ||
// combining accents (this is the normal case) in both strings, it will work. | ||
// | ||
// Additionally, this comparator does not decode UTF sequences which is why it | ||
// is called "UCS2". UTF-16 surrogates will be compared literally (i.e. "case- | ||
// sensitively"). | ||
// | ||
// There are also a few cases where the lower-case version of a character | ||
// expands to more than one code point that will not be handled properly. Such | ||
// characters will be compared case-sensitively. | ||
struct SimpleCaseInsensitiveCompareUCS2 { | ||
public: | ||
bool operator()(base::char16 x, base::char16 y) const { | ||
return u_tolower(x) == u_tolower(y); | ||
} | ||
}; | ||
|
||
#endif // COMPONENTS_OMNIBOX_BROWSER_AUTOCOMPLETE_I18N_H_ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters