Skip to content

Commit

Permalink
Revert 221380 "Use Finch to compare the performances of CLD1 and..."
Browse files Browse the repository at this point in the history
Broke
http://build.chromium.org/p/chromium.chromiumos/buildstatus?builder=ChromiumOS%20%28daisy%29&number=12452

and maybe
http://build.chromium.org/p/chromium/buildstatus?builder=Linux%20x64&number=55236
http://build.chromium.org/p/chromium.win/buildstatus?builder=Win%20x64%20Builder&number=9603

> Use Finch to compare the performances of CLD1 and CLD2
> 
> Add a compile time constant CLD_VERSION, which indicates the version of CLD. If this is not define, Finch test to compare CLD1 and CLD2 is supposed to be used.
> 
> By this CL, each platform will have the below status:
> 
> Linux:    Use both CLD1 and CLD2 (and use Finch).
> Mac OS X: Use both CLD1 and CLD2 (and use Finch).
> Windows:  Use only CLD1 once because now CLD2 can't be compiled on Windows. After we can have CLD2 compiled on Windows, we will use CLD2 and Finch asap.
> iOS:      Still use only CLD1. (It's because it is hard to use both CLD1 and CLD2 on mobile platform because of the binary size impact.)
> Android:  Still use only CLD1. (The same reason as iOS)
> 
> So some platforms will have two CLD binaries, but this is temporal in the sense that we intend to use Finch only for Dev and Beta channel. Before releasing the stable Chromium version, we decide which version of CLD is adopted, make another CL to use only one CLD, and send a merge request. (Of course, we hope we will be able to adopt CLD2.)
> 
> BUG=240647
> 
> Review URL: https://chromiumcodereview.appspot.com/22867032

TBR=hajimehoshi@chromium.org

Review URL: https://codereview.chromium.org/23766011

git-svn-id: svn://svn.chromium.org/chrome/trunk/src@221391 0039d316-1c4b-4281-b951-d872f2087c98
  • Loading branch information
pneubeck@chromium.org committed Sep 5, 2013
1 parent 10d486e commit 6c2beac
Show file tree
Hide file tree
Showing 13 changed files with 24 additions and 432 deletions.
1 change: 1 addition & 0 deletions build/all.gyp
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
'../printing/printing.gyp:*',
'../skia/skia.gyp:*',
'../third_party/cacheinvalidation/cacheinvalidation.gyp:*',
'../third_party/cld/cld.gyp:*',
'../third_party/codesighs/codesighs.gyp:*',
'../third_party/ffmpeg/ffmpeg.gyp:*',
'../third_party/iccjpeg/iccjpeg.gyp:*',
Expand Down
16 changes: 0 additions & 16 deletions build/common.gypi
Original file line number Diff line number Diff line change
Expand Up @@ -341,12 +341,6 @@
# print, UI, etc.
'enable_printing%': 1,

# Set the version of CLD.
# 0: Don't specify the version. This option is for the Finch testing.
# 1: Use only CLD1.
# 2: Use only CLD2.
'cld_version%': 0,

# Enable spell checker.
'enable_spellcheck%': 1,

Expand Down Expand Up @@ -516,15 +510,10 @@
'enable_one_click_signin%': 1,
}],

['OS=="win"', {
'cld_version%': 1,
}],

['OS=="android"', {
'enable_automation%': 0,
'enable_extensions%': 0,
'enable_google_now%': 0,
'cld_version%': 1,
'enable_spellcheck%': 0,
'enable_themes%': 0,
'remoting%': 0,
Expand Down Expand Up @@ -574,7 +563,6 @@
'enable_automation%': 0,
'enable_extensions%': 0,
'enable_google_now%': 0,
'cld_version%': 1,
'enable_printing%': 0,
'enable_session_service%': 0,
'enable_themes%': 0,
Expand Down Expand Up @@ -863,7 +851,6 @@
'enable_printing%': '<(enable_printing)',
'enable_spellcheck%': '<(enable_spellcheck)',
'enable_google_now%': '<(enable_google_now)',
'cld_version%': '<(cld_version)',
'enable_captive_portal_detection%': '<(enable_captive_portal_detection)',
'disable_ftp_support%': '<(disable_ftp_support)',
'enable_task_manager%': '<(enable_task_manager)',
Expand Down Expand Up @@ -2285,9 +2272,6 @@
['enable_google_now==1', {
'defines': ['ENABLE_GOOGLE_NOW=1'],
}],
['cld_version!=0', {
'defines': ['CLD_VERSION=<(cld_version)'],
}],
['enable_printing==1', {
'defines': ['ENABLE_FULL_PRINTING=1', 'ENABLE_PRINTING=1'],
}],
Expand Down
11 changes: 1 addition & 10 deletions chrome/chrome_common.gypi
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
# TODO(gregoryd): chrome_resources and chrome_strings could be
# shared with the 64-bit target, but it does not work due to a gyp
# issue.
'../third_party/cld/cld.gyp:cld',
'common_net',
'common_version',
'installer_util',
Expand Down Expand Up @@ -666,16 +667,6 @@
'common/print_messages.h',
]
}],
['cld_version==0 or cld_version==1', {
'dependencies': [
'<(DEPTH)/third_party/cld/cld.gyp:cld',
],
}],
['cld_version==0 or cld_version==2', {
'dependencies': [
'<(DEPTH)/third_party/cld_2/cld_2.gyp:cld_2',
],
}],
],
'target_conditions': [
['OS == "ios"', {
Expand Down
11 changes: 1 addition & 10 deletions chrome/chrome_dll.gypi
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@
'../crypto/crypto.gyp:crypto',
'../printing/printing.gyp:printing',
'../net/net.gyp:net_resources',
'../third_party/cld/cld.gyp:cld',
'../ui/views/views.gyp:views',
'../webkit/webkit_resources.gyp:webkit_resources',
],
Expand Down Expand Up @@ -232,16 +233,6 @@
'../content/content.gyp:content_app_browser',
],
}],
['cld_version==0 or cld_version==1', {
'dependencies': [
'../third_party/cld/cld.gyp:cld',
],
}],
['cld_version==0 or cld_version==2', {
'dependencies': [
'../third_party/cld_2/cld_2.gyp:cld_2',
],
}],
['OS=="mac" and component!="shared_library"', {
'includes': [ 'chrome_dll_bundle.gypi' ],
}],
Expand Down
2 changes: 2 additions & 0 deletions chrome/chrome_tests.gypi
Original file line number Diff line number Diff line change
Expand Up @@ -1151,6 +1151,7 @@
'../testing/gmock.gyp:gmock',
'../testing/gtest.gyp:gtest',
'../third_party/cacheinvalidation/cacheinvalidation.gyp:cacheinvalidation',
'../third_party/cld/cld.gyp:cld',
'../third_party/icu/icu.gyp:icui18n',
'../third_party/icu/icu.gyp:icuuc',
'../third_party/leveldatabase/leveldatabase.gyp:leveldatabase',
Expand Down Expand Up @@ -2164,6 +2165,7 @@
'../testing/gmock.gyp:gmock',
'../testing/gtest.gyp:gtest',
'../testing/perf/perf_test.gyp:*',
'../third_party/cld/cld.gyp:cld',
'../third_party/icu/icu.gyp:icui18n',
'../third_party/icu/icu.gyp:icuuc',
'../third_party/leveldatabase/leveldatabase.gyp:leveldatabase',
Expand Down
26 changes: 10 additions & 16 deletions chrome/chrome_tests_unit.gypi
Original file line number Diff line number Diff line change
Expand Up @@ -487,6 +487,14 @@
'include_dirs': [
'..',
],
'defines': [
'CLD_WINDOWS',
],
'direct_dependent_settings': {
'defines': [
'CLD_WINDOWS',
],
},
'msvs_settings': {
'VCLinkerTool': {
'conditions': [
Expand Down Expand Up @@ -1932,6 +1940,7 @@
'../skia/ext/skia_utils_mac_unittest.mm',
'../skia/ext/vector_canvas_unittest.cc',
'../testing/gtest_mac_unittest.mm',
'../third_party/cld/encodings/compact_lang_det/compact_lang_det_unittest_small.cc',
'../third_party/zlib/google/zip_reader_unittest.cc',
'../third_party/zlib/google/zip_unittest.cc',
'../tools/json_schema_compiler/test/additional_properties_unittest.cc',
Expand Down Expand Up @@ -1969,6 +1978,7 @@
'../gpu/gpu.gyp:gpu_unittest_utils',
'../media/media.gyp:media_test_support',
'../ppapi/ppapi_internal.gyp:ppapi_unittest_shared',
'../third_party/cld/cld.gyp:cld',
'../third_party/leveldatabase/leveldatabase.gyp:leveldatabase',
'../third_party/libjingle/libjingle.gyp:libjingle',
'../third_party/libphonenumber/libphonenumber.gyp:libphonenumber',
Expand Down Expand Up @@ -2625,22 +2635,6 @@
['exclude', '^browser/extensions/blacklist_unittest.cc'],
],
}],
['cld_version==0 or cld_version==1', {
'defines': [
'CLD_WINDOWS',
],
'direct_dependent_settings': {
'defines': [
'CLD_WINDOWS',
],
},
'sources': [
'../third_party/cld/encodings/compact_lang_det/compact_lang_det_unittest_small.cc',
],
'dependencies': [
'../third_party/cld/cld.gyp:cld',
],
}],
],
'target_conditions': [
['OS == "ios"', {
Expand Down
1 change: 0 additions & 1 deletion chrome/common/DEPS
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ include_rules = [
"+third_party/bzip2",
"+third_party/cld",
"+third_party/cld/encodings/compact_lang_det/win",
"+third_party/cld_2/src",
"+third_party/mt19937ar",
"+third_party/npapi",
"+third_party/re2",
Expand Down
100 changes: 8 additions & 92 deletions chrome/common/translate/language_detection_util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -5,23 +5,14 @@
#include "chrome/common/translate/language_detection_util.h"

#include "base/logging.h"
#include "base/metrics/field_trial.h"
#include "base/strings/string_split.h"
#include "base/strings/string_util.h"
#include "base/strings/utf_string_conversions.h"
#include "base/time/time.h"
#include "chrome/common/chrome_constants.h"
#include "chrome/common/translate/translate_common_metrics.h"
#include "chrome/common/translate/translate_util.h"

#if !defined(CLD_VERSION) || CLD_VERSION==1
#include "third_party/cld/encodings/compact_lang_det/compact_lang_det.h"
#include "third_party/cld/encodings/compact_lang_det/win/cld_unicodetext.h"
#endif

#if !defined(CLD_VERSION) || CLD_VERSION==2
#include "third_party/cld_2/src/public/compact_lang_det.h"
#endif

namespace {

Expand Down Expand Up @@ -70,63 +61,18 @@ void ApplyLanguageCodeCorrection(std::string* code) {
TranslateUtil::ToTranslateLanguageSynonym(code);
}

int GetCLDMajorVersion() {
#if !defined(CLD_VERSION)
std::string group_name = base::FieldTrialList::FindFullName("CLD1VsCLD2");
if (group_name == "CLD2")
return 2;
else
return 1;
#else
return CLD_VERSION;
#endif
}

// Returns the ISO 639 language code of the specified |text|, or 'unknown' if it
// failed.
// |is_cld_reliable| will be set as true if CLD says the detection is reliable.
std::string DetermineTextLanguage(const base::string16& text,
bool* is_cld_reliable) {
std::string language = chrome::kUnknownLanguageCode;
int num_languages = 0;
int text_bytes = 0;
bool is_reliable = false;

// Language or CLD2::Language
int cld_language = 0;
bool is_valid_language = false;

switch (GetCLDMajorVersion()) {
#if !defined(CLD_VERSION) || CLD_VERSION==1
case 1: {
int num_languages = 0;
cld_language =
DetectLanguageOfUnicodeText(NULL, text.c_str(), true, &is_reliable,
&num_languages, NULL, &text_bytes);
is_valid_language = cld_language != NUM_LANGUAGES &&
cld_language != UNKNOWN_LANGUAGE &&
cld_language != TG_UNKNOWN_LANGUAGE;
break;
}
#endif
#if !defined(CLD_VERSION) || CLD_VERSION==2
case 2: {
std::string utf8_text(UTF16ToUTF8(text));
CLD2::Language language3[3];
int percent3[3];
cld_language =
CLD2::DetectLanguageSummary(utf8_text.c_str(), utf8_text.size(), true,
language3, percent3,
&text_bytes, &is_reliable);
is_valid_language = cld_language != CLD2::NUM_LANGUAGES &&
cld_language != CLD2::UNKNOWN_LANGUAGE &&
cld_language != CLD2::TG_UNKNOWN_LANGUAGE;
break;
}
#endif
default:
NOTREACHED();
}

Language cld_language =
DetectLanguageOfUnicodeText(NULL, text.c_str(), true, &is_reliable,
&num_languages, NULL, &text_bytes);
if (is_cld_reliable != NULL)
*is_cld_reliable = is_reliable;

Expand All @@ -136,33 +82,15 @@ std::string DetermineTextLanguage(const base::string16& text,
// TODO(toyoshim): CLD provides |is_reliable| flag. But, it just says that
// the determined language code is correct with 50% confidence. Chrome should
// handle the real confidence value to judge.
if (is_reliable && text_bytes >= 100 && is_valid_language) {
if (is_reliable && text_bytes >= 100 && cld_language != NUM_LANGUAGES &&
cld_language != UNKNOWN_LANGUAGE && cld_language != TG_UNKNOWN_LANGUAGE) {
// We should not use LanguageCode_ISO_639_1 because it does not cover all
// the languages CLD can detect. As a result, it'll return the invalid
// language code for tradtional Chinese among others.
// |LanguageCodeWithDialect| will go through ISO 639-1, ISO-639-2 and
// 'other' tables to do the 'right' thing. In addition, it'll return zh-CN
// for Simplified Chinese.
switch (GetCLDMajorVersion()) {
#if !defined(CLD_VERSION) || CLD_VERSION==1
case 1:
language =
LanguageCodeWithDialects(static_cast<Language>(cld_language));
break;
#endif
#if !defined(CLD_VERSION) || CLD_VERSION==2
case 2:
if (cld_language == CLD2::CHINESE) {
language = "zh-CN";
} else {
language =
CLD2::LanguageCode(static_cast<CLD2::Language>(cld_language));
}
break;
#endif
default:
NOTREACHED();
}
language = LanguageCodeWithDialects(cld_language);
}
VLOG(9) << "Detected lang_id: " << language << ", from Text:\n" << text
<< "\n*************************************\n";
Expand Down Expand Up @@ -363,19 +291,7 @@ bool MaybeServerWrongConfiguration(const std::string& page_language,
}

std::string GetCLDVersion() {
switch (GetCLDMajorVersion()) {
#if !defined(CLD_VERSION) || CLD_VERSION==1
case 1:
return CompactLangDet::DetectLanguageVersion();
#endif
#if !defined(CLD_VERSION) || CLD_VERSION==2
case 2:
return CLD2::DetectLanguageVersion();
#endif
default:
NOTREACHED();
}
return "";
return CompactLangDet::DetectLanguageVersion();
}

} // namespace LanguageDetectionUtil
2 changes: 1 addition & 1 deletion third_party/cld/README.chromium
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Name: Compact Language Detection
Short Name: cld
URL: http://src.chromium.org/viewvc/chrome/trunk/src/third_party/cld/
Version: 0
Version: unknown
License: BSD
Security Critical: yes

Expand Down
Loading

0 comments on commit 6c2beac

Please sign in to comment.