Skip to content

Commit

Permalink
build: convert CLDR locale extraction from Gulp to Bazel tool (angula…
Browse files Browse the repository at this point in the history
…r#42230)

Converts the CLDR locale extraction script to a Bazel tool.
This allows us to generate locale files within Bazel, so that
locales don't need to live as sources within the repo. Also
it allows us to get rid of the legacy Gulp tooling.

The migration of the Gulp script to a Bazel tool involved the
following things:

  1. Basic conversion of the `extract.js` script to TypeScript.
     This mostly was about adding explicit types. e.g. adding `locale:
     string` or `localeData: CldrStatic`.

  2. Split-up into separate files. Instead of keeping the large
     `extract.js` file, the tool has been split into separate files.
     The logic remains the same, just that code is more readable and
     maintainable.

  3. Introduction of a new `index.ts` file that is the entry-point
     for the Bazel tool. Previously the Gulp tool just generated
     all locale files, the default locale and base currency files
     at once. The new entry-point accepts a mode to be passed as
     first process argument. based on that argument, either locales
     are generated into a specified directory, or the default locale,
     base currencies or closure file is generated.

     This allows us to generate files with a Bazel genrule where
     we simply run the tool and specify the outputs. Note: It's
     necessary to have multiple modes because files live in separate
     locations. e.g. the default locale in `@angular/core`, but the
     rest in `@angular/common`.

  4. Removal of the `cldr-data-downloader` and custom CLDR resolution
     logic. Within Bazel we cannot run a downloader using network.

     We switch this to something more Bazel idiomatic with better
     caching. For this a new repository rule is introduced that
     downloads the CLDR JSON repository and extracts it. Within
     that rule we determine the supported locales so that they
     can be used to pre-declare outputs (for the locales) within
     Bazel analysis phase. This allows us to add the generated locale
     files to a `ts_library` (which we want to have for better testing,
     and consistent JS transpilation).

     Note that the removal of `cldr-data-downloader` also requires us to
     add logic for detecting locales without data. The CLDR data
     downloader overwrote the `availableLocales.json` file with a file
     that only lists locales that CLDR provides data for. We use the
     official `availableLocales` file CLDR provides, but filter out
     locales for which no data is available. This is needed until we
     update to CLDR 39 where data is available for all such locales
     listed in `availableLocales.json`.

PR Close angular#42230
  • Loading branch information
devversion authored and alxhub committed Jul 16, 2021
1 parent f2cd6de commit 7a3a453
Show file tree
Hide file tree
Showing 31 changed files with 1,278 additions and 1,294 deletions.
3 changes: 0 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,5 @@ baseline.json
# Ignore .history for the xyz.local-history VSCode extension
.history

# CLDR data
tools/gulp-tasks/cldr/cldr-data/

# Husky
.husky/_
15 changes: 15 additions & 0 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,18 @@ web_test_repositories()
load("//dev-infra/bazel/browsers:browser_repositories.bzl", "browser_repositories")

browser_repositories()

load("//packages/common/locales/generate-locales-tool:cldr-data.bzl", "cldr_data_repository")

cldr_data_repository(
name = "cldr_data",
# Since we use the Github archives for CLDR 37, we need to specify a path
# to the available locales. This wouldn't be needed with CLDR 39 as that
# comes with an official JSON archive not containing a version suffix.
available_locales_path = "cldr-core-37.0.0/availableLocales.json",
urls = {
"https://github.com/unicode-cldr/cldr-core/archive/37.0.0.zip": "32b5c49c3874aa342b90412c207b42e7aefb2435295891fb714c34ce58b3c706",
"https://github.com/unicode-cldr/cldr-dates-full/archive/37.0.0.zip": "e1c410dd8ad7d75df4a5393efaf5d28f0d56c0fa126c5d66e171a3f21a988a1e",
"https://github.com/unicode-cldr/cldr-numbers-full/archive/37.0.0.zip": "a921b90cf7f436e63fbdd55880f96e39a203acd9e174b0ceafa20a02c242a12e",
},
)
3 changes: 0 additions & 3 deletions gulpfile.js
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,3 @@ function loadTask(fileName, taskName) {

gulp.task('source-map-test', loadTask('source-map-test'));
gulp.task('changelog:zonejs', loadTask('changelog-zonejs'));
gulp.task('cldr:extract', loadTask('cldr', 'extract'));
gulp.task('cldr:download', loadTask('cldr', 'download'));
gulp.task('cldr:gen-closure-locale', loadTask('cldr', 'closure'));
7 changes: 3 additions & 4 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@
"@bazel/buildifier": "^4.0.1",
"@bazel/ibazel": "^0.15.8",
"@octokit/graphql": "^4.6.1",
"@types/cldrjs": "^0.4.22",
"@types/cli-progress": "^3.4.2",
"@types/conventional-commits-parser": "^3.0.1",
"@types/ejs": "^3.0.6",
Expand All @@ -176,8 +177,7 @@
"browserstacktunnel-wrapper": "^2.0.4",
"check-side-effects": "0.0.23",
"clang-format": "^1.4.0",
"cldr": "7.0.0",
"cldr-data-downloader": "^0.3.5",
"cldr": "5.7.0",
"cldrjs": "0.5.5",
"cli-progress": "^3.7.0",
"conventional-changelog": "^3.1.24",
Expand Down Expand Up @@ -220,6 +220,5 @@
"@babel/template": "7.8.6",
"@babel/traverse": "7.8.6",
"@babel/types": "7.8.6"
},
"cldr-data-coverage": "full"
}
}
17 changes: 17 additions & 0 deletions packages/common/locales/generate-locales-tool/BUILD.bazel
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
load("//tools:defaults.bzl", "ts_library")

package(default_visibility = ["//visibility:public"])

ts_library(
name = "generate-locales-tool",
srcs = glob(["*.ts"]),
deps = [
"@npm//@bazel/runfiles",
"@npm//@types/cldrjs",
"@npm//@types/glob",
"@npm//@types/node",
"@npm//cldr",
"@npm//cldrjs",
"@npm//glob",
],
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
/**
* @license
* Copyright Google LLC All Rights Reserved.
*
* Use of this source code is governed by an MIT-style license that can be
* found in the LICENSE file at https://angular.io/license
*/

/**
* To create smaller locale files, we remove duplicated data.
* To make this work we store the data in arrays, where `undefined` indicates that the
* value is a duplicate of the previous value in the array.
* e.g. consider an array like: [x, y, undefined, z, undefined, undefined]
* The first `undefined` is equivalent to y, the second and third are equivalent to z
* Note that the first value in an array is always defined.
*
* Also since we need to know which data is assumed similar, it is important that we store those
* similar data in arrays to mark the delimitation between values that have different meanings
* (e.g. months and days).
*
* For further size improvements, "undefined" values will be replaced by a constant in the arrays
* as the last step of the file generation (in generateLocale and generateLocaleExtra).
* e.g.: [x, y, undefined, z, undefined, undefined] will be [x, y, u, z, u, u]
*/
export function removeDuplicates(data: unknown[]) {
const dedup = [data[0]];
for (let i = 1; i < data.length; i++) {
if (JSON.stringify(data[i]) !== JSON.stringify(data[i - 1])) {
dedup.push(data[i]);
} else {
dedup.push(undefined);
}
}
return dedup;
}
33 changes: 33 additions & 0 deletions packages/common/locales/generate-locales-tool/bin/BUILD.bazel
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
load("@build_bazel_rules_nodejs//:index.bzl", "nodejs_binary")
load("//tools:defaults.bzl", "ts_library")

package(default_visibility = ["//visibility:public"])

BIN_ENTRYPOINTS = [
"get-base-currencies-file",
"get-base-locale-file",
"get-closure-locale-file",
"write-locale-files-to-dist",
]

ts_library(
name = "bin",
srcs = glob(["*.ts"]),
deps = [
"//packages/common/locales/generate-locales-tool",
"@npm//@types/node",
],
)

[nodejs_binary(
name = entrypoint,
data = [
":bin",
"@cldr_data//:all_json",
],
entry_point = ":%s.ts" % entrypoint,
# We need to patch the NodeJS module resolution as this binary runs as
# part of a genrule where the linker does not work as expected.
# See: https://github.com/bazelbuild/rules_nodejs/issues/2600.
templated_args = ["--bazel_patch_module_resolver"],
) for entrypoint in BIN_ENTRYPOINTS]
17 changes: 17 additions & 0 deletions packages/common/locales/generate-locales-tool/bin/base-locale.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
/**
* @license
* Copyright Google LLC All Rights Reserved.
*
* Use of this source code is governed by an MIT-style license that can be
* found in the LICENSE file at https://angular.io/license
*/


/**
* Base locale used as foundation for other locales. For example: A base locale allows
* generation of a file containing all currencies with their corresponding symbols. If we
* generate other locales, they can override currency symbols which are different in the base
* locale. This means that we do not need re-generate all currencies w/ symbols multiple times,
* and allows us to reduce the locale data payload as the base locale is always included.
* */
export const BASE_LOCALE = 'en';
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
/**
* @license
* Copyright Google LLC All Rights Reserved.
*
* Use of this source code is governed by an MIT-style license that can be
* found in the LICENSE file at https://angular.io/license
*/
import {CldrData} from '../cldr-data';
import {generateBaseCurrenciesFile} from '../locale-base-currencies';

import {BASE_LOCALE} from './base-locale';

/** Generates the base currencies file and prints it to the stdout. */
function main() {
const cldrData = new CldrData();
const baseLocaleData = cldrData.getLocaleData(BASE_LOCALE)!;

process.stdout.write(generateBaseCurrenciesFile(baseLocaleData));
}

if (require.main === module) {
main();
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/**
* @license
* Copyright Google LLC All Rights Reserved.
*
* Use of this source code is governed by an MIT-style license that can be
* found in the LICENSE file at https://angular.io/license
*/
import {CldrData} from '../cldr-data';
import {generateBaseCurrencies} from '../locale-base-currencies';
import {generateLocale} from '../locale-file';

import {BASE_LOCALE} from './base-locale';

/** Generates the base locale file and prints it to the stdout. */
function main() {
const cldrData = new CldrData();
const baseLocaleData = cldrData.getLocaleData(BASE_LOCALE)!;
const baseCurrencies = generateBaseCurrencies(baseLocaleData);

process.stdout.write(generateLocale(BASE_LOCALE, baseLocaleData, baseCurrencies));
}

if (require.main === module) {
main();
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/**
* @license
* Copyright Google LLC All Rights Reserved.
*
* Use of this source code is governed by an MIT-style license that can be
* found in the LICENSE file at https://angular.io/license
*/
import {CldrData} from '../cldr-data';
import {generateClosureLocaleFile} from '../closure-locale-file';
import {generateBaseCurrencies} from '../locale-base-currencies';

import {BASE_LOCALE} from './base-locale';

/** Generates the Google3 closure-locale file and prints it to the stdout. */
function main() {
const cldrData = new CldrData();
const baseLocaleData = cldrData.getLocaleData(BASE_LOCALE)!;
const baseCurrencies = generateBaseCurrencies(baseLocaleData);

process.stdout.write(generateClosureLocaleFile(cldrData, baseCurrencies));
}

if (require.main === module) {
main();
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
/**
* @license
* Copyright Google LLC All Rights Reserved.
*
* Use of this source code is governed by an MIT-style license that can be
* found in the LICENSE file at https://angular.io/license
*/

import {writeFileSync} from 'fs';
import {join} from 'path';

import {CldrData} from '../cldr-data';
import {generateBaseCurrencies} from '../locale-base-currencies';
import {generateLocaleExtra} from '../locale-extra-file';
import {generateLocale} from '../locale-file';
import {generateLocaleGlobalFile} from '../locale-global-file';

import {BASE_LOCALE} from './base-locale';

/**
* Generates locale files for each available CLDR locale and writes it to the
* specified directory.
*/
function main(outputDir: string) {
const cldrData = new CldrData();
const baseLocaleData = cldrData.getLocaleData(BASE_LOCALE)!;
const baseCurrencies = generateBaseCurrencies(baseLocaleData);
const extraLocaleDir = join(outputDir, 'extra');
const globalLocaleDir = join(outputDir, 'global');

console.info(`Writing locales to: ${outputDir}`);

// Generate locale files for all locales we have data for.
cldrData.availableLocales.forEach((locale: string) => {
const localeData = cldrData.getLocaleData(locale);

// If `cldrjs` is unable to resolve a `bundle` for the current locale, then there is no data
// for this locale, and it should not be generated. This can happen as with older versions of
// CLDR where `availableLocales.json` specifies locales for which no data is available
// (even within the `full` tier packages). See:
// http://cldr.unicode.org/development/development-process/design-proposals/json-packaging.
// TODO(devversion): Remove if we update to CLDR v39 where this seems fixed. Note that this
// worked before in the Gulp tooling without such a check because the `cldr-data-downloader`
// overwrote the `availableLocales` to only capture locales with data.
if (localeData && !(localeData.attributes as any).bundle) {
console.info(`Skipping generation of ${locale} as there is no data.`);
return;
}

const localeFile = generateLocale(locale, localeData, baseCurrencies);
const localeExtraFile = generateLocaleExtra(locale, localeData);
const localeGlobalFile = generateLocaleGlobalFile(locale, localeData, baseCurrencies);

writeFileSync(join(outputDir, `${locale}.ts`), localeFile);
writeFileSync(join(extraLocaleDir, `${locale}.ts`), localeExtraFile);
writeFileSync(join(globalLocaleDir, `${locale}.js`), localeGlobalFile);
});
}


if (require.main === module) {
// The first argument is expected to be a path resolving to a directory
// where all locales should be generated into.
const outputDir = process.argv[2];

if (outputDir === undefined) {
throw Error('No output directory specified.');
}

main(outputDir);
}
48 changes: 48 additions & 0 deletions packages/common/locales/generate-locales-tool/cldr-data.bzl
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
def _cldr_data_repository_impl(ctx):
for url, sha256 in ctx.attr.urls.items():
ctx.report_progress("Downloading CLDR data from: %s" % url)
ctx.download_and_extract(
url = url,
sha256 = sha256,
)

ctx.report_progress("Extracting available locales from: %s" % ctx.attr.available_locales_path)
locales_json = ctx.read(ctx.attr.available_locales_path)
locales = json.decode(locales_json)["availableLocales"]["full"]
ctx.report_progress("Extracted %s locales from CLDR" % len(locales))

ctx.file("index.bzl", content = """
LOCALES=%s
""" % locales)

ctx.file("BUILD.bazel", content = """
filegroup(
name = "all_json",
srcs = glob(["**/*.json"]),
visibility = ["//visibility:public"],
)
""")

"""
Repository rule that downloads CLDR data from the specified repository and generates a
`BUILD.bazel` file that exposes all data files. Additionally, an `index.bzl` file is generated
that exposes a constant for all locales the repository contains data for. This can be used to
generate pre-declared outputs.
"""
cldr_data_repository = repository_rule(
implementation = _cldr_data_repository_impl,
attrs = {
"urls": attr.string_dict(doc = """
Dictionary of URLs that resolve to archives containing CLDR JSON data. These archives
will be downloaded and extracted at root of the repository. Each key can specify
a SHA256 checksum for hermetic builds.
""", mandatory = True),
"available_locales_path": attr.string(
doc = """
Relative path to the JSON data file describing all available locales.
This file usually resides within the `cldr-core` package
""",
default = "cldr-core/availableLocales.json",
),
},
)
Loading

0 comments on commit 7a3a453

Please sign in to comment.