Skip to content

Conversation

@srl295
Copy link
Member

@srl295 srl295 commented Sep 10, 2019

Fixes: #19214

Contains tooling, and data for ICU 64

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

Historical note: yes, this is a reversal of nodejs/node-v0.x-archive#6371 (comment) (7 May 2014) for node v0.12. FYI @tjfontaine and @trevnorris !

@srl295 srl295 added wip Issues and PRs that are still a work in progress. i18n-api Issues and PRs related to the i18n implementation. labels Sep 10, 2019
@srl295 srl295 self-assigned this Sep 10, 2019
@nodejs-github-bot nodejs-github-bot added build Issues and PRs related to build files or the CI. doc Issues and PRs related to the documentations. tools Issues and PRs related to the tools directory. labels Sep 10, 2019
@srl295
Copy link
Member Author

srl295 commented Sep 10, 2019

please try it out

@srl295 srl295 requested a review from devsnek September 10, 2019 23:19
@gengjiawen
Copy link
Member

Also tried this, it works :). cc @mathiasbynens

const f1 = new Intl.NumberFormat('fr');
f1.format(12345678901234567890n);

const formatter = new Intl.NumberFormat('en', {
    style: 'unit',
    unit: 'kilobyte',
  });

const t = formatter.format(1.234);
console.log(t);

@gengjiawen
Copy link
Member

Should this labeled as semver-major ? Also maybe add Fix #19214 to description or commit message.

@srl295
Copy link
Member Author

srl295 commented Sep 11, 2019

@srl295 srl295 removed the wip Issues and PRs that are still a work in progress. label Sep 13, 2019
@srl295
Copy link
Member Author

srl295 commented Sep 13, 2019

should it really be semver minor or major? It doesn't break anything…

@srl295 srl295 marked this pull request as ready for review September 13, 2019 00:04
@srl295 srl295 changed the title Full icu default WIP Make full-icu the default Sep 13, 2019
@nodejs-github-bot
Copy link
Collaborator

@srl295
Copy link
Member Author

srl295 commented Sep 13, 2019

One may ask, "what about full-icu?"

I will tell you.

  1. Installing the npm module full-icu with this already-full node is harmless , and results in:

Note: Your node was not compiled with the ‘small-icu’ case, so the ICU data is not customizable through this script. Exitting.

The install returns successfully because postinstall exits with status 0.

  1. running a full-icu node with node --icu-data-dir=node_modules/full-icu …  is also harmless. IF the named directory path list (and it is a list, not a path) contains ICU data or override ICU data, that data is used. If not, it's ignored, because node has enough data to run.

@srl295
Copy link
Member Author

srl295 commented Sep 13, 2019

AIX failed,

virtual memory exhausted: Not enough space

I probably need to upgrade the genccode usage.. I'm guessing that configure --with-intl=full-icu --download-all was already broken.

/home/iojs/build/workspace/node-test-commit-aix/nodes/aix61-ppc64/out/Release/obj/gen/icudt64_dat.c

that's a huge file. Maybe 100M+ 79M of source code (assembly). okay… I can work on improving this here.

@srl295
Copy link
Member Author

srl295 commented Sep 13, 2019

@nodejs/build-infra is there a way we can get more virtual memory for the AIX build?

@srl295
Copy link
Member Author

srl295 commented Sep 13, 2019

Problem: generating a sharedlib or static data (.so/.a/.dll/.lib) from a 25M binary .dat file
This is not a new problem. This situation has been around for 5 years if you build node with --with-intl=full-icu. It's coming to the forefront because we're changing the default.

When I ported the tools, I didn't want to port ICU's dependence on makefiles and generated shell scripts, so I went for a minimal view: always generate an icudt_dat.c file which looks like this:

const struct {
    double bogus;
    uint8_t bytes[27531792];
} icudt64_dat={ 0.0, {
144,0,218,39,20,0,0,0,0,0,2,0,67,109,110,68,
1,0,0,0,3,0,0,0,32,67,111,112,121,114,105,103,
104,116,32,40,67,41,32,50,48,49,54,32,97,110,100,32,…
[zillions of lines omitted]

it's a little harsh on a C compiler. somehow.

Typically how ICU solved this is:

  • (archaic) we used to generate lots of small separate .c files with data and link them together. That format is deprecated.
  • on Windows, we have code that can generate a .OBJ file directly.
  • on many platforms, we generate assembly - an assembler seems to be 'lower overhead' and not use as much memory for such an operation. Even if there are a couple of passes, each pass is more straightforward and uses less temporary memory:
.globl _bar_dat
        .data
        .const
        .balign 16
_bar_dat:
.long 0x27DA0090,0x14,0x020000,0x446E6D43

The challenge is, there are a lot of assembly formats, even if you arbitrarily and unnecessarily restrict the problem area to only GCC. The assembler is more tied to the OS's loader, etc.

Sooo… 

  1. Can we get the memory upped on the builders? that 'kicks the can' down the road, but affects users. As I said, this isn't a new problem.
  2. I can look at using the assembly route— but it probably needs to be a very specific opt-in list for specific environments:
  • aix/gcc
  • linuxone/openpower
  • linux on gcc: If we turn on asm here (will help arm linux platforms, maybe smaller memory footprint etc.)
    Did I miss any problematic ones from the build?

@Trott
Copy link
Member

Trott commented Sep 17, 2019

@nodejs/build-infra is there a way we can get more virtual memory for the AIX build?

/pinging IBMers who are active in Build: @mhdawson @sam-github @gireeshpunathil

@sam-github
Copy link
Contributor

I'm in the process of trying to get an AIX7.2 machine working in CI. I think it has twice as much memory. Redoing the AIX6.1 machines, though, I'm not sure how to approach that. I'm asking around.

To temper expectations, more memory or releases/tests being done on AIX7.2 aren't going to happen very fast (sorry, just being honest, some things are slow).

@srl295
Copy link
Member Author

srl295 commented Sep 19, 2019

OK an update. This seemed to work on mac. PoC. Need to calculate the value of the -a option below from configure.py, and then to add a conditional in gyp to switch in/out the following.

probably "generated-file-suffix" (values: .S, .c, ..?) and '<@(_genccode_opts)' (containing the -a options etc) will work.

Docs on the -a option:

-a or --assembly Create assembly file. (possible values are: gcc, gcc-darwin, gcc-cygwin, gcc-mingw64, sun, sun-x86, xlc, aCC-ia64, aCC-parisc, masm)

so,

  • linux/gcc -> gcc,
  • mac -> gcc-darwin,
  • aix -> xlc? or maybe gcc? not sure.
diff --git a/tools/icu/icu-generic.gyp b/tools/icu/icu-generic.gyp
index b8f0d13836..96930d159c 100644
--- a/tools/icu/icu-generic.gyp
+++ b/tools/icu/icu-generic.gyp
@@ -272,7 +272,7 @@
           'conditions': [
             [ 'icu_small == "false"', {
               # full data - just build the full data file, then we are done.
-              'sources': [ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major)_dat.c' ],
+              'sources': [ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major)_dat.S' ],
               'dependencies': [ 'genccode#host', 'icupkg#host', 'icu_implementation#host', 'icu_uconfig' ],
               'include_dirs': [
                 '<(icu_path)/source/common',
@@ -302,10 +302,11 @@
                 {
                   'action_name': 'icudata',
                   'inputs': [ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major).dat' ],
-                  'outputs':[ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major)_dat.c' ],
+                  'outputs':[ '<(SHARED_INTERMEDIATE_DIR)/icudt<(icu_ver_major)_dat.S' ],
                   'action': [ '<(PRODUCT_DIR)/genccode',
                               '-e', 'icudt<(icu_ver_major)',
                               '-d', '<(SHARED_INTERMEDIATE_DIR)',
+                              '-a', 'gcc-darwin',
                               '-f', 'icudt<(icu_ver_major)_dat',
                               '<@(_inputs)' ],
                 },

@nwoltman
Copy link
Contributor

should it really be semver minor or major? It doesn't break anything…

It should be semver major because it can change the result of function calls that rely on ICU. For example, this code:

const fmt = new Intl.NumberFormat('en-CA', {style: 'currency', currency: 'USD'});
console.log(fmt.format(10.32));

currently outputs:

$10.32

but with full-icu, the output will change to:

US$10.32

@Trott Trott closed this Oct 3, 2019
@srl295 srl295 deleted the full-icu-default branch October 3, 2019 23:12
@srl295 srl295 mentioned this pull request Oct 4, 2019
@vsemozhetbyt
Copy link
Contributor

vsemozhetbyt commented Oct 8, 2019

Should our last nightly builds contain full ICU already?

I've tried the last Windows build as of 2019.10.08:

  1. The file size of the node.exe seems unchanged vs 12.11.1 release.
  2. The test from our docs seems failing:
> process.versions
{
  node: '13.0.0-nightly20191008064e111515',
  v8: '7.8.279.14-node.13',
  uv: '1.32.0',
  zlib: '1.2.11',
  brotli: '1.0.7',
  ares: '1.15.0',
  modules: '79',
  nghttp2: '1.39.2',
  napi: '5',
  llhttp: '1.1.4',
  openssl: '1.1.1d',
  cldr: '35.1',
  icu: '64.2',
  tz: '2019a',
  unicode: '12.1'
}
> const january = new Date(9e8);
undefined
> const spanish = new Intl.DateTimeFormat('es', { month: 'long' });
undefined
> spanish.format(january);
'M01'
> spanish.format(january) === 'enero';
false

@srl295
Copy link
Member Author

srl295 commented Oct 8, 2019

Should our last nightly builds contain full ICU already?

I've tried the last Windows build as of 2019.10.08:

  1. The file size of the node.exe seems unchanged vs 12.11.1 release.

Seems too small. Can you send/point me to the build log?

@richardlau
Copy link
Member

I think I can see what the problem is. PR incoming.

@targos
Copy link
Member

targos commented Oct 8, 2019

The problem is that the release build is configured with --with-intl=small-icu

@richardlau
Copy link
Member

PR: #29887

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build Issues and PRs related to build files or the CI. doc Issues and PRs related to the documentations. i18n-api Issues and PRs related to the i18n implementation. notable-change PRs with changes that should be highlighted in changelogs. semver-major PRs that contain breaking changes and should be released in the next major version. tools Issues and PRs related to the tools directory.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Building with full-icu by default