Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invoking LightGBM via JEP has distinct behaviour than invoking outside of JEP (simple python script) #205

Closed
ManuelMourato25 opened this issue Oct 9, 2019 · 11 comments

Comments

@ManuelMourato25
Copy link

Describe the bug
I am trying to call a LightGBM model (https://github.com/microsoft/LightGBM) via JEP, in order to predict the scores of a couple of records.
However, if I execute the Booster.predict command via JEP, it returns a constant score of 0.04742587, for every distinct record passed.
The same does not happen if I invoke LGBM from a Python script.
Any ideas on what the issue might be?
Note: when using JEP to invoke different models, like xgboost, this issue does not happen.

To Reproduce

COMMON STEPS:

Extract the following model inside the zip into a file named m0_test.model:
m0_test.zip

Save the following script as classifier.py, in any folder you wish (replace <PATH_TO_MODEL> accordingly)

import lightgbm as lgb
import numpy as np
import pandas as pd
import random as rd


class Classifier:
    def __init__(self):
        # Necessary file paths
        model_file_name = '<PATH_TO_MODEL>/m0_test.model'

        # Load model.
        self.model = Classifier.get_model(model_file_name)

    def getClassDistribution(self, instances):
        score = self.get_transaction_score(instances)

        print('Score:'+str(score))

        return [np.array([score, 1 - score])]

    def get_transaction_score(self, instances):
        gbm=self.model
        # Get prediction.
        fraud_prediction = gbm.predict(instances, num_iteration=gbm.best_iteration)
        return fraud_prediction.flatten()

    @staticmethod
    def get_model(model_name):
        with open(model_name, 'r') as file:
            data = file.read()
            lgb_model = lgb.Booster(model_str=data)
            return lgb_model

JEP EXECUTION:

  1. Install JEP
git clone https://github.com/ninia/jep.git 
cd jep 
echo "Installing JEP..." 
git checkout v3.7.1 
python setup.py build install 
  1. Set the following environment variables (replace variables inside <> accordingly):
LD_LIBRARY_PATH=<PATH_TO_JEP>/lib/python3.6/site-packages/jep
LD_PRELOAD=<PATH_TO_PYTHON_SO>/lib/libpython3.6m.so
JEP_LOCATION=<PATH_TO_JEP>/lib/python3.6/site-packages/jep
JEP_JAR =${JEP_LOCATION}/jep-3.7.0.jar
TO_PRELOAD=${LD_PRELOAD}
  1. Compile and run the following Java Class (replace <PATH_TO_FOLDER_CONTAINING_CLASSIFIER.PY> accordingly):
package com.feedzai.tests;
import jep.Jep;

public class TestLightGBM {

    public static void main(String[] args) {

        try  {

            Jep jep= new Jep();
            jep.eval("import sys");
           jep.eval("from java.lang import System");
            jep.eval("sys.path.append(\"<PATH_TO_FOLDER_CONTAINING_CLASSIFIER.PY>\")");
            jep.eval("from classifier import Classifier");

            jep.eval("import numpy as np");

            jep.eval("record1 = np.array([[ 4.61935575 ,-5.18927169,  2.74834851,  1.0087401 ,  1.95090556, -3.33563201,"
                   + "1.  ,        2.     ,     2.     ,     2.        ]]) ");
            jep.eval("record2 = np.array([[2.30000000e+01, 2.60000000e+02, 1.00000000e+06, 2.29400000e+02,"
                    +"2.30209137e+07 ,1.09000000e+04 ,2.00000000e+00 ,2.00000000e+00,"
                    +"1.00000000e+00, 2.00000000e+00]])");
            jep.eval( "record3 = np.array([[2.20000000e+01, 2.37000000e+02 ,1.00000100e+06 ,3.50400000e+02,"
                   + "1.90109777e+07, 1.00320000e+04, 1.00000000e+00, 3.00000000e+00,"
                 +   "2.00000000e+00, 5.00000000e+00]])");

            jep.eval( "record4 = np.array([[9.80000000e+01, 2.57000000e+02, 1.00000200e+06, 3.33400000e+02,"
                    +"1.41323727e+07, 1.19990000e+04, 0.00000000e+00, 0.00000000e+00,"
                   +" 2.00000000e+00, 3.00000000e+00]])");
            jep.eval( "record5 = np.array([[1.30000000e+01, 3.17000000e+02, 1.00000300e+06, 6.99400000e+02,"
                   + "3.30892917e+07, 1.03560000e+04, 2.00000000e+00, 2.00000000e+00,"
                   +" 1.00000000e+00, 1.00000000e+00]])");


            jep.eval("test = Classifier()");
            jep.eval("test.getClassDistribution(record1)[0].item(0)");
            jep.eval("test.getClassDistribution(record2)[0].item(0)");
            jep.eval("test.getClassDistribution(record3)[0].item(0)");
            jep.eval("test.getClassDistribution(record4)[0].item(0)");
            jep.eval("test.getClassDistribution(record5)[0].item(0)");

        }
        catch(Exception e){
            System.out.println("Error");
            System.out.println(e);

        }
    }



}

See that the return score is always constant independent of the record classified:

Finished loading model, total used 200 iterations
Score:[0.04742587]
0.04742587317756678
Score:[0.04742587]
0.04742587317756678
Score:[0.04742587]
0.04742587317756678
Score:[0.04742587]
0.04742587317756678
Score:[0.04742587]
0.04742587317756678

PYTHON EXECUTION:

  1. Open the Python console
  2. Type the folowing code, equivalent to the example above. (replace <PATH_TO_FOLDER_CONTAINING_CLASSIFIER.PY> accordingly ):
import sys
sys.path.append("<PATH_TO_FOLDER_CONTAINING_CLASSIFIER.PY>") # ex: /opt/folder/
from classifier import Classifier
import numpy as np
record1 = np.array([[ 4.61935575 ,-5.18927169,  2.74834851,  1.0087401 ,  1.95090556, -3.33563201,
   1.  ,        2.     ,     2.     ,     2.        ]])
record2 = np.array([[2.30000000e+01, 2.60000000e+02, 1.00000000e+06, 2.29400000e+02,
  2.30209137e+07 ,1.09000000e+04 ,2.00000000e+00 ,2.00000000e+00,
  1.00000000e+00, 2.00000000e+00]])
record3 = np.array([[2.20000000e+01, 2.37000000e+02 ,1.00000100e+06 ,3.50400000e+02,
  1.90109777e+07, 1.00320000e+04, 1.00000000e+00, 3.00000000e+00,
  2.00000000e+00, 5.00000000e+00]])
record4 = np.array([[9.80000000e+01, 2.57000000e+02, 1.00000200e+06, 3.33400000e+02,
  1.41323727e+07, 1.19990000e+04, 0.00000000e+00, 0.00000000e+00,
  2.00000000e+00, 3.00000000e+00]])
record5 = np.array([[1.30000000e+01, 3.17000000e+02, 1.00000300e+06, 6.99400000e+02,
  3.30892917e+07, 1.03560000e+04, 2.00000000e+00, 2.00000000e+00,
  1.00000000e+00, 1.00000000e+00]])

test = Classifier()
print(test.getClassDistribution(record1)[0].item(0))
print(test.getClassDistribution(record2)[0].item(0))
print(test.getClassDistribution(record3)[0].item(0))
print(test.getClassDistribution(record4)[0].item(0))
print(test.getClassDistribution(record5)[0].item(0))
  1. The instances are now correctly scored:
>>> print(test.getClassDistribution(record1)[0].item(0))
Score:[1.07931287e-09]
1.0793128698291579e-09
>>> print(test.getClassDistribution(record2)[0].item(0))
Score:[1.25450336e-09]
1.2545033620809255e-09
>>> print(test.getClassDistribution(record3)[0].item(0))
Score:[1.1813487e-09]
1.1813487042059903e-09
>>> print(test.getClassDistribution(record4)[0].item(0))
Score:[1.1813487e-09]
1.1813487042058477e-09
>>> print(test.getClassDistribution(record5)[0].item(0))
Score:[9.23272512e-10]
9.232725116775672e-10

Environment (please complete the following information):

  • OS Platform, Distribution, and Version: Ubuntu 18.04
  • Python Distribution and Version: 3.6.9
  • Java Distribution and Version: 1.8.0
  • Jep Version: 3.7.1
  • Python packages used (e.g. numpy, pandas, tensorflow): numpy, lightgbm
@ManuelMourato25
Copy link
Author

Linked with the following issue: microsoft/LightGBM#2500

@ndjensen
Copy link
Member

ndjensen commented Oct 9, 2019

Is there a chance you could test with Jep 3.9 and the jep console on the command line and report the results? Use a virtualenv to avoid messing with your current install? The jep console is similar to the Python console, so you could reuse the code you have in the Python Execution section above in the jep console.

@bsteffensmeier
Copy link
Member

Thanks for including a full repeatable test case. I downloaded your model and copied your code for classifier.py and TestLightGBM.java and I was able to run it successfully. Unfortunately I did not see the same output as you from java, the scores I got from java/Jep match what you got from Python. I suspect there must be something different in your environment that is causing your problems.

For reference, I am also on Ubuntu 18.04 and my python is 3.6.7. I first tried jep 3.9.0 with OpenJDK 11 and when that didn't produce the error I tried jep 3.7.1 with OpenJDK 8. I just used pip to install lightgbm and I am getting version 2.3.0. Initially I commented out the pandas import in classifier.py since I didn't have pandas, but I even installed pandas and added the import back and it still works for me.

Since I cannot repeat the problem I can only speculate on what might be causing it. My best guess is that there is something on your Java classpath that is interfering with the python imports used by lightgbm. Jep looks at everything in your classpath when processing a python import and we have run into issues before where a java package will override a python package with the same name and cause problems in python. In my test the only two things on my classpath were the jep jar and the directory with the TestLightGBM.class. You didn't mention your exact classpath so it is possible yours is more complicated, if you trim it down and it fixes the problems that would help isolate the problem.

@ManuelMourato25
Copy link
Author

Thank you for your answers.
That is very curious @bsteffensmeier . In my environment I also only have the jep jar in my classpath.

This was the test I did with jep version 3.7.1, in the Jep console :

(environment) manuel.mourato@ldaptop-2307:~/gitlab/jpmml-lightgbm$ env | grep CLASS
CLASSPATH=/home/manuel.mourato/.m2/repository/black/ninia/jep/3.7.0/jep-3.7.0.jar
(environment) manuel.mourato@ldaptop-2307:~/gitlab/jpmml-lightgbm$ 
(environment) manuel.mourato@ldaptop-2307:~/gitlab/jpmml-lightgbm$ 
(environment) manuel.mourato@ldaptop-2307:~/gitlab/jpmml-lightgbm$ jep
>>> import sys
>>> sys.path.append("/home/manuel.mourato/script/")
>>> from classifier import Classifier
>>> import numpy as np
>>> record2 = np.array([[2.30000000e+01, 2.60000000e+02, 1.00000000e+06, 2.29400000e+02,
...   2.30209137e+07 ,1.09000000e+04 ,2.00000000e+00 ,2.00000000e+00,
...   1.00000000e+00, 2.00000000e+00]])
... 
>>> test = Classifier()
Finished loading model, total used 200 iterations
>>> print(test.getClassDistribution(record2)[0].item(0))
Score:[0.04742587]
0.04742587317756678

@ManuelMourato25
Copy link
Author

@bsteffensmeier , just to give you some feedback on this matter: I have created a docker container with ubuntu 18.04 , plus all the libs I listed above, and it **worked as expected inside the docker container ** .
I then packed the environment I was using in the docker, using conda-pack (a virtualEnv) , activated that environment in my local machine, and got the error scenario (constant score).
At first I thought it was some env variable in my machine that was causing the issue, but I tried the same approach in 3 different physical machines, and I got the same error scenario.
Then I thought it might be from the conda-pack that I was using, that maybe it would work if I installed the libs in my local Ubuntu base environment, but I got the same error.

I really am at a loss with this, because I don't know what else to validate.
The only thing I can think of is that , in all the physical machines I tested, I have multiple python versions installed, but I always specify to JEP which python version to run, so that shouldnt be the issue.

Any ideas on this? @ndjensen and @bsteffensmeier ?

@bsteffensmeier
Copy link
Member

I would just try validating all your assumptions about what is loaded. I think the multiple python versions has potential for problems but it looks like you have everything configured for that to not matter. I recommend going through your test on the jep console and then validating every package involved loads from where you think it should, something like this:

>>> import lightgbm as lgb
>>> lgb.__file__
'/home/ben/.local/lib/python3.6/site-packages/lightgbm/__init__.py'
>>> jep.__file__
'/home/ben/.local/lib/python3.6/site-packages/jep/__init__.py'
>>> np.__file__
'/home/ben/.local/lib/python3.6/site-packages/numpy/__init__.py'

You could also check the loaded libraries from another terminal with the pid, something like lsof -p 22904 | grep -E "libjep|lightgbm|multiarray". It might be worthwhile to compare every open library from lsof with your container and see if any libraries are loaded differently.

I don't know anything about lightgbm so this might not make sense, but the fact you don't get errors but get the same answer every time makes me think there is something wrong with the model. Is there any chance it got truncated or somehow simplified? If you have any way to get more information about the model then you might try to investigate into it and see if that is different. I'm just thinking maybe the problem is more with the model loading rather than the actual model execution, although I don't know that the distinction gets you any closer to a solution.

@ManuelMourato25
Copy link
Author

@bsteffensmeier that last part about the model loading was a correct assumption!
I loaded the model via JEP, then saved that same model to a different file. Then, using simple Python , without Jep, I loaded the model saved in the last step (which should be exactly the same as the initial one) and now it outputed constant scores in the Python execution!
Of course, it is as you say, it only changes the origin of the problem, it is not the model execution but the model loading that is the issue.
And it doesnt explain why it works on docker but not in the local env.
But thank you very much anyway, it is a starting point at least.

@ManuelMourato25
Copy link
Author

@bsteffensmeier found the problem.
Apparently Jep was interpreting numeric values differently when loading the model, because of the following env variables being in european portugues:

export LC_PAPER=pt_PT.UTF-8;
export LC_ADDRESS=pt_PT.UTF-8;
export LC_MONETARY=pt_PT.UTF-8;
export LC_NUMERIC=pt_PT.UTF-8;
export LC_TELEPHONE=pt_PT.UTF-8;
export LC_IDENTIFICATION=pt_PT.UTF-8;
export LANG=pt_PT.UTF-8;
export LC_MEASUREMENT=pt_PT.UTF-8;
export LC_CTYPE=pt_PT.UTF-8;
export LC_TIME=pt_PT.UTF-8;
export LC_NAME=pt_PT.UTF-8;

If I change the default value to C, or US, it works as expected.

I don't know if this qualifies as a JEP bug, but it certainly seems so.

Thank you for all your help.

@bsteffensmeier
Copy link
Member

Excellent, thanks for getting back to us with the answer.

It looks like Jep is interpreting your locale from your environment while your Python is ignoring it. It feels like it is more correct to respect your environment. We generally try to match Python though so if we can figure out why Python is doing what it is then we could consider emulating it or providing some mechanism to override it. I suspect that python itself is not ignoring the environment but perhaps something in conda is reseting it. If you try the test with the system python and the Portuguese local does it read the model correctly or do you get constant results?

I'd also be curious about what the locale module reports when it is working correctly. The documentation for that module mentions embedded applications but it seems to be focused on disabling it and I don't think that would help you.

@bsteffensmeier
Copy link
Member

I tried it on my system and even on the default system Python it works correctly when your environment is Portuguese, it turns out Python always uses the default 'C' locale regardless of your environment. Interestingly if you try to actually use a locale in your Python then lightgbm will break, like this:

Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from classifier import Classifier
>>> import numpy as np
>>> record1 = np.array([[ 4.61935575 ,-5.18927169,  2.74834851,  1.0087401 ,  1.95090556, -3.33563201,
...    1.  ,        2.     ,     2.     ,     2.        ]])
>>> test = Classifier()
Finished loading model, total used 200 iterations
>>> print(test.getClassDistribution(record1)[0].item(0))
Score:[1.07931287e-09]
1.0793128698291579e-09
>>> import locale
>>> locale.resetlocale()
>>> locale.getlocale()
('pt_PT', 'UTF-8')
>>> test = Classifier()
Finished loading model, total used 200 iterations
>>> print(test.getClassDistribution(record1)[0].item(0))
Score:[0.04742587]
0.04742587317756678

This could be considered a bug in lightgbm, I would expect a file format to be portable and to work regardless of a users locale settings.

You can also use the locale module to fix jep:

>>> from classifier import Classifier
>>> import numpy as np
>>> record1 = np.array([[ 4.61935575 ,-5.18927169,  2.74834851,  1.0087401 ,  1.95090556, -3.33563201,
...    1.  ,        2.     ,     2.     ,     2.        ]])
... 
>>> test = Classifier()
Finished loading model, total used 200 iterations
>>> print(test.getClassDistribution(record1)[0].item(0))
Score:[0.04742587]
0.04742587317756678
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'C')
'C'
>>> test = Classifier()
Finished loading model, total used 200 iterations
>>> print(test.getClassDistribution(record1)[0].item(0))
Score:[1.07931287e-09]
1.0793128698291579e-09

I am still unsure what, if anything, should change within jep. We could set the locale during initialization but I've noticed slight differences in locale.getlocale() between the default Python and a manually configured jep. I also found PEP 587, which gives us the ability to set the locale with the exact code python uses. We need to look closer at that pep in general since other options may be worth using or exposing to Java users but I would prefer to wait until we drop support for every Python version before 3.8 so we don't have to figure out how to handle options without PEP 587.

I am hesitant to make any changes to the current behaviour because the locale settings affect the entire application and could have side affects in the JVM or other JNI libraries. It seems rude for an embedded library to be changing process wide settings. It appears the reason Python isn't automatically making locale changes is because their developers feel that is bad behaviour for a library, From Python Issue 24589:

I still think the current Python 3.7 behaviour makes the CPython runtime a badly behaved C library, since we may end up changing the active libc locale even when it isn't the main application, and a change hasn't been explicitly requested by the embedding app.

For now my recommendation would be that any application using jep that is sensitive to the locale should be calling locale.setlocale() which will ensure consistent behaviour in both Python and jep.

@AlbertoEAF
Copy link

AlbertoEAF commented Mar 18, 2020

Hello @ManuelMourato25, @bsteffensmeier, the issue is fixed now, locale no longer impacts the model read/write process by design in the core implementation.

Check

Can you close this? :)

AlbertoEAF added a commit to feedzai/LightGBM that referenced this issue Sep 23, 2020
When Java is used, the default C++ locale is broken. This is true for
Java providers that use the C API or even Python models that require JEP.

This patch solves that issue making the model reads/writes insensitive
to such settings.
To achieve it, within the model read/write codebase:
 - C++ streams are imbued with the classic locale
 - Calls to functions that are dependent on the locale are replaced
 - The default locale is not changed!

This approach means:
 - The user's locale is never tampered with, avoiding issues such as
    microsoft#2979 with the previous
    approach microsoft#2891
 - Datasets can still be read according the user's locale
 - The model file has a single format independent of locale

Changes:
 - Add CommonC namespace which provides faster locale-independent versions of Common's methods
 - Model code makes conversions through CommonC
 - Cleanup unused Common methods
 - Performance improvements. Use fast libraries for locale-agnostic conversion:
   - value->string: https://github.com/fmtlib/fmt
   - string->double: https://github.com/lemire/fast_double_parser (10x
      faster double parsing according to their benchmark)

Bugfixes:
 - microsoft#2500
 - microsoft#2890
 - ninia/jep#205 (as it is related to LGBM as well)
AlbertoEAF added a commit to feedzai/LightGBM that referenced this issue Nov 7, 2020
When Java is used, the default C++ locale is broken. This is true for
Java providers that use the C API or even Python models that require JEP.

This patch solves that issue making the model reads/writes insensitive
to such settings.
To achieve it, within the model read/write codebase:
 - C++ streams are imbued with the classic locale
 - Calls to functions that are dependent on the locale are replaced
 - The default locale is not changed!

This approach means:
 - The user's locale is never tampered with, avoiding issues such as
    microsoft#2979 with the previous
    approach microsoft#2891
 - Datasets can still be read according the user's locale
 - The model file has a single format independent of locale

Changes:
 - Add CommonC namespace which provides faster locale-independent versions of Common's methods
 - Model code makes conversions through CommonC
 - Cleanup unused Common methods
 - Performance improvements. Use fast libraries for locale-agnostic conversion:
   - value->string: https://github.com/fmtlib/fmt
   - string->double: https://github.com/lemire/fast_double_parser (10x
      faster double parsing according to their benchmark)

Bugfixes:
 - microsoft#2500
 - microsoft#2890
 - ninia/jep#205 (as it is related to LGBM as well)
AlbertoEAF added a commit to feedzai/LightGBM that referenced this issue Nov 9, 2020
When Java is used, the default C++ locale is broken. This is true for
Java providers that use the C API or even Python models that require JEP.

This patch solves that issue making the model reads/writes insensitive
to such settings.
To achieve it, within the model read/write codebase:
 - C++ streams are imbued with the classic locale
 - Calls to functions that are dependent on the locale are replaced
 - The default locale is not changed!

This approach means:
 - The user's locale is never tampered with, avoiding issues such as
    microsoft#2979 with the previous
    approach microsoft#2891
 - Datasets can still be read according the user's locale
 - The model file has a single format independent of locale

Changes:
 - Add CommonC namespace which provides faster locale-independent versions of Common's methods
 - Model code makes conversions through CommonC
 - Cleanup unused Common methods
 - Performance improvements. Use fast libraries for locale-agnostic conversion:
   - value->string: https://github.com/fmtlib/fmt
   - string->double: https://github.com/lemire/fast_double_parser (10x
      faster double parsing according to their benchmark)

Bugfixes:
 - microsoft#2500
 - microsoft#2890
 - ninia/jep#205 (as it is related to LGBM as well)
AlbertoEAF added a commit to feedzai/LightGBM that referenced this issue Nov 16, 2020
When Java is used, the default C++ locale is broken. This is true for
Java providers that use the C API or even Python models that require JEP.

This patch solves that issue making the model reads/writes insensitive
to such settings.
To achieve it, within the model read/write codebase:
 - C++ streams are imbued with the classic locale
 - Calls to functions that are dependent on the locale are replaced
 - The default locale is not changed!

This approach means:
 - The user's locale is never tampered with, avoiding issues such as
    microsoft#2979 with the previous
    approach microsoft#2891
 - Datasets can still be read according the user's locale
 - The model file has a single format independent of locale

Changes:
 - Add CommonC namespace which provides faster locale-independent versions of Common's methods
 - Model code makes conversions through CommonC
 - Cleanup unused Common methods
 - Performance improvements. Use fast libraries for locale-agnostic conversion:
   - value->string: https://github.com/fmtlib/fmt
   - string->double: https://github.com/lemire/fast_double_parser (10x
      faster double parsing according to their benchmark)

Bugfixes:
 - microsoft#2500
 - microsoft#2890
 - ninia/jep#205 (as it is related to LGBM as well)
AlbertoEAF added a commit to feedzai/LightGBM that referenced this issue Nov 21, 2020
When Java is used, the default C++ locale is broken. This is true for
Java providers that use the C API or even Python models that require JEP.

This patch solves that issue making the model reads/writes insensitive
to such settings.
To achieve it, within the model read/write codebase:
 - C++ streams are imbued with the classic locale
 - Calls to functions that are dependent on the locale are replaced
 - The default locale is not changed!

This approach means:
 - The user's locale is never tampered with, avoiding issues such as
    microsoft#2979 with the previous
    approach microsoft#2891
 - Datasets can still be read according the user's locale
 - The model file has a single format independent of locale

Changes:
 - Add CommonC namespace which provides faster locale-independent versions of Common's methods
 - Model code makes conversions through CommonC
 - Cleanup unused Common methods
 - Performance improvements. Use fast libraries for locale-agnostic conversion:
   - value->string: https://github.com/fmtlib/fmt
   - string->double: https://github.com/lemire/fast_double_parser (10x
      faster double parsing according to their benchmark)

Bugfixes:
 - microsoft#2500
 - microsoft#2890
 - ninia/jep#205 (as it is related to LGBM as well)
AlbertoEAF added a commit to feedzai/LightGBM that referenced this issue Nov 24, 2020
When Java is used, the default C++ locale is broken. This is true for
Java providers that use the C API or even Python models that require JEP.

This patch solves that issue making the model reads/writes insensitive
to such settings.
To achieve it, within the model read/write codebase:
 - C++ streams are imbued with the classic locale
 - Calls to functions that are dependent on the locale are replaced
 - The default locale is not changed!

This approach means:
 - The user's locale is never tampered with, avoiding issues such as
    microsoft#2979 with the previous
    approach microsoft#2891
 - Datasets can still be read according the user's locale
 - The model file has a single format independent of locale

Changes:
 - Add CommonC namespace which provides faster locale-independent versions of Common's methods
 - Model code makes conversions through CommonC
 - Cleanup unused Common methods
 - Performance improvements. Use fast libraries for locale-agnostic conversion:
   - value->string: https://github.com/fmtlib/fmt
   - string->double: https://github.com/lemire/fast_double_parser (10x
      faster double parsing according to their benchmark)

Bugfixes:
 - microsoft#2500
 - microsoft#2890
 - ninia/jep#205 (as it is related to LGBM as well)
guolinke pushed a commit to microsoft/LightGBM that referenced this issue Dec 8, 2020
* Fix LightGBM models locale sensitivity and improve R/W performance.

When Java is used, the default C++ locale is broken. This is true for
Java providers that use the C API or even Python models that require JEP.

This patch solves that issue making the model reads/writes insensitive
to such settings.
To achieve it, within the model read/write codebase:
 - C++ streams are imbued with the classic locale
 - Calls to functions that are dependent on the locale are replaced
 - The default locale is not changed!

This approach means:
 - The user's locale is never tampered with, avoiding issues such as
    #2979 with the previous
    approach #2891
 - Datasets can still be read according the user's locale
 - The model file has a single format independent of locale

Changes:
 - Add CommonC namespace which provides faster locale-independent versions of Common's methods
 - Model code makes conversions through CommonC
 - Cleanup unused Common methods
 - Performance improvements. Use fast libraries for locale-agnostic conversion:
   - value->string: https://github.com/fmtlib/fmt
   - string->double: https://github.com/lemire/fast_double_parser (10x
      faster double parsing according to their benchmark)

Bugfixes:
 - #2500
 - #2890
 - ninia/jep#205 (as it is related to LGBM as well)

* Align CommonC namespace

* Add new external_libs/ to python setup

* Try fast_double_parser fix #1

Testing commit e09e5aad828bcb16bea7ed0ed8322e019112fdbe

If it works it should fix more LGBM builds

* CMake: Attempt to link fmt without explicit PUBLIC tag

* Exclude external_libs from linting

* Add exernal_libs to MANIFEST.in

* Set dynamic linking option for fmt.

* linting issues

* Try to fix lint includes

* Try to pass fPIC with static fmt lib

* Try CMake P_I_C option with fmt library

* [R-package] Add CMake support for R and CRAN

* Cleanup CMakeLists

* Try fmt hack to remove stdout

* Switch to header-only mode

* Add PRIVATE argument to target_link_libraries

* use fmt in header-only mode

* Remove CMakeLists comment

* Change OpenMP to PUBLIC linking in Mac

* Update fmt submodule to 7.1.2

* Use fmt in header-only-mode

* Remove fmt from CMakeLists.txt

* Upgrade fast_double_parser to v0.2.0

* Revert "Add PRIVATE argument to target_link_libraries"

This reverts commit 3dd45dd.

* Address James Lamb's comments

* Update R-package/.Rbuildignore

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Upgrade to fast_double_parser v0.3.0 - Solaris support

* Use legacy code only in Solaris

* Fix lint issues

* Fix comment

* Address StrikerRUS's comments (solaris ifdef).

* Change header guards

Co-authored-by: James Lamb <jaylamb20@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants