ONNX model for OneHotVectorizer produces different result

Repro
`from nimbusml.datasets import get_dataset
from nimbusml.preprocessing import OnnxRunner
from nimbusml.feature_extraction.categorical import OneHotVectorizer

infert_df = get_dataset("infert").as_df()
infert_df.columns = [i.replace(': ', '') for i in infert_df.columns]
infert_df.rename(columns={'case': 'Label'}, inplace=True)

transform = OneHotVectorizer() << 'education_str'
print(transform.fit_transform(infert_df))
transform.export_to_onnx("test.onnx", 'com.microsoft.ml')
onnx_runner = OnnxRunner(model_file="test.onnx")
print(onnx_runner.fit_transform(infert_df))`

Output:
     row_num  education   age  ...  education_str.0-5yrs  education_str.6-11yrs  education_str.12+ yrs
0          1        0.0  26.0  ...                   1.0                    0.0                    0.0
1          2        0.0  42.0  ...                   1.0                    0.0                    0.0
2          3        0.0  39.0  ...                   1.0                    0.0                    0.0
3          4        0.0  34.0  ...                   1.0                    0.0                    0.0
4          5        2.0  35.0  ...                   0.0                    1.0                    0.0
..       ...        ...   ...  ...                   ...                    ...                    ...
243      244        1.0  31.0  ...                   0.0                    0.0                    1.0
244      245        1.0  34.0  ...                   0.0                    0.0                    1.0
245      246        1.0  35.0  ...                   0.0                    0.0                    1.0
246      247        1.0  29.0  ...                   0.0                    0.0                    1.0
247      248        1.0  23.0  ...                   0.0                    0.0                    1.0

[248 rows x 12 columns]
     row_num  education   age  ...  education_str.onnx.0  education_str.onnx.1  education_str.onnx.2
0          1        0.0  26.0  ...                   0.0                   1.0                   0.0
1          2        0.0  42.0  ...                   0.0                   1.0                   0.0
2          3        0.0  39.0  ...                   0.0                   1.0                   0.0
3          4        0.0  34.0  ...                   0.0                   1.0                   0.0
4          5        2.0  35.0  ...                   0.0                   0.0                   1.0
..       ...        ...   ...  ...                   ...                   ...                   ...
243      244        1.0  31.0  ...                   0.0                   0.0                   0.0
244      245        1.0  34.0  ...                   0.0                   0.0                   0.0
245      246        1.0  35.0  ...                   0.0                   0.0                   0.0
246      247        1.0  29.0  ...                   0.0                   0.0                   0.0
247      248        1.0  23.0  ...                   0.0                   0.0                   0.0

[248 rows x 22 columns]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ONNX model for OneHotVectorizer produces different result #429

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ONNX model for OneHotVectorizer produces different result #429

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions