-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addresses #4226 . Fixes problem when loading NormalizerTransformer from disk. #4321
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4321 +/- ##
=========================================
Coverage ? 74.66%
=========================================
Files ? 883
Lines ? 155225
Branches ? 16939
=========================================
Hits ? 115905
Misses ? 34574
Partials ? 4746
|
…h verWrittenCur: 0x00010001
// *** New Binary format *** | ||
// - int: number of dimensions | ||
// - ints: as many as dimensions, each one represent the size of each dimension | ||
// - byte: ItemKind of input column (only R4 and R8 are valid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
byte [](start = 23, length = 4)
If you read (and write) this byte first, then in the !isVector
case you can return itemType
without reading anything else. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done this. Please review if it was what you were expecting. Thanks! #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had 2 minor comments. Once those are addressed, the change looks good to me.
Can you please remove “WIP” from the title? #Resolved |
…sformers, neither by this test.
test/Microsoft.ML.OnnxTransformerTest/Microsoft.ML.OnnxTransformerTest.csproj
Show resolved
Hide resolved
@eerhardt / @antoniovs1029: Is this ready for merge? The CodeCoverage is failing; does that matter? |
I believe so.
No. |
|
||
var dimensions = new int[ndimensions]; | ||
for (int i = 0; i < ndimensions; i++) | ||
dimensions[i] = ctx.Reader.ReadInt32(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use ctx.Reader.ReadIntArray() function, its a util extension function that is designed for reading arrays.
var dims = vectorType.Dimensions; | ||
ctx.Writer.Write(dims.Length); | ||
for (int i = 0; i < dims.Length; i++) | ||
ctx.Writer.Write(dims[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use ctx.Writer.WriteIntArray() function, its a util extension function that is designed for writing arrays.
@justinormont @eerhardt It's not ready to be merged I have few comments that need to be addressed but it should be merged EOD. |
…mer from disk. (dotnet#4321) * Fixed problem when saving and loading normalizer transformers that work on multidimensional vectors * Added test of normalizer loaded from disk for a multidimensional vector * Added test for backward compatibility with normalizer transformer with verWrittenCur: 0x00010001 * Added test reproducing the original scenario * Changed visibility of classes made for test
…mer from disk. (dotnet#4321) * Fixed problem when saving and loading normalizer transformers that work on multidimensional vectors * Added test of normalizer loaded from disk for a multidimensional vector * Added test for backward compatibility with normalizer transformer with verWrittenCur: 0x00010001 * Added test reproducing the original scenario * Changed visibility of classes made for test
Addresses and actually solves issue #4226 by fixing the problem with loading a NormalizerTransformer from disk, where the NormalizerTransformer is supposed to work with multidimensional vectors as input.
Although this PR actually fixes that issue, another problem is described in one comment of the issue. This other problem is related to Resnet18, and is independent of the problem here fixed. Solving any of these two problems actually solve this issue, but perhaps the issue shall remain open until both problems are fixed.
The solution
I changed the way NormalizerTransformers are saved and loaded, so that it included the information of the dimensions of the vector it has as input. This changes maintain backward compatibility in that it is still possible to load models that were saved with previous binary format . But notice that the original problem described in the issue will persist for models that were saved before the changes made in this PR... it would be necessary to save the model with the updated save method in order to avoid that specific problem when working with multidimensional vectors.
I added a 3 test cases: one reproducing a very similar problem as in the original issue, another that tests a simplified version of the original problem, and another that checks that backward compatibility is still maintained.