-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add file mapping for windows platform. #12183
Conversation
Do you've a test case? |
… for mis-aligned offset
Hi Pranav, I have added the unit test. Please check. |
…ition for winml compiling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@caoting-dotcom:
Pls fix errors/warnings reported by pipeline asap, we are working on the final round of cherry pick for release.
Interesting, but I couldn't find any documentation on Can it be used to reduce the memory usage during loading of ONNX models? |
* Add file mapping for windows platform. * Add unit test for file mapping for windows. Also add an error message for mis-aligned offset * Add unit test for file mapping for windows. Also add an error message for mis-aligned offset * Update data type to avoid warnings * Compitable data type to avoid warnings. Update CreatFileMapping2 condition for winml compiling. * Add type conversion to avoid warnings for X86 release build. Co-authored-by: Ting Cao <ticao@microsoft.com>
It might need more work. The original design was: if you had a large model, you could split the weights to an external file then use GetFileContent function to load the weights, and leverage memory mapping when possible. For example, if you have multiple processes running on the same machine with the same ML model. Then you may be able to reduce memory usage by only having one copy of the model in memory. |
* support optimizer opt for deepspeed 0.5.9 * resolve comments * resolve comments * FP16_Optimizer Support for more Deepspeed Versions (#12046) * fp16_optimizer for more ds versions * change ds version * bugfix * fix bug * Fix unused function warning for decodeMIDR(). (#12069) Changed from static function defined in header to function declared in header and defined in separate .cc file. * pin protobuf version to be compatible with onnx (#12132) Co-authored-by: Ashwini Khade <askhade@microsoft.com@orttrainingdev10.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> * RoiAlign CPU EP add warning for max mode with samples != 1 (#12136) * RoiAlign add warning about incorrect max summation when sample size not 1 * include coreml_provider_factory.h in macos build instead of coreml_ex… (#12138) include coreml_provider_factory.h in macos build instead of coreml_execution_provider.h * List 3.10 as supported python version and remove 3.6 (#12141) list 3.10 as supported python version and remove 3.6 Co-authored-by: Randy Shuai <rashuai@microsoft.com> * Use updated symbolic_helper.check_training_mode (#11900) Co-authored-by: Jingyan Wang, Baiju Meswani * Fix GH issue 12151 by using inverse perms for updating DQ axis attribute (#12158) * Fix GH issue 12151. Need to use inverse perms for updating that axis to what is used for transposing the input. This only applies if the DQ node is doing per-axis dequantization. * fixing positions for beam search gpt2 (#12156) * fixing positions for beam search gpt2 Co-authored-by: Tianlei Wu <tlwu@microsoft.com> * remove wrong placed libs (#12201) * Add file mapping for windows platform. (#12183) * Add file mapping for windows platform. * Add unit test for file mapping for windows. Also add an error message for mis-aligned offset * Add unit test for file mapping for windows. Also add an error message for mis-aligned offset * Update data type to avoid warnings * Compitable data type to avoid warnings. Update CreatFileMapping2 condition for winml compiling. * Add type conversion to avoid warnings for X86 release build. Co-authored-by: Ting Cao <ticao@microsoft.com> * Fix bug where onnxruntime_USE_NCCL flag would default to ON (#12195) Fix bug where onnxruntime_USE_NCCL flag would default to ON, causing ORT to not build properly. New functionality: flag is ON when training is enabled and NCCL is not disabled. Flag is OFF otherwise Co-authored-by: zhijxu <zhijxu@microsoft.com> Co-authored-by: zhijxu <zhijxu> Co-authored-by: Vincent Wang <wangwchpku@outlook.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Ashwini Khade <askhade@microsoft.com> Co-authored-by: Ashwini Khade <askhade@microsoft.com@orttrainingdev10.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Dwayne Robinson <dwayner@microsoft.com> Co-authored-by: Carson Swope <carsonswope@users.noreply.github.com> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: jingyanwangms <47403504+jingyanwangms@users.noreply.github.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: Viswanath Boga <44417868+viboga@users.noreply.github.com> Co-authored-by: leqiao-1 <61653207+leqiao-1@users.noreply.github.com> Co-authored-by: caoting-dotcom <71617901+caoting-dotcom@users.noreply.github.com> Co-authored-by: Ting Cao <ticao@microsoft.com> Co-authored-by: Sean Murray <59740888+seanmurr1@users.noreply.github.com>
Right, this is where the function got called and the purpose is to reduce memory cost. It was only for POSIX only before. Now it is implemented for Win32. |
Description: Add the MapFileIntoMemor function into windows/env.cc
Motivation and Context
MapFileIntoMemor is to map the file into memory rather than loading it. The file will only be loaded when it is actually used.
This can help reduce the memory cost. We have tested it for Fluency model inference on Win32, and see the memory cost reduction.
MapFileIntoMemor is already supported in posix/env.cc, rather than windows. Our current PR implementation is basically the same as in posix/env.cc.