Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add file mapping for windows platform. #12183

Merged
merged 7 commits into from
Jul 18, 2022
Merged

Add file mapping for windows platform. #12183

merged 7 commits into from
Jul 18, 2022

Conversation

caoting-dotcom
Copy link
Contributor

Description: Add the MapFileIntoMemor function into windows/env.cc

Motivation and Context
MapFileIntoMemor is to map the file into memory rather than loading it. The file will only be loaded when it is actually used.
This can help reduce the memory cost. We have tested it for Fluency model inference on Win32, and see the memory cost reduction.

MapFileIntoMemor is already supported in posix/env.cc, rather than windows. Our current PR implementation is basically the same as in posix/env.cc.

@pranavsharma
Copy link
Contributor

Do you've a test case?

@caoting-dotcom
Copy link
Contributor Author

Do you've a test case?

Hi Pranav, I have added the unit test. Please check.

pranavsharma
pranavsharma previously approved these changes Jul 15, 2022
Copy link
Contributor

@RandySheriffH RandySheriffH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caoting-dotcom:
Pls fix errors/warnings reported by pipeline asap, we are working on the final round of cherry pick for release.

@RandySheriffH RandySheriffH merged commit 4d38b84 into master Jul 18, 2022
@RandySheriffH RandySheriffH deleted the ticao/winfilemap branch July 18, 2022 16:24
@chausner
Copy link
Contributor

Interesting, but I couldn't find any documentation on MapFileIntoMemory. Could you explain how one one would leverage this function in practice? I didn't find any code in onnxruntime calling this function so I assume users are supposed to call it.

Can it be used to reduce the memory usage during loading of ONNX models?

RandySheriffH pushed a commit that referenced this pull request Jul 18, 2022
* Add file mapping for windows platform.

* Add unit test for file mapping for windows. Also add an error message for mis-aligned offset

* Add unit test for file mapping for windows. Also add an error message for mis-aligned offset

* Update data type to avoid warnings

* Compitable data type to avoid warnings. Update CreatFileMapping2 condition for winml compiling.

* Add type conversion to avoid warnings for X86 release build.

Co-authored-by: Ting Cao <ticao@microsoft.com>
@snnn
Copy link
Member

snnn commented Jul 18, 2022

Interesting, but I couldn't find any documentation on MapFileIntoMemory. Could you explain how one one would leverage this function in practice? I didn't find any code in onnxruntime calling this function so I assume users are supposed to call it.

Can it be used to reduce the memory usage during loading of ONNX models?

See: https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/framework/tensorprotoutils.cc#L567

It might need more work. The original design was: if you had a large model, you could split the weights to an external file then use GetFileContent function to load the weights, and leverage memory mapping when possible. For example, if you have multiple processes running on the same machine with the same ML model. Then you may be able to reduce memory usage by only having one copy of the model in memory.

RandySheriffH added a commit that referenced this pull request Jul 19, 2022
* support optimizer opt for deepspeed 0.5.9

* resolve comments

* resolve comments

* FP16_Optimizer Support for more Deepspeed Versions (#12046)

* fp16_optimizer for more ds versions

* change ds version

* bugfix

* fix bug

* Fix unused function warning for decodeMIDR(). (#12069)

Changed from static function defined in header to function declared in header and defined in separate .cc file.

* pin protobuf version to be compatible with onnx (#12132)

Co-authored-by: Ashwini Khade <askhade@microsoft.com@orttrainingdev10.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>

* RoiAlign CPU EP add warning for max mode with samples != 1 (#12136)

* RoiAlign add warning about incorrect max summation when sample size not 1

* include coreml_provider_factory.h in macos build instead of coreml_ex… (#12138)

include coreml_provider_factory.h in macos build instead of coreml_execution_provider.h

* List 3.10 as supported python version and remove 3.6 (#12141)

list 3.10 as supported python version and remove 3.6

Co-authored-by: Randy Shuai <rashuai@microsoft.com>

* Use updated symbolic_helper.check_training_mode (#11900)

Co-authored-by: Jingyan Wang, Baiju Meswani

* Fix GH issue 12151 by using inverse perms for updating DQ axis attribute (#12158)

* Fix GH issue 12151.

Need to use inverse perms for updating that axis to what is used for transposing the input. This only applies if the DQ node is doing per-axis dequantization.

* fixing positions for beam search gpt2 (#12156)

* fixing positions for beam search gpt2
Co-authored-by: Tianlei Wu <tlwu@microsoft.com>

* remove wrong placed libs (#12201)

* Add file mapping for windows platform. (#12183)

* Add file mapping for windows platform.

* Add unit test for file mapping for windows. Also add an error message for mis-aligned offset

* Add unit test for file mapping for windows. Also add an error message for mis-aligned offset

* Update data type to avoid warnings

* Compitable data type to avoid warnings. Update CreatFileMapping2 condition for winml compiling.

* Add type conversion to avoid warnings for X86 release build.

Co-authored-by: Ting Cao <ticao@microsoft.com>

* Fix bug where onnxruntime_USE_NCCL flag would default to ON (#12195)

Fix bug where onnxruntime_USE_NCCL flag would default to ON, causing ORT to not build properly. New functionality: flag is ON when training is enabled and NCCL is not disabled. Flag is OFF otherwise

Co-authored-by: zhijxu <zhijxu@microsoft.com>
Co-authored-by: zhijxu <zhijxu>
Co-authored-by: Vincent Wang <wangwchpku@outlook.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Ashwini Khade <askhade@microsoft.com>
Co-authored-by: Ashwini Khade <askhade@microsoft.com@orttrainingdev10.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
Co-authored-by: Carson Swope <carsonswope@users.noreply.github.com>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
Co-authored-by: jingyanwangms <47403504+jingyanwangms@users.noreply.github.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: Viswanath Boga <44417868+viboga@users.noreply.github.com>
Co-authored-by: leqiao-1 <61653207+leqiao-1@users.noreply.github.com>
Co-authored-by: caoting-dotcom <71617901+caoting-dotcom@users.noreply.github.com>
Co-authored-by: Ting Cao <ticao@microsoft.com>
Co-authored-by: Sean Murray <59740888+seanmurr1@users.noreply.github.com>
@caoting-dotcom
Copy link
Contributor Author

Interesting, but I couldn't find any documentation on MapFileIntoMemory. Could you explain how one one would leverage this function in practice? I didn't find any code in onnxruntime calling this function so I assume users are supposed to call it.
Can it be used to reduce the memory usage during loading of ONNX models?

See: https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/framework/tensorprotoutils.cc#L567

It might need more work. The original design was: if you had a large model, you could split the weights to an external file then use GetFileContent function to load the weights, and leverage memory mapping when possible. For example, if you have multiple processes running on the same machine with the same ML model. Then you may be able to reduce memory usage by only having one copy of the model in memory.

Right, this is where the function got called and the purpose is to reduce memory cost. It was only for POSIX only before. Now it is implemented for Win32.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants