Enable PyTorch Models to Share Weights #4123

dyastremsky · 2022-03-28T23:22:20Z

This PR creates tests for an associated PyTorch backend change, which allows a Triton user to enable multiple instances of a model on the same device to share weights. This is turned off by default and can be enabled via a model config parameter, "ENABLE_WEIGHT_SHARING."

Enabling weight reuse can reduce memory usage of model loading and inference. It should not be used with models that maintain state (due to the reusing of weights).

Related backend code and documentation change: triton-inference-server/pytorch_backend#54

qa/L0_libtorch_shared_weights/libtorch_shared_weights_test.py

qa/L0_libtorch_shared_weights/test.sh

joaopcm1996 · 2023-03-22T18:10:46Z

Does weight sharing mean that multiple instances of a model will not truly run in parallel in their separate CUDA streams?

dyastremsky · 2023-03-22T18:26:59Z

@joaopcm1996 It should not have an impact. Since the models are already trained, we're just sharing the constant (read-only) values.

dyastremsky requested review from CoderHam and tanmayv25 March 28, 2022 23:26

dyastremsky mentioned this pull request Mar 28, 2022

Share Model Weights for Instances on the Same Device triton-inference-server/pytorch_backend#54

Merged

dyastremsky requested a review from jbkyang-nvi March 28, 2022 23:29

tanmayv25 requested changes Mar 29, 2022

View reviewed changes

qa/L0_libtorch_shared_weights/libtorch_shared_weights_test.py Outdated Show resolved Hide resolved

qa/L0_libtorch_shared_weights/test.sh Outdated Show resolved Hide resolved

qa/L0_libtorch_shared_weights/test.sh Show resolved Hide resolved

dyastremsky requested a review from tanmayv25 March 29, 2022 18:32

tanmayv25 reviewed Mar 29, 2022

View reviewed changes

qa/L0_libtorch_shared_weights/test.sh Outdated Show resolved Hide resolved

dyastremsky requested a review from tanmayv25 March 29, 2022 23:33

CoderHam previously approved these changes Mar 30, 2022

View reviewed changes

dyastremsky added 9 commits March 30, 2022 11:23

Added test before adding weight sharing.

f3e3aca

Added log checks to ensure weights are reused.

9128fcb

Fixed wording.

53248ee

New line at end of file.

22ae915

Remove notes

da1daf4

Removed parallel request checks/comments.

22aaa80

Increased test concurrency, instances.

2123138

Added CPU test.

09e7c03

Made GPU/CPU test one loop

d3f98ee

dyastremsky dismissed CoderHam’s stale review via d3f98ee March 30, 2022 18:23

dyastremsky force-pushed the dyas-share-weights branch from 328c199 to d3f98ee Compare March 30, 2022 18:23

tanmayv25 approved these changes Mar 30, 2022

View reviewed changes

CoderHam approved these changes Mar 30, 2022

View reviewed changes

dyastremsky merged commit b30700f into main Apr 1, 2022

dyastremsky deleted the dyas-share-weights branch April 1, 2022 16:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable PyTorch Models to Share Weights #4123

Enable PyTorch Models to Share Weights #4123

dyastremsky commented Mar 28, 2022 •

edited

Loading

joaopcm1996 commented Mar 22, 2023

dyastremsky commented Mar 22, 2023

Enable PyTorch Models to Share Weights #4123

Enable PyTorch Models to Share Weights #4123

Conversation

dyastremsky commented Mar 28, 2022 • edited Loading

joaopcm1996 commented Mar 22, 2023

dyastremsky commented Mar 22, 2023

dyastremsky commented Mar 28, 2022 •

edited

Loading