You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current CUDA graph wrapper create a static input and static output per call to the model.
In decoder, it may create a bunch of tensors, we may want to limit those creations and try to recycle them.