407ff0f#diff-26611f6be759237464a03bb1328cbc16555888836b3504dc3703e2e25d2a3ca3
@ShadenSmith This commit prohibits the use of tuple when the tensor model parallel size is greater than 1 and using pipeline parallelism. Can you explain why tuple can not be used? I think this is too strong constraint. I think that I can't do anything with this module.