v1.1.9
What's Changed
- Refactor attention mask and bias handling for efficiency by @LoserCheems in #177
- [BUG FIX] SM80 NaN in bias.grad when both mask and bias are enabled by @LoserCheems in #179
Full Changelog: v1.1.8...v1.1.9
Full Changelog: v1.1.8...v1.1.9