Skip to content

Commit 9a7e4f8

Browse files
committed
Some mixed precision edits.
1 parent 8279c0b commit 9a7e4f8

File tree

1 file changed

+3
-13
lines changed

1 file changed

+3
-13
lines changed

book/mixed-precision.md

Lines changed: 3 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,10 @@
11
# Mixed Precision
22

3-
**TLDR**: the `torch.cuda.amp` mixed-precision training module forthcoming in PyTorch 1.6 delivers on its promise, delivering speed-ups of 50-60% in large model training jobs with just a handful of new lines of code.
4-
5-
One of the most exciting additions expected to land in PyTorch 1.6, coming soon, is support for automatic mixed-precision training.
6-
73
**Mixed-precision training** is a technique for substantially reducing neural net training time by performing as many operations as possible in half-precision floating point, `fp16`, instead of the (PyTorch default) single-precision floating point, `fp32`. Recent generations of NVIDIA GPUs come loaded with special-purpose tensor cores specially designed for fast `fp16` matrix operations.
84

9-
However, up until now these tensor cores have remained difficult to use, as it has required writing reduced precision operations into your model by hand. This is where the automatic in automatic mixed-precision training comes in. The `torch.cuda.amp` API allows you to implement mixed precision training into your training scripts in just five lines of code!
10-
11-
This post is a developer-friendly introduction to mixed precision training. We will:
5+
PyTorch 1.6 added API support for mixed-precision training, including automatic mixed-precision training. Using these cores had once required writing reduced precision operations into your model by hand. Today the `torch.cuda.amp` API can be used to implement automatic mixed precision training and reap the huge speedups it provides in as few as five lines of code!
126

13-
- Take a deep dive into mixed-precision training as a technique.
14-
- Introduce tensor cores: what they are and how they work.
15-
- Introduce the new PyTorch `amp` API.
16-
- Benchmark three different networks trained using `amp`.
17-
- Discuss which network archetypes will benefit the most from `amp`.
7+
**TLDR**: the `torch.cuda.amp` mixed-precision training module provides speed-ups of 50-60% in large model training jobs.
188

199
## How mixed precision works
2010

@@ -203,4 +193,4 @@ Here is the impact that enabling mixed precision training has on the PyTorch mem
203193

204194
Interestingly enough, while both of the larger models saw benefit from the swap to mixed precision, UNet benefited from the swap a lot more than BERT did. PyTorch memory allocation behavior is pretty opaque to me, so I have no insight into why this might be the case.
205195

206-
To learn more about mixed precision training directly from the source, see the [automatic mixed precision package](https://pytorch.org/docs/master/amp.html) and [automatic mixed precision examples](https://pytorch.org/docs/master/notes/amp_examples.html) pages in the PyTorch master docs.
196+
To learn more about mixed precision training directly from the source, see the [automatic mixed precision package](https://pytorch.org/docs/master/amp.html) and [automatic mixed precision examples](https://pytorch.org/docs/master/notes/amp_examples.html) pages in the PyTorch docs.

0 commit comments

Comments
 (0)