Skip to content

Commit 4460f6a

Browse files
authored
Merge pull request pytorch#385 from zasdfgbnm/jacobian-vec-product
Explain Jacobian-vector product in blitz/autograd_tutorial.py
2 parents f202843 + 6b50354 commit 4460f6a

File tree

1 file changed

+47
-4
lines changed

1 file changed

+47
-4
lines changed

beginner_source/blitz/autograd_tutorial.py

Lines changed: 47 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -108,8 +108,48 @@
108108
# :math:`\frac{\partial o}{\partial x_i}\bigr\rvert_{x_i=1} = \frac{9}{2} = 4.5`.
109109

110110
###############################################################
111-
# You can do many crazy things with autograd!
111+
# Mathematically, if you have a vector valued function :math:`\vec{y}=f(\vec{x})`,
112+
# then the gradient of :math:`\vec{y}` with respect to :math:`\vec{x}`
113+
# is a Jacobian matrix:
114+
#
115+
# .. math::
116+
# J=\left(\begin{array}{ccc}
117+
# \frac{\partial y_{1}}{\partial x_{1}} & \cdots & \frac{\partial y_{m}}{\partial x_{1}}\\
118+
# \vdots & \ddots & \vdots\\
119+
# \frac{\partial y_{1}}{\partial x_{n}} & \cdots & \frac{\partial y_{m}}{\partial x_{n}}
120+
# \end{array}\right)
121+
#
122+
# Generally speaking, ``torch.autograd`` is an engine for computing
123+
# Jacobian-vector product. That is, given any vector
124+
# :math:`v=\left(\begin{array}{cccc} v_{1} & v_{2} & \cdots & v_{m}\end{array}\right)^{T}`,
125+
# compute the product :math:`J\cdot v`. If :math:`v` happens to be
126+
# the gradient of a scalar function :math:`l=g\left(\vec{y}\right)`,
127+
# that is,
128+
# :math:`v=\left(\begin{array}{ccc}\frac{\partial l}{\partial y_{1}} & \cdots & \frac{\partial l}{\partial y_{m}}\end{array}\right)^{T}`,
129+
# then by the chain rule, the Jacobian-vector product would be the
130+
# gradient of :math:`l` with respect to :math:`\vec{x}`:
131+
#
132+
# .. math::
133+
# J\cdot v=\left(\begin{array}{ccc}
134+
# \frac{\partial y_{1}}{\partial x_{1}} & \cdots & \frac{\partial y_{m}}{\partial x_{1}}\\
135+
# \vdots & \ddots & \vdots\\
136+
# \frac{\partial y_{1}}{\partial x_{n}} & \cdots & \frac{\partial y_{m}}{\partial x_{n}}
137+
# \end{array}\right)\left(\begin{array}{c}
138+
# \frac{\partial l}{\partial y_{1}}\\
139+
# \vdots\\
140+
# \frac{\partial l}{\partial y_{m}}
141+
# \end{array}\right)=\left(\begin{array}{c}
142+
# \frac{\partial l}{\partial x_{1}}\\
143+
# \vdots\\
144+
# \frac{\partial l}{\partial x_{n}}
145+
# \end{array}\right)
146+
#
147+
# This characteristic of Jacobian-vector product makes it very
148+
# convenient to feed external gradients into a model that has
149+
# non-scalar output.
112150

151+
###############################################################
152+
# Now let's take a look at an example of Jacobian-vector product:
113153

114154
x = torch.randn(3, requires_grad=True)
115155

@@ -120,9 +160,12 @@
120160
print(y)
121161

122162
###############################################################
123-
#
124-
gradients = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
125-
y.backward(gradients)
163+
# Now in this case ``y`` is no longer a scalar. ``torch.autograd``
164+
# could not compute the full Jacobian directly, but if we just
165+
# want the Jacobian-vector product, simply pass the vector to
166+
# ``backward`` as argument:
167+
v = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
168+
y.backward(v)
126169

127170
print(x.grad)
128171

0 commit comments

Comments
 (0)