Gradient Flow Visualization with Interactive Debugging
- 1. Overview
- 2. Main Features
- 3. Architecture
- 4. Workflow with Adapter
- 5. Usage Instructions
- 6. Future Development
- 7. Contribution & Maintenance
A tool for visually debugging gradient flows between convolutional layers of neural networks.
What are supported?
- Display Gradient Flow:
- Specify a z value in the "Z:" textbox.
- Uses a Sankey diagram to represent gradient flows between layers.
- Interactive UI for visual gradient debugging:
- Use heatmaps for displaying gradient values.
- Highlights points that have gradients to a selected node in the convolutional layer (input).
- Support padding correction via the Offset Adjustments feature.
- Input-Editing Mode:
- Users can click on individual heatmap cells to toggle values between 0 and z, which serves as a visual debugging tool for gradients.
- Save Input:
- The Save button allows users to store the modified input values in
flow_info.pkl
, which will later be processed by Adapters to generate gradient flow information.
- The Save button allows users to store the modified input values in
- Gradient-Debugging Workflow:
- Users run an external neural network (with the Adapter attached) in the
DEBUG_FOLDER
to compute new gradient flows using the saved input. - Clicking the Render button reloads the updated gradient information, reflecting the new computed gradients.
- Users run an external neural network (with the Adapter attached) in the
- State Preservation:
- After debugging, the previously saved heatmap values remain visible, allowing further adjustments and iterative debugging.
-
Gradient Extraction:
- Computes gradients for each layer and stores them in a Python dict, which is eventually saved to a pickle file with the (same) name
flow_info.pkl
, of the following format:
{ 'activation_gradients': { 'conv2': array([[15.879998, 15.899999], [15.899999, 15.929999]], dtype=float32), 'conv3': array([[4.]], dtype=float32), 'conv1': array([[3.97, 3.98, 3.97, 3.97], [3.98, 4. , 3.98, 3.98], [3.97, 3.98, 3.97, 3.97], [3.97, 3.98, 3.97, 3.97]], dtype=float32) }, 'gradient_flows': { ('conv2', 'conv3'): array([[1.], [1.], [1.], [1.]], dtype=float32), ('conv1', 'conv2'): array([ [1. , 0.99, 0.99, 0.99], [1. , 1. , 0.99, 0.99], [0.99, 1. , 0.99, 0.99], [0.99, 1. , 0.99, 0.99], [1. , 0.99, 1. , 0.99], [1. , 1. , 1. , 1. ], [0.99, 1. , 0.99, 1. ], [0.99, 1. , 0.99, 1. ], [0.99, 0.99, 1. , 0.99], [0.99, 0.99, 1. , 1. ], [0.99, 0.99, 0.99, 1. ], [0.99, 0.99, 0.99, 1. ], [0.99, 0.99, 1. , 0.99], [0.99, 0.99, 1. , 1. ], [0.99, 0.99, 0.99, 1. ], [0.99, 0.99, 0.99, 1. ] ], dtype=float32) } }
Accordingly,
flow_info['activation_gradients']
: A Pythondict
object where each key ("conv1", "conv2", ...) is the name you give to the neural network's layers and its content is a 2d tensor of the gradients at that layer (results of user-defined backward hook procedures or through accessing.grad
property of pytorch tensors).flow_info['gradient_flows']
: Adict
object where each key is a tuple, respectively, of the names of input and output layers representing a set of gradient flows, for node pairs connecting two consecutive layers, and its content is a 2-dimensional tensor whose first dimension is the number of nodes in the input layer and the second dimension is the number of nodes in the output layer. - Computes gradients for each layer and stores them in a Python dict, which is eventually saved to a pickle file with the (same) name
-
Flow Matrix Computation:
- Uses values from
gradient_flows
. - If a flow is undefined (set to
None
), it is interpolated using the available gradient data fromactivation_gradients
to maintain consistency with chain rule calculations.
- Uses values from
- Sankey Diagram:
- Each node represents a pixel or neuron, and each link represents the gradient flow between layers.
- Utilizes Plotly for rendering with fixed properties like positions, colors, and dynamic thresholds for gradient filtering.
- Heatmaps:
- Displays gradient values for selected layers.
- In Input Editing mode, the heatmap allows direct interaction to edit the underlying input values.
- ipywidgets:
- Provides interactive components such as dropdowns (for layer selection), sliders (for node selection and threshold adjustment), and buttons (for Save and Render).
- Event Handling:
- Any change in the UI (layer, node, threshold, or heatmap edits) triggers an update of the Sankey diagram and heatmaps.
- Standardized Adapter Interface:
- Users must implement an Adapter to load the saved input, gradient information from
flow_info.pkl
, process it using their neural network, and store the computed gradient flow back intoflow_info.pkl
. - The Adapter ensures that the debugging tool is independent of any particular network architecture.
- Users must implement an Adapter to load the saved input, gradient information from
An example of Adapter from the SMap project.
import torch
from torch import nn
import numpy as np
class TestBot_In(nn.Module):
def __init__(self, module=None, offset_in=0, name="in", connet2name="out"):
super(TestBot_In, self).__init__()
def get_activation_grad(name, connet2name="out"):
def hook(module, grad_inputs, grad_outputs):
if name is not None:
# Lấy grad_out: shape [N, C_out, H_out, W_out]
grad_out, grad_in = None, None
for grad in grad_inputs:
if grad is not None:
grad_in = grad
H_in, W_in = (grad_in.shape[-2]), (grad_in.shape[-1])
grad_in = (grad_in[:,:,offset_in,:,:]).reshape(H_in, W_in)
self.testcase.activation_gradients[name] = grad_in.cpu().numpy()
self.testcase.gradient_flows[(name, connet2name)] = None
import pickle
flow_info = {"activation_gradients": self.testcase.activation_gradients,
"gradient_flows": self.testcase.gradient_flows}
np.save(self.testcase.out_path+self.testcase.name+"_input_representation.npy", self.input_representation.reshape(H_in, W_in))
with open(self.testcase.out_path+self.testcase.name+"_flow_info.pkl", 'wb') as f:
pickle.dump(flow_info, f)
return hook
self.testcase = None
self.module = NoneBot()
if module is not None:
self.module = module
self.name = name
self.connet2name = connet2name
self.input_representation = None
self.module.register_backward_hook(get_activation_grad(self.name, self.connet2name))
def forward(self, x, mask):
h, w = mask.shape[-2], mask.shape[-1]
self.input_representation = mask.detach().cpu().numpy().reshape(h, w)
return self.module(x)
Inside an unit test function.
from tools.testing.vtest.vtest_types import *
# Initialize testing environment and a SMap3x3 instance
# ...
# Construct a test case with built-in testbot Adapters
testcase = TestCase(name=f"test_{test_type}_target", testbot_in=TestBot_In(), testbot_out=TestBot_Out(), testbot_target=TestBot_Target())
# Cover the input of the SMap3x3 instance with the testbot Adapter for hooking gradient information
input_repr_x = self.vtestcase.testbot_in(input_repr_x, input_mask)
# Run the neural instance to generate the data of the vnittest tool
weights = smap3x3(input_repr_x, input_repr_y, input_repr_z, input_mask, target_repr, self.input_mask.shape).reshape(1,-1, self.input_mask.shape[0], self.input_mask.shape[1])
# ...
-
Edit and Save Input:
Use the debug tool to modify the input heatmap and click Save to store the current input representation into the%DEBUG_FOLDER%input_representation.npy
file (the destination can be changed with the Save to: textbox in the UI). -
External Neural Network Execution:
In a separate notebook or process within theDEBUG_FOLDER
, run your neural network (with the Adapter attached in a similar way to setting traditional breakpoints as you can see in vinittest files, definied below, in the${{ github.workspace }}/tests
folder) so that it processes the saved input, computes updated gradient flows, and writes the new data toflow_info.pkl
. -
Render Updated Data:
In the debug tool, click the Render button to reload the updated gradient information from theDEBUG_FOLDER
. The visualizations (Sankey Diagram and Heatmaps) will then refresh to display the new data.
-
Install from PyPI
You can install vnittest directly from PyPI using pip:
pip install vnittest
-
Configure DEBUG_FOLDER:
Set theDEBUG_FOLDER
variable (e.g.,"../tests/test_data/test_"
) to point to your data directory. -
Prepare Data Files:
Ensure that%DEBUG_FOLDER%flow_info.pkl
,%DEBUG_FOLDER%input_representation.npy
and%DEBUG_FOLDER%target_representation.npy
are located in theDEBUG_FOLDER
.
-
Launch the Debug Notebook:
Windows:SET DEBUG_FOLDER="../tests/test_data/test_" vnittest %DEBUG_FOLDER%
The UI displays:
- Sankey Diagram
- Control Widgets: Layer Dropdown, Node Slider, and Threshold Slider.
- Heatmaps: For Conv1, Conv2, and Target.
- Input Section: Input Heatmap along with Render and Save buttons.
-
Modify the Input:
Click on the Input Heatmap to toggle cell values (0 ↔ 1). The Save button becomes enabled upon modification. -
Save Input Changes:
Click the Save button to save the modified input representation toflow_info.pkl
. (This action only updates the file; it does not trigger gradient computation.) -
Run the Neural Network Externally:
In a separate notebook or process within the DEBUG_FOLDER, run your neural network (with the Adapter attached) so that it processes the saved input and writes updated gradient information to file. -
Render Updated Data:
Click the Render button in the debug tool to reload the updated gradient information from the DEBUG_FOLDER. The Sankey Diagram and Heatmaps will refresh accordingly. After rendering, the Save button is disabled until new input modifications occur. -
Interact and Inspect:
Use the control widgets to select different layers, nodes, and thresholds. The visualizations update automatically.
-
Extended Debugging Capabilities:
- Integration of real-time gradient feedback during training.
-
Enhanced Adapter Interface:
- Further standardize the Adapter to support various network architectures seamlessly.
-
Contribution:
Contributions, feedback, and bug reports are welcome. Please submit pull requests or open issues on the repository. -
Maintenance:
The modular design facilitates easy updates and extensions for future debugging needs.