PyTorch experiments showing what happens when you remove ReLU from a deep network — loss curves, gradient collapse, depth sweep, decision boundaries, and activation comparison on MNIST.
-
Updated
Feb 28, 2026 - Python
PyTorch experiments showing what happens when you remove ReLU from a deep network — loss curves, gradient collapse, depth sweep, decision boundaries, and activation comparison on MNIST.
A comparative experiment between RNN and LSTM models to evaluate their ability to perform noise-robust sequence prediction. The project tests short-term vs long-term memory by reconstructing clean input sequences from noisy data, showing how LSTM outperforms RNN under long-dependency conditions.
Add a description, image, and links to the gradient-vanishing topic page so that developers can more easily learn about it.
To associate your repository with the gradient-vanishing topic, visit your repo's landing page and select "manage topics."