For experimental purposes. This is mostly just to observe the math behind a neural network and how each part comes together to allow the model to "learn."
A follow-up to this will be building a simple transformer from scratch, in a similar way. (But maybe with the use of a little more help, library wise.)