This program was made with the help of this Tensorflow tutorial.
I used the MNIST Dataset, the version that is built in to the Tensorflow Keras API. The training dataset consists of 60,000 images and labels, while the testing dataset consists of 10,000 images and labels. Each image is 28x28 pixels.
Since the Keras API has datasets that are meant for ML practice, the only preprocessing necessary was to scale the pixel value range from 0-255 to 0-1.
I used a Multilayer Perceptron with two hidden layers. I used a sequential model, which is the most common form of model used in neural networks. The input layer reformats the images from a two-dimensional to a one-dimensional array of pixels. The first hidden layer uses 128 nodes and the rectified linear unit activation function. The second hidden layer outputs a logit array that matches the number of classes for the data set (0-9). The output layer converts the logits to probabilities.
I used an adaptive moment estimation optimizer, one of the most popular gradient descent optimization algorithms, and an accuracy metric which calculates how often the prediction equals the label.
The current version of the program uses 10 epochs, which means the model will label all the test images correctly from what I've seen so far. Lowering the number of epochs to 1 means the model will be less sure about most of the labels, and will get a few wrong. Each iteration of the program will look a little different, and it is fun to experiment with changing the number of epochs.
The program plots the first few results from the test dataset. Here are the results: