1 Question 3: image classification (40%)
In this question we will train a neural network to classify images of clothing. We will use Keras,
a deep learning toolkit that is built on TensorFlow (developed at Google). You can install Keras
and Tensorflow libraries on your local machine. However, you may find it more convenient to use
Google’s Colab environment for this question as all required libraries are pre-installed, and it may
run faster than your computer. You can upload and download local notebooks to Colab but you
should always save a recent version locally, for safety.
We will use the Fashion-MNIST dataset which is available in Keras. A tutorial that explains the
dataset and the overall workflow of training an image classifier in Keras is available here:
I highly recommend that you go through this first to get a good background understanding for this
]: # TensorFlow and tf.keras
import tensorflow as tf
# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
]: # This makes figures that show how the training and testing accuracy and loss
# evolved against the number of epochs for the current training run
import matplotlib.pyplot as plt
plt.legend((“train accuracy”,”test accuracy”))
plt.legend((“train loss”,”test loss”))
Load and pre-process the images so all pixels are between 0 and 1
[ ]: fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.
train_images = train_images / 255.0
test_images = test_images / 255.0
1.1 Training the network
In the cell below a lot is happening.
• learning_rate: This determines how quickly the network updates it weight in response to the
incoming gradients. Change too slowly and the network may never reach the lowest loss value,
change too fast and you run into the danger of oscillating. Typical values range between very
small (1e-5) to 0.1
• max_epochs: One epoch is a pass over the whole training set. Setting this number tell the
training algorithm to do this many passes over the whole data.
1.1.1 Network definition
Line 4-7 define the architecture of the network.
• Line 4: We tell keras that the model will be of the Sequential type, that is data is going to
flow from the input to the output and we do not have any forks / loops.
• Line 6: In keras, Dense means a fully connected layer. To our model we add a Dense Layer,
with 64 neurons. Assuming our input is x, the output after the fully connected layer will be
of the form y1 = W1x Another important thing is the activation parameter, which we have
set to sigmoid. This is the non-linearlity which will be applied to the output of this layer
that is yσ1 = σ1(W1x).
• Line 7: We additionally have another layer which maps the output yσ1 to a single output,
with another sigmoid as the activation function. The output of this sigmoid is used to classify
if the class is 0 or 1. (-1 or 1 in case of Keras, but that conversion happens automatically and
we do not need to worry about it.)
• Line 9: Just an architecture is not enough for learning. We need to specify a loss function
as well as an optimizer. For this assignment, we start with Adam (an eﬀicient version of
Stochastic Gradient Descent) as the optimizier. However, we can choose different losses and
see their effect on how we learn. All of this is brought together using compile in Keras.
本网站支持淘宝 支付宝 微信支付 paypal等等交易。如果不放心可以用淘宝交易！
E-mail: firstname.lastname@example.org 微信:itcsdx