Practical Tensorflow2 Guide: Hello World
3rd Mar 2019
This is a reproduction of hello world example by google codelabs
Consider the following sets of numbers. Can you see the relationship between them?
X | -1 | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|---|
Y | -2 | 1 | 4 | 7 | 10 | 13 |
As you look at them you might notice that the X value is increasing by 1 as you read left to right, and the corresponding Y value is increasing by 3. So you probably think Y=3X plus or minus something. Then you'd probably look at the zero on X and see that Y = 1, and you'd come up with the relationship Y=3X+1.
That's almost exactly how you would use code to train a model to spot the patterns between these items of data!
Now let's look at the code to do it.
How would you train a neural network to do the equivalent task? Using data! By feeding it with a set of Xsand a set of Ys, it should be able to figure out the relationship between them.
This is obviously a very different paradigm than what you might be used to, so let's step through it piece by piece.
Imports
import tensorflow as tf
import numpy as np
from tensorflow import keras
Define and Compile the Neural Network
Next we will create the simplest possible neural network. It has 1 layer, and that layer has 1 neuron, and the input shape to it is just 1 value.
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
Next we will write the code to compile our neural network. When we do so, we have to specify 2 functions, a loss
and an optimizer
.
If you've seen lots of math for machine learning, here's where it's usually used, but in this case it's nicely encapsulated in functions for you. But what happens here? et's explain...
You know that in the function, the relationship between the numbers is y=2x-1
.
When the computer is trying to 'learn' that, it makes a guess...maybe y=10x+10
. The loss
function measures the guessed answers against the known correct answers and measures how well or how badly it did.
Next, the model uses the optimizer function to make another guess. Based on the loss function's result, it will try to minimize the loss. At this point maybe it will come up with something like y=5x+5
. hile this is still pretty bad, it's closer to the correct result (i.e. the loss is lower).
The model will repeat this for the number of epochs which you will see shortly.
But first, here's how we tell it to use mean squared error
for the loss and stochastic gradient descent
(sgd) for the optimizer. You don't need to understand the math for these yet, but you can see that they work! :)
Over time you will learn the different and appropriate loss and optimizer functions for different scenarios.
model.compile(optimizer='sgd', loss='mean_squared_error')
Providing the Data
Next up we'll feed in some data. In this case we are taking the 6 xs and 6 ys that we used earlier. You can see that the relationship between these is that y=2x-1, so where x = -1, y=-3 etc. etc.
A py library called numpy
provides lots of array type data structures that are a defacto standard way of doing it. We declare that we want to use these by specifying the values as an array in numpy using np.array[]
xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-2.0, 1.0, 4.0, 7.0, 10.0, 13.0], dtype=float)
You've now written all of the code you need to define the neural network. The next step will be to train it to see if it can infer the patterns between these numbers and use those to create a model.
Training the Neural Network
The process of training the neural network, where it 'learns' the relationship between the Xs and Ys is in the model.fit call. This is where it will go through the loop we spoke about before:aking a guess, measuring how good or bad it is (aka the loss), using the optimizer to make another guess etc. It will do it for the number of epochs you specify. When you run this code, you'll see the loss will be printed out for each epoch.
model.fit(xs, ys, epochs=500)
Train on 6 samples
Epoch 1/500
6/6 [==============================] - 0s 15ms/sample - loss: 47.2951
Epoch 2/500
6/6 [==============================] - 0s 1ms/sample - loss: 37.2109
Epoch 3/500
6/6 [==============================] - 0s 4ms/sample - loss: 29.2771
Epoch 4/500
6/6 [==============================] - 0s 1ms/sample - loss: 23.0351
Epoch 5/500
6/6 [==============================] - 0s 2ms/sample - loss: 18.1242
Epoch 6/500
6/6 [==============================] - 0s 1ms/sample - loss: 14.2605
Epoch 7/500
6/6 [==============================] - 0s 994us/sample - loss: 11.2207
..............
Epoch 493/500
6/6 [==============================] - 0s 840us/sample - loss: 2.6279e-07
Epoch 494/500
6/6 [==============================] - 0s 2ms/sample - loss: 2.5737e-07
Epoch 495/500
6/6 [==============================] - 0s 2ms/sample - loss: 2.5218e-07
Epoch 496/500
6/6 [==============================] - 0s 652us/sample - loss: 2.4700e-07
Epoch 497/500
6/6 [==============================] - 0s 836us/sample - loss: 2.4185e-07
Epoch 498/500
6/6 [==============================] - 0s 634us/sample - loss: 2.3691e-07
Epoch 499/500
6/6 [==============================] - 0s 3ms/sample - loss: 2.3203e-07
Epoch 500/500
6/6 [==============================] - 0s 4ms/sample - loss: 2.2733e-07
So, for example, you can see here that for the first few epochs, the loss value is quite large, but it's getting smaller with each step:
As the training progresses, the loss soon gets very small:
And by the time the training is done, the loss is extremely small, showing that our model is doing a great job of inferring the relationship between the numbers:
You probably don't need all 500 epochs, and can experiment with different amounts, but as you can see from this example the loss is really small after only 50 epochs, so that might be enough!
Using the model
Ok, now you have a model that has been trained to learn the relationship between X and Y. You can use the model.predict method to have it figure out the Y for a previously unknown X. So, for example, if X = 10, what do you think Y will be? Take a guess before you run this code:
print(model.predict([10.0]))
You might have thought 31, right? But it ended up being a little over. Why do you think that is?
Neural networks deal with probabilities, so given the data that we fed the NN with, it calculated that there is a very high probability that the relationship between X and Y is Y=3X+1, but with only 6 data points we can't know for sure. As a result, the result for 10 is very close to 31, but not necessarily 31.
As you work with neural networks, you'll see this pattern recurring. You will almost always deal with probabilities, not certainties, and will do a little bit of coding to figure out what the result is based on the probabilities, particularly when it comes to classification.
Next Steps
Believe it or not you've actually already covered most of the concepts in machine learning that you'll use in far more complex scenarios. In this lab you saw how to train a neural network to spot the relationship between two sets of numbers by defining the network. You defined a set of layers (in this case just 1) that contained neurons (also in this case, just 1), which you then compiled with a loss function and an optimizer.
This collection of a network, loss function, and optimizer handles the process of guessing the relationship between the numbers, measuring how well they did, and then generating new parameters for new guesses. As you'll see in the other TensorFlow labs, this process is almost identical to what you'll use for far more complicated scenarios!
You can also learn more at TensorFlow.org --> -->