Installation
Follow the instructions to install TensorFlow. I used pip to install it.
To verify it's installed:
$ python
$ import tensorflow
You should be able to import it successfully without seeing an ModuleNotFoundError
.
Get started
Follow this quickstart for beginners. The corresponding code is available in this Colab notebook. The Google Colaboratory is a free notebook environment that requires no setup and runs entirely in the cloud.
To run the code in the colab notebook, you can click "Connect" then "Runtime", "Run all".
Or you can create a main.py
with the following code and run python main.py
.
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
Note that at the first time running this code example locally, I got this error:
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)
As mentioned in this github issue, one solution is to run the following command in OSX:
$ /Applications/Python\ 3.7/Install\ Certificates.command
The output of the python script was as follows on my box.
$ python main.py
Train on 60000 samples
Epoch 1/5
60000/60000 [==============================] - 3s 46us/sample - loss: 0.2918 - accuracy: 0.9150
Epoch 2/5
60000/60000 [==============================] - 2s 39us/sample - loss: 0.1413 - accuracy: 0.9581
Epoch 3/5
60000/60000 [==============================] - 2s 39us/sample - loss: 0.1053 - accuracy: 0.9674
Epoch 4/5
60000/60000 [==============================] - 2s 39us/sample - loss: 0.0872 - accuracy: 0.9732
Epoch 5/5
60000/60000 [==============================] - 2s 39us/sample - loss: 0.0753 - accuracy: 0.9763
10000/10000 [==============================] - 0s 29us/sample - loss: 0.0821 - accuracy: 0.9764
MNIST dataset
The MNIST dataset is a database of handwritten digits which has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.
You can see some data example (for example, the first digit in the training set) using the following python script:
import tensorflow as tf
import matplotlib.pyplot as plt
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
plt.imshow(x_train[0])
plt.show()

Keras
tf.keras
is TensorFlow's implementation of the Keras API specification.
Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano.
tf.keras
is a high-level API to build and train models that includes first-class support for TensorFlow-specific functionality, such as eager execution, tf.data
pipelines, and Estimators.tf.keras
makes TensorFlow easier to use without sacrificing flexibility and performance.
Explanations
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
The above code loads the training and test datasets from MNIST.
The x_train
is an array of handwritten digits, each of which is represented as a 28x28
2D number array. For example the first element in the x_train
is shown above as a handwritten 5
.
The y_train
is an array of the corresponding labels. For example the first element of it is 5
.
The x_test
and y_test
are similar but are for the test dataset.
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
tf.keras.models.Sequential
is simply a linear stack of layers. So the above code creates a NN with 4 stacking sequantially.
The first layer is tf.keras.layers.Flatten
. It flattens the input. In the example, the each element in x_train
is of dimension (28, 28)
. After the Flatten layer, the output is of dimensions (784)
(28x28=784
).
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28))
])
print(x_train.shape, model.output_shape)
# The output is (60000, 28, 28) (None, 784)
The 2nd and 4th layers are tf.keras.layers.Dense
. It is a regular densely-connected NN layer.
Dense
implements the operation: output = activation(dot(input, kernel) + bias)
where activation
is the element-wise activation function passed as the activation
argument, kernel
is a weights matrix created by the layer, and bias
is a bias vector created by the layer (only applicable if use_bias
is True
).
Take the 2nd layer tf.keras.layers.Dense(128, activation='relu')
as example, it takes the output from the previous layer as input and outputs an array of shape (None, 128)
.
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu')
])
print(model.output_shape)
# The output is (None, 128)
The 3rd layer is tf.keras.layers.Dropout
.
Dropout is one of the most effective and most commonly used regularization techniques for neural networks, developed by Hinton and his students at the University of Toronto.
It consists in randomly setting a fraction rate
of input units to 0 at each update during training time, which helps prevent overfitting.
Let's say a given layer would normally have returned a vector [0.2, 0.5, 1.3, 0.8, 1.1]
for a given input sample during training; after applying dropout, this vector will have a few zero entries distributed at random, e.g. [0, 0.5, 1.3, 0, 1.1]
.
The "dropout rate" is the fraction of the features that are being zeroed-out; it is usually set between 0.2
and 0.5
. At test time, no units are dropped out, and instead the layer's output values are scaled down by a factor equal to the dropout rate, so as to balance for the fact that more units are active than at training time.
See more explanations about Dropout in Overfit and underfit
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.compile
configures the model for training.
The optimizer and loss function are the two arguments required for compiling a Keras model.
The optimizers are used for improving speed and performance for training a specific model (reference)
Tensorflow provides some built-in optimizers, see tf.keras.optimizers
. In the example we used tf.keras.optimizers.Adam
The loss function is used to determine how far the predicted values deviate from the actual values in the training data. If your predictions are off, your loss function will output a higher number.
Tensorflow provides some built-in loss functions, see tf.keras.losses
. In the example we used tf.keras.losses.sparse_categorical_crossentropy
A metric is a function that is used to judge the performance of your model. Metric functions are to be supplied in the metrics parameter when a model is compiled. See more in Keras | Metrics.
A metric function is similar to a loss function, except that the results from evaluating a metric are not used when training the model. You may use any of the loss functions as a metric function.
Tensorflow provides some built-in metrics, see tf.keras.metrics
. In the example we used tf.keras.metrics.Accuracy
which calculates how often predictions matches labels.
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
model.fit
trains the model for a fixed number of epochs (iterations on a dataset).
model.evaluate
returns the loss value & metrics values for the model in test mode. The computation is done in batches.
网友评论