Data
Dataset
- Create dataset
- Prepossess dataset
- Mapping features and label to dataset
- Shuffle the dataset
- Create batched data
- Create an iterator for loading data
- Use the iterator for feeding data or in a input_fn for estimator
Model
You can build your model through two approaches:
- High level API like estimator
- Computational graph construct by Low level API
High Level API: Estimator
Define Model_fn
There are two ways to define a custom model_fn for your estimator. One is through tf.layers
, which is similar to Keras model. The other is to manually create computional graph by yourself. Here we are going to focus more on tf.layers
as it is relatively easy for beginners.
We will use a MNIST convolutional neural network model created by me to help explain the proceure of creating model_fn. link
def model_fn(features, labels, mode):
input_layer = tf.reshape(features['image'], [-1, 28, 28, 1])
tf.summary.tensor_summary('inputs', input_layer)
conv1 = tf.layers.conv2d(input_layer, filters=32, kernel_size=5, padding='same', activation=tf.nn.relu, name='Conv1')
pool1 = tf.layers.max_pooling2d(conv1, pool_size=2, strides=2, name='Maxpool1')
conv2 = tf.layers.conv2d(pool1, filters=64, kernel_size=5, padding='same', activation=tf.nn.relu, name='Conv2')
pool2 = tf.layers.max_pooling2d(conv2, pool_size=2, strides=2, name='Maxpool2')
flatten = tf.layers.flatten(pool2, name='Flatten1')
dense = tf.layers.dense(flatten, 1024, activation=tf.nn.relu, name='Dense1')
dropout = tf.layers.dropout(dense, 0.4, training=mode == tf.estimator.ModeKeys.TRAIN, name='Dropout1')
logits = tf.layers.dense(dropout, 10, name='Dense2')
predictions = {
'classes': tf.argmax(logits, 1),
'probabilities': tf.nn.softmax(logits, name='Softmax_tensor')
}
eval_metrics_ops = {
'accuracy': tf.metrics.accuracy(labels=labels, predictions=predictions['classes'])
}
# Storing all vars into TensorBoard
for var in tf.trainable_variables():
variable_summaries(var)
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
if mode == tf.estimator.ModeKeys.TRAIN:
corrected_prediction = tf.equal(tf.argmax(logits, 1), labels)
accuracy = tf.reduce_mean(tf.cast(corrected_prediction, tf.float32))
tf.summary.scalar('accuracy', accuracy)
optimizer = tf.train.AdamOptimizer(learning_rate=0.001)
train_op = optimizer.minimize(loss=loss, global_step=tf.train.get_global_step())
merged = tf.summary.merge_all()
summary_hook = tf.train.SummarySaverHook(save_steps=50, output_dir="tmp/mnist_conv", summary_op=[merged])
return tf.estimator.EstimatorSpec(mode=mode, l
return tf.estimator.EstimatorSpec(mode=mode, loss=loss, eval_metric_ops=eval_metrics_ops, train_op=train_op)
def model_fn(features, labels, mode):
...
...
return tf.estimator.EstimatorSpec(mode=mode, loss=loss, eval_metric_ops=eval_metrics_ops, train_op=train_op)
The model_fn is defined with three parameters: features
, labels
, and mode
. Features
is a dict form of data with feature names as keys. Labels
is the corresponding labels for the input Features
. Mode
has three states: tf.estimator.ModeKeys.TRAIN
, tf.estimator.ModeKeys.EVAL
, and tf.estimator.ModeKeys.PREDICT
for denoting the behavior of estimator.
input_layer = tf.reshape(features['image'], [-1, 28, 28, 1])
The model_fn will begin with a input_layer similar to that in Keras. However, there is not native support for defining input_layer, meaning that you have to do this by yourself. Here the -1
in the second parameter of tf.reshape()
allows Tensorflow to automatically compute the batch size of the input data.
conv1 = tf.layers.conv2d(input_layer, filters=32, kernel_size=5, padding='same', activation=tf.nn.relu, name='Conv1')
pool1 = tf.layers.max_pooling2d(conv1, pool_size=2, strides=2, name='Maxpool1')
conv2 = tf.layers.conv2d(pool1, filters=64, kernel_size=5, padding='same', activation=tf.nn.relu, name='Conv2')
pool2 = tf.layers.max_pooling2d(conv2, pool_size=2, strides=2, name='Maxpool2')
flatten = tf.layers.flatten(pool2, name='Flatten1')
dense = tf.layers.dense(flatten, 1024, activation=tf.nn.relu, name='Dense1')
dropout = tf.layers.dropout(dense, 0.4, training=mode == tf.estimator.ModeKeys.TRAIN, name='Dropout1')
logits = tf.layers.dense(dropout, 10, name='Dense2')
After the input_layer, you have to add the rest layers for your model_fn. Beside using the function interfaces of the layers, you can also create the layer instances and then call them for applying layer opertions. Please refer here for further details.
predictions = {
'classes': tf.argmax(logits, 1),
'probabilities': tf.nn.softmax(logits, name='Softmax_tensor')
}
eval_metrics_ops = {
'accuracy': tf.metrics.accuracy(labels=labels, predictions=predictions['classes'])
}
# Storing all vars into TensorBoard
for var in tf.trainable_variables():
variable_summaries(var)
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
if mode == tf.estimator.ModeKeys.TRAIN:
corrected_prediction = tf.equal(tf.argmax(logits, 1), labels)
accuracy = tf.reduce_mean(tf.cast(corrected_prediction, tf.float32))
tf.summary.scalar('accuracy', accuracy)
optimizer = tf.train.AdamOptimizer(learning_rate=0.001)
train_op = optimizer.minimize(loss=loss, global_step=tf.train.get_global_step())
merged = tf.summary.merge_all()
summary_hook = tf.train.SummarySaverHook(save_steps=50, output_dir="tmp/mnist_conv", summary_op=[merged])
return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op, training_hooks=[summary_hook])
return tf.estimator.EstimatorSpec(mode=mode, loss=loss, eval_metric_ops=eval_metrics_ops)
The rest of the model_fn are about defining evalution metrics, training operation, and prediction operation.
Using TensorBoard
TensorBoard is a very powerful tool for examining your Tensorflow model. To record your model parameters, simply add every variables in tf.trainable_variables()
at the end of your layers and create a summary_hook
for the estimator to record variables.
def variable_summaries(var):
"""Attach a lot of summaries to a Tensor (for TensorBoard visualization)."""
# With Better implementation for dealing invalid name_scope
with tf.name_scope(''.join(var.name.split(':'))):
mean = tf.reduce_mean(var)
tf.summary.scalar('mean', mean)
with tf.name_scope('stddev'):
stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
tf.summary.scalar('stddev', stddev)
tf.summary.scalar('max', tf.reduce_max(var))
tf.summary.scalar('min', tf.reduce_min(var))
tf.summary.histogram('histogram', var)
# Storing all vars into TensorBoard
for var in tf.trainable_variables():
variable_summaries(var)
# Creating summary hook for recording params
summary_hook = tf.train.SummarySaverHook(save_steps=50, output_dir="tmp/mnist_conv", summary_op=[merged])
# Adding summary_hook to estimator
return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op, training_hooks=[summary_hook])
After adding these to your model_fn, you can use tensorboard --logdir YOUR_OUTPUT_DIR
to launch TensorBoard and view results.
Reference
Low Level API
Terms
- Graph
- The Model representation
- Will not be executed until some operations have been called
- Which the part of the graph required for that operation will be compute
- Variables
- To store values that can be updated using optimizer
- Tensor
- The basic element in Tensorflow
- According to Tensorflow,
Tensor
is like the edge of the graph
- Operation
- The
function
in graph to be called bySession.run
- Accordin to Tensorflow,
Operation
is like the node of the graph
- The
- Session
- Like a Engine that process graph
Graph Construction
If you want to use low level API (Graph, Variables, or Sessions) to build your Tensorflow model, you need to first create a tensorflow graph.
Define Input
The first part of your graph should be the inputs of your model. To define them, you can use tf.placeholder
to host inputs. Note that you cannot modify or assign values to the Tensor return from the tf.placeholder
.
# `Shape` param indicate the input shape of placeholder
# You can use `None` in your shape config, meaning that this placeholder will hold arbitrary length of data on that dimension
# For instance, [None, 32] indicates that the placeholder could accept a matrix with random numbers of rows with column size fix at 32
# `dtype` in Tensorflow often indicates the type of data
# Typical types involve `tf.int32`, `tf.string`,...
input = tf.placeholder(shape=[Batch_size, Input_dim], dtype=tf.float32)
labels = tf.placeholder(shape=[None, Output_dim], dtype=tf.int32)
Define Hidden Layers
In this part, you will specify the details of your model. In other words, you will need to manually tell your model how to compute your output.
Define Loss Funtion
When you come to this phase of Graph construction, you will need to define the loss of your model for optimizers to minimize. Usually tf.losses module will provide the loss function you need. However, if you have specific need on computing your loss, you could also manually define your own loss function by creating a tensor that hosts the loss.
loss_op = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=raw_output)
Define Optimizer
In the final part of your graph construction, you need to choose your optimizer from the algorithms here
loss_op = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=raw_output)
# To train your model, simply run your train_op using `Session.run` and the optimizer will automatically calculate gradients and update the weights.
train_op = tf.train.AdamOptimizer.minimize(loss_op)
# If you want to modify your gradients before applying them to model weights, typically needed if exploding gradients occur
# you can then first create the optimizer instance
# Get and update the gradients
# And apply them in the end
optimizer = tf.train.AdamOptimizer()
gvs = optimizer.compute_gradients(loss_op)
# Gradients processing goes here
optimizer.apply_gradients(gvs) # Use updated gradients to minimize loss
Trainning
To train your model, simple call Session.run(train_op)
. However, if your trainning operation rely on placeholders, you will need to feed in data using feed_dict
param when call your operation.
with tf.Session() as sess:
for step in range(steps):
# Run instance from get_next() method from Dataset iterator to get next batch of data
input = sess.run(dataset_iterator.get_next())
# The input value in feed_dict cannot be Tensors
_ = sess.run(train_op, feed_dict={input: SOME_VALUES})
Evaluating and Predicting
To evaluate or predict your model, you will need to write series of operations that complete the tasks and run the final operation to get the result, much like the trainning process.
Debug
To be continued