By William An in Tensorflow — Mar 10, 2018

Tensorflow workflow Explanation

Data

Dataset

Create dataset
Prepossess dataset
1. Mapping features and label to dataset
2. Shuffle the dataset
Create batched data
Create an iterator for loading data
Use the iterator for feeding data or in a input_fn for estimator

Model

You can build your model through two approaches:

High level API like estimator
Computational graph construct by Low level API

High Level API: Estimator

Define Model_fn

There are two ways to define a custom model_fn for your estimator. One is through tf.layers, which is similar to Keras model. The other is to manually create computional graph by yourself. Here we are going to focus more on tf.layers as it is relatively easy for beginners.

We will use a MNIST convolutional neural network model created by me to help explain the proceure of creating model_fn. link

def model_fn(features, labels, mode):
    input_layer = tf.reshape(features['image'], [-1, 28, 28, 1])
    tf.summary.tensor_summary('inputs', input_layer)
    conv1 = tf.layers.conv2d(input_layer, filters=32, kernel_size=5, padding='same', activation=tf.nn.relu, name='Conv1')
    pool1 = tf.layers.max_pooling2d(conv1, pool_size=2, strides=2, name='Maxpool1')
    conv2 = tf.layers.conv2d(pool1, filters=64, kernel_size=5, padding='same', activation=tf.nn.relu, name='Conv2')
    pool2 = tf.layers.max_pooling2d(conv2, pool_size=2, strides=2, name='Maxpool2')
    flatten = tf.layers.flatten(pool2, name='Flatten1')
    dense = tf.layers.dense(flatten, 1024, activation=tf.nn.relu, name='Dense1')
    dropout = tf.layers.dropout(dense, 0.4, training=mode == tf.estimator.ModeKeys.TRAIN, name='Dropout1')
    logits = tf.layers.dense(dropout, 10, name='Dense2')

    predictions = {
        'classes': tf.argmax(logits, 1),
        'probabilities': tf.nn.softmax(logits, name='Softmax_tensor')
    }

    eval_metrics_ops = {
        'accuracy': tf.metrics.accuracy(labels=labels, predictions=predictions['classes'])
    }

    # Storing all vars into TensorBoard
    for var in tf.trainable_variables():
        variable_summaries(var)

    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

    loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
    if mode == tf.estimator.ModeKeys.TRAIN:
        corrected_prediction = tf.equal(tf.argmax(logits, 1), labels)
        accuracy = tf.reduce_mean(tf.cast(corrected_prediction, tf.float32))
        tf.summary.scalar('accuracy', accuracy)
        optimizer = tf.train.AdamOptimizer(learning_rate=0.001)
        train_op = optimizer.minimize(loss=loss, global_step=tf.train.get_global_step())

        merged = tf.summary.merge_all()
        summary_hook = tf.train.SummarySaverHook(save_steps=50, output_dir="tmp/mnist_conv", summary_op=[merged])
        return tf.estimator.EstimatorSpec(mode=mode, l
    return tf.estimator.EstimatorSpec(mode=mode, loss=loss, eval_metric_ops=eval_metrics_ops, train_op=train_op)

def model_fn(features, labels, mode):
    ...
    ...
    return tf.estimator.EstimatorSpec(mode=mode, loss=loss, eval_metric_ops=eval_metrics_ops, train_op=train_op)

The model_fn is defined with three parameters: features, labels, and mode. Features is a dict form of data with feature names as keys. Labels is the corresponding labels for the input Features. Mode has three states: tf.estimator.ModeKeys.TRAIN, tf.estimator.ModeKeys.EVAL, and tf.estimator.ModeKeys.PREDICT for denoting the behavior of estimator.

input_layer = tf.reshape(features['image'], [-1, 28, 28, 1])

The model_fn will begin with a input_layer similar to that in Keras. However, there is not native support for defining input_layer, meaning that you have to do this by yourself. Here the -1 in the second parameter of tf.reshape() allows Tensorflow to automatically compute the batch size of the input data.

conv1 = tf.layers.conv2d(input_layer, filters=32, kernel_size=5, padding='same', activation=tf.nn.relu, name='Conv1')
pool1 = tf.layers.max_pooling2d(conv1, pool_size=2, strides=2, name='Maxpool1')
conv2 = tf.layers.conv2d(pool1, filters=64, kernel_size=5, padding='same', activation=tf.nn.relu, name='Conv2')
pool2 = tf.layers.max_pooling2d(conv2, pool_size=2, strides=2, name='Maxpool2')
flatten = tf.layers.flatten(pool2, name='Flatten1')
dense = tf.layers.dense(flatten, 1024, activation=tf.nn.relu, name='Dense1')
dropout = tf.layers.dropout(dense, 0.4, training=mode == tf.estimator.ModeKeys.TRAIN, name='Dropout1')
logits = tf.layers.dense(dropout, 10, name='Dense2')

After the input_layer, you have to add the rest layers for your model_fn. Beside using the function interfaces of the layers, you can also create the layer instances and then call them for applying layer opertions. Please refer here for further details.

predictions = {
    'classes': tf.argmax(logits, 1),
    'probabilities': tf.nn.softmax(logits, name='Softmax_tensor')
}

eval_metrics_ops = {
    'accuracy': tf.metrics.accuracy(labels=labels, predictions=predictions['classes'])
}

# Storing all vars into TensorBoard
for var in tf.trainable_variables():
    variable_summaries(var)

if mode == tf.estimator.ModeKeys.PREDICT:
    return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
if mode == tf.estimator.ModeKeys.TRAIN:
    corrected_prediction = tf.equal(tf.argmax(logits, 1), labels)
    accuracy = tf.reduce_mean(tf.cast(corrected_prediction, tf.float32))
    tf.summary.scalar('accuracy', accuracy)
    optimizer = tf.train.AdamOptimizer(learning_rate=0.001)
    train_op = optimizer.minimize(loss=loss, global_step=tf.train.get_global_step())
    merged = tf.summary.merge_all()
    summary_hook = tf.train.SummarySaverHook(save_steps=50, output_dir="tmp/mnist_conv", summary_op=[merged])
    return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op, training_hooks=[summary_hook])

return tf.estimator.EstimatorSpec(mode=mode, loss=loss, eval_metric_ops=eval_metrics_ops)

The rest of the model_fn are about defining evalution metrics, training operation, and prediction operation.

Using TensorBoard

TensorBoard is a very powerful tool for examining your Tensorflow model. To record your model parameters, simply add every variables in tf.trainable_variables() at the end of your layers and create a summary_hook for the estimator to record variables.

def variable_summaries(var):
    """Attach a lot of summaries to a Tensor (for TensorBoard visualization)."""
    # With Better implementation for dealing invalid name_scope
    with tf.name_scope(''.join(var.name.split(':'))):
        mean = tf.reduce_mean(var)
        tf.summary.scalar('mean', mean)
        with tf.name_scope('stddev'):
            stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
        tf.summary.scalar('stddev', stddev)
        tf.summary.scalar('max', tf.reduce_max(var))
        tf.summary.scalar('min', tf.reduce_min(var))
        tf.summary.histogram('histogram', var)
        
# Storing all vars into TensorBoard
for var in tf.trainable_variables():
    variable_summaries(var)
    
# Creating summary hook for recording params
summary_hook = tf.train.SummarySaverHook(save_steps=50, output_dir="tmp/mnist_conv", summary_op=[merged])

# Adding summary_hook to estimator
return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op, training_hooks=[summary_hook])

After adding these to your model_fn, you can use tensorboard --logdir YOUR_OUTPUT_DIR to launch TensorBoard and view results.
Reference

Low Level API

Terms

Formal Definiation

Graph
- The Model representation
- Will not be executed until some operations have been called
- Which the part of the graph required for that operation will be compute
Variables
- To store values that can be updated using optimizer
Tensor
- The basic element in Tensorflow
- According to Tensorflow, Tensor is like the edge of the graph
Operation
- The function in graph to be called by Session.run
- Accordin to Tensorflow, Operation is like the node of the graph
Session
- Like a Engine that process graph

Graph Construction

If you want to use low level API (Graph, Variables, or Sessions) to build your Tensorflow model, you need to first create a tensorflow graph.

Define Input

The first part of your graph should be the inputs of your model. To define them, you can use tf.placeholder to host inputs. Note that you cannot modify or assign values to the Tensor return from the tf.placeholder.

# `Shape` param indicate the input shape of placeholder
# You can use `None` in your shape config, meaning that this placeholder will hold arbitrary length of data on that dimension
# For instance, [None, 32] indicates that the placeholder could accept a matrix with random numbers of rows with column size fix at 32
# `dtype` in Tensorflow often indicates the type of data
# Typical types involve `tf.int32`, `tf.string`,...
input = tf.placeholder(shape=[Batch_size, Input_dim], dtype=tf.float32)
labels = tf.placeholder(shape=[None, Output_dim], dtype=tf.int32)

Define Hidden Layers

In this part, you will specify the details of your model. In other words, you will need to manually tell your model how to compute your output.

Define Loss Funtion

When you come to this phase of Graph construction, you will need to define the loss of your model for optimizers to minimize. Usually tf.losses module will provide the loss function you need. However, if you have specific need on computing your loss, you could also manually define your own loss function by creating a tensor that hosts the loss.

loss_op = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=raw_output)

Define Optimizer

In the final part of your graph construction, you need to choose your optimizer from the algorithms here

loss_op = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=raw_output)

# To train your model, simply run your train_op using `Session.run` and the optimizer will automatically calculate gradients and update the weights.
train_op = tf.train.AdamOptimizer.minimize(loss_op)

# If you want to modify your gradients before applying them to model weights, typically needed if exploding gradients occur
# you can then first create the optimizer instance
# Get and update the gradients
# And apply them in the end
optimizer = tf.train.AdamOptimizer()
gvs = optimizer.compute_gradients(loss_op)
# Gradients processing goes here
optimizer.apply_gradients(gvs) # Use updated gradients to minimize loss

Trainning

To train your model, simple call Session.run(train_op). However, if your trainning operation rely on placeholders, you will need to feed in data using feed_dict param when call your operation.

with tf.Session() as sess:
    for step in range(steps):
        # Run instance from get_next() method from Dataset iterator to get next batch of data
        input = sess.run(dataset_iterator.get_next())
        # The input value in feed_dict cannot be Tensors
        _ = sess.run(train_op, feed_dict={input: SOME_VALUES})

Evaluating and Predicting

To evaluate or predict your model, you will need to write series of operations that complete the tasks and run the final operation to get the result, much like the trainning process.

Debug

To be continued