[TOC]
variables are used to hold and update parameters. Variables are in-memory buffers containing tensors. They must be explicitly initialized and can be saved to disk during and after training. You can later restore saved values to exercise or analyze the model.
# Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name='weights')
baises = tf.Variable(tf.zeros([200]), name='biases')
pass a tensor as its initial value to the Variable()
constructor.
tensor flow provides a collection of ops that produce tensors often used for initialization from constant or random values, all these ops require you specify the shape of tensors.
a variable can be pinned to a particular device:
with tf.device('/gpu:1'):
v = tf.Variable(...)
# Pin a variable to a particular parameter server task.
with tf.device('/job:ps/task:7'):
v = tf.Variable(...)
NOTE Operations that mutate a variable, such as tf.Variable.assigh and the parameter update operations in a tf.train.Optimizer must run on the same device as the variblle. Incompatible device placement directives will be ignored when creating these operations.
Variable initializers must be run explicitly before other ops in models. e.g. we can add an op that runs all the variable initializers and run that op before using the model.
Use tf.global_variables_initializer()
to add an op to run variable initializers. Only run that op after you have fully constructed your model and launched it in a session.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
name="weights")
biases = tf.Variable(tf.zeros([200]), name="biases")
...
# Add an op to initialize the variable
init_op = tf.global_variables_initializer()
# Later, when launching the model
with tf.Session() as sess:
sess.run(init_op)
...
# Use the model
...
to initialize a new varibale from the value of another variable use the other variable’s initialized_value()
property. you can use the initialized value directly as the initial value for the new variable, or you can use it as any other tensor to compute a value for the new variable.
weights = tf.Variable(tf.random_normal([784, 100], stddev=0.35), name='weights')
# create another variable tiwh the same value as 'weights'
w2 = tf.Variable(weights.initialized_value(), name='w2')
# Create another variable with twice the value of 'weights'
w_twice = tf.Variable(weights.initialized_value()*2.0, name='w_twice')
you can also pass an explicit list of variables to initialize to tf.variables_initializer
.
the easiest way to save and restore a model is to use tf.train.Saver
object. this constructor adds save and restore ops to the graph for all, or a specified list, of the variables in the graph.
to restore a model checkpoint without a graph, it must first import the graph from the meta graph file (typical extension is .meta), this is down with tf.train.import_meta_graph, which in turn returns a Saver from which one can then perform a restore.
Variables are saved in binary files that contain a map from variable names to tensor values.
when create a Saver object, we can optionally choose names for the variables in the checkpoint files. by default, it uses the value of the tf.Variable.name
property for each variable.
to understand what variables are in a checkpoint, we can use the inspect_checkpoint
library, and in particular, the print_tensors_in_checkpoint_file function.
# Create some variables.
v1 = tf.Variable(..., name='v1')
v2 = tf.Variable(..., name='v2')
...
init_op = tf.global_variables_initializer()
# Add ops to save and restore all the variables
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the variables to disk.
with tf.Session() as sess:
sess.run(init_op)
# Do some work with the model.
...
# Save the model to disk
save_path = saver.save(sess, '/tmp/model.ckpt')
print('Model saved in file: %s' %save_path)
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, '/tmp/model.ckpt')
print('Model restored.')
# Do some work with the model
...
if you do not pass any argument to tf.train.Saver(), the saver handles all variables in the graph.
to change the names for variables in the checkpoint file, or only to save or restore a subset of the varibles used by a model.
you can specify the names and variables to save by passing to the tf.train.Saver()
constructor a python dictionary: keys are the names to use, values are the variables to manage.
V1 = tf.Variable(..., name='v1')
v2 = tf.Variable(..., name='v2')
saver = tf.train.Saver({'my_v2':v2})
you can think of a TF tensor as an n-dimensional array or list. a tensor has a static type and dynamic dimensions.
only tensors may be passed between nodes in the computation graph.
tensor rank is sometimes referred to as order or degree or n-dimension, it’s the number of dimensions of the tensor.
Rank | Math entity | Python example |
---|---|---|
0 | Scalar (magnitude only) | s = 483 |
1 | Vector (magnitude and direction) | v = [1.1, 2.2, 3.3] |
2 | Matrix (table of numbers) | m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] |
3 | 3-Tensor (cube of numbers) | t = [[[2], [4], [6]], [[8], [10], [12]], [[14],[16], [18]]] |
n | n-Tensor (you get the idea) | .... |
Rank | Shape | Dimension number | Example |
---|---|---|---|
0 | [] | 0-D | A 0-D tensor. A scalar. |
1 | [D0] | 1-D | A 1-D tensor with shape [5]. |
2 | [D0, D1] | 2-D | A 2-D tensor with shape [3, 4]. |
3 | [D0, D1, D2] | 3-D | A 3-D tensor with shape [1, 4, 3]. |
n | [D0, D1, … Dn-1] | n-D | A tensor with shape [D0, D1, … Dn-1]. |
Data type | Python type | Description |
---|---|---|
DT_FLOAT | tf.float32 | 32 bits floating point. |
DT_DOUBLE | tf.float64 | 64 bits floating point. |
DT_INT8 | tf.int8 | 8 bits signed integer. |
DT_INT16 | tf.int16 | 16 bits signed integer. |
DT_INT32 | tf.int32 | 32 bits signed integer. |
DT_INT64 | tf.int64 | 64 bits signed integer. |
DT_UINT8 | tf.uint8 | 8 bits unsigned integer. |
DT_UINT16 | tf.uint16 | 16 bits unsigned integer. |
DT_STRING | tf.string | Variable length byte arrays. Each element of a Tensor is a byte array. |
DT_BOOL | tf.bool | Boolean. |
DT_COMPLEX64 | tf.complex64 | Complex number made of two 32 bits floating points: real and imaginary parts. |
DT_COMPLEX128 | tf.complex128 | Complex number made of two 64 bits floating points: real and imaginary parts. |
DT_QINT8 | tf.qint8 | 8 bits signed integer used in quantized Ops. |
DT_QINT32 | tf.qint32 | 32 bits signed integer used in quantized Ops. |
DT_QUINT8 | tf.quint8 | 8 bits unsigned integer used in quantized Ops. |
def my_image_filter(input_images):
conv1_weights = tf.Varibale(tf.random_normal([5, 5, 32, 32]), name='conv1_weights')
conv1_biases = tf.Variable(tf.zeros([32]), name='conv1_biases')
conv1 = tf.nn.conv2d(input_images, conv1_weithts, strides=[1, 1, 1, 1], padding='SAME')
relu1 = tf.nn.relu(conv1 + conv1_biases)
conv2_weights = tf.Varibale(tf.random_normal([5, 5, 32, 32]), name='conv2_weights')
conv2_biases = tf.Variable(tf.zeros([32]), name='conv2_biases')
conv2 = tf.nn.conv2d(relu1, conv2_weights, strides=[1, 1, 1, 1], padding='SAME')
return tf.nn.relu(conv2+conv2_biases)
here we have 4 variables: conv1_weights, conv1_biases, conv2_weights, conv2_biases.
when you want to reuse this model, assume you want to apply you image filter to 2 different images, instead of call my_image_filter() 2 times, we can share them, a common way to share variables is to create them in a separate piece of code and pass them to functions that use them. for example by using a dictionary.
variables_dict = {
'conv1_weights' : tf.Variable(tf.random_normal([5, 5, 32, 32]), name='conv1_weights')
'conv1_biases' : tf.Variable(tf.zeros([32]), name='conv1_biases')
... etc. ...
}
def my_image_filter(input_images, variables_dict):
conv1 = tf.nn.conv2d(input_images, variables_dict['conv1_weigths'], strides=[1, 1, 1, 1], padding='SAME')
relu1 = tf.nn.relu(conv1 + variables_dict['conv1_biases'])
conv2 = tf.nn.conv2d(relu1, variables_dict['conv2_weights'], strides=[1, 1, 1, 1], padding='SAME')
return rf.nn.relu(conv2+variables_dict['conv2_biases'])
result1 = my_image_filter(image1, variables_dict)
result2 = my_image_filter(image2, variables_dict)
but the above breaks encapsulation, one way to address the problem is to use classes to create a model, where the classes take care of managing the variables they need. For a lighter solution , not involving classes, TF provides a Variable Scope mechanizm tha allows to easily share named variables whild constructing a graph.
Variable Scope mechanism in TF consists of two main functions:
tf.get_variable(
tf.variable_scope(
tf.get_variable
is used to get or create a variable instead of a direct all to tf.Variable
. it uses an initializer instead of passing the value directly, as in tf.Variable.
a initializer is a function that takes the shape and provides a tensor with that shape. here’re some initializers avialable in TF:
tf.constant_initializer(value)
tf.random_uniform_initializer(a, b)
tf.random_normal_initializer(mean, stddev)
below is a tf.get_variable()
usage example:
def conv_relu(input, kernel_shape, bias_shape):
weights = tf.get_variable('weights', kernel_shape, initializer=tf.random_normal_initializer())
biases = tf.get_variable('biases', bias_shape, initializer=tf.constant_initializer(0.0))
conv = tf.nn.conv2d(input, weights, strides=[1, 1, 1, 1], padding='SAME')
return tf.nn.relu(conv+biases)
tf.variable_scope()
pushes a namespace for variables:
def my_image_filter(input_images):
with tf.varibale_scope('conv1'):
# Variables created here will be named 'conv1/weights', 'conv1/biases'
relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
with tf.variable_scope('conv2'):
return conv_relu(relu1, [5, 5, 32, 32], [32])
tf.get_variable()
checks that already existing variables are not shared by accident. If you want to share them, you need to specify it by setting reuse_variables()
:
with tf.variable_scope('image_filters') as scope:
result1 = my_image_filter(image1)
scope.reuse_variables()
result2 = my_image_filter(image2)
this is a good way to share variables, light weight and safe.
The current variable scope can be retrieved using tf.get_variable_scope()
and the reuse
flag of the current variable scope can be set to True
by calling tf.get_variable_scope().reuse_variables()
:
with tf.variable_scope('foo'):
v = tf.get_variable('v', [1])
tf.get_variable_scope().reuse_variables()
v1 = tf.get_variable('v', [1])
assert v1 is v
with tf.variable_scope('root'):
# At start, the scope is not reusing
assert tf.get_variable_scope().reuse == False
with tf.variable_scope('foo'):
# Open a sub-scope, still not reusing
assert tf.get_variable_scope().reuse == False
with tf.variable_scope('foo', reuse=True):
# Explicitly opened a reusing scope
assert tf.get_variable_scope().reuse == True
with tf.variable_scope('bar'):
# Now sub-scope inherits the reuse flag.
assert tf.get_variable_scope().reuse == True
# Exited the reusing scope, back to a non-reusing one.
assert tf.get_variable_scope().reuse == False
tf.variable_scope()
with tf.variable_scope("foo"):
x = 1.0 + tf.get_variable("v", [1])
assert x.op.name == "foo/add"
name scopes can be opened in addtion to a variable scope, and then they will only affect the names of the ops, but not of variables.
with tf.variable_scope('foo'):
with tf.name_scope('bar'):
v = tf.get_variable('v', [1])
x = 1.0 + v
assert v.name == 'foo/v:0'
assert x.op.name == 'foo/bar/add'
when opening a variable scope using a captured object instead of a string, we do not alter the current name scope for ops.