tensorpack.tfutils package

tensorpack.tfutils.collection module

Parameters:keys (list) – list of collection keys to backup
Returns:dict – the backup

Restore from a collection backup.

Parameters:backup (dict) –
tensorpack.tfutils.collection.freeze_collection(*args, **kwds)[source]
Parameters:keys (list) – list of collection keys to freeze.
Returns:a context where the collections are in the end restored to its initial state.

tensorpack.tfutils.distributions module

class tensorpack.tfutils.distributions.Distribution(name)[source]

Bases: object

Base class of symbolic distribution utilities (the distribution parameters can be symbolic tensors).

Parameters:name (str) – the name to be used for scope and tensors in this distribution.
encoder_activation(*args, **kwargs)[source]
An activation function which transform unconstrained raw network output

to a vector of feasible distribution parameters.

Note that for each distribution, there are many feasible ways to design this function and it’s hard to say which is better. The default implementations in the distribution classes here is just one reasonable way to do this.

Parameters:dist_param – output from a network, of shape (batch, param_dim).
Returns:a tensor of the same shape, the distribution parameters.
entropy(*args, **kwargs)[source]
Entropy of this distribution parameterized by theta,
estimated from a batch of samples.
\[H(x) = - E[\log p(x_i)], \text{where } p \text{ is parameterized by } \theta.\]
  • x – samples of shape (batch, sample_dim)

  • theta – model parameters of shape (batch, param_dim)


a scalar tensor, the entropy.

loglikelihood(*args, **kwargs)[source]
  • x – samples of shape (batch, sample_dim)

  • theta – model parameters of shape (batch, param_dim)


log likelihood of each sample, of shape (batch,)

name = None

Returns – int: the dimension of parameters of this distribution.

sample(*args, **kwargs)[source]

Sample a batch of vectors from this distribution parameterized by theta.

  • batch_size (int) – the batch size.

  • theta – a tensor of shape (param_dim,) or (batch, param_dim).


a batch of samples of shape (batch, sample_dim)


Returns – int: the dimension of samples out of this distribution.

class tensorpack.tfutils.distributions.CategoricalDistribution(name, cardinality)[source]

Bases: tensorpack.tfutils.distributions.Distribution

Categorical distribution of a set of classes. Each sample is a one-hot vector.

__init__(name, cardinality)[source]
Parameters:cardinality (int) – number of categories
class tensorpack.tfutils.distributions.GaussianDistribution(name, dim, fixed_std=True)[source]

Bases: tensorpack.tfutils.distributions.Distribution

__init__(name, dim, fixed_std=True)[source]
  • dim (int) – the dimension of samples.

  • fixed_std (bool) – if True, will use 1 as std for all dimensions.

class tensorpack.tfutils.distributions.ProductDistribution(name, dists)[source]

Bases: tensorpack.tfutils.distributions.Distribution

A product of a list of independent distributions.

__init__(name, dists)[source]
Parameters:dists (list) – list of Distribution.
entropy(x, theta)[source]


It returns a list, as one might use different weights for each distribution.

Returns:list[tf.Tensor] – entropy of each distribution.

tensorpack.tfutils.gradproc module

class tensorpack.tfutils.gradproc.GradientProcessor[source]

Bases: object

Base class for all gradient processors.

Subclass should override the _process() method.


Process the symbolic gradients.

Parameters:grads (list) – list of (grad, var).
Returns:list – processed gradients, with the same type as input.
class tensorpack.tfutils.gradproc.FilterNoneGrad(verbose=True)[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Skip the update and print a warning (instead of crashing), when the gradient of certain variable is None.

Parameters:verbose (bool) – whether to print warning about None gradients.
class tensorpack.tfutils.gradproc.GlobalNormClip(global_norm)[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Clip by global norm. The global norm is the sum of norm for all gradients.

See tf.clip_by_global_norm() for more information.

Parameters:global_norm (float) – the threshold to clip with.
class tensorpack.tfutils.gradproc.MapGradient(func, regex='.*')[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Apply a function on all gradient if the name matches regex. Keep the other gradients unchanged.

__init__(func, regex='.*')[source]
  • func – takes a grad or (grad, var) pair and returns a grad. If return None, the gradient is discarded (hence no update to the variable will happen).

  • regex (str) – used to match variables. Defaults to match all variables.

class tensorpack.tfutils.gradproc.SummaryGradient[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Summary histogram and RMS for each gradient variable.

class tensorpack.tfutils.gradproc.CheckGradient[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Check for numeric issue. See tf.check_numerics() for more information.

class tensorpack.tfutils.gradproc.ScaleGradient(multipliers, verbose=True, log=None)[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Scale certain gradient by a multiplier.

__init__(multipliers, verbose=True, log=None)[source]
  • multipliers (tuple or list) – tuple of (regex, float), or list of tuples.

  • verbose (bool) – whether to print logs or not

  • log – deprecated


Use double learning rate for all the bias (as in caffe):

ScaleGradient(('.*/b', 2))

tensorpack.tfutils.model_utils module


Print a description of the current model parameters. Skip variables starting with “tower”.

Parameters:tensors (list or tf.Tensor) – a tensor or a list of tensors
Returns:str – a string to describe the shape

tensorpack.tfutils.scope_utils module

tensorpack.tfutils.scope_utils.get_name_scope_name(*args, **kwargs)[source]
Returns:str – the name of the current name scope, without the ending ‘/’.

A decorator which automatically reuse the current variable scope if the function has been called with the same variable scope before.

tensorpack.tfutils.optimizer module

tensorpack.tfutils.optimizer.apply_grad_processors(opt, gradprocs)[source]

Wrapper around optimizers to apply gradient processors.

  • opt (tf.train.Optimizer) –

  • gradprocs (list[GradientProcessor]) – gradient processors to add to the optimizer.


a tf.train.Optimizer instance which runs the gradient processors before updating the variables.

class tensorpack.tfutils.optimizer.ProxyOptimizer(opt, name='ProxyOptimizer')[source]

Bases: tensorflow.python.training.optimizer.Optimizer

A transparent proxy which delegates all methods of tf.train.Optimizer

apply_gradients(*args, **kwargs)[source]
compute_gradients(*args, **kwargs)[source]
get_slot(*args, **kwargs)[source]
get_slot_names(*args, **kwargs)[source]
class tensorpack.tfutils.optimizer.PostProcessOptimizer(opt, func, colocate=True)[source]

Bases: tensorpack.tfutils.optimizer.ProxyOptimizer

An optimizer which applies some “post-processing operation” per variable (e.g. clipping, quantization) after the gradient update.

__init__(opt, func, colocate=True)[source]
  • opt (tf.train.Optimizer) –

  • func (tf.Variable -> tf.Operation or None) – the operation needed to perform for this variable after the gradient update.

  • colocate (boolean) – colocate the function with the variable.

apply_gradients(grads_and_vars, global_step=None, name=None)[source]
class tensorpack.tfutils.optimizer.VariableAssignmentOptimizer(opt, func)[source]

Bases: tensorpack.tfutils.optimizer.PostProcessOptimizer

An optimizer which assigns each variable a new value (e.g. clipping, quantization) after the gradient update.

__init__(opt, func)[source]
  • opt (tf.train.Optimizer) –

  • func (tf.Variable -> tf.Tensor or None) – the new value to be assigned to this variable after the gradient update.

tensorpack.tfutils.sesscreate module

class tensorpack.tfutils.sesscreate.NewSessionCreator(target='', graph=None, config=None)[source]

Bases: tensorflow.python.training.monitored_session.SessionCreator

__init__(target='', graph=None, config=None)[source]
  • graph, config (target,) – same as Session.__init__().

  • config – defaults to tfutils.get_default_sess_config()

class tensorpack.tfutils.sesscreate.ReuseSessionCreator(sess)[source]

Bases: tensorflow.python.training.monitored_session.SessionCreator

Parameters:sess (tf.Session) – the session to reuse
class tensorpack.tfutils.sesscreate.SessionCreatorAdapter(session_creator, func)[source]

Bases: tensorflow.python.training.monitored_session.SessionCreator

__init__(session_creator, func)[source]
  • session_creator (tf.train.SessionCreator) – a session creator

  • func (tf.Session -> tf.Session) – takes a session created by

  • and return a new session to be returned by self.create_session (session_creator,) –


tensorpack.tfutils.summary module

tensorpack.tfutils.summary.create_scalar_summary(name, v)[source]
  • name (str) –

  • v (float) – scalar value


tf.Summary – a tf.Summary object with name and simple scalar value v.


Add summary Ops for all trainable variables matching the regex.

  • summary_lists (list) – each is (regex, [list of summary type to perform]).

  • type can be 'mean', 'scalar', 'histogram', 'sparsity', 'rms' (Summary) –

tensorpack.tfutils.summary.add_activation_summary(x, name=None)[source]

Add summary for an activation tensor x. If name is None, use x.name.

Parameters:x (tf.Tensor) – the tensor to summary.
tensorpack.tfutils.summary.add_moving_summary(v, *args, **kwargs)[source]
  • v (tf.Tensor or list) – tensor or list of tensors to summary. Must have scalar type.

  • args – tensors to summary (support positional arguments)

  • decay (float) – the decay rate. Defaults to 0.95.

  • collection (str) – the name of the collection to add EMA-maintaining ops. The default will work together with the default MovingAverageSummary callback.

tensorpack.tfutils.symbolic_functions module

tensorpack.tfutils.symbolic_functions.accuracy(logits, label, topk=1, name='accuracy')[source]
  • logits – shape [B,C].

  • label – shape [B].

  • topk (int) – topk


a single scalar


Flatten the tensor except the first dimension.

tensorpack.tfutils.symbolic_functions.class_balanced_cross_entropy(pred, label, name='cross_entropy_loss')[source]

The class-balanced cross entropy loss, as in Holistically-Nested Edge Detection.

  • pred – of shape (b, ...). the predictions in [0,1].

  • label – of the same shape. the ground truth in {0,1}.


class-balanced cross entropy loss.

tensorpack.tfutils.symbolic_functions.class_balanced_sigmoid_cross_entropy(logits, label, name='cross_entropy_loss')[source]

This function accepts logits rather than predictions, and is more numerically stable than class_balanced_cross_entropy().

tensorpack.tfutils.symbolic_functions.contrastive_loss(left, right, y, margin, extra=False, scope='constrastive_loss')[source]

Loss for Siamese networks as described in the paper: Learning a Similarity Metric Discriminatively, with Application to Face Verification by Chopra et al.

\[\frac{1}{2} [y \cdot d^2 + (1-y) \cdot \max(0, m - d)^2], d = \Vert l - r \Vert_2\]
  • left (tf.Tensor) – left feature vectors of shape [Batch, N].

  • right (tf.Tensor) – right feature vectors of shape [Batch, N].

  • y (tf.Tensor) – binary labels of shape [Batch]. 1: similar, 0: not similar.

  • margin (float) – horizon for negative examples (y==0).

  • extra (bool) – also return distances for pos and neg.


tf.Tensor – constrastive_loss (averaged over the batch), (and optionally average_pos_dist, average_neg_dist)


Flatten the tensor.

tensorpack.tfutils.symbolic_functions.get_scalar_var(name, init_value, summary=False, trainable=False)[source]

Get a scalar float variable with certain initial value

  • name (str) – name of the variable.

  • init_value (float) – initial value.

  • summary (bool) – whether to summary this variable.

  • trainable (bool) – trainable or not.


tf.Variable – the variable

tensorpack.tfutils.symbolic_functions.guided_relu(*args, **kwds)[source]
Returns:A context where the gradient of tf.nn.relu() is replaced by guided back-propagation, as described in the paper: Striving for Simplicity: The All Convolutional Net
tensorpack.tfutils.symbolic_functions.huber_loss(x, delta=1, name='huber_loss')[source]

Huber loss of x.

\[\begin{split}y = \begin{cases} \frac{x^2}{2}, & |x| < \delta \\ \delta |x| - \frac{\delta^2}{2}, & |x| \ge \delta \end{cases}\end{split}\]
  • x – the difference vector.

  • delta (float) –


a tensor of the same shape of x.

tensorpack.tfutils.symbolic_functions.prediction_incorrect(logits, label, topk=1, name='incorrect_vector')[source]
  • logits – shape [B,C].

  • label – shape [B].

  • topk (int) – topk


a float32 vector of length N with 0/1 values. 1 means incorrect prediction.

tensorpack.tfutils.symbolic_functions.print_stat(x, message=None)[source]

A simple print Op that might be easier to use than tf.Print(). Use it like: x = print_stat(x, message='This is x').

tensorpack.tfutils.symbolic_functions.psnr(prediction, ground_truth, maxp=None, name='psnr')[source]

Peek Signal to Noise Ratio.

\[PSNR = 20 \cdot \log_{10}(MAX_p) - 10 \cdot \log_{10}(MSE)\]
  • prediction – a tf.Tensor representing the prediction signal.

  • ground_truth – another tf.Tensor with the same shape.

  • maxp – maximum possible pixel value of the image (255 in in 8bit images)


A scalar tensor representing the PSNR.

tensorpack.tfutils.symbolic_functions.rms(x, name=None)[source]
Returns:root mean square of tensor x.
tensorpack.tfutils.symbolic_functions.saliency_map(output, input, name='saliency_map')[source]

Produce a saliency map as described in the paper: Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. The saliency map is the gradient of the max element in output w.r.t input.

Returns:tf.Tensor – the saliency map. Has the same shape as input.
tensorpack.tfutils.symbolic_functions.shapeless_placeholder(x, axis, name)[source]

Make the static shape of a tensor less specific.

If you want to feed to a tensor, the shape of the feed value must match the tensor’s static shape. This function creates a placeholder which defaults to x if not fed, but has a less specific static shape than x. See also tensorflow#5680.

  • x – a tensor

  • axis (int or list of ints) – these axes of x.get_shape() will become None in the output.

  • name (str) – name of the output tensor


a tensor equal to x, but shape information is partially cleared.

tensorpack.tfutils.symbolic_functions.siamese_cosine_loss(left, right, y, scope='cosine_loss')[source]

Loss for Siamese networks (cosine version). Same as contrastive_loss() but with different similarity measurement.

\[[\frac{l \cdot r}{\lVert l\rVert \lVert r\rVert} - (2y-1)]^2\]
  • left (tf.Tensor) – left feature vectors of shape [Batch, N].

  • right (tf.Tensor) – right feature vectors of shape [Batch, N].

  • y (tf.Tensor) – binary labels of shape [Batch]. 1: similar, 0: not similar.


tf.Tensor – cosine-loss as a scalar tensor.

tensorpack.tfutils.symbolic_functions.soft_triplet_loss(anchor, positive, negative, extra=True, scope='soft_triplet_loss')[source]

Loss for triplet networks as described in the paper: Deep Metric Learning using Triplet Network by Hoffer et al.

It is a softmax loss using \((anchor-positive)^2\) and \((anchor-negative)^2\) as logits.

  • anchor (tf.Tensor) – anchor feature vectors of shape [Batch, N].

  • positive (tf.Tensor) – features of positive match of the same shape.

  • negative (tf.Tensor) – features of negative match of the same shape.

  • extra (bool) – also return distances for pos and neg.


tf.Tensor – triplet-loss as scalar (and optionally average_pos_dist, average_neg_dist)

tensorpack.tfutils.symbolic_functions.triplet_loss(anchor, positive, negative, margin, extra=False, scope='triplet_loss')[source]

Loss for Triplet networks as described in the paper: FaceNet: A Unified Embedding for Face Recognition and Clustering by Schroff et al.

Learn embeddings from an anchor point and a similar input (positive) as well as a not-similar input (negative). Intuitively, a matching pair (anchor, positive) should have a smaller relative distance than a non-matching pair (anchor, negative).

\[\max(0, m + \Vert a-p\Vert^2 - \Vert a-n\Vert^2)\]
  • anchor (tf.Tensor) – anchor feature vectors of shape [Batch, N].

  • positive (tf.Tensor) – features of positive match of the same shape.

  • negative (tf.Tensor) – features of negative match of the same shape.

  • margin (float) – horizon for negative examples

  • extra (bool) – also return distances for pos and neg.


tf.Tensor – triplet-loss as scalar (and optionally average_pos_dist, average_neg_dist)

tensorpack.tfutils.varmanip module

class tensorpack.tfutils.varmanip.SessionUpdate(sess, vars_to_update)[source]

Bases: object

Update the variables in a session

__init__(sess, vars_to_update)[source]
  • sess (tf.Session) – a session object

  • vars_to_update – a collection of variables to update

static load_value_to_var(var, val, strict=False)[source]

Call var.load(val) with the default session.

  • var (tf.Variable) –

  • strict (bool) – Behave less strict if set to False.

Parameters:prms (dict) – dict of {variable name: value} Any name in prms must be in the graph and in vars_to_update.

Dump value of all TRAINABLE + MODEL variables to a dict, and save as npy format (loadable by DictRestore).

Parameters:path (str) – the path to save the parameters.

Dump all variables from a checkpoint to a dict.

Parameters:model_path (str) – path to a checkpoint.
tensorpack.tfutils.varmanip.get_savename_from_varname(varname, varname_prefix=None, savename_prefix=None)[source]
  • varname (str) – a variable name in the graph

  • varname_prefix (str) – an optional prefix that may need to be removed in varname

  • savename_prefix (str) – an optional prefix to append to all savename


str – the name used to save the variable


Guess if a name belongs to a training-only variables. Only used internally to avoid too many logging. Do not use it.

Returns:bool – Guess whether this tensor is something only used in training.

Work around TF problems in checkpoint path handling.

Parameters:model_path – a user-input path
Returns:str – the argument that can be passed to NewCheckpointReader

tensorpack.tfutils.varreplace module

tensorpack.tfutils.varreplace.custom_getter_scope(*args, **kwds)[source]
tensorpack.tfutils.varreplace.replace_get_variable(*args, **kwargs)[source]
Parameters:fn – a function compatible with tf.get_variable.
Returns:a context with a custom getter

Return a context, where all variables (reused or not) returned by get_variable will have no gradients (they will be wrapped by tf.stop_gradient). But they will still be in TRAINABLE_VARIABLES collections so they will get saved correctly. This is useful to fix certain variables for fine-tuning.


with varreplace.freeze_variable():
    x = FullyConnected('fc', x, 1000)   # fc/* will not be trained
tensorpack.tfutils.varreplace.freeze_get_variable(*args, **kwargs)[source]
tensorpack.tfutils.varreplace.remap_get_variable(*args, **kwargs)[source]

Use fn to map the output of any variable getter.

Parameters:fn (tf.Variable -> tf.Tensor) –
Returns:a context where all the variables will be mapped by fn.

Module contents

tensorpack.tfutils.argscope(*args, **kwds)[source]
Parameters:layers (list or layer) – layer or list of layers to apply the arguments.
Returns:a context where all appearance of these layer will by default have the arguments specified by kwargs.


with argscope(Conv2D, kernel_shape=3, nl=tf.nn.relu, out_channel=32):
    x = Conv2D('conv0', x)
    x = Conv2D('conv1', x)
    x = Conv2D('conv2', x, out_channel=64)  # override argscope
Returns:dict – the current argscope.

An argscope is a dict of dict: dict[layername] = {arg: val}


Return a better session config to use as default. Tensorflow default session config consume too much resources.

Parameters:mem_fraction (float) – fraction of memory to use.
Returns:tf.ConfigProto – the config to use.
Returns:int – global_step value in current graph and session
tensorpack.tfutils.get_global_step_var(*args, **kwargs)

Will automatically determine if name is a tensor name (ends with ‘:x’) or a op name. If it is an op name, the corresponding tensor name is assumed to be op_name + ':0'.

Parameters:name (str) – name of an op or a tensor
Returns:tuple – (op_name, tensor_name)

Get a list of tensors in the default graph by a list of names.

Parameters:names (list) –

Get either tf.Operation of tf.Tensor from names.

Parameters:name (list[str] or str) – names of operations or tensors.

Return a float (for comparison), indicating tensorflow version.

class tensorpack.tfutils.SessionInit[source]

Bases: object

Base class for utilities to initialize a (existing) session.


Initialize a session

Parameters:sess (tf.Session) – the session
class tensorpack.tfutils.SaverRestore(model_path, prefix=None, ignore=[])[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Restore a tensorflow checkpoint saved by tf.train.Saver or ModelSaver.

__init__(model_path, prefix=None, ignore=[])[source]
  • model_path (str) – a model name (model-xxxx) or a checkpoint file.

  • prefix (str) – during restore, add a prefix/ for every variable in this checkpoint

  • ignore (list[str]) – list of tensor names that should be ignored during loading, e.g. learning-rate

class tensorpack.tfutils.SaverRestoreRelaxed(model_path, prefix=None, ignore=[])[source]

Bases: tensorpack.tfutils.sessinit.SaverRestore

Same as SaverRestore, but has more relaxed constraints.

It allows upcasting certain variables, or reshape certain variables when there is a mismatch that can be fixed. Another advantage is that it doesn’t add any new ops to the graph. But it is also slower than SaverRestore.

tensorpack.tfutils.ParamRestore(*args, **kwargs)[source]
class tensorpack.tfutils.DictRestore(param_dict)[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Restore variables from a dictionary.

Parameters:param_dict (dict) – a dict of {name: value}
class tensorpack.tfutils.ChainInit(sess_inits)[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Initialize a session by a list of SessionInit instance, executed one by one. This can be useful for, e.g., loading several models from different files to form a composition of models.

Parameters:sess_inits (list[SessionInit]) – list of SessionInit instances.
class tensorpack.tfutils.JustCurrentSession[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

This is a no-op placeholder


Get a corresponding model loader by looking at the file name.

Returns:SessInit – either a DictRestore (if name ends with ‘npy’) or SaverRestore (otherwise).

Load latest checkpoint from LOG_DIR, if there is one.

Returns:SessInit – either a JustCurrentSession, or a SaverRestore.
class tensorpack.tfutils.TowerContext(tower_name, is_training=None, index=0, vs_name='')[source]

Bases: object

A context where the current model is being built in.

__init__(tower_name, is_training=None, index=0, vs_name='')[source]
  • tower_name (str) – The name scope of the tower.

  • is_training (bool) – if None, automatically determine from tower_name.

  • index (int) – index of this tower.

  • vs_name (str) – Open a variable scope with this name, if given.


Filter the list and only keep those under the current variable scope. If this tower doesn’t contain its own variable scope, return the list as-is.

Parameters:varlist (list[tf.Variable] or list[tf.Tensor]) –

Whether this tower is supposed to have its own variables.