tensorpack.tfutils package

tensorpack.tfutils.collection module

Parameters:keys (list) – list of collection keys to backup
Returns:dict – the backup

Restore from a collection backup.

Parameters:backup (dict) –
Parameters:keys (list) – list of collection keys to freeze.
Returns:a context where the collections are in the end restored to its initial state.

tensorpack.tfutils.gradproc module

class tensorpack.tfutils.gradproc.GradientProcessor[source]

Bases: object

Base class for all gradient processors.

Subclass should override the _process() method.


Process the symbolic gradients.

Parameters:grads (list) – list of (grad, var).
Returns:list – processed gradients, with the same type as input.
class tensorpack.tfutils.gradproc.FilterNoneGrad(verbose=True)[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Skip the update and print a warning (instead of crashing), when the gradient of certain variable is None.

Parameters:verbose (bool) – whether to print warning about None gradients.
class tensorpack.tfutils.gradproc.GlobalNormClip(global_norm)[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Clip by global norm. The global norm is the sum of norm for all gradients.

See tf.clip_by_global_norm() for more information.

Parameters:global_norm (float) – the threshold to clip with.
class tensorpack.tfutils.gradproc.MapGradient(func, regex='.*')[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Apply a function on all gradient if the name matches regex. Keep the other gradients unchanged.

__init__(func, regex='.*')[source]
  • func – takes a grad or (grad, var) pair and returns a grad. If return None, the gradient is discarded (hence no update to the variable will happen).

  • regex (str) – used to match variables. Defaults to match all variables.

class tensorpack.tfutils.gradproc.SummaryGradient(regex='.*')[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

For each gradient tensor, summary its histogram and add it to moving summaries.

Parameters:regex (str) – same as in MapGradient.
class tensorpack.tfutils.gradproc.PrintGradient(regex='.*')[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Print the gradients every step with symbolic_functions.print_stat().

Parameters:regex (str) – same as in MapGradient.
class tensorpack.tfutils.gradproc.CheckGradient[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Check for numeric issue. See tf.check_numerics() for more information.

class tensorpack.tfutils.gradproc.ScaleGradient(multipliers, verbose=True, log=None)[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Scale certain gradient by a multiplier.

__init__(multipliers, verbose=True, log=None)[source]
  • multipliers (tuple or list) – tuple of (regex, float), or list of tuples.

  • verbose (bool) – whether to print logs or not

  • log – deprecated


Use double learning rate for all the bias (as in caffe):

ScaleGradient(('.*/b', 2))

tensorpack.tfutils.scope_utils module


A decorator which automatically reuses the current variable scope if the function has been called with the same variable scope before.


def myfunc(x):
    return tf.layers.conv2d(x, 128, 3)

myfunc(x1)  # will inherit parent scope reuse
myfunc(x2)  # will reuse
with tf.variable_scope('newscope'):
    myfunc(x3)  # will inherit parent scope reuse
    myfunc(x4)  # will reuse

Return a context which either opens and caches a new top-level name scope, or reenter an existing one.


The name scope will always be top-level. It will not be nested under any existing name scope of the caller.

tensorpack.tfutils.optimizer module

tensorpack.tfutils.optimizer.apply_grad_processors(opt, gradprocs)[source]

Wrapper around optimizers to apply gradient processors.

  • opt (tf.train.Optimizer) –

  • gradprocs (list[GradientProcessor]) – gradient processors to add to the optimizer.


a tf.train.Optimizer instance which runs the gradient processors before updating the variables.

class tensorpack.tfutils.optimizer.ProxyOptimizer(opt, name='ProxyOptimizer')[source]

Bases: tensorflow.python.training.optimizer.Optimizer

A transparent proxy which delegates all methods of tf.train.Optimizer

apply_gradients(*args, **kwargs)[source]
compute_gradients(*args, **kwargs)[source]
get_slot(*args, **kwargs)[source]
get_slot_names(*args, **kwargs)[source]
class tensorpack.tfutils.optimizer.PostProcessOptimizer(opt, func, colocate=True)[source]

Bases: tensorpack.tfutils.optimizer.ProxyOptimizer

An optimizer which applies some “post-processing operation” per variable (e.g. clipping, quantization) after the gradient update.

__init__(opt, func, colocate=True)[source]
  • opt (tf.train.Optimizer) –

  • func (tf.Variable -> tf.Operation or None) – the operation needed to perform for this variable after the gradient update.

  • colocate (boolean) – colocate the function with the variable.

apply_gradients(grads_and_vars, global_step=None, name=None)[source]
class tensorpack.tfutils.optimizer.VariableAssignmentOptimizer(opt, func)[source]

Bases: tensorpack.tfutils.optimizer.PostProcessOptimizer

An optimizer which assigns each variable a new value (e.g. clipping, quantization) after the gradient update.

__init__(opt, func)[source]
  • opt (tf.train.Optimizer) –

  • func (tf.Variable -> tf.Tensor or None) – the new value to be assigned to this variable after the gradient update.

class tensorpack.tfutils.optimizer.AccumGradOptimizer(opt, niter)[source]

Bases: tensorpack.tfutils.optimizer.ProxyOptimizer

An optimizer which accumulates gradients across \(k\) minimize() calls, and apply them together in every \(k`th :meth:`minimize\) call. This is equivalent to using a \(k\) times larger batch size plus a \(k\) times larger learning rate, but uses much less memory.

__init__(opt, niter)[source]
  • opt (tf.train.Optimizer) – the underlying sub-optimizer.

  • niter (int) – number of iterations to accumulate gradients.

apply_gradients(grads_and_vars, global_step=None, name=None)[source]

tensorpack.tfutils.sesscreate module

class tensorpack.tfutils.sesscreate.NewSessionCreator(target='', graph=None, config=None)[source]

Bases: tensorflow.python.training.monitored_session.SessionCreator

__init__(target='', graph=None, config=None)[source]
  • graph, config (target,) – same as Session.__init__().

  • config – defaults to tfutils.get_default_sess_config()

class tensorpack.tfutils.sesscreate.ReuseSessionCreator(sess)[source]

Bases: tensorflow.python.training.monitored_session.SessionCreator

Parameters:sess (tf.Session) – the session to reuse
class tensorpack.tfutils.sesscreate.SessionCreatorAdapter(session_creator, func)[source]

Bases: tensorflow.python.training.monitored_session.SessionCreator

__init__(session_creator, func)[source]
  • session_creator (tf.train.SessionCreator) – a session creator

  • func (tf.Session -> tf.Session) – takes a session created by

  • and return a new session to be returned by self.create_session (session_creator,) –


tensorpack.tfutils.summary module

tensorpack.tfutils.summary.add_tensor_summary(x, types, name=None, collections=None, main_tower_only=True)[source]

Summarize a tensor by different methods.

  • x (tf.Tensor) – a tensor to summarize

  • types (list[str]) – summary types, can be scalar/histogram/sparsity/mean/rms

  • name (str) – summary name. Defaults to be the op name.

  • collections (list[str]) – collections of the summary ops.

  • main_tower_only (bool) – Only run under main training tower. If set to True, calling this function under other TowerContext has no effect.


with tf.name_scope('mysummaries'):  # to not mess up tensorboard
        tensor, ['histogram', 'rms', 'sparsity'], name='mytensor')
tensorpack.tfutils.summary.add_param_summary(*summary_lists, **kwargs)[source]

Add summary ops for all trainable variables matching the regex, under a reused ‘param-summary’ name scope. This function is a no-op if not calling from main training tower.

  • summary_lists (list) – each is (regex, [list of summary type]). Summary type is defined in add_tensor_summary().

  • collections (list[str]) – collections of the summary ops.


    ('.*/W', ['histogram', 'rms']),
    ('.*/gamma', ['scalar']),
tensorpack.tfutils.summary.add_activation_summary(x, types=None, name=None, collections=None)[source]

Call add_tensor_summary() under a reused ‘activation-summary’ name scope. This function is a no-op if not calling from main training tower.

  • x (tf.Tensor) – the tensor to summary.

  • types (list[str]) – summary types, defaults to ['sparsity', 'rms', 'histogram'].

  • name (str) – if is None, use x.name.

  • collections (list[str]) – collections of the summary ops.

tensorpack.tfutils.summary.add_moving_summary(*args, **kwargs)[source]

Add moving average summary for some tensors. This function is a no-op if not calling from main training tower.

  • args – tensors to summarize

  • decay (float) – the decay rate. Defaults to 0.95.

  • collection (str or None) – the name of the collection to add EMA-maintaining ops. The default will work together with the default MovingAverageSummary callback.



list of tensors returned by assign_moving_average,

which can be used to maintain the EMA.

tensorpack.tfutils.symbolic_functions module

tensorpack.tfutils.symbolic_functions.accuracy(logits, label, topk=1, name='accuracy')[source]
  • logits – shape [B,C].

  • label – shape [B].

  • topk (int) – topk


a single scalar


Flatten the tensor except the first dimension.

tensorpack.tfutils.symbolic_functions.class_balanced_cross_entropy(pred, label, name='cross_entropy_loss')[source]

The class-balanced cross entropy loss, as in Holistically-Nested Edge Detection.

  • pred – of shape (b, …). the predictions in [0,1].

  • label – of the same shape. the ground truth in {0,1}.


class-balanced cross entropy loss.

tensorpack.tfutils.symbolic_functions.class_balanced_sigmoid_cross_entropy(logits, label, name='cross_entropy_loss')[source]

This function accepts logits rather than predictions, and is more numerically stable than class_balanced_cross_entropy().


Flatten the tensor.

tensorpack.tfutils.symbolic_functions.get_scalar_var(name, init_value, summary=False, trainable=False)[source]

Get a scalar float variable with certain initial value

  • name (str) – name of the variable.

  • init_value (float) – initial value.

  • summary (bool) – whether to summary this variable.

  • trainable (bool) – trainable or not.


tf.Variable – the variable

Returns:A context where the gradient of tf.nn.relu() is replaced by guided back-propagation, as described in the paper: Striving for Simplicity: The All Convolutional Net
tensorpack.tfutils.symbolic_functions.prediction_incorrect(logits, label, topk=1, name='incorrect_vector')[source]
  • logits – shape [B,C].

  • label – shape [B].

  • topk (int) – topk


a float32 vector of length N with 0/1 values. 1 means incorrect prediction.

tensorpack.tfutils.symbolic_functions.print_stat(x, message=None)[source]

A simple print Op that might be easier to use than tf.Print(). Use it like: x = print_stat(x, message='This is x').

tensorpack.tfutils.symbolic_functions.psnr(prediction, ground_truth, maxp=None, name='psnr')[source]

Peek Signal to Noise Ratio.

\[PSNR = 20 \cdot \log_{10}(MAX_p) - 10 \cdot \log_{10}(MSE)\]
  • prediction – a tf.Tensor representing the prediction signal.

  • ground_truth – another tf.Tensor with the same shape.

  • maxp – maximum possible pixel value of the image (255 in in 8bit images)


A scalar tensor representing the PSNR.

tensorpack.tfutils.symbolic_functions.rms(x, name=None)[source]
Returns:root mean square of tensor x.
tensorpack.tfutils.symbolic_functions.saliency_map(output, input, name='saliency_map')[source]

Produce a saliency map as described in the paper: Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. The saliency map is the gradient of the max element in output w.r.t input.

Returns:tf.Tensor – the saliency map. Has the same shape as input.
tensorpack.tfutils.symbolic_functions.shapeless_placeholder(x, axis, name)[source]

Make the static shape of a tensor less specific.

If you want to feed to a tensor, the shape of the feed value must match the tensor’s static shape. This function creates a placeholder which defaults to x if not fed, but has a less specific static shape than x. See also tensorflow#5680.

  • x – a tensor

  • axis (int or list of ints) – these axes of x.get_shape() will become None in the output.

  • name (str) – name of the output tensor


a tensor equal to x, but shape information is partially cleared.

tensorpack.tfutils.varmanip module

class tensorpack.tfutils.varmanip.SessionUpdate(sess, vars_to_update)[source]

Bases: object

Update the variables in a session

__init__(sess, vars_to_update)[source]
  • sess (tf.Session) – a session object

  • vars_to_update – a collection of variables to update

static load_value_to_var(var, val, strict=False)[source]

Call var.load(val) with the default session.

  • var (tf.Variable) –

  • strict (bool) – Behave less strict if set to False.

Parameters:prms (dict) – dict of {variable name: value} Any name in prms must be in the graph and in vars_to_update.

Dump value of all TRAINABLE + MODEL variables to a dict, and save as npy/npz format (loadable by DictRestore).

Parameters:path (str) – the file name to save the parameters. Must ends with npy or npz.

Dump all variables from a checkpoint to a dict.

Parameters:model_path (str) – path to a checkpoint.
Returns:dict – a name:value dict

Work around TF problems in checkpoint path handling.

Parameters:model_path – a user-input path
Returns:str – the argument that can be passed to NewCheckpointReader

tensorpack.tfutils.varreplace module


Return a context, where all trainable variables (reused or not) returned by get_variable will have no gradients (they will be wrapped by tf.stop_gradient). But they will still be in TRAINABLE_VARIABLES collections so they will get saved correctly. This is useful to fix certain variables for fine-tuning.


with varreplace.freeze_variable():
    x = FullyConnected('fc', x, 1000)   # fc/* will not be trained

Use fn to map the output of any variable getter.

Parameters:fn (tf.Variable -> tf.Tensor) –
Returns:a context where all the variables will be mapped by fn.


with varreplace.remap_variables(lambda var: quantize(var)):
    x = FullyConnected('fc', x, 1000)   # fc/{W,b} will be quantized

Module contents

tensorpack.tfutils.argscope(layers, **kwargs)[source]
Parameters:layers (list or layer) – layer or list of layers to apply the arguments.
Returns:a context where all appearance of these layer will by default have the arguments specified by kwargs.


with argscope(Conv2D, kernel_shape=3, nl=tf.nn.relu, out_channel=32):
    x = Conv2D('conv0', x)
    x = Conv2D('conv1', x)
    x = Conv2D('conv2', x, out_channel=64)  # override argscope
Returns:dict – the current argscope.

An argscope is a dict of dict: dict[layername] = {arg: val}


Return a better session config to use as default. Tensorflow default session config consume too much resources.

Parameters:mem_fraction (float) – fraction of memory to use.
Returns:tf.ConfigProto – the config to use.
Returns:int – global_step value in current graph and session
Returns:tf.Tensor – the global_step variable in the current graph. create if doesn’t exist.
class tensorpack.tfutils.SessionInit[source]

Bases: object

Base class for utilities to initialize a (existing) session.


Initialize a session

Parameters:sess (tf.Session) – the session
class tensorpack.tfutils.ChainInit(sess_inits)[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Initialize a session by a list of SessionInit instance, executed one by one. This can be useful for, e.g., loading several models from different files to form a composition of models.

Parameters:sess_inits (list[SessionInit]) – list of SessionInit instances.
class tensorpack.tfutils.SaverRestore(model_path, prefix=None, ignore=[])[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Restore a tensorflow checkpoint saved by tf.train.Saver or ModelSaver.

__init__(model_path, prefix=None, ignore=[])[source]
  • model_path (str) – a model name (model-xxxx) or a checkpoint file.

  • prefix (str) – during restore, add a prefix/ for every variable in this checkpoint

  • ignore (list[str]) – list of tensor names that should be ignored during loading, e.g. learning-rate

class tensorpack.tfutils.SaverRestoreRelaxed(model_path, prefix=None, ignore=[])[source]

Bases: tensorpack.tfutils.sessinit.SaverRestore

Same as SaverRestore, but has more relaxed constraints.

It allows upcasting certain variables, or reshape certain variables when there is a mismatch that can be fixed. Another advantage is that it doesn’t add any new ops to the graph. But it is also slower than SaverRestore.

class tensorpack.tfutils.DictRestore(variable_dict)[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Restore variables from a dictionary.

Parameters:variable_dict (dict) – a dict of {name: value}
class tensorpack.tfutils.JustCurrentSession[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

This is a no-op placeholder


Get a corresponding model loader by looking at the file name.

Returns:SessInit – either a DictRestore (if name ends with ‘npy/npz’) or SaverRestore (otherwise).

Try loading latest checkpoint from logger.LOG_DIR, only if there is one.

Returns:SessInit – either a JustCurrentSession, or a SaverRestore.
class tensorpack.tfutils.TowerContext(tower_name, is_training=None, index=0, use_vs=False)[source]

Bases: object

A context where the current model is being built in.

__init__(tower_name, is_training=None, index=0, use_vs=False)[source]
  • tower_name (str) – The name scope of the tower.

  • is_training (bool) – if None, automatically determine from tower_name.

  • index (int) – index of this tower, only used in training.

  • use_vs (bool) – Open a new variable scope with this name.


Filter the list and only keep those under the current variable scope. If this tower doesn’t contain its own variable scope, return the list as-is.

Parameters:varlist (list[tf.Variable] or list[tf.Tensor]) –

Whether this tower is supposed to have its own variables.