tensorpack.tfutils package

tensorpack.tfutils.collection module

Parameters:keys (list) – list of collection keys to backup. Defaults to all keys in the graph.
Returns:dict – the backup

Restore from a collection backup.

Parameters:backup (dict) –
Parameters:keys (list) – list of collection keys to freeze.
Returns:a context where the collections are in the end restored to its initial state.

tensorpack.tfutils.gradproc module

class tensorpack.tfutils.gradproc.GradientProcessor[source]

Bases: object

Base class for all gradient processors.

Subclass should override the _process() method.


Process the symbolic gradients.

Parameters:grads (list) – list of (grad, var).
Returns:list – processed gradients, with the same type as input.
class tensorpack.tfutils.gradproc.FilterNoneGrad(verbose=True)[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Skip the update and print a warning (instead of crashing), when the gradient of certain variable is None.

Parameters:verbose (bool) – whether to print warning about None gradients.
class tensorpack.tfutils.gradproc.GlobalNormClip(global_norm)[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Clip by global norm. The global norm is the sum of norm for all gradients.

See tf.clip_by_global_norm() for more information.

Parameters:global_norm (float) – the threshold to clip with.
class tensorpack.tfutils.gradproc.MapGradient(func, regex='.*')[source]

Bases: tensorpack.tfutils.gradproc.GradientProcessor

Apply a function on all gradient if the name matches regex. Keep the other gradients unchanged.

__init__(func, regex='.*')[source]
  • func – takes a grad or (grad, var) pair and returns a grad. If return None, the gradient is discarded (hence no update to the variable will happen).

  • regex (str) – used to match variables. Defaults to match all variables.

class tensorpack.tfutils.gradproc.SummaryGradient(regex='.*')[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

For each gradient tensor, summary its histogram and add it to moving summaries.

Parameters:regex (str) – same as in MapGradient.
class tensorpack.tfutils.gradproc.PrintGradient(regex='.*')[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Print the gradients every step with symbolic_functions.print_stat().

Parameters:regex (str) – same as in MapGradient.
class tensorpack.tfutils.gradproc.CheckGradient[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Check for numeric issue. See tf.check_numerics() for more information.

class tensorpack.tfutils.gradproc.ScaleGradient(multipliers, verbose=True, log=None)[source]

Bases: tensorpack.tfutils.gradproc.MapGradient

Scale certain gradient by a multiplier.

__init__(multipliers, verbose=True, log=None)[source]
  • multipliers (tuple or list) – tuple of (regex, float), or list of tuples.

  • verbose (bool) – whether to print logs or not

  • log – deprecated


Use double learning rate for all the bias (as in caffe):

ScaleGradient(('.*/b', 2))

tensorpack.tfutils.tower module

class tensorpack.tfutils.tower.TowerContext(tower_name, is_training, index=0, vs_name='')[source]

Bases: object

A context where the current model is being built in.

__init__(tower_name, is_training, index=0, vs_name='')[source]
  • tower_name (str) – The name scope of the tower.

  • is_training (bool) –

  • index (int) – index of this tower, only used in training.

  • vs_name (str) – Open a new variable scope with this name.


Get items from this collection that are added in the current tower.


Whether this tower is supposed to have its own variables.

class tensorpack.tfutils.tower.TowerFuncWrapper(tower_fn, inputs_desc)[source]

Bases: object

A wrapper around a tower function (function which builds one tower, i.e. one replicate of the model). It keeps track of the name scope, variable scope and input/output tensors each time the function is called.

TowerTrainer needs this so that it knows how to build a predictor.

__init__(tower_fn, inputs_desc)[source]
  • tower_func – a function which builds one tower in the graph. It takes several input tensors and could return anything.

  • inputs_desc ([InputDesc]) – use this to figure out the right name for the input tensors.


Returns – a TowerTensorHandles object, that can access the tower handles by either indices or names.

class tensorpack.tfutils.tower.TowerTensorHandle(ctx, input, output, inputs_desc=None)[source]

Bases: object

When a function is called multiple times under each tower, it becomes hard to keep track of the scope and access those tensors in each tower. This class provides easy access to the tensors as well as the inputs/outputs created in each tower.


Get a tensor in this tower. The name can be: 1. The name of the tensor without any tower prefix. 2. The name of an InputDesc, if it is used when building the tower.


Get a variable used in this tower.


The list of input tensors used to build the tower.


The output returned by the tower function.

class tensorpack.tfutils.tower.TowerTensorHandles(handles)[source]

Bases: object

Wrap a list of TowerTensorHandle, to support access to them by index or names.

Parameters:name_or_index (str or int) –
Returns:a TowerTensorHandle.
Returns:A TowerTensorHandles, containing only the inference towers.
Returns:A TowerTensorHandles, containing only the training towers.

tensorpack.tfutils.scope_utils module


A decorator which automatically reuses the current variable scope if the function has been called with the same variable scope before.


def myfunc(x):
    return tf.layers.conv2d(x, 128, 3)

myfunc(x1)  # will inherit parent scope reuse
myfunc(x2)  # will reuse
with tf.variable_scope('newscope'):
    myfunc(x3)  # will inherit parent scope reuse
    myfunc(x4)  # will reuse
tensorpack.tfutils.scope_utils.cached_name_scope(name, top_level=True)[source]

Return a context which either opens and caches a new name scope, or reenter an existing one.

Parameters:top_level (bool) – if True, the name scope will always be top-level. It will not be nested under any existing name scope of the caller.
Returns:A decorator which makes the function happen under a name scope, which is named by the function itself.


def rms(x):
    return tf.sqrt(  # will be under name scope 'rms'


Add a reuse option.

tensorpack.tfutils.optimizer module

tensorpack.tfutils.optimizer.apply_grad_processors(opt, gradprocs)[source]

Wrapper around optimizers to apply gradient processors.

  • opt (tf.train.Optimizer) –

  • gradprocs (list[GradientProcessor]) – gradient processors to add to the optimizer.


a tf.train.Optimizer instance which runs the gradient processors before updating the variables.

class tensorpack.tfutils.optimizer.ProxyOptimizer(opt, name='ProxyOptimizer')[source]

Bases: tensorflow.python.training.optimizer.Optimizer

A transparent proxy which delegates all methods of tf.train.Optimizer

apply_gradients(*args, **kwargs)[source]
compute_gradients(*args, **kwargs)[source]
get_slot(*args, **kwargs)[source]
get_slot_names(*args, **kwargs)[source]
class tensorpack.tfutils.optimizer.PostProcessOptimizer(opt, func, colocate=True)[source]

Bases: tensorpack.tfutils.optimizer.ProxyOptimizer

An optimizer which applies some “post-processing operation” per variable (e.g. clipping, quantization) after the gradient update.

__init__(opt, func, colocate=True)[source]
  • opt (tf.train.Optimizer) –

  • func (tf.Variable -> tf.Operation or None) – the operation needed to perform for this variable after the gradient update.

  • colocate (boolean) – colocate the function with the variable.

apply_gradients(grads_and_vars, global_step=None, name=None)[source]
class tensorpack.tfutils.optimizer.VariableAssignmentOptimizer(opt, func)[source]

Bases: tensorpack.tfutils.optimizer.PostProcessOptimizer

An optimizer which assigns each variable a new value (e.g. clipping, quantization) after the gradient update.

__init__(opt, func)[source]
  • opt (tf.train.Optimizer) –

  • func (tf.Variable -> tf.Tensor or None) – the new value to be assigned to this variable after the gradient update.

class tensorpack.tfutils.optimizer.AccumGradOptimizer(opt, niter)[source]

Bases: tensorpack.tfutils.optimizer.ProxyOptimizer

An optimizer which accumulates gradients across \(k\) minimize() calls, and apply them together in every \(k`th :meth:`minimize\) call. This is equivalent to using a \(k\) times larger batch size plus a \(k\) times larger learning rate, but uses much less memory.

Note that this implementation may not support all models. E.g., it doesn’t support sparse gradient update.

__init__(opt, niter)[source]
  • opt (tf.train.Optimizer) – the underlying sub-optimizer.

  • niter (int) – number of iterations to accumulate gradients.

apply_gradients(grads_and_vars, global_step=None, name=None)[source]

tensorpack.tfutils.sesscreate module

class tensorpack.tfutils.sesscreate.NewSessionCreator(target='', graph=None, config=None)[source]

Bases: tensorflow.python.training.monitored_session.ChiefSessionCreator

__init__(target='', graph=None, config=None)[source]
  • graph, config (target,) – same as Session.__init__().

  • config – defaults to tfutils.get_default_sess_config()

class tensorpack.tfutils.sesscreate.ReuseSessionCreator(sess)[source]

Bases: tensorflow.python.training.monitored_session.SessionCreator

Parameters:sess (tf.Session) – the session to reuse
class tensorpack.tfutils.sesscreate.SessionCreatorAdapter(session_creator, func)[source]

Bases: tensorflow.python.training.monitored_session.SessionCreator

__init__(session_creator, func)[source]
  • session_creator (tf.train.SessionCreator) – a session creator

  • func (tf.Session -> tf.Session) – takes a session created by

  • and return a new session to be returned by self.create_session (session_creator,) –


tensorpack.tfutils.sessinit module

class tensorpack.tfutils.sessinit.SessionInit[source]

Bases: object

Base class for utilities to load variables to a (existing) session.


Initialize a session

Parameters:sess (tf.Session) – the session
class tensorpack.tfutils.sessinit.ChainInit(sess_inits)[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Initialize a session by a list of SessionInit instance, executed one by one. This can be useful for, e.g., loading several models from different files to form a composition of models.

Parameters:sess_inits (list[SessionInit]) – list of SessionInit instances.
class tensorpack.tfutils.sessinit.SaverRestore(model_path, prefix=None, ignore=[])[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Restore a tensorflow checkpoint saved by tf.train.Saver or ModelSaver.

__init__(model_path, prefix=None, ignore=[])[source]
  • model_path (str) – a model name (model-xxxx) or a checkpoint file.

  • prefix (str) – during restore, add a prefix/ for every variable in this checkpoint

  • ignore (list[str]) – list of tensor names that should be ignored during loading, e.g. learning-rate

class tensorpack.tfutils.sessinit.SaverRestoreRelaxed(model_path, prefix=None, ignore=[])[source]

Bases: tensorpack.tfutils.sessinit.SaverRestore

Same as SaverRestore, but has more relaxed constraints.

It allows upcasting certain variables, or reshape certain variables when there is a mismatch that can be fixed. Another advantage is that it doesn’t add any new ops to the graph. But it is also slower than SaverRestore.

class tensorpack.tfutils.sessinit.DictRestore(variable_dict)[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

Restore variables from a dictionary.

Parameters:variable_dict (dict) – a dict of {name: value}
class tensorpack.tfutils.sessinit.JustCurrentSession[source]

Bases: tensorpack.tfutils.sessinit.SessionInit

This is a no-op placeholder


Get a corresponding model loader by looking at the file name.

Returns:SessInit – either a DictRestore (if name ends with ‘npy/npz’) or SaverRestore (otherwise).

tensorpack.tfutils.summary module

tensorpack.tfutils.summary.add_tensor_summary(x, types, name=None, collections=None, main_tower_only=True)[source]

Summarize a tensor by different methods.

  • x (tf.Tensor) – a tensor to summarize

  • types (list[str]) – summary types, can be scalar/histogram/sparsity/mean/rms

  • name (str) – summary name. Defaults to be the op name.

  • collections (list[str]) – collections of the summary ops.

  • main_tower_only (bool) – Only run under main training tower. If set to True, calling this function under other TowerContext has no effect.


with tf.name_scope('mysummaries'):  # to not mess up tensorboard
        tensor, ['histogram', 'rms', 'sparsity'], name='mytensor')
tensorpack.tfutils.summary.add_param_summary(*summary_lists, **kwargs)[source]

Add summary ops for all trainable variables matching the regex, under a reused ‘param-summary’ name scope. This function is a no-op if not calling from main training tower.

  • summary_lists (list) – each is (regex, [list of summary type]). Summary type is defined in add_tensor_summary().

  • collections (list[str]) – collections of the summary ops.


    ('.*/W', ['histogram', 'rms']),
    ('.*/gamma', ['scalar']),
tensorpack.tfutils.summary.add_activation_summary(x, types=None, name=None, collections=None)[source]

Call add_tensor_summary() under a reused ‘activation-summary’ name scope. This function is a no-op if not calling from main training tower.

  • x (tf.Tensor) – the tensor to summary.

  • types (list[str]) – summary types, defaults to ['sparsity', 'rms', 'histogram'].

  • name (str) – if is None, use x.name.

  • collections (list[str]) – collections of the summary ops.

tensorpack.tfutils.summary.add_moving_summary(*args, **kwargs)[source]

Add moving average summary for some tensors. This function is a no-op if not calling from main training tower.

  • args – tensors to summarize

  • decay (float) – the decay rate. Defaults to 0.95.

  • collection (str or None) – the name of the collection to add EMA-maintaining ops. The default will work together with the default MovingAverageSummary callback.



list of tensors returned by assign_moving_average,

which can be used to maintain the EMA.

tensorpack.tfutils.varmanip module

class tensorpack.tfutils.varmanip.SessionUpdate(sess, vars_to_update)[source]

Bases: object

Update the variables in a session

__init__(sess, vars_to_update)[source]
  • sess (tf.Session) – a session object

  • vars_to_update – a collection of variables to update

static load_value_to_var(var, val, strict=False)[source]

Call var.load(val) with the default session.

  • var (tf.Variable) –

  • strict (bool) – Behave less strict if set to False.

Parameters:prms (dict) – dict of {variable name: value} Any name in prms must be in the graph and in vars_to_update.

Dump value of all TRAINABLE + MODEL variables to a dict, and save as npz format (loadable by DictRestore).

Parameters:path (str) – the file name to save the parameters. Must ends with npz.

Load all variables from a checkpoint to a dict.

Parameters:model_path (str) – path to a checkpoint.
Returns:dict – a name:value dict

Work around TF problems in checkpoint path handling.

Parameters:model_path – a user-input path
Returns:str – the argument that can be passed to NewCheckpointReader

tensorpack.tfutils.varreplace module

tensorpack.tfutils.varreplace.freeze_variables(stop_gradient=True, skip_collection=False)[source]

Return a context to freeze variables, by wrapping tf.get_variable with a custom getter. It works by either applying tf.stop_gradient on the variables, or by keeping them out of the TRAINABLE_VARIABLES collection, or both.


with varreplace.freeze_variable(stop_gradient=False, skip_collection=True):
    x = FullyConnected('fc', x, 1000)   # fc/* will not be trained
  • stop_gradient (bool) – if True, variables returned from get_variable will be wrapped with tf.stop_gradient and therefore has no gradient when used later. Note that the created variables may still have gradient when accessed by other approaches (e.g. by name, or by collection).

  • skip_collection (bool) – if True, do not add the variable to TRAINABLE_VARIABLES collection. As a result they will not be trained by default.


Use fn to map the output of any variable getter.

Parameters:fn (tf.Variable -> tf.Tensor) –
Returns:a context where all the variables will be mapped by fn.


with varreplace.remap_variables(lambda var: quantize(var)):
    x = FullyConnected('fc', x, 1000)   # fc/{W,b} will be quantized

Other functions in tensorpack.tfutils module


Return a tf.ConfigProto to use as default session config. You can modify the returned config to fit your needs.

Parameters:mem_fraction (float) – fraction of memory to use.
Returns:tf.ConfigProto – the config to use.
the global_step variable in the current graph. Create if
doesn’t exist.
Returns:int – global_step value in current graph and session
tfutils.argscope(layers, **kwargs)
Parameters:layers (list or layer) – layer or list of layers to apply the arguments.
Returns:a context where all appearance of these layer will by default have the arguments specified by kwargs.


with argscope(Conv2D, kernel_shape=3, nl=tf.nn.relu, out_channel=32):
    x = Conv2D('conv0', x)
    x = Conv2D('conv1', x)
    x = Conv2D('conv2', x, out_channel=64)  # override argscope
Returns:dict – the current argscope.

An argscope is a dict of dict: dict[layername] = {arg: val}