tensorpack.models package

tensorpack.models.BatchNorm(scope_name, inputs, training=None, momentum=0.9, epsilon=1e-05, center=True, scale=True, gamma_initializer=<tf.python.ops.init_ops.Ones object>, data_format='channels_last', internal_update=False)[source]

Mostly equivalent to tf.layers.batch_normalization, but difference in the following:

  1. Accepts data_format rather than axis. For 2D input, this argument will be ignored.

  2. Default value for momentum and epsilon is different.

  3. Default value for training is automatically obtained from TowerContext.

  4. Support the internal_update option.

Parameters:internal_update (bool) – if False, add EMA update ops to tf.GraphKeys.UPDATE_OPS. If True, update EMA inside the layer by control dependencies.

Variable Names:

  • beta: the bias term. Will be zero-inited by default.

  • gamma: the scale term. Will be one-inited by default. Input will be transformed by x * gamma + beta.

  • mean/EMA: the moving average of mean.

  • variance/EMA: the moving average of variance.

Note

  1. About multi-GPU training: moving averages across GPUs are not aggregated. Batch statistics are computed independently. This is consistent with most frameworks.

  2. Combinations of training and ctx.is_training:
    • training == ctx.is_training: standard BN, EMA are

      maintained during training and used during inference. This is the default.

    • training and not ctx.is_training: still use batch statistics in inference.

    • not training and ctx.is_training: use EMA to normalize in

      training. This is useful when you load a pre-trained BN and don’t want to fine tune the EMA. EMA will not be updated in this case.

tensorpack.models.BatchRenorm(scope_name, x, rmax, dmax, momentum=0.9, epsilon=1e-05, center=True, scale=True, gamma_initializer=None, data_format='channels_last')[source]

Batch Renormalization layer, as described in the paper: Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models. This implementation is a wrapper around tf.layers.batch_normalization.

Parameters:
  • x (tf.Tensor) – a NHWC or NC tensor.

  • dmax (rmax,) – a scalar tensor, the maximum allowed corrections.

  • decay (float) – decay rate of moving average.

  • epsilon (float) – epsilon to avoid divide-by-zero.

  • use_bias (use_scale,) – whether to use the extra affine transformation or not.

Returns:

tf.Tensor – a tensor named output with the same shape of x.

Variable Names:

  • beta: the bias term.

  • gamma: the scale term. Input will be transformed by x * gamma + beta.

  • moving_mean, renorm_mean, renorm_mean_weight: See TF documentation.

  • moving_variance, renorm_stddev, renorm_stddev_weight: See TF documentation.

tensorpack.models.layer_register(log_shape=False, use_scope=True)[source]
Parameters:
  • log_shape (bool) – log input/output shape of this layer

  • use_scope (bool or None) – Whether to call this layer with an extra first argument as scope. When set to None, it can be called either with or without the scope name argument. It will try to figure out by checking if the first argument is string or not.

Returns:

A decorator used to register a layer.

Examples:

@layer_register(use_scope=True)
def add10(x):
    return x + tf.get_variable('W', shape=[10])
class tensorpack.models.VariableHolder(**kwargs)[source]

Bases: object

A proxy to access variables defined in a layer.

__init__(**kwargs)[source]
Parameters:kwargs – {name:variable}
all()[source]
Returns:list of all variables
tensorpack.models.Conv2D(scope_name, inputs, filters, kernel_size, strides=(1, 1), padding='same', data_format='channels_last', dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer=<function variance_scaling_initializer.<locals>._initializer>, bias_initializer=<tf.python.ops.init_ops.Zeros object>, kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, split=1)[source]

A wrapper around tf.layers.Conv2D. Some differences to maintain backward-compatibility:

  1. Default kernel initializer is variance_scaling_initializer(2.0).

  2. Default padding is ‘same’.

  3. Support ‘split’ argument to do group conv.

Variable Names:

  • W: weights

  • b: bias

tensorpack.models.Conv2DTranspose(scope_name, inputs, filters, kernel_size, strides=(1, 1), padding='same', data_format='channels_last', activation=None, use_bias=True, kernel_initializer=<function variance_scaling_initializer.<locals>._initializer>, bias_initializer=<tf.python.ops.init_ops.Zeros object>, kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None)[source]

A wrapper around tf.layers.Conv2DTranspose. Some differences to maintain backward-compatibility:

  1. Default kernel initializer is variance_scaling_initializer(2.0).

  2. Default padding is ‘same’

Variable Names:

  • W: weights

  • b: bias

tensorpack.models.FullyConnected(scope_name, inputs, units, activation=None, use_bias=True, kernel_initializer=<function variance_scaling_initializer.<locals>._initializer>, bias_initializer=<tf.python.ops.init_ops.Zeros object>, kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None)[source]

A wrapper around tf.layers.Dense. One difference to maintain backward-compatibility: Default weight initializer is variance_scaling_initializer(2.0).

Variable Names:

  • W: weights of shape [in_dim, out_dim]

  • b: bias

tensorpack.models.ImageSample(scope_name, inputs, borderMode='repeat')[source]

Sample the images using the given coordinates, by bilinear interpolation. This was described in the paper: Spatial Transformer Networks.

Parameters:
  • inputs (list) – [images, coords]. images has shape NHWC. coords has shape (N, H’, W’, 2), where each pair of the last dimension is a (y, x) real-value coordinate.

  • borderMode – either “repeat” or “constant” (zero-filled)

Returns:

tf.Tensor – a tensor named output of shape (N, H’, W’, C).

tensorpack.models.LayerNorm(scope_name, x, epsilon=1e-05, use_bias=True, use_scale=True, gamma_init=None, data_format='channels_last')[source]

Layer Normalization layer, as described in the paper: Layer Normalization.

Parameters:
  • x (tf.Tensor) – a 4D or 2D tensor. When 4D, the layout should match data_format.

  • epsilon (float) – epsilon to avoid divide-by-zero.

  • use_bias (use_scale,) – whether to use the extra affine transformation or not.

tensorpack.models.InstanceNorm(scope_name, x, epsilon=1e-05, use_affine=True, gamma_init=None, data_format='channels_last')[source]

Instance Normalization, as in the paper: Instance Normalization: The Missing Ingredient for Fast Stylization.

Parameters:
  • x (tf.Tensor) – a 4D tensor.

  • epsilon (float) – avoid divide-by-zero

  • use_affine (bool) – whether to apply learnable affine transformation

class tensorpack.models.LinearWrap(tensor)[source]

Bases: object

A simple wrapper to easily create “linear” graph, consisting of layers / symbolic functions with only one input & output.

__call__()[source]
Returns:tf.Tensor – the underlying wrapped tensor.
__init__(tensor)[source]
Parameters:tensor (tf.Tensor) – the tensor to wrap
apply(func, *args, **kwargs)[source]

Apply a function on the wrapped tensor.

Returns:LinearWrapLinearWrap(func(self.tensor(), *args, **kwargs)).
apply2(func, *args, **kwargs)[source]

Apply a function on the wrapped tensor. The tensor will be the second argument of func.

Returns:LinearWrapLinearWrap(func(args[0], self.tensor(), *args[1:], **kwargs)).
print_tensor()[source]

Print the underlying tensor and return self. Can be useful to get the name of tensors inside LinearWrap.

Returns:self
tensor()[source]

Equivalent to self.__call__().

Returns:tf.Tensor – the underlying wrapped tensor.
tensorpack.models.Maxout([scope_name, ]x, num_unit)[source]

Maxout as in the paper Maxout Networks.

Parameters:
  • x (tf.Tensor) – a NHWC or NC tensor. Channel has to be known.

  • num_unit (int) – a int. Must be divisible by C.

Returns:

tf.Tensor – of shape NHW(C/num_unit) named output.

tensorpack.models.PReLU(scope_name, x, init=0.001, name='output')[source]

Parameterized ReLU as in the paper Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification.

Parameters:
  • x (tf.Tensor) – input

  • init (float) – initial value for the learnable slope.

  • name (str) – name of the output.

Variable Names:

  • alpha: learnable slope.

tensorpack.models.BNReLU([scope_name, ]x, name=None)[source]

A shorthand of BatchNormalization + ReLU.

tensorpack.models.MaxPooling(scope_name, inputs, pool_size, strides=None, padding='valid', data_format='channels_last')[source]

Same as tf.layers.MaxPooling2D. Default strides is equal to pool_size.

tensorpack.models.FixedUnPooling(scope_name, x, shape, unpool_mat=None, data_format='channels_last')[source]

Unpool the input with a fixed matrix to perform kronecker product with.

Parameters:
  • x (tf.Tensor) – a 4D image tensor

  • shape – int or (h, w) tuple

  • unpool_mat – a tf.Tensor or np.ndarray 2D matrix with size=shape. If is None, will use a matrix with 1 at top-left corner.

Returns:

tf.Tensor – a 4D image tensor.

tensorpack.models.AvgPooling(scope_name, inputs, pool_size, strides=None, padding='valid', data_format='channels_last')[source]

Same as tf.layers.AveragePooling2D. Default strides is equal to pool_size.

tensorpack.models.GlobalAvgPooling(scope_name, x, data_format='channels_last')[source]

Global average pooling as in the paper Network In Network.

Parameters:x (tf.Tensor) – a 4D tensor.
Returns:tf.Tensor – a NC tensor named output.
tensorpack.models.BilinearUpSample(scope_name, x, shape)[source]

Deterministic bilinearly-upsample the input images.

Parameters:
  • x (tf.Tensor) – a NHWC tensor

  • shape (int) – the upsample factor

Returns:

tf.Tensor – a NHWC tensor.

tensorpack.models.regularize_cost(regex, func, name='regularize_cost')[source]

Apply a regularizer on trainable variables matching the regex, and print the matched variables (only print once in multi-tower training). In replicated mode, it will only regularize variables within the current tower.

Parameters:
  • regex (str) – a regex to match variable names, e.g. “conv.*/W”

  • func – the regularization function, which takes a tensor and returns a scalar tensor. E.g., tf.contrib.layers.l2_regularizer.

Returns:

tf.Tensor – a scalar, the total regularization cost.

Example

cost = cost + regularize_cost("fc.*/W", l2_regularizer(1e-5))
tensorpack.models.regularize_cost_from_collection(name='regularize_cost')[source]

Get the cost from the regularizers in tf.GraphKeys.REGULARIZATION_LOSSES. If in replicated mode, will only regularize variables created within the current tower.

Parameters:name (str) – the name of the returned tensor
Returns:tf.Tensor – a scalar, the total regularization cost.
tensorpack.models.l2_regularizer(scale, scope=None)[source]

Returns a function that can be used to apply L2 regularization to weights.

Small values of L2 can help prevent overfitting the training data.

Parameters:
  • scale – A scalar multiplier Tensor. 0.0 disables the regularizer.

  • scope – An optional scope name.

Returns:

A function with signature l2(weights) that applies L2 regularization.

Raises:

ValueError – If scale is negative or if scale is not a float.

tensorpack.models.l1_regularizer(scale, scope=None)[source]

Returns a function that can be used to apply L1 regularization to weights.

L1 regularization encourages sparsity.

Parameters:
  • scale – A scalar multiplier Tensor. 0.0 disables the regularizer.

  • scope – An optional scope name.

Returns:

A function with signature l1(weights) that apply L1 regularization.

Raises:

ValueError – If scale is negative or if scale is not a float.

tensorpack.models.Dropout([scope_name, ]x, *args, **kwargs)[source]

Same as tf.layers.dropout. However, for historical reasons, the first positional argument is interpreted as keep_prob rather than drop_prob. Explicitly use rate= keyword arguments to ensure things are consistent.

tensorpack.models.ConcatWith([scope_name, ]x, tensor, dim)[source]

A wrapper around tf.concat to cooperate with LinearWrap.

Parameters:
  • x (tf.Tensor) – input

  • tensor (list[tf.Tensor]) – a tensor or list of tensors to concatenate with x. x will be at the beginning

  • dim (int) – the dimension along which to concatenate

Returns:

tf.Tensortf.concat([x] + tensor, dim)

tensorpack.models.SoftMax([scope_name, ]x, use_temperature=False, temperature_init=1.0)[source]

A SoftMax layer (w/o linear projection) with optional temperature, as defined in the paper Distilling the Knowledge in a Neural Network.

Parameters:
  • x (tf.Tensor) – input of any dimension. Softmax will be performed on the last dimension.

  • use_temperature (bool) – use a learnable temperature or not.

  • temperature_init (float) – initial value of the temperature.

Returns:

tf.Tensor – a tensor of the same shape named output.

Variable Names:

  • invtemp: 1.0/temperature.