tensorpack.callbacks package¶
Everything other than the training iterations happen in the callbacks. Most of the fancy things you want to do will probably end up here. See relevant tutorials: Callbacks.
-
class
tensorpack.callbacks.
Callback
[source]¶ Bases:
object
Base class for all callbacks. See Write a Callback for more detailed explanation of the callback methods.
-
graph
¶ the graph.
- Type
tf.Graph
Note
These attributes are available only after (and including)
_setup_graph()
.-
_setup_graph
()[source]¶ Called before finalizing the graph. Override this method to setup the ops used in the callback. This is the same as
tf.train.SessionRunHook.begin()
.
-
_before_train
()[source]¶ Called right before the first iteration. The main difference to setup_graph is that at this point the graph is finalized and a default session is initialized. Override this method to, e.g. run some operations under the session.
This is similar to
tf.train.SessionRunHook.after_create_session()
, but different: it is called after the session is initialized bytfutils.SessionInit
.
-
_before_run
(ctx)[source]¶ It is called before every
hooked_sess.run()
call, and it registers some extra op/tensors to run in the next call. This method is the same astf.train.SessionRunHook.before_run
. Refer to TensorFlow docs for more details.
-
_after_run
(run_context, run_values)[source]¶ It is called after every
hooked_sess.run()
call, and it processes the values requested by the correspondingbefore_run()
. It is equivalent totf.train.SessionRunHook.after_run()
, refer to TensorFlow docs for more details.
-
_before_epoch
()[source]¶ Called right before each epoch. Usually you should use the
trigger()
callback to run something between epochs. Use this method only when something really needs to be run immediately before each epoch.
-
_after_epoch
()[source]¶ Called right after each epoch. Usually you should use the
trigger()
callback to run something between epochs. Use this method only when something really needs to be run immediately after each epoch.
-
_trigger_step
()[source]¶ Called after each
Trainer.run_step()
completes. Defaults to no-op.You can override it to implement, e.g. a ProgressBar.
-
_trigger_epoch
()[source]¶ Called after the completion of every epoch. Defaults to call
self.trigger()
-
_trigger
()[source]¶ Override this method to define a general trigger behavior, to be used with trigger schedulers. Note that the schedulers (e.g.
PeriodicTrigger
) might call this method both inside an epoch and after an epoch.When used without the scheduler, this method by default will be called by trigger_epoch().
-
property
chief_only
¶ Only run this callback on chief training process.
Returns: bool
-
get_tensors_maybe_in_tower
(names)[source]¶ Get tensors in the graph with the given names. Will automatically check for the first training tower if no existing tensor is found with the name.
- Returns
[tf.Tensor]
-
name_scope
= ''¶ A name scope for ops created inside this callback. By default to the name of the class, but can be set per-instance.
-
-
class
tensorpack.callbacks.
ProxyCallback
(cb)[source]¶ Bases:
tensorpack.callbacks.base.Callback
A callback which proxy all methods to another callback. It’s useful as a base class of callbacks which decorate other callbacks.
-
class
tensorpack.callbacks.
CallbackFactory
(setup_graph=None, before_train=None, trigger=None, after_train=None)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Create a callback with some lambdas.
-
class
tensorpack.callbacks.
StartProcOrThread
(startable, stop_at_last=True)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Start some threads or processes before training.
-
__init__
(startable, stop_at_last=True)[source]¶ - Parameters
startable (list) – list of processes or threads which have
start()
method. Can also be a single instance of process of thread.stop_at_last (bool) – whether to stop the processes or threads after training. It will use
Process.terminate()
orStoppableThread.stop()
, but will do nothing on normalthreading.Thread
or other startable objects.
-
-
class
tensorpack.callbacks.
RunOp
(op, run_before=True, run_as_trigger=True, run_step=False, verbose=False)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Run an Op.
-
__init__
(op, run_before=True, run_as_trigger=True, run_step=False, verbose=False)[source]¶ - Parameters
op (tf.Operation or function) – an Op, or a function that returns the Op in the graph. The function will be called after the main graph has been created (in the
setup_graph()
callback).run_before (bool) – run the Op before training
run_as_trigger (bool) – run the Op on every
trigger()
call.run_step (bool) – run the Op every step (along with training)
verbose (bool) – print logs when the op is run.
Example
The DQN Example uses this callback to update target network.
-
-
class
tensorpack.callbacks.
RunUpdateOps
(collection=None)[source]¶ Bases:
tensorpack.callbacks.graph.RunOp
Run ops from the collection UPDATE_OPS every step. The ops will be hooked to
trainer.hooked_sess
and run along with eachhooked_sess.run
call.Be careful when using
UPDATE_OPS
if your model contains more than one sub-networks. Perhaps not all updates are supposed to be executed in every iteration.This callback is one of the
DEFAULT_CALLBACKS()
.
-
class
tensorpack.callbacks.
ProcessTensors
(names, fn)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Fetch extra tensors along with each training step, and call some function over the values. It uses
_{before,after}_run
method to injecttf.train.SessionRunHooks
to the session. You can use it to print tensors, save tensors to file, etc.Example:
ProcessTensors(['mycost1', 'mycost2'], lambda c1, c2: print(c1, c2, c1 + c2))
-
class
tensorpack.callbacks.
DumpTensors
(names)[source]¶ Bases:
tensorpack.callbacks.graph.ProcessTensors
Dump some tensors to a file. Every step this callback fetches tensors and write them to a npz file under
logger.get_logger_dir
. The dump can be loaded bydict(np.load(filename).items())
.
-
class
tensorpack.callbacks.
DumpTensorAsImage
(tensor_name, prefix=None, map_func=None, scale=255)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Dump a tensor to image(s) to
logger.get_logger_dir()
once triggered.Note that it requires the tensor is directly evaluable, i.e. either inputs are not its dependency (e.g. the weights of the model), or the inputs are feedfree (in which case this callback will take an extra datapoint from the input pipeline).
-
__init__
(tensor_name, prefix=None, map_func=None, scale=255)[source]¶ - Parameters
tensor_name (str) – the name of the tensor.
prefix (str) – the filename prefix for saved images. Defaults to the Op name.
map_func – map the value of the tensor to an image or list of images of shape [h, w] or [h, w, c]. If None, will use identity.
scale (float) – a multiplier on pixel values, applied after map_func.
-
-
class
tensorpack.callbacks.
CheckNumerics
(run_as_trigger=True, run_step=False)[source]¶ Bases:
tensorpack.callbacks.graph.RunOp
Check variables in the graph for NaN and Inf. Raise an exception if such an error is found.
-
class
tensorpack.callbacks.
Callbacks
(cbs)[source]¶ Bases:
tensorpack.callbacks.base.Callback
A container to hold all callbacks, and trigger them iteratively.
This is only used by the base trainer to run all the callbacks. Users do not need to use this class.
-
class
tensorpack.callbacks.
CallbackToHook
(cb)[source]¶ Bases:
tensorflow.python.training.session_run_hook.SessionRunHook
Hooks are less powerful than callbacks so the conversion is incomplete. It only converts the
before_run/after_run
calls.This is only for internal implementation of
before_run/after_run
callbacks. You shouldn’t need to use this.
-
class
tensorpack.callbacks.
HookToCallback
(hook)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Make a
tf.train.SessionRunHook
into a callback. Note that whenSessionRunHook.after_create_session
is called, thecoord
argument will be None.
-
class
tensorpack.callbacks.
TFLocalCLIDebugHook
(*args, **kwargs)[source]¶ Bases:
tensorpack.callbacks.hooks.HookToCallback
Use the hook tfdbg.LocalCLIDebugHook in tensorpack.
-
class
tensorpack.callbacks.
ScalarStats
(names, prefix='validation')[source]¶ Bases:
tensorpack.callbacks.inference.Inferencer
Statistics of some scalar tensor. The value will be averaged over all given datapoints.
Note that the average of accuracy over all batches is not necessarily the accuracy of the whole dataset. See
ClassificationError
for details.
-
class
tensorpack.callbacks.
Inferencer
[source]¶ Bases:
tensorpack.callbacks.base.Callback
Base class of Inferencer. Inferencer is a special kind of callback that should be called by
InferenceRunner
. It has the methods_get_fetches
and_on_fetches
which are likeSessionRunHooks
, except that they will be used only byInferenceRunner
.-
_after_inference
()[source]¶ Called after a round of inference ends. Returns a dict of scalar statistics which will be logged to monitors.
-
-
class
tensorpack.callbacks.
ClassificationError
(wrong_tensor_name='incorrect_vector', summary_name='validation_error')[source]¶ Bases:
tensorpack.callbacks.inference.Inferencer
Compute true classification error in batch mode, from a
wrong
tensor.The
wrong
tensor is supposed to be an binary vector containing whether each sample in the batch is incorrectly classified. You can usetf.nn.in_top_k
to produce this vector.This Inferencer produces the “true” error, which could be different from
ScalarStats('error_rate')
. It takes account of the fact that batches might not have the same size in testing (because the size of test set might not be a multiple of batch size). Therefore the result can be different from averaging the error rate of each batch.You can also use the “correct prediction” tensor, then this inferencer will give you “classification accuracy” instead of error.
-
class
tensorpack.callbacks.
BinaryClassificationStats
(pred_tensor_name, label_tensor_name, prefix='val')[source]¶ Bases:
tensorpack.callbacks.inference.Inferencer
Compute precision / recall in binary classification, given the prediction vector and the label vector.
-
class
tensorpack.callbacks.
InferenceRunnerBase
(input, infs)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Base class for inference runner.
Note
InferenceRunner will use input.size() to determine how much iterations to run, so you’re responsible to ensure that input.size() is accurate.
Only works with instances of TowerTrainer.
-
__init__
(input, infs)[source]¶ - Parameters
input (InputSource) – the input to use. Must have an accurate
size()
.infs (list[Inferencer]) – list of
Inferencer
to run.
-
class
tensorpack.callbacks.
InferenceRunner
(input, infs, tower_name='InferenceTower', tower_func=None, device=0)[source]¶ Bases:
tensorpack.callbacks.inference_runner.InferenceRunnerBase
A callback that runs a list of
Inferencer
on someInputSource
.-
__init__
(input, infs, tower_name='InferenceTower', tower_func=None, device=0)[source]¶ - Parameters
input (InputSource or DataFlow) – The
InputSource
to run inference on. If given a DataFlow, will useFeedInput
.infs (list) – a list of
Inferencer
instances.tower_name (str) – the name scope of the tower to build. If multiple InferenceRunner are used, each needs a different tower_name.
tower_func (tfutils.TowerFunc or None) – the tower function to be used to build the graph. By defaults to call trainer.tower_func under a training=False TowerContext, but you can change it to a different tower function if you need to inference with several different graphs.
device (int) – the device to use
-
-
class
tensorpack.callbacks.
DataParallelInferenceRunner
(input, infs, gpus, tower_name='InferenceTower', tower_func=None)[source]¶ Bases:
tensorpack.callbacks.inference_runner.InferenceRunnerBase
Inference with data-parallel support on multiple GPUs. It will build one predict tower on each GPU, and run prediction with a large total batch in parallel on all GPUs. It will run the remainder (when the total size of input is not a multiple of #GPU) sequentially.
-
__init__
(input, infs, gpus, tower_name='InferenceTower', tower_func=None)[source]¶ - Parameters
input (DataFlow or QueueInput) –
tower_name (str) – the name scope of the tower to build. If multiple InferenceRunner are used, each needs a different tower_name.
tower_func (tfutils.TowerFunc or None) – the tower function to be used to build the graph. The tower function will be called under a training=False TowerContext. The default is trainer.tower_func, but you can change it to a different tower function if you need to inference with several different models.
-
-
class
tensorpack.callbacks.
SendStat
(command, names)[source]¶ Bases:
tensorpack.callbacks.base.Callback
An equivalent of
SendMonitorData
, but as a normal callback.
-
class
tensorpack.callbacks.
InjectShell
(file='INJECT_SHELL.tmp', shell='ipython')[source]¶ Bases:
tensorpack.callbacks.base.Callback
Allow users to create a specific file as a signal to pause and iteratively debug the training. Once the
trigger()
method is called, it detects whether the file exists, and opens an IPython/pdb shell if yes. In the shell,self
is this callback,self.trainer
is the trainer, and from that you can access everything else.Example:
callbacks=[InjectShell('/path/to/pause-training.tmp'), ...] # the following command will pause the training and start a shell when the epoch finishes: $ touch /path/to/pause-training.tmp
-
class
tensorpack.callbacks.
EstimatedTimeLeft
(last_k_epochs=5, median=True)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Estimate the time left until completion of training.
-
class
tensorpack.callbacks.
MonitorBase
[source]¶ Bases:
tensorpack.callbacks.base.Callback
Base class for monitors which monitor a training progress, by processing different types of summary/statistics from trainer.
-
process_event
(evt)[source]¶ - Parameters
evt (tf.Event) – the most basic format acceptable by tensorboard. It could include Summary, RunMetadata, LogMessage, and more.
-
-
class
tensorpack.callbacks.
Monitors
(monitors)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Merge monitors together for trainer to use.
In training, each trainer will create a
Monitors
instance, and you can access it throughtrainer.monitors
. You should usetrainer.monitors
for logging and it will dispatch your logs to each sub-monitor.-
get_history
(name)[source]¶ Get a history of the scalar value of some data.
If you run multiprocess training, keep in mind that the data is perhaps only available on chief process.
- Returns
a list of (global_step, value) pairs – history data for this scalar
-
get_latest
(name)[source]¶ Get latest scalar value of some data.
If you run multiprocess training, keep in mind that the data is perhaps only available on chief process.
- Returns
scalar
-
put_event
(evt)[source]¶ Put an
tf.Event
. step and wall_time fields oftf.Event
will be filled automatically.- Parameters
evt (tf.Event) –
-
-
class
tensorpack.callbacks.
TFEventWriter
(logdir=None, max_queue=10, flush_secs=120, split_files=False)[source]¶ Bases:
tensorpack.callbacks.monitor.MonitorBase
Write summaries to TensorFlow event file.
-
__init__
(logdir=None, max_queue=10, flush_secs=120, split_files=False)[source]¶ - Parameters
logdir –
logger.get_logger_dir()
by default.flush_secs (max_queue,) – Same as in
tf.summary.FileWriter
.split_files – if True, split events to multiple files rather than append to a single file. Useful on certain filesystems where append is expensive.
-
-
class
tensorpack.callbacks.
JSONWriter
[source]¶ Bases:
tensorpack.callbacks.monitor.MonitorBase
Write all scalar data to a json file under
logger.get_logger_dir()
, grouped by their global step. If found an earlier json history file, will append to it.-
FILENAME
= 'stats.json'¶ The name of the json file. Do not change it.
-
-
class
tensorpack.callbacks.
ScalarPrinter
(enable_step=False, enable_epoch=True, whitelist=None, blacklist=None)[source]¶ Bases:
tensorpack.callbacks.monitor.MonitorBase
Print scalar data into terminal.
-
__init__
(enable_step=False, enable_epoch=True, whitelist=None, blacklist=None)[source]¶ - Parameters
enable_epoch (enable_step,) – whether to print the monitor data (if any) between steps or between epochs.
whitelist (list[str] or None) – A list of regex. Only names matching some regex will be allowed for printing. Defaults to match all names.
blacklist (list[str] or None) – A list of regex. Names matching any regex will not be printed. Defaults to match no names.
-
-
class
tensorpack.callbacks.
SendMonitorData
(command, names)[source]¶ Bases:
tensorpack.callbacks.monitor.MonitorBase
Execute a command with some specific scalar monitor data. This is useful for, e.g. building a custom statistics monitor.
It will try to send once receiving all the stats
-
class
tensorpack.callbacks.
CometMLMonitor
(experiment=None, tags=None, **kwargs)[source]¶ Bases:
tensorpack.callbacks.monitor.MonitorBase
Send scalar data and the graph to https://www.comet.ml.
Note
comet_ml requires you to import comet_ml before importing tensorflow or tensorpack.
The “automatic output logging” feature of comet_ml will make the training progress bar appear to freeze. Therefore the feature is disabled by default.
-
property
experiment
¶ The
comet_ml.Experiment
instance.
-
class
tensorpack.callbacks.
HyperParam
[source]¶ Bases:
object
Base class for a hyperparam.
-
property
readable_name
¶ A name to display
-
property
-
class
tensorpack.callbacks.
GraphVarParam
(name, shape=)[source]¶ Bases:
tensorpack.callbacks.param.HyperParam
A variable in the graph (e.g. learning_rate) can be a hyperparam.
-
class
tensorpack.callbacks.
ObjAttrParam
(obj, attrname, readable_name=None)[source]¶ Bases:
tensorpack.callbacks.param.HyperParam
An attribute of an object can be a hyperparam.
-
class
tensorpack.callbacks.
HyperParamSetter
(param)[source]¶ Bases:
tensorpack.callbacks.base.Callback
An abstract base callback to set hyperparameters.
Once the
trigger()
method is called, the method_get_value_to_set()
will be used to get a new value for the hyperparameter.-
__init__
(param)[source]¶ - Parameters
param (HyperParam or str) – if is a
str
, it is assumed to be aGraphVarParam
.
-
-
class
tensorpack.callbacks.
HumanHyperParamSetter
(param, file_name='hyper.txt')[source]¶ Bases:
tensorpack.callbacks.param.HyperParamSetter
Set hyperparameter by loading the value from a file each time it get called. This is useful for manually tuning some parameters (e.g. learning_rate) without interrupting the training.
-
__init__
(param, file_name='hyper.txt')[source]¶ - Parameters
param – same as in
HyperParamSetter
.file_name (str) – a file containing the new value of the parameter. Each line in the file is a
k:v
pair, for example,learning_rate:1e-4
. If the pair is not found, the param will not be changed.
-
-
class
tensorpack.callbacks.
ScheduledHyperParamSetter
(param, schedule, interp=None, step_based=False, set_at_beginning=True)[source]¶ Bases:
tensorpack.callbacks.param.HyperParamSetter
Set hyperparameters by a predefined epoch-based schedule.
-
__init__
(param, schedule, interp=None, step_based=False, set_at_beginning=True)[source]¶ - Parameters
param – same as in
HyperParamSetter
.schedule (list) – with the format
[(epoch1, val1), (epoch2, val2), (epoch3, val3)]
. Each(ep, val)
pair means to set the param to “val” after the completion of epoch ep. If ep == 0, the value will be set before the first epoch (because by default the first is epoch 1). The epoch numbers have to be increasing.interp (str or None) – Either None or ‘linear’. If None, the parameter will only be set when the specific epoch or steps is reached exactly. If ‘linear’, perform linear interpolation (but no extrapolation) every time this callback is triggered.
step_based (bool) – interpret
schedule
as (step, value) instead of (epoch, value).set_at_beginning (bool) – at the start of training, the current value may be different from the expected value according to the schedule. If this option is True, set the value anyway even though the current epoch/step is not at the scheduled time. If False, the value will only be set according to the schedule, i.e. it will only be set if the current epoch/step is at the scheduled time.
Example
ScheduledHyperParamSetter('learning_rate', [(30, 1e-2), (60, 1e-3), (85, 1e-4), (95, 1e-5)]),
-
-
class
tensorpack.callbacks.
StatMonitorParamSetter
(param, stat_name, value_func, threshold, last_k, reverse=False)[source]¶ Bases:
tensorpack.callbacks.param.HyperParamSetter
Change the param by monitoring the change of a scalar statistics. The param will be changed when the scalar does not decrease/increase enough.
Once triggered, this callback observes the latest one value of
stat_name
, from the monitor backend.This callback will then change a hyperparameter
param
bynew_value = value_func(old_value)
, if:min(history) >= history[0] - threshold
, wherehistory = [the most recent k observations of stat_name]
Note
The statistics of interest must be created at a frequency higher than or equal to this callback. For example, using
PeriodicTrigger(StatMonitorParamSetter(...), every_k_steps=100)
is meaningless if the statistics to be monitored is only updated every 500 steps.Callbacks are executed in order. Therefore, if the statistics to be monitored is created after this callback, the behavior of this callback may get delayed.
Example
If validation error wasn’t decreasing for 5 epochs, decay the learning rate by 0.2:
StatMonitorParamSetter('learning_rate', 'val-error', lambda x: x * 0.2, threshold=0, last_k=5)
-
__init__
(param, stat_name, value_func, threshold, last_k, reverse=False)[source]¶ - Parameters
param – same as in
HyperParamSetter
.stat_name (str) – name of the statistics.
value_func (float -> float) – a function which returns a new value taking the old value.
threshold (float) – change threshold.
last_k (int) – use last k observations of statistics.
reverse (bool) – monitor increasing instead of decreasing. If True,
param
will be changed whenmax(history) <= history[0] + threshold
.
-
-
class
tensorpack.callbacks.
HyperParamSetterWithFunc
(param, func)[source]¶ Bases:
tensorpack.callbacks.param.HyperParamSetter
Set the parameter by a function of epoch num and old value.
-
__init__
(param, func)[source]¶ - Parameters
param – same as in
HyperParamSetter
.func –
param
will be set bynew_value = func(epoch_num, old_value)
.epoch_num
is the number of epochs that have finished.
Example
Decrease by a factor of 0.9 every two epochs:
HyperParamSetterWithFunc('learning_rate', lambda e, x: x * 0.9 if e % 2 == 0 else x)
-
-
class
tensorpack.callbacks.
GPUUtilizationTracker
(devices=None)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Summarize the average GPU utilization within an epoch.
It will start a process to obtain GPU utilization through NVML every second within the epoch (the trigger_epoch time was not included), and write average utilization to monitors.
This callback creates a process, therefore it’s not safe to be used with MPI.
-
class
tensorpack.callbacks.
GraphProfiler
(dump_metadata=False, dump_tracing=True, dump_event=False)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Enable profiling by installing session hooks, and write tracing files / events / metadata to
logger.get_logger_dir()
.The tracing files can be loaded from
chrome://tracing
. The metadata files can be processed by tfprof command line utils. The event is viewable from tensorboard.Tips:
Note that the profiling is by default enabled for every step and is expensive. You probably want to schedule it less frequently, e.g.:
EnableCallbackIf( GraphProfiler(dump_tracing=True, dump_event=True), lambda self: self.trainer.global_step > 20 and self.trainer.global_step < 30)
-
class
tensorpack.callbacks.
GPUMemoryTracker
(devices=0)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Track peak memory used on each GPU device every epoch, by
tf.contrib.memory_stats
. The peak memory comes from theMaxBytesInUse
op, which is the peak memory used in recentsession.run
calls. See https://github.com/tensorflow/tensorflow/pull/13107.
-
class
tensorpack.callbacks.
HostMemoryTracker
[source]¶ Bases:
tensorpack.callbacks.base.Callback
Track free RAM on the host.
When triggered, it writes the size of free RAM into monitors.
-
class
tensorpack.callbacks.
ThroughputTracker
(samples_per_step=None)[source]¶ Bases:
tensorpack.callbacks.base.Callback
This callback writes the training throughput (in terms of either steps/sec, or samples/sec) to the monitors everytime it is triggered. The throughput is computed based on the duration between the consecutive triggers.
The time spent on callbacks after each epoch is excluded.
-
class
tensorpack.callbacks.
ModelSaver
(max_to_keep=10, keep_checkpoint_every_n_hours=0.5, checkpoint_dir=None, var_collections=None)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Save the model once triggered.
-
__init__
(max_to_keep=10, keep_checkpoint_every_n_hours=0.5, checkpoint_dir=None, var_collections=None)[source]¶ - Parameters
max_to_keep (int) – the same as in
tf.train.Saver
.keep_checkpoint_every_n_hours (float) – the same as in
tf.train.Saver
. Note that “keep” does not mean “create”, but means “don’t delete”.checkpoint_dir (str) – Defaults to
logger.get_logger_dir()
.var_collections (str or list of str) – collection of the variables (or list of collections) to save.
-
-
class
tensorpack.callbacks.
MinSaver
(monitor_stat, reverse=False, filename=None, checkpoint_dir=None)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Separately save the model with minimum value of some statistics.
-
__init__
(monitor_stat, reverse=False, filename=None, checkpoint_dir=None)[source]¶ - Parameters
Example
Save the model with minimum validation error to “min-val-error.tfmodel”:
MinSaver('val-error')
Note
It assumes that
ModelSaver
is used with the samecheckpoint_dir
and appears earlier in the callback list. The default for bothModelSaver
andMinSaver
ischeckpoint_dir=logger.get_logger_dir()
Callbacks are executed in the order they are defined. Therefore you’d want to use this callback after the callback (e.g. InferenceRunner) that produces the statistics.
-
-
class
tensorpack.callbacks.
MaxSaver
(monitor_stat, filename=None, checkpoint_dir=None)[source]¶ Bases:
tensorpack.callbacks.saver.MinSaver
Separately save the model with maximum value of some statistics.
See docs of
MinSaver
for details.
-
class
tensorpack.callbacks.
TensorPrinter
(names)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Prints the value of some tensors in each step. It’s an example of how
before_run/after_run
works.
-
class
tensorpack.callbacks.
ProgressBar
(names=)[source]¶ Bases:
tensorpack.callbacks.base.Callback
A progress bar based on tqdm.
This callback is one of the
DEFAULT_CALLBACKS()
.
-
class
tensorpack.callbacks.
SessionRunTimeout
(timeout_in_ms)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Add timeout option to each sess.run call.
-
class
tensorpack.callbacks.
MovingAverageSummary
(collection='MOVING_SUMMARY_OPS', train_op=None)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Maintain the moving average of summarized tensors in every step, by ops added to the collection. Note that it only maintains the moving averages by updating the relevant variables in the graph, the actual summary should be done in other callbacks.
This callback is one of the
DEFAULT_CALLBACKS()
.-
__init__
(collection='MOVING_SUMMARY_OPS', train_op=None)[source]¶ - Parameters
collection (str) – the collection of EMA-maintaining ops. The default value would work with the tensors you added by
tfutils.summary.add_moving_summary()
, but you can use other collections as well.train_op (tf.Operation or str) – the (name of) training op to associate the maintaing ops with. If not provided, the EMA-maintaining ops will be hooked to trainer.hooked_session and be executed in every iteration. Otherwise, the EMA-maintaining ops will be executed whenever the training op is executed.
-
-
tensorpack.callbacks.
MergeAllSummaries
(period=0, run_alone=False, key=None)[source]¶ Evaluate all summaries by
tf.summary.merge_all
, and write them to logs.This callback is one of the
DEFAULT_CALLBACKS()
.- Parameters
period (int) – by default the callback summarizes once every epoch. This option (if not set to 0) makes it additionally summarize every
period
steps.run_alone (bool) – whether to evaluate the summaries alone. If True, summaries will be evaluated after each epoch alone. If False, summaries will be evaluated together with the sess.run calls, in the last step of each epoch. For
SimpleTrainer
, it needs to be False because summary may depend on inputs.key (str) – the collection of summary tensors. Same as in
tf.summary.merge_all
. Default istf.GraphKeys.SUMMARIES
.
-
class
tensorpack.callbacks.
SimpleMovingAverage
(tensors, window_size)[source]¶ Bases:
tensorpack.callbacks.base.Callback
Monitor Simple Moving Average (SMA), i.e. an average within a sliding window, of some tensors.
-
class
tensorpack.callbacks.
PeriodicTrigger
(triggerable, every_k_steps=None, every_k_epochs=None, before_train=False)[source]¶ Bases:
tensorpack.callbacks.base.ProxyCallback
Trigger a callback every k global steps or every k epochs by its
trigger()
method.Most existing callbacks which do something every epoch are implemented with
trigger()
method. By default thetrigger()
method will be called every epoch. This wrapper can make the callback run at a different frequency.All other methods (
before/after_run
,trigger_step
, etc) of the given callback are unaffected. They will still be called as-is.
-
class
tensorpack.callbacks.
PeriodicCallback
(callback, every_k_steps=None, every_k_epochs=None)[source]¶ Bases:
tensorpack.callbacks.trigger.EnableCallbackIf
The
{before,after}_epoch
,{before,after}_run
,trigger_{epoch,step}
methods of the given callback will be enabled only whenglobal_step % every_k_steps == 0` or ``epoch_num % every_k_epochs == 0
. The other methods are unaffected.Note that this can only makes a callback less frequent than itself. If you have a callback that by default runs every epoch by its
trigger()
method, usePeriodicTrigger
to schedule it more frequent than itself.-
__init__
(callback, every_k_steps=None, every_k_epochs=None)[source]¶ - Parameters
callback (Callback) – a Callback instance.
every_k_steps (int) – enable the callback when
global_step % k == 0
. Set to None to ignore.every_k_epochs (int) – enable the callback when
epoch_num % k == 0
. Also enable when the last step finishes (epoch_num == max_epoch
andlocal_step == steps_per_epoch - 1
). Set to None to ignore.
every_k_steps and every_k_epochs can be both set, but cannot be both None.
-
-
class
tensorpack.callbacks.
EnableCallbackIf
(callback, pred)[source]¶ Bases:
tensorpack.callbacks.base.ProxyCallback
Disable the
{before,after}_epoch
,{before,after}_run
,trigger_{epoch,step}
methods of a callback, unless some condition satisfies. The other methods are unaffected.A more accurate name for this callback should be “DisableCallbackUnless”, but that’s too ugly.
Note
If you use
{before,after}_run
,pred
will be evaluated only inbefore_run
.