tensorpack.dataflow.dataset package¶
-
class
tensorpack.dataflow.dataset.BSDS500(name, data_dir=None, shuffle=True)[source]¶ Bases:
tensorpack.dataflow.base.RNGDataFlowBerkeley Segmentation Data Set and Benchmarks 500 dataset.
Produce
(image, label)pair, whereimagehas shape (321, 481, 3(BGR)) and ranges in [0,255].Labelis a floating point image of shape (321, 481) in range [0, 1]. The value of each pixel isnumber of times it is annotated as edge / total number of annotators for this image.
-
class
tensorpack.dataflow.dataset.Caltech101Silhouettes(name, shuffle=True, dir=None)[source]¶ Bases:
tensorpack.dataflow.base.RNGDataFlowProduces [image, label] in Caltech101 Silhouettes dataset, image is 28x28 in the range [0,1], label is an int in the range [0,100].
-
class
tensorpack.dataflow.dataset.CifarBase(train_or_test, shuffle=None, dir=None, cifar_classnum=10)[source]¶ Bases:
tensorpack.dataflow.base.RNGDataFlowProduces [image, label] in Cifar10/100 dataset, image is 32x32x3 in the range [0,255]. label is an int.
-
class
tensorpack.dataflow.dataset.Cifar10(train_or_test, shuffle=None, dir=None)[source]¶ Bases:
tensorpack.dataflow.dataset.cifar.CifarBaseProduces [image, label] in Cifar10 dataset, image is 32x32x3 in the range [0,255]. label is an int.
-
class
tensorpack.dataflow.dataset.Cifar100(train_or_test, shuffle=None, dir=None)[source]¶ Bases:
tensorpack.dataflow.dataset.cifar.CifarBaseSimilar to Cifar10
-
class
tensorpack.dataflow.dataset.ILSVRCMeta(dir=None)[source]¶ Bases:
objectProvide methods to access metadata for
ILSVRC12dataset.-
get_image_list(name, dir_structure='original')[source]¶ - Parameters
name (str) – ‘train’ or ‘val’ or ‘test’
dir_structure (str) – same as in
ILSVRC12.__init__().
- Returns
list – list of (image filename, label)
-
-
class
tensorpack.dataflow.dataset.ILSVRC12(dir, name, meta_dir=None, shuffle=None, dir_structure=None)[source]¶ Bases:
tensorpack.dataflow.dataset.ilsvrc.ILSVRC12FilesThe ILSVRC12 classification dataset, aka the commonly used 1000 classes ImageNet subset. This dataflow produces uint8 images of shape [h, w, 3(BGR)], and a label between [0, 999]. The label map follows the synsets.txt file in http://dl.caffe.berkeleyvision.org/caffe_ilsvrc12.tar.gz, which can also be queried using
ILSVRCMeta.-
__init__(dir, name, meta_dir=None, shuffle=None, dir_structure=None)[source]¶ - Parameters
dir (str) – A directory containing a subdir named
name, containing the images in a structure described below.name (str) – One of ‘train’ or ‘val’ or ‘test’.
shuffle (bool) – shuffle the dataset. Defaults to True if name==’train’.
dir_structure (str) – One of ‘original’ or ‘train’. The directory structure for the ‘val’ directory. ‘original’ means the original decompressed directory, which only has list of image files (as below). If set to ‘train’, it expects the same two-level directory structure similar to ‘dir/train/’. By default, it tries to automatically detect the structure. You probably do not need to care about this option because ‘original’ is what people usually have.
Example:
When dir_structure==’original’, dir should have the following structure:
dir/ train/ n02134418/ n02134418_198.JPEG ... ... val/ ILSVRC2012_val_00000001.JPEG ... test/ ILSVRC2012_test_00000001.JPEG ...With the downloaded ILSVRC12_img_*.tar, you can use the following command to build the above structure:
mkdir val && tar xvf ILSVRC12_img_val.tar -C val mkdir test && tar xvf ILSVRC12_img_test.tar -C test mkdir train && tar xvf ILSVRC12_img_train.tar -C train && cd train find -type f -name '*.tar' | parallel -P 10 'echo {} && mkdir -p {/.} && tar xf {} -C {/.}'When dir_structure==’train’, dir should have the following structure:
dir/ train/ n02134418/ n02134418_198.JPEG ... ... val/ n01440764/ ILSVRC2012_val_00000293.JPEG ... ... test/ ILSVRC2012_test_00000001.JPEG ...
-
-
class
tensorpack.dataflow.dataset.ILSVRC12Files(dir, name, meta_dir=None, shuffle=None, dir_structure=None)[source]¶ Bases:
tensorpack.dataflow.base.RNGDataFlowSame as
ILSVRC12, but produces filenames of the images instead of nparrays. This could be useful whencv2.imreadis a bottleneck and you want to decode it in smarter ways (e.g. in parallel).
-
class
tensorpack.dataflow.dataset.TinyImageNet(dir, name, shuffle=None)[source]¶ Bases:
tensorpack.dataflow.base.RNGDataFlowThe TinyImageNet classification dataset, with 200 classes and 500 images per class. See https://tiny-imagenet.herokuapp.com/.
It produces [image, label] where image is a 64x64x3(BGR) image, label is an integer in [0, 200).
-
class
tensorpack.dataflow.dataset.Mnist(train_or_test, shuffle=True, dir=None)[source]¶ Bases:
tensorpack.dataflow.base.RNGDataFlowProduces [image, label] in MNIST dataset, image is 28x28 in the range [0,1], label is an int.
-
class
tensorpack.dataflow.dataset.FashionMnist(train_or_test, shuffle=True, dir=None)[source]¶ Bases:
tensorpack.dataflow.dataset.mnist.MnistSame API as
Mnist, but more fashion.
-
class
tensorpack.dataflow.dataset.Places365Standard(dir, name, shuffle=None)[source]¶ Bases:
tensorpack.dataflow.base.RNGDataFlowThe Places365-Standard Dataset, in low resolution format only. Produces BGR images of shape (256, 256, 3) in range [0, 255].
-
__init__(dir, name, shuffle=None)[source]¶ - Parameters
dir – path to the Places365-Standard dataset in its “easy directory structure”. See http://places2.csail.mit.edu/download.html
name – one of “train” or “val”
shuffle (bool) – shuffle the dataset. Defaults to True if name==’train’.
-
-
class
tensorpack.dataflow.dataset.RNGDataFlow[source]¶ Bases:
tensorpack.dataflow.base.DataFlowA DataFlow with RNG
-
rng= None¶ self.rngis anp.random.RandomStateinstance that is initialized correctly (with different seeds in each process) inRNGDataFlow.reset_state().
-