lightonml.datasets

This module contains functions to load some common datasets. All datasets return tuples of train and test examples and labels. Grayscale images are flattened, RGB images have shape (3, width, height). All functions look for a .lightonml_config file to read the data location. If it doesn’t exist, they create one, with your home directory as the default data directory location. You can change it by changing the config file .lighton.json.

CIFAR10()[source]

Data Loader for the CIFAR10 dataset.

Returns:
  • (X_train, y_train) (tuple of np.ndarray of np.uint8, of shape (50000, 3, 96, 96) and (50000,)) – train CIFAR10 images and labels.
  • (X_test, y_test) (tuple of np.ndarray of np.uint8, of shape (10000, 3, 96, 96) and (10000,)) – test CIFAR10 images and labels.
CIFAR100()[source]

Data Loader for the CIFAR100 dataset.

Returns:
  • (X_train, y_train) (tuple of np.ndarray of np.uint8, of shape (50000, 3, 96, 96) and (50000,)) – train CIFAR100 images and labels.
  • (X_test, y_test) (tuple of np.ndarray of np.uint8, of shape (10000, 3, 96, 96) and (10000,)) – test CIFAR100 images and labels.
FashionMNIST()[source]

Data Loader for the FashionMNIST dataset.

Returns:
  • (X_train, y_train) (tuple of np.ndarray of np.uint8, of shape (60000, 784) and (60000,)) – train flattened FashionMNIST images and labels.
  • (X_test, y_test) (tuple of np.ndarray of np.uint8, of shape (10000, 784) and (10000,)) – test flattened FashionMNIST images and labels.
MNIST()[source]

Data loader for the MNIST dataset.

Returns:
  • (X_train, y_train) (tuple of np.ndarray of np.uint8, of shape (60000, 784) and (60000,)) – train flattened MNIST images and labels.
  • (X_test, y_test) (tuple of np.ndarray of np.uint8, of shape (10000, 784) and (10000,)) – test flattened MNIST images and labels.
STL10(unlabeled=False)[source]

Data Loader for the STL10 dataset.

Parameters:unlabeled (bool, default to False,) – if True returns also the unlabeled part of the dataset
Returns:
  • (X_train, y_train) (tuple of np.ndarray of np.uint8, of shape (5000, 3, 96, 96) and (5000,)) – train STL10 images and labels.
  • (X_test, y_test) (tuple of np.ndarray of np.uint8, of shape (8000, 3, 96, 96) and (8000,)) – test STL10 images and labels.
  • X_unlabeled (np.ndarray of np.uint8, of shape (100000, 3, 96, 96),) – unlabeled images from STL10.
SignMNIST()[source]

Data Loader for the SignMNIST dataset. Each training and test case represents a label (0-25) as a one-to-one map for each alphabetic letter A-Z.

https://www.kaggle.com/datamunge/sign-language-mnist/home

Returns:
  • (X_train, y_train) (tuple of np.ndarray of np.uint8, of shape (27455, 784) and (27455,)) – train flattened SignMNIST images and labels.
  • (X_test, y_test) (tuple of np.ndarray of np.uint8, of shape (7172, 784) and (7172,)) – test flattened SignMNIST images and labels.