lightonml.utils

This file contains some utils function to deal with data and load and save models.

class OptionalProfiler(condition)[source]

Bases: object

Wrapper to profile functions.

Parameters:condition (bool,) – switch to return the decorated function
condition

switch to return the decorated function

Type:bool,
cast_01_to_uint8(X)[source]

Casts binary data to uint8.

Parameters:X (np.ndarray,) – input data.
Returns:X_uint8 – input data in uint8.
Return type:np.ndarray,
download(url, directory='')[source]

Download data from url into directory

get_ml_data_dir_path()[source]

Get the data directory folder from the JSON config file.

Returns:
Return type:pathlib.Path, location of the data folder.
load_data_from_numpy_archive(path_to_file)[source]

Loads data from NumPy archive.

Parameters:path_to_file (str,) – path to the numpy archive to load.
Returns:
  • (X_train, y_train) (tuple of np.ndarray,) – train set.
  • (X_test, y_test) (tuple of np.ndarray,) – test set.
load_model(model_path)[source]

Loads the model from a pickle file.

Parameters:model_path (str,) – path for the pickle file of the model.
Returns:model – instance of the model.
Return type:BaseEstimator, RegressorMixin or TransformerMixin and children,
save_model(model, model_name, model_path='models')[source]

Saves a model in a pickle file.

Parameters:
  • model (BaseEstimator, RegressorMixin or TransformerMixin and children,) – instance of the model to save.
  • model_name (str,) – name under which the model is saved in the pickle file.
  • model_path (str, defaults to model,) – path for the pickle file of the saved model.
select_subset(X, y, classes=range(0, 10), ratio=1, random_state=None)[source]

Selects a subset of a dataset.

Parameters:
  • X (2D np.ndarray,) – input data.
  • y (np.ndarray,) – targets.
  • classes (list or np.ndarray,) – number of classes in the dataset.
  • ratio (float,) – controls the ratio between examples.
  • random_state (int, RandomState instance or None, optional, defaults to None,) – controls the pseudo random number generator used to subsample the dataset.
Returns:

  • X (np.ndarray,) – subsampled data.
  • y (np.ndarray,) – subsampled targets.