lightonml.encoding

lightonml.encoding.base

Encoders

These modules contains implementations of Encoders that can transform data in the binary uint8 format required by the OPU.

class BinaryThresholdEncoder(threshold_enc=25, greater_is_one=True)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements binary encoding using a threshold function.

Parameters:
  • threshold_enc (int) – Threshold for the binary encoder. Must be in the interval [0, 255]
  • greater_is_one (bool) – If True, above threshold is 1 and below 0. Vice versa if False.
threshold_enc

Threshold for the binary encoder. Must be in the interval [0, 255]

Type:int
greater_is_one

If True, above threshold is 1 and below 0. Vice versa if False.

Type:bool
fit(X, y=None)[source]

No-op. This method doesn’t do anything. It exists purely for compatibility with the scikit-learn transformer API.

Parameters:
  • X (np.ndarray,) – the input data to encode.
  • y (np.ndarray,) – the targets data.
Returns:

self

Return type:

BinaryThresholdEncoding

transform(X, y=None)[source]

Transform a uint8 array in [0, 255] in a uint8 binary array of [0, 1].

Parameters:X (np.ndarray of uint8,) – the input data to encode.
Returns:X_enc – the encoded data.
Return type:np.ndarray of uint8,
class ConcatenatingBitPlanDecoder(n_bits=8, decoding_decay=0.5)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a decoding that works by concatenating bitplanes.

n_bits MUST be the same value used in SeparatedBitPlanEncoder. Read more in the Examples section.

Parameters:
  • n_bits (int, defaults to 8,) – number of bits used during the encoding.
  • decoding_decay (float, defaults to 0.5,) – decay to apply to the bits during the decoding.
n_bits

number of bits used during the encoding.

Type:int,
decoding_decay

decay to apply to the bits during the decoding.

Type:float, defaults to 0.5,
fit(X, y=None)[source]

No-op. This method doesn’t do anything. It exists purely for compatibility with the scikit-learn transformer API.

Parameters:
  • X (np.ndarray) –
  • y (np.ndarray, optional, defaults to None.) –
Returns:

self

Return type:

MixingBitPlanDecoder

transform(X, y=None)[source]

Performs the decoding.

Parameters:X (2D np.ndarray of uint8 or uint16,) – input data to decode.
Returns:X_dec – decoded data.
Return type:2D np.ndarray of floats
class Float32Encoder(sign_bit=True, exp_bits=8, mantissa_bits=23)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements an encoding that works by separating bitplans and selecting how many bits to keep for sign, mantissa and exponent of the float32.

Parameters:
  • sign_bit (bool, defaults to True,) – if True keeps the bit for the sign.
  • exp_bits (int, defaults to 8,) – number of bits of the exponent to keep.
  • mantissa_bits (int, defaults to 23,) – number of bits of the mantissa to keep.
sign_bit

if True keeps the bit for the sign.

Type:bool, defaults to True,
exp_bits

number of bits of the exponent to keep.

Type:int, defaults to 8,
mantissa_bits

number of bits of the mantissa to keep.

Type:int, defaults to 23,
n_bits

total number of bits to keep.

Type:int,
indices

list of the indices of the bits to keep.

Type:list,
fit(X, y=None)[source]

No-op. This method doesn’t do anything. It exists purely for compatibility with the scikit-learn transformer API.

Parameters:
  • X (2D np.ndarray) –
  • y (1D np.ndarray) –
Returns:

self

Return type:

SeparatedBitPlanEncoder

transform(X)[source]

Performs the encoding.

Parameters:X (2D np.ndarray of uint8, 16, 32 or 64 [n_samples, n_features],) – input data to encode.
Returns:X_enc – encoded input data.
Return type:2D np.ndarray of uint8 [n_samples*n_bits, n_features],
class MixingBitPlanDecoder(n_bits=8, decoding_decay=0.5)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a decoding that works by mixing bitplanes.

n_bits MUST be the same value used in SeparatedBitPlanEncoder. Read more in the Examples section.

Parameters:
  • n_bits (int, defaults to 8,) – number of bits used during the encoding.
  • decoding_decay (float, defaults to 0.5,) – decay to apply to the bits during the decoding.
n_bits

number of bits used during the encoding.

Type:int,
decoding_decay

decay to apply to the bits during the decoding.

Type:float, defaults to 0.5,
fit(X, y=None)[source]

No-op. This method doesn’t do anything. It exists purely for compatibility with the scikit-learn transformer API.

Parameters:
  • X (np.ndarray) –
  • y (np.ndarray, optional, defaults to None.) –
Returns:

self

Return type:

MixingBitPlanDecoder

transform(X, y=None)[source]

Performs the decoding.

Parameters:X (2D np.ndarray of uint8 or uint16,) – input data to decode.
Returns:X_dec – decoded data.
Return type:2D np.ndarray of floats
class NoDecoding[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a No-Op Decoding class for API consistency.

class NoEncoding[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a No-Op Encoding class for API consistency.

class SeparatedBitPlanEncoder(n_bits=8, starting_bit=0)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements an encoding that works by separating bitplans.

n_bits + starting_bit must be lower than the bitwidth of data that are going to be fed to the encoder. E.g. if X.dtype is uint8, then n_bits + starting_bit must be lower than 8. If instead X.dtype is uint32, then n_bits + starting_bit must be lower than 32.

Read more in the Examples section.

Parameters:
  • n_bits (int, defaults to 8,) – number of bits to keep during the encoding. Must be positive.
  • starting_bit (int, defaults to 0,) – bit used to start the encoding, previous bits will be thrown away. Must be positive.
n_bits

number of bits to keep during the encoding.

Type:int,
starting_bit

bit used to start the encoding, previous bits will be thrown away.

Type:int,
fit(X, y=None)[source]

No-op. This method doesn’t do anything. It exists purely for compatibility with the scikit-learn transformer API.

Parameters:
  • X (2D np.ndarray) –
  • y (1D np.ndarray) –
Returns:

self

Return type:

SeparatedBitPlanEncoder

transform(X)[source]

Performs the encoding.

Parameters:X (2D np.ndarray of uint8, 16, 32 or 64 [n_samples, n_features],) – input data to encode.
Returns:X_enc – encoded input data.
Return type:2D np.ndarray of uint8 [n_samples*n_bits, n_features]
class SequentialBaseTwoEncoder(n_gray_levels=16)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Implements a base 2 encoding.

E.g. \(5\) is written \(101\) in base 2: \(1 * 2^2 + 0 * 2^1 + 1 * 2^0\) = (1)*4 +(0)*2 +(1)*1, so the encoder will give 1111001.

Parameters:n_gray_levels (int,) – number of values that can be encoded. Must be a power of 2.
n_gray_levels

number of values that can be encoded. Must be a power of 2.

Type:int,
n_bits

number of bits needed to encode n_gray_levels values.

Type:int,
offset

value to subtract to get the minimum to 0.

Type:float,
scale

scaling factor to normalize the data.

Type:float,
fit(X, y=None)[source]

Computes parameters for the normalization.

Must be run only on the training set to avoid leaking information to the dev/test set.

Parameters:
  • X (np.ndarray of uint [n_samples, n_features],) – the input data to encode.
  • y (np.ndarray,) – the targets data.
Returns:

self

Return type:

SequentialBaseTwoEncoder.

normalize(X)[source]

Normalize the data in the right range before the integer casting.

Parameters:X (np.ndarray of uint [n_samples, n_features],) – the input data to normalize.
Returns:X_norm – normalized data.
Return type:np.ndarray of uint8 [n_samples, n_features],
transform(X, y=None)[source]

Performs the encoding.

Parameters:X (2D np.ndarray of uint [n_samples, n_features],) – input data to encode.
Returns:X_enc – encoded input data.
Return type:2D np.ndarray of uint8 [n_samples, n_features*(n_gray_levels-1)

lightonml.encoding.utils

This module contains all the functions that are used to implement the data formatting. Coordinates are (y, x).

compute_indices_1d_macro_pixels(n_features, rectangle_shape, feature_shape='square')[source]

Computes indices of the macro pixels in a linear way. It means that it is using the whole width of the rectangle even if the quotient (rectangle_width / macro_pixel_width) is not an integer.

Parameters:
  • n_features (int,) – number of features for.
  • rectangle_shape (tuple of 2 integer,) – shape of the rectangle area.
  • feature_shape (string,) – shape of the macropixels (‘rectangle’ or ‘square’)
Returns:

  • indices (np.ndarray of int64/int32,) – indices in the rectangle.
  • factor (int,) – size of the macropixels.

compute_indices_2d_macro_pixels(n_features, rectangle_shape)[source]

Computes indices of the macro pixels as a classical zoom in of the squared features.

Parameters:
  • n_features (int,) – number of features for.
  • rectangle_shape (tuple of 2 integer,) – shape of the rectangle area.
Returns:

  • indices (np.ndarray of int64/int32,) – indices in the rectangle.
  • factor (int,) – size of the macropixels.

compute_indices_centered(n_features, rectangle_shape)[source]

Computes indices in order to have the feature values in a square centered in the area.

Parameters:
  • n_features (int,) – number of features for.
  • rectangle_shape (tuple of 2 integer,) – shape of the rectangle area.
Returns:

  • indices (np.ndarray of int64/int32,) – indices in the rectangle.
  • factor (int,) – size of the macropixels.

compute_indices_lined(n_features, rectangle_shape)[source]

Computes indices in order to have the feature values positioned in line.

Parameters:
  • n_features (int,) – number of features for.
  • rectangle_shape (tuple of 2 integer,) – shape of the rectangle area.
Returns:

  • indices (np.ndarray of int64/int32,) – indices in the rectangle.
  • factor (int,) – size of the macropixels.

compute_new_indices_greater_rectangle(indices, old_rectangle_shape, old_rectangle_position, new_rectangle_shape)[source]

Computes new indices from an old rectangle to a new and larger one. It is useful in the case where the region of interest is smaller than the DMD area.

Parameters:
  • indices (np.ndarray of uint64/uint32,) – indices in the old rectangle.
  • old_rectangle_shape (tuple of 2 integer,) – shape of the old rectangle.
  • old_rectangle_position (tuple of 2 integer,) – position of the origin of the old rectangle inside the new one (distance from top, distance from left margin)
  • new_rectangle_shape (tuple of 2 integer,) – shape of the new rectangle.
Returns:

new_indices – indices in the new rectangle.

Return type:

np.ndarray of uint64/uint32,

get_formatting_function(n_features, position='2d_macro_pixels', roi_shape=(1140, 912), roi_position=(0, 0), dmd_shape=(1140, 912))[source]

Returns a formatting function that takes feature vectors and returns the dmd formatted vectors .

Parameters:
  • n_features (int,) – number of features for.
  • position (str,) – type of formatting.
  • roi_shape (tuple of ints,) – shape of the ROI on the DMD.
  • roi_position (tuple of ints,) – position of the ROI on the DMD.
  • dmd_shape (tuple of ints,) – shape of the DMD.
Returns:

  • formatting_func (callable,) – callable to perform the formatting of the input arrays.
  • factor (int,) – size of the macropixels.