Labeller

class gtda.time_series.Labeller(size=10, func=<function std>, func_params=None, percentiles=None, n_steps_future=1)[source]

Target creation from sliding windows over a univariate time series.

Useful to define a time series forecasting task in which labels are obtained from future values of the input time series, via the application of a function to time windows.

Parameters
  • size (int, optional, default: 10) – Size of each sliding window.

  • func (callable, optional, default: numpy.std) – Function to be applied to each window.

  • func_params (dict or None, optional, default: None) – Additional keyword arguments for func.

  • percentiles (list of real numbers between 0 and 100 inclusive, or None, optional, default: None) – If None, creates a target for a regression task. Otherwise, creates a target for an n-class classification task where n = len(percentiles) + 1.

  • n_steps_future (int, optional, default: 1) – Number of steps in the future for the predictive task.

thresholds_

Values corresponding to each percentile, based on data seen in fit.

Type

list of floats or None if percentiles is None

Examples

>>> import numpy as np
>>> from gtda.time_series import Labeller
>>> # Create a time series
>>> X = np.arange(10)
>>> labeller = Labeller(size=3, func=np.min)
>>> # Fit and transform X
>>> X, y = labeller.fit_transform_resample(X, X)
>>> print(X)
[1 2 3 4 5 6 7 8]
>>> print(y)
[0 1 2 3 4 5 6 7]
__init__(size=10, func=<function std>, func_params=None, percentiles=None, n_steps_future=1)[source]

Initialize self. See help(type(self)) for accurate signature.

fit(X, y=None)[source]

Compute thresholds_ and return the estimator.

Parameters
  • X (ndarray of shape (n_samples,) or (n_samples, 1)) – Univariate time series to build a target for.

  • y (None) – There is no need for a target, yet the pipeline API requires this parameter.

Returns

self

Return type

object

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (ndarray of shape (n_samples,) or (n_samples, 1)) – Univariate time series to build a target for.

  • y (None) – There is no need for a target, yet the pipeline API requires this parameter.

Returns

Xt – The cut input time series.

Return type

ndarray of shape (n_samples_new,)

fit_transform_resample(X, y, **fit_params)

Fit to data, then transform the input and resample the target. Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X ans a resampled version of y.

Parameters
  • X (ndarray of shape (n_samples, ..)) – Input data.

  • y (ndarray of shape (n_samples,)) – Target data.

Returns

  • Xt (ndarray of shape (n_samples, …)) – Transformed input.

  • yr (ndarray of shape (n_samples, …)) – Resampled target.

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

mapping of string to any

resample(y, X=None)[source]

Resample y.

Parameters
  • y (ndarray of shape (n_samples,)) – Time series to build a target for.

  • X (None) – There is no need for X, yet the pipeline API requires this parameter.

Returns

yr – Target for the prediction task.

Return type

ndarray of shape (n_samples_new,)

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

object

transform(X, y=None)[source]

Cuts X so it is aligned with y.

Parameters
  • X (ndarray of shape (n_samples,) or (n_samples, 1)) – Univariate time series to build a target for.

  • y (None) – There is no need for a target, yet the pipeline API requires this parameter.

Returns

Xt – The cut input time series.

Return type

ndarray of shape (n_samples_new,)

transform_resample(X, y)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (ndarray of shape (n_samples, ..)) – Input data.

  • y (ndarray of shape (n_samples,)) – Target data.

Returns

  • Xt (ndarray of shape (n_samples, …)) – Transformed input.

  • yr (ndarray of shape (n_samples, …)) – Resampled target.