# Topology in time series forecasting¶

This notebook shows how `giotto-tda`

can be used to create topological
features for time series forecasting tasks, and how to integrate them
into `scikit-learn`

–compatible pipelines.

In particular, we will concentrate on topological features which are
created from consecutive **sliding windows** over the data. In sliding
window models, a single time series array `X`

of shape
`(n_timestamps, n_features)`

is turned into a time series of windows
over the data, with a new shape
`(n_windows, n_samples_per_window, n_features)`

. There are two main
issues that arise when building forecasting models with sliding windows:

`n_windows`

is smaller than`n_timestamps`

. This is because we cannot have more windows than there are timestamps without padding`X`

, and this is not done by`giotto-tda`

.`n_timestamps - n_windows`

is even larger if we decide to pick a large stride between consecutive windows.The target variable

`y`

needs to be properly “aligned” with each window so that the forecasting problem is meaningful and e.g. we don’t “leak” information from the future. In particular,`y`

needs to be “resampled” so that it too has length`n_windows`

.

To deal with these issues, `giotto-tda`

provides a selection of
transformers with `resample`

, `transform_resample`

and
`fit_transform_resample`

methods. These are inherited from a
`TransformerResamplerMixin`

base class. Furthermore, `giotto-tda`

provides a drop-in replacement for `scikit-learn`

’s `Pipeline`

which extends it to allow chaining `TransformerResamplerMixin`

s with
regular `scikit-learn`

estimators.

If you are looking at a static version of this notebook and would like to run its contents, head over to GitHub and download the source.

## See also¶

Topology of time series which explains how transforming time series into point clouds via the

*Takens embedding*procedure makes topological feature extraction possible.Topological feature extraction using VietorisRipsPersistence and PersistenceEntropy for a quick introduction to general topological feature extraction in

`giotto-tda`

.

**License: AGPLv3**

`SlidingWindow`

¶

Let us start with a simple example of a “time series” `X`

with a
corresponding target `y`

of the same length.

```
import numpy as np
n_timestamps = 10
X, y = np.arange(n_timestamps), np.arange(n_timestamps) - n_timestamps
X, y
```

```
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
array([-10, -9, -8, -7, -6, -5, -4, -3, -2, -1]))
```

We can instantiate our sliding window transformer-resampler and run it
on the pair `(X, y)`

:

```
from gtda.time_series import SlidingWindow
window_size = 3
stride = 2
SW = SlidingWindow(size=window_size, stride=stride)
X_sw, yr = SW.fit_transform_resample(X, y)
X_sw, yr
```

```
(array([[1, 2, 3],
[3, 4, 5],
[5, 6, 7],
[7, 8, 9]]),
array([-7, -5, -3, -1]))
```

We note a couple of things: - `fit_transform_resample`

returns a pair:
the window-transformed `X`

and the resampled and aligned `y`

. -
`SlidingWindow`

has made a choice for us on how to resample `y`

and
line it up with the windows from `X`

: a window on `X`

corresponds to
the *last* value in a corresponding window over `y`

. This is common in
time series forecasting where, for example, `y`

could be a shift of
`X`

by one timestamp. - Some of the initial values of `X`

may not be
found in `X_sw`

. This is because `SlidingWindow`

only ensures the
*last* value is represented in the last window, regardless of the
stride.

## Multivariate time series example: Sliding window + topology `Pipeline`

¶

`giotto-tda`

’s topology transformers expect 3D input. But our
`X_sw`

above is 2D. How do we capture interesting properties of the
topology of input time series then? For univariate time series, it turns
out that a good way is to use the “time delay embedding” or “Takens
embedding” technique explained in the first part of Topology of time
series.
But as this involves an extra layer of complexity, we leave it for later
and concentrate for now on an example with a simpler API which also
demonstrates the use of a `giotto-tda`

`Pipeline`

.

Surprisingly, this involves multivariate time series input!

```
rng = np.random.default_rng(42)
n_features = 2
X = rng.integers(0, high=20, size=(n_timestamps, n_features), dtype=int)
X
```

```
array([[ 1, 15],
[13, 8],
[ 8, 17],
[ 1, 13],
[ 4, 1],
[10, 19],
[14, 15],
[14, 15],
[10, 2],
[16, 9]])
```

We are interpreting this input as a time series in two variables, of
length `n_timestamps`

. The target variable is the same `y`

as
before.

```
SW = SlidingWindow(size=window_size, stride=stride)
X_sw, yr = SW.fit_transform_resample(X, y)
X_sw, yr
```

```
(array([[[13, 8],
[ 8, 17],
[ 1, 13]],
[[ 1, 13],
[ 4, 1],
[10, 19]],
[[10, 19],
[14, 15],
[14, 15]],
[[14, 15],
[10, 2],
[16, 9]]]),
array([-7, -5, -3, -1]))
```

`X_sw`

is now a complicated-looking array, but it has a simple
interpretation. Again, `X_sw[i]`

is the `i`

-th window on `X`

, and
it contains `window_size`

samples from the original time series. This
time, the samples are not scalars but 1D arrays.

What if we suspect that the way in which the **correlations** between
the variables evolve over time can help forecast the target `y`

? This
is a common situation in neuroscience, where each variable could be data
from a single EEG sensor, for instance.

`giotto-tda`

exposes a `PearsonDissimilarity`

transformer which
creates a 2D dissimilarity matrix from each window in `X_sw`

, and
stacks them together into a single 3D object. This is the correct format
(and information content!) for a typical topological transformer in
`gtda.homology`

. See also Topological feature extraction from
graphs
for an in-depth look. Finally, we can extract simple scalar features
using a selection of transformers in `gtda.diagrams`

.

```
from gtda.time_series import PearsonDissimilarity
from gtda.homology import VietorisRipsPersistence
from gtda.diagrams import Amplitude
PD = PearsonDissimilarity()
X_pd = PD.fit_transform(X_sw)
VR = VietorisRipsPersistence(metric="precomputed")
X_vr = VR.fit_transform(X_pd) # "precomputed" required on dissimilarity data
Ampl = Amplitude()
X_a = Ampl.fit_transform(X_vr)
X_a
```

```
array([[0.18228669, 0. ],
[0.03606068, 0. ],
[0.28866041, 0. ],
[0.01781238, 0. ]])
```

Notice that we are not acting on `y`

above. We are simply creating
features from each window using topology! *Note*: it’s two features per
window because we used the default value for `homology_dimensions`

in
`VietorisRipsPersistence`

, not because we had two variables in the
time series initially!

We can now put this all together into a `giotto-tda`

`Pipeline`

which combines both the sliding window transformation on `X`

and
resampling of `y`

with the feature extraction from the windows on
`X`

.

*Note*: while we could import the `Pipeline`

class and use its
constructor, we use the convenience function `make_pipeline`

instead,
which is a drop-in replacement for
scikit-learn’s.

```
from sklearn import set_config
set_config(display='diagram') # For HTML representations of pipelines
from gtda.pipeline import make_pipeline
pipe = make_pipeline(SW, PD, VR, Ampl)
pipe
```

Pipeline(steps=[('slidingwindow', SlidingWindow(size=3, stride=2)), ('pearsondissimilarity', PearsonDissimilarity()), ('vietorisripspersistence', VietorisRipsPersistence(metric='precomputed')), ('amplitude', Amplitude())])

SlidingWindow(size=3, stride=2)

PearsonDissimilarity()

VietorisRipsPersistence(metric='precomputed')

Amplitude()

Finally, if we have a *regression* task on `y`

we can add a final
estimator such as scikit-learn’s `RandomForestRegressor`

as a final
step in the previous pipeline, and fit it!

```
from sklearn.ensemble import RandomForestRegressor
RFR = RandomForestRegressor()
pipe = make_pipeline(SW, PD, VR, Ampl, RFR)
pipe
```

Pipeline(steps=[('slidingwindow', SlidingWindow(size=3, stride=2)), ('pearsondissimilarity', PearsonDissimilarity()), ('vietorisripspersistence', VietorisRipsPersistence(metric='precomputed')), ('amplitude', Amplitude()), ('randomforestregressor', RandomForestRegressor())])

SlidingWindow(size=3, stride=2)

PearsonDissimilarity()

VietorisRipsPersistence(metric='precomputed')

Amplitude()

RandomForestRegressor()

```
pipe.fit(X, y)
y_pred = pipe.predict(X)
score = pipe.score(X, y)
y_pred, score
```

```
(array([-5.96, -4.4 , -4.16, -2.4 ]), 0.76264)
```

## Univariate time series – `TakensEmbedding`

and `SingleTakensEmbedding`

¶

The first part of Topology of time
series
explains a commonly used technique for converting a univariate time
series into a single **point cloud**. Since topological features can be
extracted from any point cloud, this is a gateway to time series
analysis using topology. The second part of that notebook shows how to
transform a *batch* of time series into a batch of point clouds, and how
to extract topological descriptors from each of them independently.
While in that notebook this is applied to a time series classification
task, in this notebook we are concerned with topology-powered
*forecasting* from a single time series.

Reasoning by analogy with the multivariate case above, we can look at
sliding windows over `X`

as small time series in their own right and
track the evolution of *their* topology against the variable of interest
(or against itself, if we are interested in unsupervised tasks such as
anomaly detection).

There are two ways in which we can implement this idea in
`giotto-tda`

: 1. We can first apply a `SlidingWindow`

, and then an
instance of `TakensEmbedding`

. 2. We can *first* compute a global
Takens embedding of the time series via `SingleTakensEmbedding`

, which
takes us from 1D/column data to 2D data, and *then* partition the 2D
data of vectors into sliding windows via `SlidingWindow`

.

The first route ensures that we can run our “topological feature
extraction track” in parallel with other feature-generation pipelines
from sliding windows, without experiencing shape mismatches. The second
route seems a little upside-down and it is not generally recommended,
but it has the advantange that globally “optimal” parameters for the
“time delay” and “embedding dimension” parameters can be computed
automatically by `SingleTakensEmbedding`

.

Below is what each route would look like.

*Remark:* In the presence of noise, a small sliding window size is
likely to reduce the reliability of the estimate of the time series’
local topology.

### Option 1: `SlidingWindow`

+ `TakensEmbedding`

¶

`TakensEmbedding`

is not a `TransformerResamplerMixin`

, but this is
not a problem in the context of a `Pipeline`

when we order things in
this way.

```
from gtda.time_series import TakensEmbedding
X = np.arange(n_timestamps)
window_size = 5
stride = 2
SW = SlidingWindow(size=window_size, stride=stride)
X_sw, yr = SW.fit_transform_resample(X, y)
X_sw, yr
```

```
(array([[1, 2, 3, 4, 5],
[3, 4, 5, 6, 7],
[5, 6, 7, 8, 9]]),
array([-5, -3, -1]))
```

```
time_delay = 1
dimension = 2
TE = TakensEmbedding(time_delay=time_delay, dimension=dimension)
X_te = TE.fit_transform(X_sw)
X_te
```

```
array([[[1, 2],
[2, 3],
[3, 4],
[4, 5]],
[[3, 4],
[4, 5],
[5, 6],
[6, 7]],
[[5, 6],
[6, 7],
[7, 8],
[8, 9]]])
```

```
VR = VietorisRipsPersistence() # No "precomputed" for point clouds
Ampl = Amplitude()
RFR = RandomForestRegressor()
pipe = make_pipeline(SW, TE, VR, Ampl, RFR)
pipe
```

Pipeline(steps=[('slidingwindow', SlidingWindow(size=5, stride=2)), ('takensembedding', TakensEmbedding()), ('vietorisripspersistence', VietorisRipsPersistence()), ('amplitude', Amplitude()), ('randomforestregressor', RandomForestRegressor())])

SlidingWindow(size=5, stride=2)

TakensEmbedding()

VietorisRipsPersistence()

Amplitude()

RandomForestRegressor()

```
pipe.fit(X, y)
y_pred = pipe.predict(X)
score = pipe.score(X, y)
y_pred, score
```

```
(array([-3., -3., -3.]), 0.0)
```

### Option 2: `SingleTakensEmbeding`

+ `SlidingWindow`

¶

Note that `SingleTakensEmbedding`

is also a
`TransformerResamplerMixin`

, and that the logic for
resampling/aligning `y`

is the same as in `SlidingWindow`

.

```
from gtda.time_series import SingleTakensEmbedding
X = np.arange(n_timestamps)
STE = SingleTakensEmbedding(parameters_type="search", time_delay=2, dimension=3)
X_ste, yr = STE.fit_transform_resample(X, y)
X_ste, yr
```

```
(array([[0, 2],
[1, 3],
[2, 4],
[3, 5],
[4, 6],
[5, 7],
[6, 8],
[7, 9]]),
array([-8, -7, -6, -5, -4, -3, -2, -1]))
```

```
window_size = 5
stride = 2
SW = SlidingWindow(size=window_size, stride=stride)
X_sw, yr = SW.fit_transform_resample(X_ste, yr)
X_sw, yr
```

```
(array([[[1, 3],
[2, 4],
[3, 5],
[4, 6],
[5, 7]],
[[3, 5],
[4, 6],
[5, 7],
[6, 8],
[7, 9]]]),
array([-3, -1]))
```

From here on, it is easy to push a very similar pipeline through as in the multivariate case:

```
VR = VietorisRipsPersistence() # No "precomputed" for point clouds
Ampl = Amplitude()
RFR = RandomForestRegressor()
pipe = make_pipeline(STE, SW, VR, Ampl, RFR)
pipe
```

Pipeline(steps=[('singletakensembedding', SingleTakensEmbedding(dimension=3, time_delay=2)), ('slidingwindow', SlidingWindow(size=5, stride=2)), ('vietorisripspersistence', VietorisRipsPersistence()), ('amplitude', Amplitude()), ('randomforestregressor', RandomForestRegressor())])

SingleTakensEmbedding(dimension=3, time_delay=2)

SlidingWindow(size=5, stride=2)

VietorisRipsPersistence()

Amplitude()

RandomForestRegressor()

```
pipe.fit(X, y)
y_pred = pipe.predict(X)
score = pipe.score(X, y)
y_pred, score
```

```
(array([-2.07, -2.07]), -0.004899999999999904)
```

### Integrating non-topological features¶

The best results are obtained when topological methods are used not in
isolation but in **combination** with other methods. Here’s an example
where, in parallel with the topological feature extraction from local
sliding windows using **Option 2** above, we also compute the mean and
variance in each sliding window. A `scikit-learn`

`FeatureUnion`

is
used to combine these very different sets of features into a single
pipeline object.

```
from functools import partial
from sklearn.preprocessing import FunctionTransformer
from sklearn.pipeline import FeatureUnion
from sklearn.base import clone
mean = FunctionTransformer(partial(np.mean, axis=1, keepdims=True))
var = FunctionTransformer(partial(np.var, axis=1, keepdims=True))
pipe_topology = make_pipeline(TE, VR, Ampl)
feature_union = FeatureUnion([("window_mean", mean),
("window_variance", var),
("window_topology", pipe_topology)])
pipe = make_pipeline(SW, feature_union, RFR)
pipe
```

Pipeline(steps=[('slidingwindow', SlidingWindow(size=5, stride=2)), ('featureunion', FeatureUnion(transformer_list=[('window_mean', FunctionTransformer(func=functools.partial(, axis=1, keepdims=True))), ('window_variance', FunctionTransformer(func=functools.partial( , axis=1, keepdims=True))), ('window_topology', Pipeline(steps=[('takensembedding', TakensEmbedding()), ('vietorisripspersistence', VietorisRipsPersistence()), ('amplitude', Amplitude())]))])), ('randomforestregressor', RandomForestRegressor())])

SlidingWindow(size=5, stride=2)

FeatureUnion(transformer_list=[('window_mean', FunctionTransformer(func=functools.partial(, axis=1, keepdims=True))), ('window_variance', FunctionTransformer(func=functools.partial( , axis=1, keepdims=True))), ('window_topology', Pipeline(steps=[('takensembedding', TakensEmbedding()), ('vietorisripspersistence', VietorisRipsPersistence()), ('amplitude', Amplitude())]))])

FunctionTransformer(func=functools.partial(, axis=1, keepdims=True))

FunctionTransformer(func=functools.partial(, axis=1, keepdims=True))

TakensEmbedding()

VietorisRipsPersistence()

Amplitude()

RandomForestRegressor()

```
pipe.fit(X, y)
y_pred = pipe.predict(X)
score = pipe.score(X, y)
y_pred, score
```

```
(array([-4.34, -3.48, -1.58]), 0.8747)
```

## Endogeneous target preparation with `Labeller`

¶

Let us say that we simply wish to predict the future of a time series
from itself. This is very common in the study of financial markets for
example. `giotto-tda`

provides convenience classes for target
preparation from a time series. This notebook only shows a very simple
example: many more options are described in `Labeller`

’s
documentation.

If we wished to create a target `y`

from `X`

such that `y[i]`

is
equal to `X[i + 1]`

, while also modifying `X`

and `y`

so that they
still have the same length, we could proceed as follows:

```
from gtda.time_series import Labeller
X = np.arange(10)
Lab = Labeller(size=1, func=np.max)
Xl, yl = Lab.fit_transform_resample(X, X)
Xl, yl
```

```
(array([0, 1, 2, 3, 4, 5, 6, 7, 8]), array([1, 2, 3, 4, 5, 6, 7, 8, 9]))
```

Notice that we are feeding two copies of `X`

to
`fit_transform_resample`

in this case!

This is what fitting an end-to-end pipeline for future prediction using topology could look like. Again, you are encouraged to include your own non-topological features in the mix!

```
SW = SlidingWindow(size=5)
TE = TakensEmbedding(time_delay=1, dimension=2)
VR = VietorisRipsPersistence()
Ampl = Amplitude()
RFR = RandomForestRegressor()
# Full pipeline including the regressor
pipe = make_pipeline(Lab, SW, TE, VR, Ampl, RFR)
pipe
```

Pipeline(steps=[('labeller', Labeller(func=, size=1)), ('slidingwindow', SlidingWindow(size=5)), ('takensembedding', TakensEmbedding()), ('vietorisripspersistence', VietorisRipsPersistence()), ('amplitude', Amplitude()), ('randomforestregressor', RandomForestRegressor())])

Labeller(func=, size=1)

SlidingWindow(size=5)

TakensEmbedding()

VietorisRipsPersistence()

Amplitude()

RandomForestRegressor()

```
pipe.fit(X, X)
y_pred = pipe.predict(X)
y_pred
```

```
array([6.948, 6.948, 6.948, 6.948, 6.948])
```

## Where to next?¶

There are two additional simple

`TransformerResamplerMixin`

s in`gtda.time_series`

:`Resampler`

and`Stationarizer`

.The sort of pipeline for topological feature extraction using Takens embedding is a bit crude. More sophisticated methods exist for extracting robust topological summaries from (windows on) time series. A good source of inspiration is the following paper:

Persistent Homology of Complex Networks for Dynamic State Detection, by A. Myers, E. Munch, and F. A. Khasawneh.

The module

`gtda.graphs`

contains several transformers implementing the main algorithms proposed there.Advanced users may be interested in

`ConsecutiveRescaling`

, which can be found in`gtda.point_clouds`

.The notebook Case study: Lorenz attractor is an advanced use-case for

`TakensEmbedding`

and other time series forecasting techniques inspired by topology.