Deep Historical Consistent Neural Network

Module

class prosper_nn.models.dhcnn.dhcnn.DHCNN(n_state_neurons: int, n_features_Y: int, past_horizon: int, forecast_horizon: int, deepness: int, sparsity: float = 0.0, activation: <module 'torch' from '/home/runner/.local/lib/python3.10/site-packages/torch/__init__.py'> = <built-in method tanh of type object>, init_state: ~torch.Tensor | None = None, learn_init_state: bool = True, teacher_forcing: float = 1, decrease_teacher_forcing: float = 0)[source]

Bases: Module

The DHCNN class creates a Deep Historical Consistent Neural Network. The model uses multiple HCNNs in different levels. The state from the lower level is passed to one upper level. The first level is a HCNN with the GRU variant 3 implementation.

Parameters:
  • n_state_neurons (int) – The dimension of the state in the HCNN Cell. It must be a positive integer with n_state_neurons >= n_features_Y.

  • n_features_Y (int) – The size of the data in each timestamp. It must be a positive integer.

  • past_horizon (int) – The past horizon gives the amount of time steps into the past, where an observation is available. It represents the number of comparisons between expectation and observation and therefore the amount of teacher forcing.

  • forecast_horizon (int) – The forecast horizon gives the amount of time steps into the future, where no observation is available. It represents the amount of forecast steps the model returns.

  • deepness (int) – The number of stacked HCNNs in the Neural Network. A deepness equal to 1 leads to a normal Historical Consistent Neural Network with GRU variant 3 implementation.

  • sparsity (float) – The share of weights that are set to zero in the matrix A. These weights are not trainable and therefore always zero. For big matrices (dimension > 50) this can be necessary to guarantee numerical stability and it increases the long-term memory of the model.

  • activation (torch) – The activation function that is applied on the output of the hidden layers. The same function is used on all hidden layers. No function is applied if no function is given.

  • init_state (torch.Tensor) – The initial states of the HCNN model. Can be given optionally and is chosen randomly if not specified. If given, it should have the shape = (deepness, 1, n_state_neurons)

  • learn_init_state (boolean) – Learn the initial hidden state or not.

  • teacher_forcing (float) – The probability that teacher forcing is applied for a single state neuron. In each time step this is repeated and therefore enforces stochastic learning if the value is smaller than 1.

  • decrease_teacher_forcing (float) – The amount by which teacher_forcing is decreased each epoch.

Return type:

None

forward(Y: Tensor)[source]
Parameters:

Y (torch.Tensor) – Y should be 3-dimensional with the shape = (past_horizon, batchsize, n_features_Y). This timeseries of observations is used for training the model in order to predict future observations.

Returns:

Contains for each HCNN level: The past_error, i.e. the forecasting errors along the past_horizon where Y is known, and forecast, i.e. the forecast along the forecast_horizon. Both can be used for backpropagation. shape=(deepness, past_horizon+forecast_horizon, batchsize, n_features_Y)

Return type:

torch.Tensor

Note

The model uses individual Historical Consistent Neural Network Cell in each depth.

Example

import torch

from prosper_nn.models.dhcnn import DHCNN
import prosper_nn.utils.generate_time_series_data as gtsd
import prosper_nn.utils.create_input_ecnn_hcnn as ci

# Define network and data parameters
past_horizon = 10
forecast_horizon = 5
n_features_Y = 2
n_data = 20
n_state_neurons = 3
deepness = 3
batchsize = 5

# Initialise Historical Consistant Neural Network
dhcnn = DHCNN(n_state_neurons, n_features_Y, past_horizon, forecast_horizon, deepness)

# Generate data with "unknown" variables U
Y, U = gtsd.sample_data(n_data, n_features_Y=n_features_Y - 1, n_features_U=1)
Y = torch.cat((Y, U), 1)
Y_batches = ci.create_input(Y, past_horizon, batchsize)

targets = torch.zeros((deepness, past_horizon, batchsize, n_features_Y))

# Train model
optimizer = torch.optim.Adam(dhcnn.parameters())
loss_function = torch.nn.MSELoss()

for epoch in range(10):
    for batch_index in range(0, Y_batches.shape[0]):
        Y_batch = Y_batches[batch_index]
        model_output = dhcnn(Y_batch)
        past_errors, forecast = torch.split(model_output, past_horizon, dim=1)

        dhcnn.zero_grad()
        loss = loss_function(past_errors, targets)
        loss.backward()
        optimizer.step()