Historical Consistent Neural Network With_Know_U

Module

class prosper_nn.models.hcnn_known_u.hcnn_known_u.HCNN_KNOWN_U(n_state_neurons: int, n_features_U: int, n_features_Y: int, past_horizon: int, forecast_horizon: int, sparsity: float = 0.0, activation: ~typing.Type[~torch.autograd.function.Function] = <built-in method tanh of type object>, init_state: ~torch.Tensor | None = None, learn_init_state: bool = True, teacher_forcing: float = 1, decrease_teacher_forcing: float = 0, backward_full_Y: bool = True, ptf_in_backward: bool = True)[source]

Bases: Module

The HCNN_KNOWN_U class creates a Historical Consistent Neural Network with known features appended at each time step.

A Historical Consistent Neural Network belongs to the class of Recurrent Neural Networks. Unlike the common HCNN, which doesn’t take any inputs, special feature of this architecture is that it provides the model with the future values of features if they are known (e.g. holidays). These features are called U. The model can be seen as a combination of HCNN (features are modeled internally and forecasted) and ECNN (feature values are provided to the model even along the forecast horizon).

Parameters:
  • n_state_neurons (int) – The dimension of the state in the HCNN_KNOWN_U Cell. It must be a positive integer with n_state_neurons >= n_features_Y.

  • n_features_U (int) – The number of features for which future values are known in each timestamp. Must be a positive integer or zero.

  • n_features_Y (int) – The number of features (including targets) whose dynamics are supposed to be modeled (and forecasted) internally. Must be a positive integer.

  • past_horizon (int) – The past horizon gives the amount of time steps into the past we want to use observations of in order to forecast. It represents the number of comparisons between expectation and observation and therefore the amount of teacher forcing.

  • forecast_horizon (int) – The forecast horizon gives the amount of time steps into the future which are supposed to be forecasted. It represents the amount of forecast steps the model returns.

  • sparsity (float) – The share of weights that are set to zero in the matrix A. These weights are not trainable and therefore always zero. For big matrices (dimension > 50) this can be necessary to guarantee numerical stability and it increases the long-term memory of the model.

  • activation (Type[torch.autograd.Function]) – The activation function that is applied on the output of the hidden layers. The same function is used on all hidden layers. No function is applied if no function is given.

  • init_state (torch.Tensor) – The initial state of the HCNN model. Can be given optionally and is chosen randomly if not specified.

  • learn_init_state (boolean) – Learn the initial hidden state or not.

  • teacher_forcing (float) – The probability that teacher forcing is applied for a single state neuron. In each time step this is repeated and therefore enforces stochastic learning if the value is smaller than 1.

  • decrease_teacher_forcing (float) – The amount by which teacher_forcing is decreased each epoch.

Return type:

None

adjust_teacher_forcing()[source]

Decrease teacher_forcing each epoch by decrease_teacher_forcing until it reaches zero. :param None:

Return type:

None

forward(U: Tensor, Y: Tensor)[source]
Parameters:
  • U (torch.Tensor) – U should be 3-dimensional with the shape = (past_horizon + forecast_horizon, batchsize, n_features_U). This timeseries of known features is used to append the hidden state while training the model in order to predict future observations. Only makes sense for features where future values are known.

  • Y (torch.Tensor) – Y should be 3-dimensional with the shape = (past_horizon, batchsize, n_features_Y). This timeseries of observations is used for training the model in order to predict future observations. Contains the features (including targets) whose dynamic is supposed to be modeled internally and then forecasted.

Returns:

Contains past_error, the forecasting errors along the past_horizon where Y is known, and forecast, the forecast along the forecast_horizon. Both can be used for backpropagation. shape=(past_horizon+forecast_horizon, batchsize, n_features_Y)

Return type:

torch.Tensor

Example

import prosper_nn.utils.generate_time_series_data as gtsd
import prosper_nn.utils.create_input_ecnn_hcnn as ci
from prosper_nn.models.hcnn_known_u import hcnn_known_u
import torch

# Define network parameters
n_features_U = 10  # setting this to zero reverts to vanilla HCNN with tf
batchsize = 5
past_horizon = 15
forecast_horizon = 5
n_state_neurons = 20
n_data = 50
n_features_Y = 5
sparsity = 0
teacher_forcing = 1
decrease_teacher_forcing = 0.0001

#  Generate data
Y, U = gtsd.sample_data(n_data, n_features_Y, n_features_U)
Y_batches, U_batches = ci.create_input(
    Y,
    past_horizon,
    batchsize,
    U,
    True,  # Has to be true for Hcnn_known_U
    forecast_horizon,
)

Y_batches.shape, U_batches.shape

# Initialize HCNN_KNOWN_U
hcnn_known_u = hcnn_known_u.HCNN_KNOWN_U(
    n_state_neurons,
    n_features_U,
    n_features_Y,
    past_horizon,
    forecast_horizon,
    sparsity,
    teacher_forcing=teacher_forcing,
    decrease_teacher_forcing=decrease_teacher_forcing,
)

# setting the optimizer, loss and targets
optimizer = torch.optim.Adam(hcnn_known_u.parameters(), lr=0.01)
loss_function = torch.nn.MSELoss()
targets = torch.zeros((past_horizon, batchsize, n_features_Y))

# Train model
epochs = 150
for epoch in range(epochs):
    for batch_index in range(0, U_batches.shape[0]):
        hcnn_known_u.zero_grad()
        U_batch = U_batches[batch_index]
        Y_batch = Y_batches[batch_index]
        model_out = hcnn_known_u(U_batch,Y_batch)
        past_error, forecast = torch.split(model_out, past_horizon)
        loss = loss_function(past_error, targets)
        loss.backward()
        optimizer.step()
hcnn_known_u_model = hcnn_known_u.HCNN_KNOW_U(50, 3, 5, 10, 5)
input = torch.randn(20, 1, 1)
past_error, forecast = torch.split(hcnn(input), 20)