Historical Consistent Neural Network

Module

Prosper_nn provides implementations for specialized time series forecasting neural networks and related utility functions.

Copyright (C) 2022 Nico Beck, Julia Schemm, Henning Frechen, Jacob Fidorra,

Denni Schmidt, Sai Kiran Srivatsav Gollapalli

This file is part of Propser_nn.

Propser_nn is free software: you can redistribute it and/or modify

it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

class prosper_nn.models.hcnn.hcnn.HCNN(n_state_neurons: int, n_features_Y: int, past_horizon: int, forecast_horizon: int, sparsity: float = 0.0, activation: ~typing.Type[~torch.autograd.function.Function] = <built-in method tanh of type object>, cell_type: str = 'hcnn_cell', init_state: ~torch.Tensor | None = None, learn_init_state: bool = True, teacher_forcing: float = 1, decrease_teacher_forcing: float = 0, backward_full_Y: bool = True, ptf_in_backward: bool = True)[source]

Bases: Module

The HCNN class creates a Historical Consistent Neural Network.

A Historical Consistent Neural Network belongs to the class of Recurrent Neural Networks. A special feature of the architecture is that it has no input in the common sense. Instead, all the inputs are also interpreted as targets that are equally important. So the architecture does not distinguish between input and output features. Furthermore, it uses teacher forcing to correct the hidden state.

Parameters:
  • n_state_neurons (int) – The dimension of the state in the HCNN Cell. It must be a positive integer with n_state_neurons >= n_features_Y.

  • n_features_Y (int) – The size of the data in each timestamp. It must be a positive integer.

  • past_horizon (int) – The past horizon gives the amount of time steps into the past, where an observation is available. It represents the number of comparisons between expectation and observation and therefore the amount of teacher forcing.

  • forecast_horizon (int) – The forecast horizon gives the amount of time steps into the future, where no observation is available. It represents the amount of forecast steps the model returns.

  • sparsity (float) – The share of weights that are set to zero in the matrix A. These weights are not trainable and therefore always zero. For big matrices (dimension > 50) this can be necessary to guarantee numerical stability and it increases the long-term memory of the model.

  • activation (Type[torch.autograd.Function]) – The activation function that is applied on the output of the hidden layers. The same function is used on all hidden layers. No function is applied if no function is given.

  • cell_type (str) – Include a version of the gated recurrent unit. Possible choices: hcnn_cell or hcnn_gru_3_variant.

  • init_state (torch.Tensor) – The initial state of the HCNN model. Can be given optionally and is chosen randomly if not specified.

  • learn_init_state (boolean) – Learn the initial hidden state or not.

  • teacher_forcing (float) – The probability that teacher forcing is applied for a single state neuron. In each time step this is repeated and therefore enforces stochastic learning if the value is smaller than 1.

  • decrease_teacher_forcing (float) – The amount by which teacher_forcing is decreased each epoch.

Return type:

None

adjust_teacher_forcing()[source]

Decrease teacher_forcing each epoch by decrease_teacher_forcing until it reaches zero. :param None:

Return type:

None

disable_calculate_forecast()[source]
enable_calculate_forecast()[source]
forward(Y: Tensor)[source]
Parameters:

Y (torch.Tensor) – Y should be 3-dimensional with the shape = (past_horizon, batchsize, n_features_Y). This timeseries of observations is used for training the model in order to predict future observations.

Returns:

Contains past_error, the forecasting errors along the past_horizon where Y is known, and forecast, the forecast along the forecast_horizon. Both can be used for backpropagation. shape=(past_horizon+forecast_horizon, batchsize, n_features_Y)

Return type:

torch.Tensor

reset_cell_outputs(batchsize, device, dtype)[source]

Example

import torch

from prosper_nn.models.hcnn import HCNN
import prosper_nn.utils.generate_time_series_data as gtsd
import prosper_nn.utils.create_input_ecnn_hcnn as ci

# Define network and data parameters
past_horizon = 10
forecast_horizon = 5
n_features_Y = 2
n_data = 20
n_state_neurons = 3
batchsize = 5

# Initialise Historical Consistant Neural Network
hcnn = HCNN(n_state_neurons, n_features_Y, past_horizon, forecast_horizon)

# Generate data with "unknown" variables U
Y, U = gtsd.sample_data(n_data, n_features_Y=n_features_Y - 1, n_features_U=1)
Y = torch.cat((Y, U), 1)
Y_batches = ci.create_input(Y, past_horizon, batchsize)

targets = torch.zeros((past_horizon, batchsize, n_features_Y))

# Train model
optimizer = torch.optim.Adam(hcnn.parameters())
loss_function = torch.nn.MSELoss()

for epoch in range(10):
    for batch_index in range(0, Y_batches.shape[0]):
        Y_batch = Y_batches[batch_index]
        model_output = hcnn(Y_batch)
        past_error, forecast = torch.split(model_output, past_horizon)

        hcnn.zero_grad()
        loss = loss_function(past_error, targets)
        loss.backward()
        optimizer.step()
hcnn = models.hcnn.HCNN(5, 1, 20, 5, 1)
input = torch.randn(20, 1, 1)
past_error, forecast = torch.split(hcnn(input), 20)

Reference

Zimmermann HG., Tietz C., Grothmann R. (2012) Forecasting with Recurrent Neural Networks: 12 Tricks. In: Montavon G., Orr G.B., Müller KR. (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol 7700. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35289-8_37

Historical Consistent Neural Network Cell

Module

class prosper_nn.models.hcnn.hcnn_cell.HCNNCell(n_state_neurons: int, n_features_Y: int, sparsity: float = 0.0, activation: ~typing.Type[~torch.autograd.function.Function] = <built-in method tanh of type object>, teacher_forcing: float = 1, backward_full_Y: bool = True, ptf_in_backward: bool = True)[source]

Bases: Module

The HCNNCell call is implemented to model one forecast step in a Historical Consistent Neural Network. By recursively using the cell a HCNN network can be implemented. Mathematically the the output of one cell is calculated as:

\[s_{t+1} = A \tanh \left( s_t -[\mathbb{1}, 0]^T \cdot ( [\mathbb{1}, 0] s_t -y_t^d) \right)\]
\[y_t = [\mathbb{1}, 0] \cdot s_t\]
Parameters:
  • n_state_neurons (int) – The dimension of the state in the HCNN Cell. It must be an positive integer with n_state_neuron >= n_features_Y.

  • n_features_Y (int) – The size of the data in each timestamp. It must be an positive integer.

  • sparsity (float) – The share of weights that are set to zero in the matrix A. These weights are not trainable and therefore always zero. For big matrices (dimension > 50) this can be necessary to guarantee numerical stability and increases the long-term memory of the model.

  • activation (Type[torch.autograd.Function]) – The activation function that is applied on the output of the hidden layers. The same function is used on all hidden layers. No function is applied if no function is given.

  • teacher_forcing (float) – The probability that teacher forcing is applied for a single state neuron. In each time step this is repeated and therefore enforces stochastic learning if the value is smaller than 1. Since not all nodes are corrected then, it is partial teacher forcing (ptf).

  • backward_full_Y (bool) – Apply partial teacher forcing dropout after or before the output is calculated. If True dropout layer is applied afterwards and the output contains the errors of all features. If False dropout is applied before and the output contains only the errors that are not dropped. The remaining entries are zero and therefore contain no error.

  • ptf_in_backward (bool) – If True nothing happens and the Dropout layer is handled as it is in the backward path. If False the Dropout layer is skipped in the backward path.

Return type:

None

forward(state: Tensor, observation: Tensor | None = None)[source]
Parameters:
  • state (torch.Tensor) – The previous state of the HCNN. shape = (batch_size, n_state_neurons)

  • observation (torch.Tensor) – The observation is the data for the given timestamp which should be learned. It contains all observational features in the batch and has the shape = (batchsize, n_features_Y). It is an optional variable. If no variable is given, the observation is not subtracted from the expectation to create the output variable. Additionally, no teacher forcing is applied on the state vector.

Returns:

  • state (torch.Tensor) – The updated state of the HCNN.

  • output (torch.Tensor) – The output of the HCNN Cell. If an observation is given, this output is calculated by the expectation minus the observation. If no observation is given, the output is equal to the expectation.

set_teacher_forcing(teacher_forcing: float) None[source]

Function to set teacher forcing to a specific value in layer and as self variable.

Parameters:

teacher_forcing (float) – The value teacher forcing is set to in the cell.

Return type:

None

class prosper_nn.models.hcnn.hcnn_cell.PartialTeacherForcing(p: float = 0.5, inplace: bool = False)[source]

Bases: Dropout

Applies Dropout as partial teacher forcing. Therefore, scaling of the Dropout is reverted, so that the partial teacher forcing sets the values to the original observation.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(input: Tensor) Tensor[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

prosper_nn.models.hcnn.hcnn_cell.no_dropout_backward(module, grad_in, grad_out)[source]

Example

hcnn_cell = model.hcnn.HCNNCell(5, 1)
observation = torch.randn(1, 1)
state = torch.randn(1, 5)
outputs = []
for i in range(6):
    state, output = hcnn_cell(state, observation)
    outputs.append(output)

Historical Consistent Neural Network GRU Variant 3 Cell

Module

class prosper_nn.models.hcnn.hcnn_gru_cell.HCNN_GRU_3_variant(n_state_neurons: int, n_features_Y: int, sparsity: float = 0.0, activation: ~typing.Type[~torch.autograd.function.Function] = <built-in method tanh of type object>, teacher_forcing: float = 1, backward_full_Y: bool = True, ptf_in_backward: bool = True)[source]

Bases: Module

The HCNN_GRU_3_variant call is implemented to model one forecast step in a HCNN with a version similar to the GRU 3 variant in the following paper. One difference is that $$r_t$$ is fixed to a vector with ones in our implementation.

R. Dey and F. M. Salem, “Gate-variants of Gated Recurrent Unit (GRU) neural networks,” 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, USA, 2017, pp. 1597-1600, doi: 10.1109/MWSCAS.2017.8053243

By recursively using the cell a HCNN network can be implemented. Mathematically the the output of one cell is calculated as following, where \(s_t^{\prime}\) serves as an interim result:

\[s_{t}^\prime = \tanh \left( s_t -[\mathbb{1}, 0]^T \cdot ( [\mathbb{1}, 0] s_t -y_t^d) \right)\]
\[s_{t+1} = (1 - \sigma(update\_vector)) \circ s_{t}^\prime + \sigma(update\_vector) \circ A \tanh (s_{t}^\prime)\]
\[\hat{y}_t = [\mathbb{1}, 0] \cdot s_t\]
Parameters:
  • n_state_neurons (int) – The dimension of the state in the HCNN Cell. It must be an positive integer with n_state_neurons >= n_features_Y

  • n_features_Y (int) – The size of the data in each timestamp. It must be an positive integer.

  • sparsity (float) – The share of weights that are set to zero in the matrix A. These weights are not trainable and therefore always zero. For big matrices (dimension > 50) this can be necessary to guarantee numerical stability and increases the long-term memory of the model.

  • activation (Type[torch.autograd.Function]) – The activation function that is applied on the output of the hidden layers. The same function is used on all hidden layers. No function is applied if no function is given.

  • teacher_forcing (float) – The probability that teacher forcing is applied for a single state neuron. In each time this is repeated and therefore enforces stochastic learning if the value is smaller than 1. Since not all nodes are corrected then, it is partial teacher forcing (ptf).

  • backward_full_Y (bool) – Apply partial teacher forcing dropout after or before the output is calculated. If True dropout layer is applied afterwards and the output contains the errors of all features. If False dropout is applied before and the output contains only the errors that are not dropped. The remaining entries are zero and therefore contain no error.

  • ptf_in_backward (bool) – If True nothing happens and the Dropout layer is handled as it is in the backward path. If False the Dropout layer is skipped in the backward path.

Return type:

None

forward(state: Tensor, observation: Tensor | None = None)[source]
Parameters:
  • state (torch.Tensor) – The previous state of the HCNN. shape = (batch_size, n_state_neurons)

  • observation (torch.Tensor) – The observation is the data for the given timestamp which should be learned. It contains all observational features in the batch and has the shape = (batchsize, n_features_Y). It is an optional variable. If no variable is given, the observation is not subtracted from the expectation to create the output variable. Additionally, no teacher forcing is applied on the state vector.

Returns:

  • state (torch.Tensor) – The updated state of the HCNN.

  • output (torch.Tensor) – The output of the HCNN Cell. If an observation is given, this output is calculated by the expectation minus the observation. If no observation is given, the output is equal to the expectation.

set_teacher_forcing(teacher_forcing: float) None[source]

Function to set teacher forcing to a specific value in layer and as self variable.

Parameters:

teacher_forcing (float) – The value teacher forcing is set to in the cell.

Return type:

None

Example

hcnn_cell = model.hcnn.HCNN_GRU_3_variant(10, 20)
observation = torch.randn(1, 1)
state = torch.randn(1, 5)
outputs = []
for i in range(6):
    state, output = hcnn_cell(state, observation)
    outputs.append(output)