{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Historical Consistent Neural Network with known features\n",
"\n",
"In this notebook we demonstrate how the prosper_nn package can be used to build and analyze a Historical Consistent Neural Network (HCNN) with (partly) known features.\n",
"Similar to the other tutorials, it begins with a simple version of an HCNN_known_u and shows how the data and the training loop should look like.\n",
"If you want to build an ensemble please refer to [HCNN tutorial](Hcnn.ipynb#Ensemble-of-Historical-Consistent-Neural-Networks)."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Theory\n",
"\n",
"Historical Consistent Neural Networks with known features U are based on the architectures of HCNNs and ECNNs. As a result it belongs to the class of Recurrent Neural Networks. The picture below shows the architecture of the model.\n",
"\n",
"The common HCNN model treats all features as targets and forecasts them into the future. The version of the HCNN presented in this notebook, however, distinguishes between features we want to forecast and features we already know for the future (like e.g. holidays or months). The latter is what 'known U' refers to. As a result, we use all the advantages of the common HCNN while also utilizing our known information to capacity. Instead of trying to forecast holidays in the HCNN (and failing), we provide the model with this information even along the forecast horizon. In a way, this model is like a combination of HCNN (model features internally if we don't know them for the future) and ECNN (supply the model with external features if we know them for the future).\n",
"\n",
"We facilitate the inclusion of known features by concatenating them to the state $r$. These known features are used for calculating the next state but are themselves not modeled as part of the next state. This is why $dim(r_t) = (batchsize, n\\_state\\_neurons + n\\_features\\_U)$ whereas $dim(s_t) = (batchsize, n\\_state\\_neurons)$. Accordingly, $dim(A) = ((n\\_state\\_neurons + n\\_features\\_U) , n\\_state\\_neurons)$.\n",
"\n",
"To calculate the state of the next time step, a non-linearity ($\\tanh$) and the state transition matrix $A$ are applied. These update steps describe the implementation of the HCNN_known_u cell that calculates the output and the following state for one time step. In formula each cell performs the following calculation.\n",
"For readability we use these abbreviations:\n",
" - nst = n_state_neurons \n",
" - nfY = n_features_Y \n",
" - nfU = n_features_U \n",
"\n",
"$$\\hat{z}_t = [\\mathbb{I}_{nfY}, \\mathbb{0}_{nst-nfY}] s_t -y_t^d$$\n",
"$$r_t = [\\mathbb{I}_{nst}, \\mathbb{0}_{nst, nfU}]^T \\cdot s_t - [\\mathbb{I}_{nfY}, \\mathbb{0}_{nfY, nst-nfY}, \\mathbb{0}_{nfY, nfU}]^T \\cdot \\hat{z}_t + [\\mathbb{0}_{nfU, nst}, \\mathbb{I}_{nfU}]^T u_{t+1} $$\n",
"$$ \n",
"s_{t+1} = A \\tanh (r_t) $$\n",
"$$ y_t = [\\mathbb{I}_{nfY}, \\mathbb{0}_{nst-nfY}] \\cdot s_t $$\n",
"The first part of $r$ contains the data, the middle portion the hidden features and the end the info from the external features U.\n",
"\n",
"\n",
""
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": [
"hide_cell"
]
},
"outputs": [],
"source": [
"import sys, os\n",
"\n",
"sys.path.append(os.path.abspath(\"../../..\"))\n",
"sys.path.append(os.path.abspath(\"../..\"))\n",
"sys.path.append(os.path.abspath(\"..\"))\n",
"sys.path.append(os.path.abspath(\".\"))"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import torch\n",
"import torch.nn as nn\n",
"import torch.optim as optim\n",
"\n",
"from prosper_nn.models.hcnn_known_u import hcnn_known_u\n",
"from prosper_nn.models.ensemble import Ensemble\n",
"\n",
"import prosper_nn.utils.generate_time_series_data as gtsd\n",
"import prosper_nn.utils.create_input_ecnn_hcnn as ci\n",
"\n",
"import prosper_nn.utils.neuron_correlation_hidden_layers as nchl\n",
"import prosper_nn.utils.visualization as visualization\n",
"from prosper_nn.utils import visualize_forecasts\n",
"from prosper_nn.utils import sensitivity_analysis\n",
"torch.manual_seed(0)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data Preparation \n",
"\n",
"For the data creation look at the [ECNN tutorial](ECNN.ipynb#Data-preparation).\n",
"If we set `n_features_U` to zero, we would revert this architecture back to vanilla HCNN. "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"past_horizon = 30\n",
"forecast_horizon = 5\n",
"n_features_U = 2\n",
"n_features_Y = 3\n",
"future_U = True\n",
"batchsize = 5\n",
"n_batches = 4\n",
"n_data = 100"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"For HCNN_known_u the target data should be in the `shape=(past_horizon, batchsize, n_features_Y)` and the external features which are known for the future in the shape `shape=(past_horizon + forecast_horizon, batchsize, n_features_U)`. "
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\Users\\bkn\\Projekte\\Prosper\\prosper\\prosper_nn\\utils\\create_input_ecnn_hcnn.py:57: UserWarning: For the last values of Y there are not enough future Us, so they will be discarded.\n",
" warnings.warn(\"For the last values of Y there are not enough \"\n",
"C:\\Users\\bkn\\AppData\\Local\\Temp\\ipykernel_19076\\3361968738.py:3: UserWarning: The number of sequences generated from the data are not a multiple of batchsize. The first 1 sequences will be discarded.\n",
" Y_batches, U_batches = ci.create_input(\n"
]
}
],
"source": [
"# generate data with n_features_U and n_features_Y\n",
"Y, U = gtsd.sample_data(n_data, n_features_Y, n_features_U)\n",
"Y_batches, U_batches = ci.create_input(\n",
" Y, past_horizon, batchsize, U, future_U, forecast_horizon\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Equal to HCNN, the targets of the HCNN_known_u should be in the same shape as $Y$, that is `shape=(past_horizon, batchsize, n_features_Y)`. Because the output of the HCNN_known_u is already the comparison between observation and expectation in the past horizon, the targets are zeros."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"targets = torch.zeros((past_horizon, batchsize, n_features_Y))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Single Historical Consistent Neural Network with known features (HCNN_known_u) \n",
"\n",
"In this section, we apply a HCNN with known features. We first start with the initialization of the model. Then we discuss the training loop and create forecasts for the data we generated. At the end of the section we evaluate the model and analyze it."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialization\n",
"\n",
"Compared to the [HCNN](../api/hcnn.rst) we only have to specify the `n_featues_U` additionally."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"n_state_neurons = 30\n",
"sparsity = 0"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"hcnn_known_u_model = hcnn_known_u.HCNN_KNOWN_U(\n",
" n_state_neurons,\n",
" n_features_U,\n",
" n_features_Y,\n",
" past_horizon,\n",
" forecast_horizon,\n",
" sparsity,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Set optimizer and loss function."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"optimizer = optim.Adam(hcnn_known_u_model.parameters(), lr=0.001)\n",
"loss_function = nn.MSELoss()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Training Loop\n",
"\n",
"In this training loop the output of the HCNN_known_u has `shape=(past_horizon + forecast_horizon, batchsize, n_features_Y)`. So there is no difference to the basic HCNN and we only have to give the `U_batch` in the forward of the model."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"epochs = 150\n",
"\n",
"total_loss = epochs * [0]\n",
"for epoch in range(epochs):\n",
" for batch_index in range(0, U_batches.shape[0]):\n",
" hcnn_known_u_model.zero_grad()\n",
"\n",
" U_batch = U_batches[batch_index]\n",
" Y_batch = Y_batches[batch_index]\n",
"\n",
" model_out = hcnn_known_u_model(U_batch, Y_batch)\n",
"\n",
" past_error, forecast = torch.split(model_out, past_horizon)\n",
"\n",
" losses = [loss_function(past_error[i], targets[i]) for i in range(past_horizon)]\n",
" loss = sum(losses)\n",
" loss.backward()\n",
" \n",
" optimizer.step()\n",
" total_loss[epoch] += loss.detach()\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Forecast \n",
"\n",
"For a final prediction we only need to forward the U and Y we want to use for the actual forecast through the model. Again, we get the forecast and also the error the model still makes on the known Y."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"example_pred_U = torch.reshape(\n",
" U[0 : (past_horizon + forecast_horizon), :],\n",
" (past_horizon + forecast_horizon, 1, n_features_U),\n",
").float()\n",
"example_pred_Y = torch.reshape(\n",
" Y[0 : (past_horizon + forecast_horizon), :],\n",
" (past_horizon + forecast_horizon, 1, n_features_Y),\n",
").float()\n",
"\n",
"with torch.no_grad():\n",
" hcnn_known_u_model.eval()\n",
"\n",
" model_output = hcnn_known_u_model(example_pred_U, example_pred_Y[:past_horizon])\n",
" past_errors, forecast = torch.split(model_output, past_horizon)\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Evaluation\n",
"#### Postprocessing\n",
"Because the output of the model has different meaning for `past_horizon` and `forecast_horizon`, the `expected_timeseries` can be calculated by adding the real observation data `Y` on the `past_error` for the `past_horizon` and concatenate the result to the `forecast` of the model."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"expected_timeseries = torch.cat(\n",
" (torch.add(past_errors.squeeze(), Y[:past_horizon]), forecast.squeeze()), dim=0\n",
").detach()\n",
"\n",
"visualize_forecasts.plot_time_series(\n",
" expected_time_series=expected_timeseries[:, 0],\n",
" target=Y[: past_horizon + forecast_horizon, 0],\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "prosper",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
},
"vscode": {
"interpreter": {
"hash": "a604604040b0261c277bc75aa34f15c6f86bb9bc8166d3b0f73ab3af3d1b81ef"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}