Skip to content

Commit

Permalink
more electricity example tweaks; bump version
Browse files Browse the repository at this point in the history
  • Loading branch information
jwdink committed Jan 8, 2025
1 parent 9d1da45 commit 6a99bf5
Show file tree
Hide file tree
Showing 2 changed files with 56 additions and 77 deletions.
131 changes: 55 additions & 76 deletions docs/examples/electricity.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@
{
"data": {
"text/plain": [
"<torch._C.Generator at 0x11d1b06f0>"
"<torch._C.Generator at 0x1146446f0>"
]
},
"execution_count": 3,
Expand Down Expand Up @@ -1183,18 +1183,10 @@
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": 25,
"id": "cc9c3c39",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/jacobdink/miniconda3/envs/bark-phone/lib/python3.10/site-packages/torch/nn/modules/lazy.py:181: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment.\n"
]
}
],
"outputs": [],
"source": [
"from torchcast.utils.training import SeasonalEmbeddingsTrainer\n",
"\n",
Expand Down Expand Up @@ -1259,7 +1251,7 @@
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 26,
"id": "33119450",
"metadata": {},
"outputs": [
Expand All @@ -1276,10 +1268,10 @@
{
"data": {
"text/plain": [
"<torchcast.utils.training.SeasonalEmbeddingsTrainer at 0x3f62271c0>"
"<torchcast.utils.training.SeasonalEmbeddingsTrainer at 0x16213db10>"
]
},
"execution_count": 25,
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
Expand Down Expand Up @@ -1327,7 +1319,7 @@
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 27,
"id": "8fa1e51b",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -1381,7 +1373,7 @@
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 28,
"id": "3b49fb78",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -1424,22 +1416,16 @@
},
{
"cell_type": "markdown",
"id": "95187033",
"id": "d1ab34ad-2c71-4083-9566-410413f51230",
"metadata": {},
"source": [
"How should we incorporate our `season_embedder` neural-network into a state-space model? There are at least two options:\n",
"\n",
"#### Option 1\n",
"\n",
"The first option is to create our fourier-features on the dataframe, and pass these as features into a dataloader.\n",
"\n",
"1. First, we create our time-series model:"
"How should we incorporate our `season_embedder` neural-network into a state-space model? First, we create our time-series model:"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "fd7ca2f5",
"execution_count": 29,
"id": "b68d8b86-04fe-4613-8d11-a80ae1d43f3c",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -1457,15 +1443,17 @@
},
{
"cell_type": "markdown",
"id": "caf84e63",
"id": "141cd31c-fb28-4a35-9761-e40b5a7dd262",
"metadata": {},
"source": [
"2. Next, we add our season features to the dataframe, and create a dataloader, passing these feature-names to the `X_colnames` argument:"
"Then, we have two options:\n",
"\n",
"1. The first option is to create our fourier-features on the dataframe, and pass these as features into a dataloader."
]
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": 30,
"id": "f3244f64",
"metadata": {},
"outputs": [],
Expand All @@ -1491,23 +1479,12 @@
"id": "fda8243f",
"metadata": {},
"source": [
"Finally, we train the model, either rolling our own training loop...\n",
"\n",
"```python\n",
"for i in range(num_epochs):\n",
" for batch in dataloader_kf_nn:\n",
" batch = batch.to(DEVICE)\n",
" y, X = batch.tensors\n",
" predictions = kf_nn(y, X=X, start_offsets=batch.start_offsets)\n",
" # use predictions.log_prob on optimizer, etc.\n",
"```\n",
"\n",
"...or, even better, using a tool like Pytorch Lightning. Torchcast also includes a simple tool for this, the `StateSpaceTrainer`:"
"...then we'd train our model with a tool like Pytorch Lightning. Torchcast also includes a simple tool for this, the `StateSpaceTrainer`:"
]
},
{
"cell_type": "code",
"execution_count": 30,
"execution_count": 31,
"id": "39eff5f3",
"metadata": {},
"outputs": [],
Expand All @@ -1530,21 +1507,19 @@
"id": "4ab296fc",
"metadata": {},
"source": [
"#### Option 2\n",
"\n",
"An even simpler (though less general) option is just to leverage the util methods in the `SeasonalEmbeddingsTrainer`, which handles converting a `TimeSeriesDataset` into a tensor of fourier terms:"
"2. An even simpler (though less general) option is just to leverage the util methods in the `SeasonalEmbeddingsTrainer`, which handles converting a `TimeSeriesDataset` into a tensor of fourier terms:"
]
},
{
"cell_type": "code",
"execution_count": 31,
"execution_count": 32,
"id": "a7d0abfa",
"metadata": {},
"outputs": [],
"source": [
"def dataset_to_kwargs(batch: TimeSeriesDataset) -> dict:\n",
" seasonX = season_trainer.times_to_model_mat(batch.times()).to(dtype=torch.float, device=DEVICE)\n",
" return {'X' : season_trainer.module.season_nn(seasonX)}\n",
" return {'X' : season_trainer.module(seasonX)}\n",
"\n",
"ss_trainer = StateSpaceTrainer(\n",
" module=kf_nn,\n",
Expand All @@ -1558,12 +1533,12 @@
"id": "535db134",
"metadata": {},
"source": [
"Then we don't need to use `add_season_features` when creating our data-loader, since `times_to_model_mat` will create them per-batch as needed (which will be much easier on our GPU's memory):"
"Then we don't need to use `add_season_features` when creating our data-loader, since `season_trainer.times_to_model_mat` will create them per-batch as needed (which will be much easier on our GPU's memory):"
]
},
{
"cell_type": "code",
"execution_count": 32,
"execution_count": 33,
"id": "160ecea4",
"metadata": {},
"outputs": [],
Expand Down Expand Up @@ -1595,14 +1570,14 @@
"state-space models are much slower to train _per_ epoch). So it's much more efficient to pre-train the network first. Then it's up to\n",
"us whether we want to continue training the network, or just freeze its weights (i.e. exclude it from the optimizer) and just train the\n",
"state-space models' parameters. Here we're freezing them by not assigning the network as an attribute (so that the parameters don't get\n",
"passed to when we run ``torch.optim.Adam(kf_nn.parameters(), lr=.05)``.\n",
"passed to when we run ``torch.optim.Adam(kf_nn.parameters(), lr=.05)``).\n",
"\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": 42,
"execution_count": 34,
"id": "51fdb529",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -1639,30 +1614,22 @@
},
{
"cell_type": "markdown",
"id": "0cec0bcd",
"id": "0cc5d6fe-fe0c-4c54-9a40-7c02faacf24a",
"metadata": {},
"source": [
"### Evaluation"
]
},
{
"cell_type": "markdown",
"id": "a36e1ec8",
"metadata": {},
"source": [
"#### Generating Torchcast Forecasts for all groups"
"Now we'll create forecasts for all the groups, and back-transform them, for plotting and evaluation."
]
},
{
"cell_type": "code",
"execution_count": 43,
"execution_count": 35,
"id": "7e38a631",
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "598fdfa4ec4c479b9aeb5760bd5d61cf",
"model_id": "19c334095df54e6d8157b2c31a4b998e",
"version_major": 2,
"version_minor": 0
},
Expand Down Expand Up @@ -1861,7 +1828,7 @@
"[9548098 rows x 8 columns]"
]
},
"execution_count": 43,
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -1881,8 +1848,6 @@
"\n",
" df_all_preds = []\n",
" for batch in tqdm(dataloader_all):\n",
" # if example_group not in batch.group_names:\n",
" # continue\n",
" batch = batch.to(DEVICE)\n",
" seasonX = season_trainer.times_to_model_mat(batch.times()).to(dtype=torch.float, device=DEVICE)\n",
" pred = kf_nn(batch.tensors[0], X=season_trainer.module(seasonX), start_offsets=batch.start_offsets)\n",
Expand All @@ -1897,7 +1862,7 @@
},
{
"cell_type": "code",
"execution_count": 44,
"execution_count": 36,
"id": "6907f9bb",
"metadata": {},
"outputs": [
Expand All @@ -1918,14 +1883,30 @@
"plot_2x2(df_all_preds.query(\"group==@example_group\"), actual_colname='kW', split_dt=SPLIT_DT)"
]
},
{
"cell_type": "markdown",
"id": "2c3bc7dd-f0c8-453a-8310-2c7c182087b5",
"metadata": {},
"source": [
"Success! If our example group is representative, our forecasting model was able to use the embeddings to capture complex seasonal structure."
]
},
{
"cell_type": "markdown",
"id": "0cec0bcd",
"metadata": {},
"source": [
"### Evaluation"
]
},
{
"cell_type": "markdown",
"id": "2ab8a44f",
"metadata": {},
"source": [
"#### A Simple Baseline\n",
"\n",
"We've see that, for this dataset, generating forecasts that are *sane* is an already an achievement.\n",
"We've see that, for this dataset, generating forecasts that are *sane* is already an achievement.\n",
"\n",
"But of course, ideally we'd actually have some kind of a quantitative measure of how good our forecasts are.\n",
"\n",
Expand All @@ -1934,7 +1915,7 @@
},
{
"cell_type": "code",
"execution_count": 45,
"execution_count": 38,
"id": "67852453",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -1989,7 +1970,7 @@
},
{
"cell_type": "code",
"execution_count": 46,
"execution_count": 39,
"id": "34f7097c-b5ef-4384-bfc5-b0b7b61c1396",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -2169,7 +2150,7 @@
"[19096196 rows x 7 columns]"
]
},
"execution_count": 46,
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -2178,8 +2159,6 @@
"df_compare = (df_all_preds[['group', 'mean', 'time', 'kW', 'dataset']]\n",
" .rename(columns={'mean' : 'torchcast'})\n",
" .merge(df_baseline365, how='left'))\n",
"assert (df_compare['baseline'].notnull() | (df_compare['dataset'] == 'train')).all()\n",
"assert df_compare['torchcast'].notnull().all()\n",
"\n",
"df_compare_long = df_compare.melt(\n",
" id_vars=['group', 'time', 'kW', 'dataset'], \n",
Expand All @@ -2203,7 +2182,7 @@
},
{
"cell_type": "code",
"execution_count": 48,
"execution_count": 40,
"id": "724fcadb-b243-4ea1-9666-3ac99f6e7b9a",
"metadata": {},
"outputs": [
Expand All @@ -2213,7 +2192,7 @@
"<Axes: title={'center': 'Torchcast vs. Baseline: Error over Time'}, xlabel='date', ylabel='Abs(Error)'>"
]
},
"execution_count": 48,
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
},
Expand Down Expand Up @@ -2260,7 +2239,7 @@
},
{
"cell_type": "code",
"execution_count": 54,
"execution_count": 41,
"id": "8cdac807",
"metadata": {},
"outputs": [
Expand Down
2 changes: 1 addition & 1 deletion torchcast/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '0.4.2'
__version__ = '0.4.3'

0 comments on commit 6a99bf5

Please sign in to comment.