Skip to content

Commit

Permalink
Fix admonition formatting
Browse files Browse the repository at this point in the history
  • Loading branch information
uadnan committed Jan 6, 2025
1 parent 9cd078a commit 4e7419d
Showing 1 changed file with 14 additions and 4 deletions.
18 changes: 14 additions & 4 deletions docs/examples/electricity.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1576,10 +1576,20 @@
"id": "5ae5e987",
"metadata": {},
"source": [
"<div class=\"alert alert-info\">\n",
" <b>Training End-to-End</b>\n",
" <p>Above, we never actually registered <code>season_trainer.module</code> as an attribute of our KalmanFilter (i.e. we didn't do <code>kf_nn.season_nn = season_trainer.module</code>). This means that we won't continue training the embeddings as we train our KalmanFilter. Why not? For that matter, why did we pre-train in the first place? Couldn't we have just registered an untrained embeddings network and trained the whole thing end to end?</p>\n",
" <p>In practice, neural-networks have many more parameters and take many more epochs than our state-space models (and conversely, our state-space models are much slower to train _per_ epoch). So it's much more efficient to pre-train the network first. Then it's up to us whether we want to continue training the network, or just freeze its weights (i.e. exclude it from the optimizer) and just train the state-space models' parameters. Here we're freezing them by not assigning the network as an attribute (so that the parameters don't get passed to when we run <code>torch.optim.Adam(kf_nn.parameters(), lr=.05)</code>.</p>\n",
"<div class=\"admonition note\">\n",
"<div class=\"admonition-title\">Training End-to-End</div>\n",
"\n",
"Above, we never actually registered ``season_trainer.module`` as an attribute of our KalmanFilter (i.e. we didn't do\n",
"``kf_nn.season_nn = season_trainer.module``). This means that we won't continue training the embeddings as we train our KalmanFilter.\n",
"Why not? For that matter, why did we pre-train in the first place? Couldn't we have just registered an untrained embeddings network\n",
"and trained the whole thing end to end?\n",
"\n",
"In practice, neural-networks have many more parameters and take many more epochs than our state-space models (and conversely, our\n",
"state-space models are much slower to train _per_ epoch). So it's much more efficient to pre-train the network first. Then it's up to\n",
"us whether we want to continue training the network, or just freeze its weights (i.e. exclude it from the optimizer) and just train the\n",
"state-space models' parameters. Here we're freezing them by not assigning the network as an attribute (so that the parameters don't get\n",
"passed to when we run ``torch.optim.Adam(kf_nn.parameters(), lr=.05)``.\n",
"\n",
"</div>"
]
},
Expand Down

0 comments on commit 4e7419d

Please sign in to comment.