You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 18, 2021. It is now read-only.
I just have a question (which probably sound very stupid). I am just wondering is it necessary to optimize parameters of the decoder and the encoder separately here?
so the decoder.parameters() doesn't include encoder.parameters()?? i cant just do "decoder_optimizer.step()"? the loss is backprop all the way, but the parameters aren't?
thanks
The text was updated successfully, but these errors were encountered:
Though this is outdated already, to the best of my understanding, in PyTorch, the parameter becomes registered to the Module when it is assigned to the 'self' in the constructor. So the encoder and decoder will hold references only to the parameters of the submodules that were registered to them. In other words, the encoder.parameters() return only the weights and biases contained within the encoder object, similarly with the decoder. So with such a design, the backward call would calculate all the gradients, but the
decoder_optimizer.step()
would only update the weights of the decoder object's layers.
I have, however, another question on similar topic. My approach for assigning parameters to the PyTorch optimizer object with a multi-model architecture is as follows:
Is there any particular reason for assigning separate optimizer objects in such a scenario, apart from the fact that it enables us to configure optimizers differently for each module? Is there some mistake with my approach?
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
regarding this tutorial: https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html
I just have a question (which probably sound very stupid). I am just wondering is it necessary to optimize parameters of the decoder and the encoder separately here?
encoder_optimizer = optim.SGD(encoder.parameters(), lr=learning_rate)
decoder_optimizer = optim.SGD(decoder.parameters(), lr=learning_rate)
...
loss.backward()
encoder_optimizer.step()
decoder_optimizer.step()
so the decoder.parameters() doesn't include encoder.parameters()?? i cant just do "decoder_optimizer.step()"? the loss is backprop all the way, but the parameters aren't?
thanks
The text was updated successfully, but these errors were encountered: