Skip to content
This repository has been archived by the owner on Nov 1, 2021. It is now read-only.

Gradient direct assignment is missing #126

Open
bartvm opened this issue May 27, 2016 · 5 comments · May be fixed by #127
Open

Gradient direct assignment is missing #126

bartvm opened this issue May 27, 2016 · 5 comments · May be fixed by #127

Comments

@bartvm
Copy link
Contributor

bartvm commented May 27, 2016

When using direct assignment and the tensor being assigned is a function of parameters, those parameters don't seem to get a gradient i.e. params.W1[i] = params.W2 * x will result in a zero gradient for W2. The following code is a minimal test case:

grad = require 'autograd'
torch = require 'torch'

params={
  W=torch.range(0, 8):view(3, 3),
  storage=torch.zeros(3, 3)
}

function f(params, x)
  params.storage[2] = params.W * x
  return torch.mean(params.storage)
end

grad, _ = grad(f)(params, torch.ones(3))
print(grad.W)

The gradient of W here is zero, while it should be torch.ones(3, 3) / 9.

When I try to run the same thing with {optimize=true} I get an error:

/home/vanmerb/torch/install/bin/luajit: ...re/lua/5.1/autograd/runtime/codegen/backend/lua/init.lua:550: attempt to index local 'node' (a nil value)
stack traceback:
        ...re/lua/5.1/autograd/runtime/codegen/backend/lua/init.lua:550: in function 'addNodeTargets'
        ...re/lua/5.1/autograd/runtime/codegen/backend/lua/init.lua:572: in function 'generateCode'
        ...re/lua/5.1/autograd/runtime/codegen/backend/lua/init.lua:748: in function 'generateFn'
        .../install/share/lua/5.1/autograd/runtime/codegen/init.lua:140: in function <.../install/share/lua/5.1/autograd/runtime/codegen/init.lua:114>
        autograd_subtensor.lua:14: in main chunk
        [C]: in function 'dofile'
        ...merb/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
        [C]: at 0x00405e90
@bartvm
Copy link
Contributor Author

bartvm commented May 27, 2016

@alexbw

@alexbw
Copy link
Collaborator

alexbw commented May 27, 2016

Oh without optimized=true, this will never work, the assignment functionality is limited to optimized mode.
Then the issue is that bug. One thing that I'm wary of is that you're overwriting the params. Taking the gradient of the output w.r.t. to an overwritten variable is odd. Perhaps try making a clone of params.storage first?
Alternately, pass in storage as a non-primary argument.

@bartvm
Copy link
Contributor Author

bartvm commented May 27, 2016

Making a clone of params.storage first doesn't make a difference.

Passing storage as a non-primary argument doesn't work: You would be trying to assign an autograd node (params.W * x) to a row of a normal Torch tensor. It gives the following error as expected:

/home/vanmerb/torch/install/bin/luajit: bad argument #3 to '?' (torch.*Tensor expected, got table)
stack traceback:
        [C]: at 0x7f6537022070
        [C]: in function '__newindex'
        autograd_subtensor.lua:9: in function 'fun'
        ...all/share/lua/5.1/autograd/runtime/direct/DirectTape.lua:113: in function 'funOnly'
        ...all/share/lua/5.1/autograd/runtime/direct/DirectTape.lua:217: in function <...all/share/lua/5.1/autograd/runtime/direct/DirectTape.lua:215>
        autograd_subtensor.lua:13: in main chunk
        [C]: in function 'dofile'
        ...merb/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
        [C]: at 0x00405e90

After some debugging, I tracked it down to a bug in the gradient of set. The following commit fixes it and gives the correct gradient for me even with optimize=false: bartvm@7decd96. What was happening is that the gradient function w.r.t. params.storage was overriding the gradient of params.W * x because the latter was a view on the former.

@alexbw
Copy link
Collaborator

alexbw commented May 27, 2016

Awesome, thanks. Can you open a PR, and I'll merge it right quick?

On Fri, May 27, 2016 at 4:34 PM Bart van Merriënboer <
[email protected]> wrote:

Making a clone of params.storage first doesn't make a difference.

Passing storage as a non-primary argument doesn't work: You would be
trying to assign an autograd node (params.W * x) to a row of a normal
Torch tensor. It gives the following error as expected:

/home/vanmerb/torch/install/bin/luajit: bad argument #3 to '?' (torch.*Tensor expected, got table)
stack traceback:
[C]: at 0x7f6537022070
[C]: in function '__newindex'
autograd_subtensor.lua:9: in function 'fun'
...all/share/lua/5.1/autograd/runtime/direct/DirectTape.lua:113: in function 'funOnly'
...all/share/lua/5.1/autograd/runtime/direct/DirectTape.lua:217: in function <...all/share/lua/5.1/autograd/runtime/direct/DirectTape.lua:215>
autograd_subtensor.lua:13: in main chunk
[C]: in function 'dofile'
...merb/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405e90

After some debugging, I tracked it down to a bug in the gradient of set.
The following commit fixes it and gives the correct gradient for me even
with optimize=false: bartvm/torch-autograd@7decd96
bartvm@7decd96.
What was happening is that the gradient function w.r.t. params.storage
was overriding the gradient of params.W * x because the latter was a view
on the former.


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#126 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAJ4jy2aYz8SYxod9dG1XfDStB288NtGks5qF1U4gaJpZM4Iot21
.

@bartvm
Copy link
Contributor Author

bartvm commented May 27, 2016

I realized that when g[k] is a scalar the call to clone fails, so just need to test for that. Maybe good idea to add a unit test too, I'll try and do that tonight or tomorrow.

@bartvm bartvm linked a pull request May 27, 2016 that will close this issue
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants