-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allows functions to accept std::vectors with non-standard allocators #948
Allows functions to accept std::vectors with non-standard allocators #948
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to allocate std::vector
memory in the autodiff arena? It sounds like it might lead to a memory leak because we only free the autodiff memory after the reverse pass. For example, what happens with this:
for (n in 1:N) {
array[n] real y;
for (m in 1:n) { ... }
}
Will that just store the array contents in the arena? The default RAII allocator implementation with the heap would free the memory for y
after each loop in 1:N
.
Is the idea to only do this for the block variables like parameters?
Where do you think there are extraneous copies of std::vector
being made? Isn't everything call by constant reference? Or are the function returns not properly optimizing for move semantics?
…DF returns that return an eigen type. Use auto for function returns
Yep! Specifically we want to let the stan math backend know that data and transformed data does not need to be copied to the memory arena. So like a good example is csr_matrix_times_vector(int, int, vector, array int, array int, vector) Those two integer arrays right now get copied onto the memory arena because we need them for the reverse pass. But, if we could construct those int arrays in the model constructor taking in a // In model constructor
// local_allocator uses a local version of stan math's stack_alloc to get memory
stan::local_allocator local_alloc;
std::vector<int, stan::local_allocator> z(N, local_alloc); Then in Stan's For the example for (n in 1:N) {
array[n] real y;
for (m in 1:n) { ... }
}
We make a hard copy of these data arrays when we pass them to the stack allocator for the reverse pass callback. |
We'll that's not entirely true. We allocate space on the stack allocator for them if none exists already and then just copy the values over. But with the scheme I have in mind here we would totally avoid having to copy those values over / allocate the data |
Thanks for explaining. That sounds really useful. |
I'm not sure if this is still something desirable, but I wanted to note that this would make using custom C++ with the |
Gonna close this for now and come back to it. Imo still a good idea but I need to rethink it |
I think @SteveBronder is right in that if we could build std::vector instances using memory from our arena, then we wouldn't have to monkey with separate stacks to destroy them and it should be faster because our allocation is very homogeneous. |
Summary
This PR is related to stan-dev/math#2421 where it would be nice to call user defined functions with
std::vectors
that use astan::math::arena_allocator<>
. It changes UDF function signatures such as the below to also have anAlloc{id}__
allocator template.This will also allow us to do a similar trick to #865 so that eventually we can have
std::vectors
of data allocated once in the model, then Stan math can just reuse those vectors without having to make copies of them.I've also changed the return type to just beauto
. This is safe as long as the actual definition is made before the function is first called. Is that a safe assumption? tmk UDF's that are left undefined are normally passed in via a header during compilation.EDIT: ^was not a safe assumption
Submission Checklist
Release notes
Allow user defined functions to accept standard vectors with non-standard allocators
Copyright and Licensing
By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the BSD 3-clause license (https://opensource.org/licenses/BSD-3-Clause)