Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allows functions to accept std::vectors with non-standard allocators #948

Conversation

SteveBronder
Copy link
Contributor

@SteveBronder SteveBronder commented Aug 19, 2021

Summary

This PR is related to stan-dev/math#2421 where it would be nice to call user defined functions with std::vectors that use a stan::math::arena_allocator<>. It changes UDF function signatures such as the below to also have an Alloc{id}__ allocator template.

template <typename T0__, typename T1__, typename Alloc1__, typename T2__,
Eigen::Matrix<stan::promote_args_t<stan::value_type_t<T0__>, T1__, typename Alloc2__> T2__>, -1, 1>
foo2(const T0__& a_arg__, const std::vector<Eigen::Matrix<T1__, -1, -1>>& b,
     const std::vector<Eigen::Matrix<T2__, 1, -1>>& c,
     std::ostream* pstream__) ;
template <typename T0__, typename T1__, typename T2__>
auto foo2(const T0__& a_arg__, const std::vector<T1__, Alloc1__>& b,
          const std::vector<T2__, Alloc2__>& c, std::ostream* pstream__) ;

This will also allow us to do a similar trick to #865 so that eventually we can have std::vectors of data allocated once in the model, then Stan math can just reuse those vectors without having to make copies of them.

I've also changed the return type to just be auto. This is safe as long as the actual definition is made before the function is first called. Is that a safe assumption? tmk UDF's that are left undefined are normally passed in via a header during compilation.

EDIT: ^was not a safe assumption

Submission Checklist

  • Run unit tests
  • Documentation
    • If a user-facing facing change was made, the documentation PR is here:
    • OR, no user-facing changes were made

Release notes

Allow user defined functions to accept standard vectors with non-standard allocators

Copyright and Licensing

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the BSD 3-clause license (https://opensource.org/licenses/BSD-3-Clause)

Copy link
Member

@bob-carpenter bob-carpenter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to allocate std::vector memory in the autodiff arena? It sounds like it might lead to a memory leak because we only free the autodiff memory after the reverse pass. For example, what happens with this:

for (n in 1:N) {
  array[n] real y;
  for (m in 1:n) { ... }
}

Will that just store the array contents in the arena? The default RAII allocator implementation with the heap would free the memory for y after each loop in 1:N.

Is the idea to only do this for the block variables like parameters?

Where do you think there are extraneous copies of std::vector being made? Isn't everything call by constant reference? Or are the function returns not properly optimizing for move semantics?

…DF returns that return an eigen type. Use auto for function returns
@SteveBronder
Copy link
Contributor Author

Why do we want to allocate std::vector memory in the autodiff arena? It sounds like it might lead to a memory leak because we only free the autodiff memory after the reverse pass. For example, what happens with this:

Is the idea to only do this for the block variables like parameters?

Yep! Specifically we want to let the stan math backend know that data and transformed data does not need to be copied to the memory arena. So like a good example is csr_matrix_times_vector() where we have

csr_matrix_times_vector(int, int, vector, array int, array int, vector)

Those two integer arrays right now get copied onto the memory arena because we need them for the reverse pass. But, if we could construct those int arrays in the model constructor taking in a local_allocator type that stan's arena_allocator type knew it didn't need to make copies for then we could avoid having to copy those arrays to the arena. The code in the model constructor would look something like the below where a local_allocator class is used when constructing the std::vector

// In model constructor
// local_allocator uses a local version of stan math's stack_alloc to get memory
stan::local_allocator local_alloc;
std::vector<int, stan::local_allocator> z(N, local_alloc);

Then in Stan's arena_allocator (the allocator used for putting std::vector data onto the memory arena) we would have a constructor that takes in a local_allocator and doesn't copy the memory

For the example

for (n in 1:N) {
  array[n] real y;
  for (m in 1:n) { ... }
}

Will that just store the array contents in the arena? The default RAII allocator implementation with the heap would free the memory for y after each loop in 1:N.

y here would not use the local allocator. My plan would be that only "global block" parameters in data and transformed data would use this.

Where do you think there are extraneous copies of std::vector being made? Isn't everything call by constant reference? Or are the function returns not properly optimizing for move semantics?

We make a hard copy of these data arrays when we pass them to the stack allocator for the reverse pass callback.

@SteveBronder
Copy link
Contributor Author

SteveBronder commented Aug 19, 2021

We make a hard copy of these data arrays when we pass them to the stack allocator for the reverse pass callback.

We'll that's not entirely true. We allocate space on the stack allocator for them if none exists already and then just copy the values over. But with the scheme I have in mind here we would totally avoid having to copy those values over / allocate the data

@bob-carpenter
Copy link
Member

Thanks for explaining. That sounds really useful.

@WardBrian
Copy link
Member

I'm not sure if this is still something desirable, but I wanted to note that this would make using custom C++ with the USER_HEADER argument more complicated since they would also need to have these extra templates. Not saying that's a blocker, just something worth considering as a cost/benefit item

@SteveBronder
Copy link
Contributor Author

Gonna close this for now and come back to it. Imo still a good idea but I need to rethink it

@bob-carpenter
Copy link
Member

I think @SteveBronder is right in that if we could build std::vector instances using memory from our arena, then we wouldn't have to monkey with separate stacks to destroy them and it should be faster because our allocation is very homogeneous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants