-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return value policy gray zone #888
Comments
This is a potential answer to wjakob#888, also inspired by the discussion in wjakob#589. Seeking feedback on the approach before I work on docs and tests. When returning unique or shared ownership to Python when some Python instances already hold a non-owning reference to the same C++ object, we have a few choices: * Create a new owning instance in addition to the existing non-owning one. (This PR's approach.) * Upgrade the existing non-owning instance to owning. (More appealing in some ways, but hard to implement with shared ownership since nanobind doesn't have holders.) * Just return the non-owning instance and forget about the ownership. (nanobind's current approach, which sometimes causes problems.) This PR also handles the fact that a non-owning instance can dangle. That is, the referenced C++ object could be destroyed while the Python instance is still alive -- especially on PyPy, where it's hard to control when a Python instance dies. Dangling instances are always non-owning, so they are mostly handled by the same logic that handles rv_policy changes. The remaining piece of support for dangling instances is to acknowledge that, if the C++ referent is freed, a new instance could be allocated with its same address, even one with internal storage. (I have observed this in production.) So, there is some new logic in `inst_new_int` to remove the previous must-be-dangling instances from `inst_c2p`, rather than crashing because they exist. This also implies a new state that a nanobind instance can be in: if inst->offset == 0, the instance refers to no C++ object and is not stored in the inst_c2p map.
This is a potential answer to wjakob#888, also inspired by the discussion in wjakob#589. Seeking feedback on the approach before I work on docs and tests. When returning unique or shared ownership to Python when some Python instances already hold a non-owning reference to the same C++ object, we have a few choices: * Create a new owning instance in addition to the existing non-owning one. (This PR's approach.) * Upgrade the existing non-owning instance to owning. (More appealing in some ways, but hard to implement with shared ownership since nanobind doesn't have holders.) * Just return the non-owning instance and forget about the ownership. (nanobind's current approach, which sometimes causes problems.) This PR also handles the fact that a non-owning instance can dangle. That is, the referenced C++ object could be destroyed while the Python instance is still alive -- especially on PyPy, where it's hard to control when a Python instance dies. Dangling instances are always non-owning, so they are mostly handled by the same logic that handles rv_policy changes. The remaining piece of support for dangling instances is to acknowledge that, if the C++ referent is freed, a new instance could be allocated with its same address, even one with internal storage. (I have observed this in production.) So, there is some new logic in `inst_new_int` to remove the previous must-be-dangling instances from `inst_c2p`, rather than crashing because they exist. This also implies a new state that a nanobind instance can be in: if inst->offset == 0, the instance refers to no C++ object and is not stored in the inst_c2p map.
As it happens, I attempted to solve this problem some months ago, following the discussion in #589. I gave up on it because I was running into a few too many fiddly corner cases, but with fresh eyes all of them were resolved pretty easily and I've uploaded the result as #889. That PR is focused on the correctness issues here, not the performance ones. I don't think straight-up doubling the number of rv_policies is a very user-friendly solution (correctly choosing between
|
A quick thought (that might appear simpler to developers) would be |
Is it useful to have both
And, |
They're at least needed internally in order to communicate the value category to It would be theoretically possible to distinguish between "user-facing" RVPs (which get mentioned in For reference, the converter between "external" and "internal" RVPs in current nanobind is |
cc @hawkinsp @vfdev-5 @oremanj
Nanobind exposes various return value policies that generally do an OK job. But there is a gray zone where their current behavior is IMO confusing. In particular,
rv_policy::move
,rv_policy::take_ownership
, andrv_policy::reference_internal
all check if an existing Python object is already associated with the pointer/reference. In that case, they directly return that and the return value policy is ignored.This can lead to leaks. For example, suppose that code returns an object twice -- once using
rv_policy::reference
, and later usingrv_policy::take_ownership
. The second ownership transfer will never occur.It can also lead to weird/unexpected behavior. For example,
will not actually move the field when somebody else has previously created a reference to
s.field
.Confusion aside, these lookups to check for the existence of an instance also have a non-negligible cost (hash table traversal) that would be nice to avoid.
But it is not so simple to change this behavior.
For example, one low hanging fruit that I was looking at just now was to disable the search for existing instances when the user passes
rv_policy::move
(which is used for pass-by value, so this would be great optimization that hits many usecases).However, this breaks overloaded assignment operators (specifically
nb::self() += nb::self()
used intest14_operators
). This is related to a PR by @oremanj (#803). The operator is overloaded withrv_policy::move
and enforcing that now breaks preservation of the same object for an in-place update.Taking a step back, the ideal behavior for in-place operators is to return the same Python object if the operator returns
*this
, and copy or move otherwise (perhaps move in the case of pass-by value, and copy in remaining cases?) We always move for pass-by value, so it seems that we need anreturn_existing_or_copy
return value policy...Next, I looked at disabling the instance search for
rv_policy::take_ownership
. This causes many unique pointer-related tests to fail intest_holders.py
, and the test suite eventually segfaults intests/test_thread.py
.The segfault is instructive: it happens in a function binding that essentially does
[](Value *value) -> Value * { return value; }
(pointer pass-through). The default value policyautomatic
turns intotake_ownership
, and that's the wrong one to use here. So making any change here will probably break quite a lot of user code.I thought that it would be useful to have a discussion about whether others think this is a problem, and how to improve it.
One potential solution that I was thinking about is to separate the return value policy into two independent aspects:
So we could, e.g., have
always_copy
always_move
always_take_ownership
return_existing_or_copy
return_existing_or_move
return_existing_or_take_ownership
(with the current policy names mapping to the
return_existing_*
variants).By numbering them suitably, the use of these (many) policies could be dealt with using bit arithmetic in the implementation. To take the negative position, this change makes something that users already find confusing even more overwhelming.
Your feedback would be greatly appreaciated!
The text was updated successfully, but these errors were encountered: