Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add public APIs to Access Underlying cudf and pandas Objects from cudf.pandas Proxy Objects #17629

Merged
merged 27 commits into from
Jan 29, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
b5eea1f
Add a public api to get fast slow objects
galipremsagar Dec 19, 2024
7bc76e5
Merge remote-tracking branch 'upstream/branch-25.02' into 17524
galipremsagar Jan 24, 2025
3cdfe94
update names and add fast paths
galipremsagar Jan 24, 2025
34375dc
centralize logic
galipremsagar Jan 25, 2025
31f9e99
fix
galipremsagar Jan 25, 2025
72ba73f
cleanup
galipremsagar Jan 25, 2025
3fd679f
Merge branch 'branch-25.02' into 17524
galipremsagar Jan 25, 2025
37764c2
Apply suggestions from code review
galipremsagar Jan 25, 2025
bbe0fa4
Apply suggestions from code review
galipremsagar Jan 27, 2025
cf6888f
wrap result
galipremsagar Jan 27, 2025
528c189
Merge branch 'branch-25.02' into 17524
galipremsagar Jan 27, 2025
6b744f4
Update faq.md
galipremsagar Jan 27, 2025
9ec0215
style
galipremsagar Jan 27, 2025
a3c49fd
add is_cudf_pandas.. APIs
galipremsagar Jan 27, 2025
3f06e70
update docs
galipremsagar Jan 27, 2025
95ba799
Merge remote-tracking branch 'upstream/branch-25.02' into 17524
galipremsagar Jan 27, 2025
a2e97f5
Apply suggestions from code review
galipremsagar Jan 27, 2025
b96f8ff
Apply suggestions from code review
galipremsagar Jan 27, 2025
0044d8f
update API
galipremsagar Jan 28, 2025
a77fecc
revert cudf.pandas spilling into cudf
galipremsagar Jan 28, 2025
aafbd31
Merge branch 'branch-25.02' into 17524
galipremsagar Jan 28, 2025
f15d37b
Update docs/cudf/source/cudf_pandas/faq.md
galipremsagar Jan 28, 2025
f94253d
cleanup
galipremsagar Jan 28, 2025
daaa5df
update api
galipremsagar Jan 28, 2025
eaa23cb
Merge branch 'branch-25.02' into 17524
galipremsagar Jan 28, 2025
9c58a71
cleanup
galipremsagar Jan 28, 2025
69837f3
Merge branch 'branch-25.02' into 17524
galipremsagar Jan 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions docs/cudf/source/cudf_pandas/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,24 @@ cuDF (learn more in [this
blog](https://medium.com/rapids-ai/easy-cpu-gpu-arrays-and-dataframes-run-your-dask-code-where-youd-like-e349d92351d)) and the [RAPIDS Accelerator for Apache Spark](https://nvidia.github.io/spark-rapids/)
provides a similar configuration-based plugin for Spark.


## Recommendation for libraries that are type aware.
galipremsagar marked this conversation as resolved.
Show resolved Hide resolved

When working with `cudf.pandas` proxy objects, it is important to access the real underlying objects to ensure compatibility with libraries that are `cudf` or `pandas` aware. You can use the following methods to retrieve the actual `cudf` or `pandas` objects:
galipremsagar marked this conversation as resolved.
Show resolved Hide resolved

- `get_cudf_pandas_fast_object()`: This method returns the fast `cudf` object from the proxy.
- `get_cudf_pandas_slow_object()`: This method returns the slow `pandas` object from the proxy.

Here is an example of how to use these methods:

```python
# Assuming `proxy_obj` is a cudf.pandas proxy object
fast_obj = proxy_obj.get_cudf_pandas_fast_object()
slow_obj = proxy_obj.get_cudf_pandas_slow_object()

# Now you can use `fast_obj` and `slow_obj` with libraries that are cudf or pandas aware
```

galipremsagar marked this conversation as resolved.
Show resolved Hide resolved
(are-there-any-known-limitations)=
## Are there any known limitations?

Expand Down
8 changes: 8 additions & 0 deletions python/cudf/cudf/pandas/fast_slow_proxy.py
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,12 @@ def _fsproxy_fast_to_slow(self):
return fast_to_slow(self._fsproxy_wrapped)
return self._fsproxy_wrapped

def get_cudf_pandas_fast_object(self):
galipremsagar marked this conversation as resolved.
Show resolved Hide resolved
return self._fsproxy_slow_to_fast()

def get_cudf_pandas_slow_object(self):
return self._fsproxy_fast_to_slow()

@property # type: ignore
def _fsproxy_state(self) -> _State:
return (
Expand All @@ -221,6 +227,8 @@ def _fsproxy_state(self) -> _State:
"_fsproxy_slow_type": slow_type,
"_fsproxy_slow_to_fast": _fsproxy_slow_to_fast,
"_fsproxy_fast_to_slow": _fsproxy_fast_to_slow,
"get_cudf_pandas_fast_object": get_cudf_pandas_fast_object,
"get_cudf_pandas_slow_object": get_cudf_pandas_slow_object,
"_fsproxy_state": _fsproxy_state,
}

Expand Down
6 changes: 6 additions & 0 deletions python/cudf/cudf_pandas_tests/test_cudf_pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -1885,3 +1885,9 @@ def test_dataframe_setitem():
new_df = df + 1
df[df.columns] = new_df
tm.assert_equal(df, new_df)


def test_dataframe_get_fast_slow_methods():
df = xpd.DataFrame({"a": [1, 2, 3], "b": [1, 2, 3]})
assert isinstance(df.get_cudf_pandas_fast_object(), cudf.DataFrame)
assert isinstance(df.get_cudf_pandas_slow_object(), pd.DataFrame)
Loading