Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: [DO NOT MERGE] introduce libraft wheels #2531

Draft
wants to merge 27 commits into
base: branch-25.02
Choose a base branch
from

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented Dec 17, 2024

Replaces #2360, contributes to rapidsai/build-planning#33.

Proposes packaging libraft as a wheel, which is then re-used by:

Notes for Reviewers

If you see this note, that means this is not ready for review.

Dependency Flows

libraft wheels contain all of:

  • raft headers (and everything they require)
  • libraft.a (static library)
  • libraft.so (shared object)

libcugraph and libcuml need libraft at build time to statically link against libraft.a.

pylibcugraph, cugraph, and cuml need libraft at build time just to get the RAFT headers (since those are used in libcugraph's and libcuml's public headers).

---
title: Build dependencies
---
flowchart TD
    A[libraft] -->B[pylibraft]
    A --> C[raft-dask]
    A --> D[libcugraph]
    D --> E[pylibcugraph]
    D --> F[cugraph]
    A --> G[libcuml]
    A --> H[cuml]
    A --> E
    B --> E
    A --> F
    B --> F
    E --> F
    G --> H
Loading

Nothing in cugraph would need libraft at runtime, because libcugraph is statically linked against libraft.

---
title: Runtime dependencies
---
flowchart LR
    A[libraft] --> B[pylibraft]
    A --> C[raft-dask]
Loading

Presumably wholegraph could follow a similar pattern

Size changes (CUDA 12, Python 3.12, x86_64)

wheel num files (before) num files (after) size (before) size (this PR)
libraft. --- --- --- ---
pylibraft --- --- --- ---
raft-dask --- --- --- ---
libcugraph --- --- --- ---
pylibcugraph 190 --- 900M ---
cugraph 314 --- 901M ---
libcuml --- --- --- ---
cuml --- --- --- ---
TOTAL **** --- **** ---

NOTES: size = compressed, "before" = 2024-12-17 nightlies (rapidsai/cugraph@5c8f850), cugraph libraries from rapidsai/cugraph#4804

how I calculated those (click me)
docker run \
    --rm \
    -v $(pwd):/opt/work:ro \
    -w /opt/work \
    --network host \
    --env RAPIDS_NIGHTLY_DATE=2024-12-06 \
    --env RAPIDS_NIGHTLY_SHA=5c8f850 \
    --env RAPIDS_PR_NUMBER=4804 \
    --env RAPIDS_PY_CUDA_SUFFIX=cu12 \
    --env RAPIDS_REPOSITORY=rapidsai/cugraph \
    --env WHEEL_DIR_BEFORE=/tmp/wheels-before \
    --env WHEEL_DIR_AFTER=/tmp/wheels-after \
    -it rapidsai/ci-wheel:cuda12.5.1-rockylinux8-py3.12 \
    bash

mkdir -p "${WHEEL_DIR_BEFORE}"
mkdir -p "${WHEEL_DIR_AFTER}"

cpp_projects=(
    libcugraph
)
py_projects=(
    cugraph
    pylibcugraph
)

for project in "${py_projects[@]}"; do
    # before
    RAPIDS_BUILD_TYPE=nightly \
    RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    RAPIDS_REF_NAME="branch-25.02" \
    RAPIDS_SHA=${RAPIDS_NIGHTLY_SHA} \
        rapids-download-wheels-from-s3 python "${WHEEL_DIR_BEFORE}"

    # after
    #RAPIDS_BUILD_TYPE=pull-request \
    #RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    #RAPIDS_REF_NAME="pull-request/${RAPIDS_PR_NUMBER}" \
    #    rapids-download-wheels-from-s3 python "${WHEEL_DIR_AFTER}"
done

for project in "${cpp_projects[@]}"; do    
    # after
    RAPIDS_BUILD_TYPE=pull-request \
    RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    RAPIDS_REF_NAME="pull-request/${RAPIDS_PR_NUMBER}" \
        rapids-download-wheels-from-s3 cpp "${WHEEL_DIR_AFTER}"
done

pip install pydistcheck
pydistcheck \
    --inspect \
    --select 'distro-too-large-compressed' \
    ${WHEEL_DIR_BEFORE}/*.whl

pydistcheck \
    --inspect \
    --select 'distro-too-large-compressed' \
    ${WHEEL_DIR_AFTER}/*.whl

@jameslamb jameslamb added 5 - DO NOT MERGE Hold off on merging; see PR for details improvement Improvement / enhancement to an existing function non-breaking Non-breaking change 2 - In Progress Currenty a work in progress labels Dec 17, 2024

This comment was marked as resolved.

@jameslamb jameslamb changed the title WIP: introduce libraft wheels WIP: [DO NOT MERGE] introduce libraft wheels Dec 17, 2024
rapids-bot bot pushed a commit that referenced this pull request Jan 7, 2025
…nup (#2532)

Proposes some cleanup of packaging details, noticed while I was working on #2531

* removes runtime dependencies on `joblib` and `numba` for `raft-dask`
   - *`raft-dask` doesn't directly import from these libraries, and the git blame didn't suggest any other reason that they were being pinned here*
   - *checked with `git grep -E 'joblib|numba'`
* removes `setup.cfg` files
   - *these are currently being ignored by tools, in favor of identical configuration in `pyproject.toml` and `.flake8` files*
   - e.g. https://github.com/rapidsai/raft/blob/bfd190687ee396374b7106d9ac26add73b57b22a/.pre-commit-config.yaml#L16-L19
* packages license files in conda packages
  - *think these were just missed in the round of PRs like this: rapidsai/cuml#6061
* removes some outdated / inaccurate comments in packaging configs

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #2532
@github-actions github-actions bot added the cpp label Jan 7, 2025
@jameslamb
Copy link
Member Author

/ok to test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currenty a work in progress 5 - DO NOT MERGE Hold off on merging; see PR for details ci CMake cpp improvement Improvement / enhancement to an existing function non-breaking Non-breaking change python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant