Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove cu::Context class #316

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Remove cu::Context class #316

wants to merge 14 commits into from

Conversation

csbnw
Copy link
Contributor

@csbnw csbnw commented Feb 5, 2025

Remove the cu::Context class. The Context Management is now done in the constructor of the cu::Device class.
When a Primary Context was already active, e.g. when combining cudawrappers with the NVIDIA Runtime API, that context is retained. If not, a new Context is created.

Some tests had to be removed, while others were adapted slightly. When building in HIP mode, the Context code is disabled. The code is tested locally on NVIDIA en AMD en all pass.

Description

Related issues:

Instructions to review the pull request

  • Check that CHANGELOG.md has been updated if necessary

@csbnw csbnw requested review from john-romein and loostrum February 5, 2025 09:29
@john-romein
Copy link
Contributor

I do not think that I like it. It is a major deviation from the idea to stay close to the CUDA driver API. Also, it breaks basically all the libraries and applications that we have.

@csbnw
Copy link
Contributor Author

csbnw commented Feb 6, 2025

Going forward, I see a couple of options. In no particular order:

  1. Don't merge the suggested changed
  2. Keep the cu::Context, as is
  3. Keep the cu::Context, but make all functions no-ops
  4. Proceed with merging this code

For 1. and 2., we would also have to revert the deprecation message to improve the user experience.

I would be strongly in favour of option 3. or 4. Option 3. maintains compatibility with existing code, but could also confuse the user. People arguably shouldn't use the main branch and instead use any of the released versions. If they stick to an older version, API changes are not really an issue.

There is one thing that still may be need to be addressed for 3. or 4., the cu::Context::setCurrent. Some codes need it. In case of option 3., this won't work as there will be no CUcontext left. We would have to add something like cu::Device::setCurrentContext.

@csbnw csbnw requested a review from matmanc February 6, 2025 07:51
if (active) {
checkCudaCall(cuDevicePrimaryCtxRetain(&_pctx, _obj));
} else {
checkCudaCall(cuCtxCreate(&_pctx, flags, _obj));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you create it here you are responsible for it so maybe you can set the the context inside a ptr and change the manager to do nothing in the first two cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we should indeed clean-up afterwards. I now added a _context_manager member that does this. It releases the context in case there was already a primary context, or it destroys it when it created one.

@matmanc
Copy link
Contributor

matmanc commented Feb 6, 2025

If we leave it as is we should print something during compilation when using HIP.

The context functions are doing nothing...

@john-romein
Copy link
Contributor

Can't we just make them noops for AMD only? Can a Context::setCurrent() be safely ignored on AMD GPUs?

csbnw added 11 commits February 6, 2025 16:07
The Context Management is now done in the constructor of the cu::Device
class.
To this end, even when the primary context is retained, the returned
Cucontext object must be stored in the _context_manager.
With HIP on an AMD GPU, allocating CU_MEMORYTYPE_UNIFIED has type
CU_MEMORYTYPE_HOST, causing the check to fail. It is weird that this
didn't cause problems before changing the Context class. Alternatively,
we could query and store the memory type in the constuctor.
@csbnw
Copy link
Contributor Author

csbnw commented Feb 6, 2025

@matmanc

If we leave it as is we should print something during compilation when using HIP.

The context functions are doing nothing...

and @john-romein

Can't we just make them noops for AMD only? Can a Context::setCurrent() be safely ignored on AMD GPUs?

Can you check the latest code to see how I solved it? There is no need for no-ops. Both in CUDA and HIP mode, you can use the same cu::Device interface.

The only difference between the two is that CUDA will complain about an invalid device context when you try to use the device without calling cuCtxSetCurrent first. This can now be done with the new Device::ctxSetCurrent function. With HIP, nothing happens (the code will just work). I added test cases to test_cu for these scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants