Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCLcompat] Add support for device attribute MAX_PITCH and ASYNC_ENGINE_COUNT #16533

Open
wants to merge 2 commits into
base: sycl
Choose a base branch
from

Conversation

the-slow-one
Copy link

CU_DEVICE_ATTRIBUTE_AYSNC_ENGINE_COUNT is associated with value 2 for bidirectional data transfer between host and device while executing the kernel and is associated to 1 in case of unidirectional data transfer and for no support of data transfer and parallel execution it's 0.

CU_DEVICE_ATTRIBUTE_MAX_PITCH return INT_MAX, indicating there is no limits! Intel GPU shows the same behavior too!

Signed-off-by: Deepak Raj H R

@the-slow-one the-slow-one requested a review from a team as a code owner January 7, 2025 06:30
@@ -444,6 +444,13 @@ class device_ext : public sycl::device {
return get_device_info().get_local_mem_size();
}

// For CU_DEVICE_ATTRIBUTE_MAX_PITCH, NVidia GPUs return MAX_INT
Copy link
Contributor

@zhiweij1 zhiweij1 Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't mention CUDA-related text here.

Copy link
Contributor

@GeorgeWeb GeorgeWeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, otherwise syclcompat looks okay.

@@ -444,6 +444,10 @@ class device_ext : public sycl::device {
return get_device_info().get_local_mem_size();
}

int get_max_pitch() const { return INT_MAX; }
Copy link
Contributor

@GeorgeWeb GeorgeWeb Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may prefer to use numeric_limits<int>::max() from the C++ <limits> library instead which is equivalent to INT_MAX from <climits>.

Also, while this compiles fine due to limits probably being included indirectly from another include, it is better to include the library/headers at the top of this file explicitly.

@JackAKirk
Copy link
Contributor

JackAKirk commented Jan 7, 2025

CU_DEVICE_ATTRIBUTE_AYSNC_ENGINE_COUNT is associated with value 2 for bidirectional data transfer between host and device while executing the kernel and is associated to 1 in case of unidirectional data transfer and for no support of data transfer and parallel execution it's 0.

I may be misremembering, but I don't think the above is correct. IIRC querying CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT can return greater than 2 depending upon the NVLINK capabilities.

In any case I don't understand why you are always returning 0 for get_async_engine_count if you want to map it to the CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT device query?
The correct way of dealing with such device queries in a future-proof manner for all existing and future devices is to add a query in urDeviceGetInfo that calls cuDeviceGetAttribute using CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT https://github.com/oneapi-src/unified-runtime/blob/main/source/adapters/cuda/device.cpp#L41
and then calling this UR API via the sycl compat entry point using a mapped CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT query.

It may also have different mappings to other backends (e.g. AMD), and the call to urDeviceGetInfo allows such backends to return an appropriate value.

Copy link
Contributor

@GeorgeWeb GeorgeWeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the minor comment I left earlier, looking at this again, @JackAKirk is right that it seems a little strange to return 0 for all backends from get_async_engine_count when this should be correctly queried via device::get_info which calls into the respective backend query. I am still okay with this if the above stated may not be needed or if there's a plan to follow up with it in case it is needed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants