Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel property use_root_sync not specified correctly for kernel functors #16451

Open
HPS-1 opened this issue Dec 20, 2024 · 0 comments
Open
Labels
bug Something isn't working confirmed

Comments

@HPS-1
Copy link
Contributor

HPS-1 commented Dec 20, 2024

Describe the bug

As suggested in #14785, we are about to deprecate parallel_for and single_task overloads from the sycl_ext_oneapi_kernel_properties extension, and use the alternative interface provided by the sycl_ext_oneapi_enqueue_functions extension. With this new interface, if a user wants to specify kernel properties for a kernel, they must use a named function object (kernel functor) which exposes the properties via get(sycl::ext::oneapi::experimental::properties_tag) instead of a kernel lambda. (See note in this doc: https://github.com/intel/llvm/blob/974aec94af2ab81014895cf961895b5d2c06fc29/sycl/doc/extensions/experimental/sycl_ext_oneapi_enqueue_functions.asciidoc)

However, this method appears to be not setting the property sycl::ext::oneapi::experimental::use_root_sync correctly. This will cause UR errors if the respective kernel tries to run certain statements such as:

auto root = it.ext_oneapi_get_root_group();
sycl::group_barrier(root);

To reproduce

Here are two scripts for contrast, one should fail and the other should pass:
Script with kernel functor: (fails)

#include <cstdlib>
#include <type_traits>
#include <sycl/builtins.hpp>
#include <sycl/detail/core.hpp>
#include <sycl/ext/oneapi/experimental/root_group.hpp>
#include <sycl/group_barrier.hpp>
#include <sycl/kernel_bundle.hpp>
struct RootGroupKernel {
  RootGroupKernel(){}
  void operator()(sycl::nd_item<1> it) const {
    auto root = it.ext_oneapi_get_root_group();
    sycl::group_barrier(root);
  }
  auto get(sycl::ext::oneapi::experimental::properties_tag) {
    return sycl::ext::oneapi::experimental::properties{sycl::ext::oneapi::experimental::use_root_sync};
  }
};
int main() {
  sycl::queue q;
  sycl::range<1> R1{1};
  sycl::nd_range<1> NDR1{R1, R1};
  q.submit([&](sycl::handler &h) {
    h.parallel_for(NDR1, RootGroupKernel());
  });
  return EXIT_SUCCESS;
}

Script with kernel lambda: (passes, and if you remove props from parallel_for()'s parameter list it fails, which shows that the property is what matters here):

#include <cassert>
#include <cstdlib>
#include <type_traits>
#include <sycl/builtins.hpp>
#include <sycl/detail/core.hpp>
#include <sycl/ext/oneapi/experimental/root_group.hpp>
#include <sycl/group_barrier.hpp>
#include <sycl/kernel_bundle.hpp>
int main() {
  sycl::queue q;
  sycl::range<1> R1{1};
  sycl::nd_range<1> NDR1{R1, R1};
  const auto props = sycl::ext::oneapi::experimental::properties{sycl::ext::oneapi::experimental::use_root_sync};
  q.submit([&](sycl::handler &h) {
    h.parallel_for(NDR1, props, [=](sycl::nd_item<1> it) {
      auto root = it.ext_oneapi_get_root_group();
      sycl::group_barrier(root);
    });
  });
  return EXIT_SUCCESS;
}

Environment

The issue happens under Intel/GEN12 (and maybe NVIDIA/CUDA?) environments. AMD/HIP environment seems to be not impacted.

Additional context

No response

@HPS-1 HPS-1 added the bug Something isn't working label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working confirmed
Projects
None yet
Development

No branches or pull requests

2 participants