Skip to content

Commit

Permalink
[SYCL] Use static address space cast for atomic_ref ctor in SPIR-…
Browse files Browse the repository at this point in the history
…V path (intel#15384)

From SYCL 2020 specification:

> The sycl::atomic_ref class also has a template parameter AddressSpace,
> which allows the application to make an assertion about the address
> space of the object of type T that it references. The default value
> for this parameter is access::address_space::generic_space, which
> indicates that the object could be in either the global or local
> address spaces. If the application knows the address space, it can set
> this template parameter to either access::address_space::global_space
> or access::address_space::local_space as an assertion to the
> implementation. Specifying the address space via this template
> parameter may allow the implementation to perform certain
> optimizations. Specifying an address space that does not match the
> object’s actual address space results in undefined behavior

We use `ext::oneapi::experimental::static_address_cast` to do that. It's
not implemented for CUDA/HIP yet, that path continues using
`sycl::address_space_cast` that performs runtime checks:

> An implementation must return nullptr if the run-time value of pointer
> is not compatible with Space, and must issue a compiletime diagnostic
> if the deduced address space for pointer is not compatible with Space.
  • Loading branch information
aelovikov-intel committed Sep 18, 2024
1 parent 032d36a commit 0b65c98
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 1 deletion.
11 changes: 10 additions & 1 deletion sycl/include/sycl/atomic_ref.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@

#include <sycl/access/access.hpp> // for address_space
#include <sycl/bit_cast.hpp> // for bit_cast
#include <sycl/memory_enums.hpp> // for getStdMemoryOrder, memory_order
#include <sycl/ext/oneapi/experimental/address_cast.hpp>
#include <sycl/memory_enums.hpp> // for getStdMemoryOrder, memory_order

#ifdef __SYCL_DEVICE_ONLY__
#include <sycl/detail/spirv.hpp>
Expand Down Expand Up @@ -157,8 +158,16 @@ class atomic_ref_base {
}

#ifdef __SYCL_DEVICE_ONLY__
#if defined(__SPIR__)
explicit atomic_ref_base(T &ref)
: ptr(ext::oneapi::experimental::static_address_cast<AddressSpace>(
&ref)) {}
#else
// CUDA/HIP don't support `ext::oneapi::experimental::static_address_cast`
// yet.
explicit atomic_ref_base(T &ref)
: ptr(address_space_cast<AddressSpace, access::decorated::no>(&ref)) {}
#endif
#else
// FIXME: This reinterpret_cast is UB, but happens to work for now
explicit atomic_ref_base(T &ref)
Expand Down
20 changes: 20 additions & 0 deletions sycl/test/check_device_code/atomic_ref.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
// RUN: %clangxx -O3 -fsycl -fsycl-device-only -fno-discard-value-names -S -emit-llvm -fno-sycl-instrument-device-code -o - %s | FileCheck %s

#include <sycl/sycl.hpp>

// CHECK-LABEL: define dso_local spir_func noundef i32 @_Z17atomic_ref_globalRi(
// CHECK-SAME: ptr addrspace(4) noundef align 4 dereferenceable(4) [[I:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] !srcloc [[META6:![0-9]+]] !sycl_fixed_targets [[META7:![0-9]+]] {
// CHECK-NEXT: [[ENTRY:.*:]]
// CHECK-NEXT: [[CALL_I_I_I_I_I_I:%.*]] = tail call spir_func noundef ptr addrspace(1) @_Z33__spirv_GenericCastToPtr_ToGlobalPvi(ptr addrspace(4) noundef [[I]], i32 noundef 5) #[[ATTR3:[0-9]+]]
// CHECK-NEXT: [[CALL3_I_I:%.*]] = tail call spir_func noundef i32 @_Z18__spirv_AtomicLoadPU3AS1KiN5__spv5Scope4FlagENS1_19MemorySemanticsMask4FlagE(ptr addrspace(1) noundef [[CALL_I_I_I_I_I_I]], i32 noundef 1, i32 noundef 898) #[[ATTR4:[0-9]+]]
// CHECK-NEXT: ret i32 [[CALL3_I_I]]
//
SYCL_EXTERNAL auto atomic_ref_global(int &i) {
// Verify that we use _Z33__spirv_GenericCastToPtr_ToGlobalPvi that doesn't
// perform dynamic address space validation.
sycl::atomic_ref<int, sycl::memory_order::acq_rel, sycl::memory_scope::device,
sycl::access::address_space::global_space>
a(i);
return a.load();
}

0 comments on commit 0b65c98

Please sign in to comment.