Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return device total global memory for MaxAllocSize #1181

Merged
merged 1 commit into from
Apr 10, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 5 additions & 11 deletions source/adapters/cuda/device.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -68,17 +68,11 @@ struct ur_device_handle_t_ {
}

// Max size of memory object allocation in bytes.
// The minimum value is max(min(1024 × 1024 ×
// 1024, 1/4th of CL_DEVICE_GLOBAL_MEM_SIZE),
// 32 × 1024 × 1024) for devices that are not of type
// CL_DEVICE_TYPE_CUSTOM.
size_t Global = 0;
UR_CHECK_ERROR(cuDeviceTotalMem(&Global, cuDevice));

auto QuarterGlobal = static_cast<uint32_t>(Global / 4u);

MaxAllocSize = std::max(std::min(1024u * 1024u * 1024u, QuarterGlobal),
32u * 1024u * 1024u);
// The minimum value is max (1/4th of info::device::global_mem_size,
// 128*1024*1024) if this SYCL device is not device_type::custom.
// CUDA doesn't really have this concept, and could allow almost 100% of
// global memory in one allocation, but is dependent on device usage.
mmoadeli marked this conversation as resolved.
Show resolved Hide resolved
UR_CHECK_ERROR(cuDeviceTotalMem(&MaxAllocSize, cuDevice));
}

~ur_device_handle_t_() { cuDevicePrimaryCtxRelease(CuDevice); }
Expand Down