Skip to content
This repository has been archived by the owner on Aug 11, 2023. It is now read-only.

cl::sycl::exception #84

Closed
huoyushi opened this issue Jan 3, 2018 · 13 comments
Closed

cl::sycl::exception #84

huoyushi opened this issue Jan 3, 2018 · 13 comments

Comments

@huoyushi
Copy link

huoyushi commented Jan 3, 2018

I make this sdk with opencl1.2 .It can make successfully ,but When I run hello world ,I get this error

terminate called after throwing an instance of 'cl::sycl::exception'
Aborted (core dumped)

this is my environment
ubuntu 16.04 gcc 5.4.0 computecpp 0.5.0

clinfo :

Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.0 AMD-APP (2482.3)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
Platform Extensions function suffix AMD

Platform Name AMD Accelerated Parallel Processing
Number of devices 1
Device Name Hainan
Device Vendor Advanced Micro Devices, Inc.
Device Vendor ID 0x1002
Device Version OpenCL 1.2 AMD-APP (2482.3)
Driver Version 2482.3
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile FULL_PROFILE
Device Board Name (AMD) AMD Radeon HD 8500M
Device Topology (AMD) PCI-E, 04:00.0
Max compute units 4
SIMD per compute unit (AMD) 4
SIMD width (AMD) 16
SIMD instruction width (AMD) 1
Max clock frequency 850MHz
Graphics IP (AMD) 6.0
Device Partition (core)
Max number of sub-devices 4
Supported partition types none specified
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple 64
Wavefront width (AMD) 64
Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (n/a)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Address bits 32, Little-Endian
Global memory size 2140311552 (1.993GiB)
Global free memory (AMD) <printDeviceInfo:68: get number of CL_DEVICE_GLOBAL_FREE_MEMORY_AMD : error -33>
Global memory channels (AMD) 2
Global memory banks per channel (AMD) 8
Global memory bank width (AMD) 256 bytes
Error Correction support No
Max memory allocation 1591773593 (1.482GiB)
Unified memory for Host and Device No
Minimum alignment for any data type 128 bytes
Alignment of base address 2048 bits (256 bytes)
Global Memory cache type Read/Write
Global Memory cache size 16384
Global Memory cache line 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 134217728 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 256 bytes
Pitch alignment for 2D image buffers 256 bytes
Max 2D image size 16384x16384 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 128
Max number of write image args 8
Local memory type Local
Local memory size 32768 (32KiB)
Local memory syze per CU (AMD) 65536 (64KiB)
Local memory banks (AMD) 32
Max constant buffer size 65536 (64KiB)
Max number of constant args 8
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1ns
Profiling timer offset since Epoch (AMD) 1514963480756412943ns (Wed Jan 3 15:11:20 2018)
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Thread trace supported (AMD) No
SPIR versions 1.2
printf() buffer size 1048576 (1024KiB)
Built-in kernels
Device Available Yes
Compiler Available Yes
Linker Available Yes
Device Extensions cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event

NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) AMD Accelerated Parallel Processing
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [AMD]
clCreateContext(NULL, ...) [default] Success [AMD]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name AMD Accelerated Parallel Processing
Device Name Hainan
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name AMD Accelerated Parallel Processing
Device Name Hainan

ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.8
ICD loader Profile OpenCL 1.2
NOTE: your OpenCL library declares to support OpenCL 1.2,
but it seems to support up to OpenCL 2.1 too.

computecpp_info :

ComputeCpp Info (CE 0.5.0)


Toolchain information:

GLIBC version: 2.23
GLIBCXX: 20160609
This version of libstdc++ is supported.


Device Info:

Discovered 1 devices matching:
platform :
device type :


Device 0:

Device is supported : UNTESTED - Vendor not tested on this OS
CL_DEVICE_NAME : Hainan
CL_DEVICE_VENDOR : Advanced Micro Devices, Inc.
CL_DRIVER_VERSION : 2482.3
CL_DEVICE_TYPE : CL_DEVICE_TYPE_GPU

@rodburns
Copy link
Contributor

rodburns commented Jan 3, 2018

Can you please run this again using gdb to obtain a stack trace and post the trace here?

@DuncanMcBain
Copy link
Member

Running the "accessors" sample should also have some more informative output, hopefully!

@huoyushi
Copy link
Author

huoyushi commented Jan 3, 2018

run accessors got the error @DuncanMcBain
SYCL exception caught: Error: [ComputeCpp:RT0407] Failed to create OpenCL command queue

@huoyushi
Copy link
Author

huoyushi commented Jan 3, 2018

this is gdb info @rodburns
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff6299700 (LWP 8444)]
[New Thread 0x7ffff5a98700 (LWP 8445)]
[New Thread 0x7ffff5297700 (LWP 8446)]
[New Thread 0x7ffff4a96700 (LWP 8447)]
[New Thread 0x7ffff4295700 (LWP 8448)]
[Thread 0x7ffff6299700 (LWP 8444) exited]
[Thread 0x7ffff4a96700 (LWP 8447) exited]
[New Thread 0x7ffff6299700 (LWP 8449)]
[Thread 0x7ffff4295700 (LWP 8448) exited]
[Thread 0x7ffff5297700 (LWP 8446) exited]
[Thread 0x7ffff5a98700 (LWP 8445) exited]
[Thread 0x7ffff6299700 (LWP 8449) exited]
SYCL exception caught: Error: [ComputeCpp:RT0407] Failed to create OpenCL command queue[Inferior 1 (process 8440) exited with code 02]

@rodburns
Copy link
Contributor

rodburns commented Jan 3, 2018

Thanks for sharing that. An exception is being thrown in the code. Can you put try and catch around the code and output the error shown in the exception? This should give the cl error code and will help to track down what the problem is.

@huoyushi
Copy link
Author

huoyushi commented Jan 4, 2018

Hi @rodburns @DuncanMcBain .
I debug accessors program by gdb with breakpoints ,It catch the exception
Error: [ComputeCpp:RT0407] Failed to create OpenCL command queue[Inferior 1 (process 8440) exited with code 02]
when define the queue at line 44
/* We can also create a queue that uses the default selector in
43 * the queue's default constructor. */
44 queue myQueue;

also I test my device by the code
#include <CL/sycl.hpp>
#include

int main() {
auto devices = cl::sycl::device::get_devices();
std::cout << devices.size() << " devices\n";
for (auto d : devices)
std::cout << d.get_infocl::sycl::info::device::name() << "\n";
}

Here is the output
1 devices
Hainan

@huoyushi
Copy link
Author

huoyushi commented Jan 4, 2018

@rodburns @DuncanMcBain Sorry for not reply at once ,I step in the queue define It seems that the error cause by the library
/opt/amdgpu-pro/lib/x86_64-linux-gnu/libamdocl12cl64.so

here is the gdb detail

Breakpoint 1, main ()
at /home/huoyushi/computecpp-sdk/samples/accessors/accessors.cpp:44
44 queue myQueue;
(gdb) s
cl::sycl::property_list::property_list<, void, void>() (this=0x7fffffffda70)
at /usr/local/computecpp/include/SYCL/property.h:201
201 property_list(propertyTN &&... props) {
(gdb) s
std::vector<std::shared_ptrcl::sycl::detail::property_base, std::allocator<std::shared_ptrcl::sycl::detail::property_base > >::vector (this=0x7fffffffda70)
at /usr/include/c++/5/bits/stl_vector.h:257
257 : _Base() { }
(gdb) s
std::_Vector_base<std::shared_ptrcl::sycl::detail::property_base, std::allocator<std::shared_ptrcl::sycl::detail::property_base > >::_Vector_base (
this=0x7fffffffda70) at /usr/include/c++/5/bits/stl_vector.h:125
125 : _M_impl() { }
(gdb) s
std::_Vector_base<std::shared_ptrcl::sycl::detail::property_base, std::allocator<std::shared_ptrcl::sycl::detail::property_base > >::_Vector_impl::_Vector_impl (this=0x7fffffffda70) at /usr/include/c++/5/bits/stl_vector.h:87
87 : _Tp_alloc_type(), _M_start(), _M_finish(), _M_end_of_storage()
(gdb) s
std::allocator<std::shared_ptrcl::sycl::detail::property_base >::allocator (
this=0x7fffffffda70) at /usr/include/c++/5/bits/allocator.h:113
113 allocator() throw() { }
(gdb) s
__gnu_cxx::new_allocator<std::shared_ptrcl::sycl::detail::property_base >::new_allocator (this=0x7fffffffda70) at /usr/include/c++/5/ext/new_allocator.h:79
79 new_allocator() _GLIBCXX_USE_NOEXCEPT { }
(gdb) s
std::_Vector_base<std::shared_ptrcl::sycl::detail::property_base, std::allocator<std::shared_ptrcl::sycl::detail::property_base > >::_Vector_impl::_Vector_impl (this=0x7fffffffda70) at /usr/include/c++/5/bits/stl_vector.h:88
88 { }
(gdb) s
cl::sycl::property_list::property_list<, void, void>() (this=0x7fffffffda70)
at /usr/local/computecpp/include/SYCL/property.h:202
202 this->reserve(sizeof...(props));
(gdb) s
std::vector<std::shared_ptrcl::sycl::detail::property_base, std::allocator<std::shared_ptrcl::sycl::detail::property_base > >::reserve (
this=0x7fffffffda70, __n=0) at /usr/include/c++/5/bits/vector.tcc:68
68 if (__n > this->max_size())
(gdb) s
std::vector<std::shared_ptrcl::sycl::detail::property_base, std::allocator<std::shared_ptrcl::sycl::detail::property_base > >::max_size (
this=0x7fffffffda70) at /usr/include/c++/5/bits/stl_vector.h:660
660 { return _Alloc_traits::max_size(_M_get_Tp_allocator()); }
(gdb) s
std::_Vector_base<std::shared_ptrcl::sycl::detail::property_base, std::allocator<std::shared_ptrcl::sycl::detail::property_base > >::_M_get_Tp_allocator (
this=0x7fffffffda70) at /usr/include/c++/5/bits/stl_vector.h:118
118 { return static_cast<const _Tp_alloc_type>(&this->_M_impl); }
(gdb) s
std::allocator_traits<std::allocator<std::shared_ptrcl::sycl::detail::property_base > >::max_size (__a=...) at /usr/include/c++/5/bits/alloc_traits.h:551
551 { return __a.max_size(); }
(gdb) s
__gnu_cxx::new_allocator<std::shared_ptrcl::sycl::detail::property_base >::max_size (this=0x7fffffffda70) at /usr/include/c++/5/ext/new_allocator.h:114
114 { return size_t(-1) / sizeof(_Tp); }
(gdb) s
std::vector<std::shared_ptrcl::sycl::detail::property_base, std::allocator<std::shared_ptrcl::sycl::detail::property_base > >::reserve (
this=0x7fffffffda70, __n=0) at /usr/include/c++/5/bits/vector.tcc:70
70 if (this->capacity() < __n)
(gdb) s
std::vector<std::shared_ptrcl::sycl::detail::property_base, std::allocator<std::shared_ptrcl::sycl::detail::property_base > >::capacity (
this=0x7fffffffda70) at /usr/include/c++/5/bits/stl_vector.h:736
736 - this->_M_impl._M_start); }
(gdb) n
std::vector<std::shared_ptrcl::sycl::detail::property_base, std::allocator<std::shared_ptrcl::sycl::detail::property_base > >::reserve (
this=0x7fffffffda70, __n=0) at /usr/include/c++/5/bits/vector.tcc:85
85 }
(gdb) n
cl::sycl::property_list::property_list<, void, void>() (this=0x7fffffffda70)
at /usr/local/computecpp/include/SYCL/property.h:203
203 detail::add_properties<propertyTN...>::apply(
(gdb) n
205 }
(gdb) n
[New Thread 0x7ffff6299700 (LWP 16381)]
[New Thread 0x7ffff5a98700 (LWP 16382)]
[New Thread 0x7ffff5297700 (LWP 16383)]
[New Thread 0x7ffff4a96700 (LWP 16384)]
[New Thread 0x7ffff4295700 (LWP 16385)]
0x00007fffed458a71 in ?? ()
from /opt/amdgpu-pro/lib/x86_64-linux-gnu/libamdocl12cl64.so

@DuncanMcBain
Copy link
Member

Hi @huoyushi, thanks for looking into this. The way to get the error code from the exception is to use the get_cl_code() method of the exception thrown. It will be a number which we can then map back to the OpenCL function call to see what has gone wrong.

@huoyushi
Copy link
Author

huoyushi commented Jan 5, 2018

Hi @DuncanMcBain @rodburns Here is the detail of the exception:
get_file_name: queue_detail.cpp
get_line_number: 123
get_cl_error_message: CL_OUT_OF_HOST_MEMORY
get_cl_code: -6
get_description(): Error: [ComputeCpp:RT0407] Failed to create OpenCL command queue
e.what: Error: [ComputeCpp:RT0407] Failed to create OpenCL command queue

@DuncanMcBain
Copy link
Member

Hi @huoyushi, thanks for your excellent detective work! From the OpenCL specification, "CL_OUT_OF_HOST_MEMORY [is returned] if there is a failure to allocate resources required by the OpenCL implementation on the host". Bizarre as it sounds, how long has it been since you restarted your computer? Sometimes OpenCL implementations can leak objects over time, and eventually be unable to allocate even the most basic OpenCL objects. I can tell you that the function call that is failing is very simple, but I've been surprised before :)

@mirh
Copy link

mirh commented Jan 5, 2018

CL_OUT_OF_HOST_MEMORY

lukeiwanski/tensorflow#167
If any, it's unfortunate that every time this happens you have to waste half a day debugging, instead of gracefully being shown the precise "name" for the usual error.

@Ruyk
Copy link
Contributor

Ruyk commented Jan 5, 2018

Thanks @mirh , that is a good point. We can sometimes detect this error and present a different message string, so I'll add an internal ticket to do so on a future ComputeCpp release...

@DuncanMcBain
Copy link
Member

Oh, I forgot about that issue, and they never responded to us. Thanks for the reminder!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants