Question: Performance implications when using tensors [N, C, 1, 1] vs [N, C] #1703

densamoilov · 2022-06-09T20:49:05Z

densamoilov
Jun 9, 2022

Suppose I want to perform miopenOpTensor operation for a 2D tensor. What are the performance implications when miopenOpTensor is used for a tensor [N, C, 1, 1] vs [N, C]? I found that there are different code paths for tensors with different number of dimensions, for example [N, C, 1, 1] case will go to the code path under USE_4D_TENSOR_GENERIC.

v01dXYZ · 2022-06-29T06:27:47Z

v01dXYZ
Jun 29, 2022

Some details when looking at the code:

OpTensor with [N, C, 1, 1] uses AddKernel with the following arguments

       algorithm="OpTensorFwdBias" | "OpTensorFwdBiasGeneric" | "Op4dTensorLite" | "OpTensorLeadingOnes" | "OpTensorLeadingOnesGeneric" 
       kernel_name=<same as algorithm>
       program_name="MIOpenTensorKernels.cl"
       local_dims=...
       global_dims=...
       params=shared_params +  " -DUSE_FWD_BIAS" | " -DUSE_FWD_BIAS_GENERIC" | " -DUSE_4D_TENSOR_LITE" | " -DUSE_LEADING_ONES" | " -DUSE_LEADING_ONES_GENERIC" | " -DUSE_4D_TENSOR_GENERIC"

OpTensor with [N, C] uses AddKernel with the following arguments

       algorithm="Op2dTensorGeneric"
       kernel_name=<same as algorithm>
       program_name="MIOpenTensorKernels.cl"
       local_dims=... 
       global_dims=.. 
       params=shared_params +  " -DUSE_2D_TENSOR_GENERIC"

Thus we will have the two following kernels:

2D

https://github.com/ROCmSoftwarePlatform/MIOpen/blob/8d67ae8b6a84b48debd3f140482b077b104c32dd/src/kernels/MIOpenTensorKernels.cl#L815-L864

4D

https://github.com/ROCmSoftwarePlatform/MIOpen/blob/8d67ae8b6a84b48debd3f140482b077b104c32dd/src/kernels/MIOpenTensorKernels.cl#L322-L391

I don't expect a lot of performance differences for one single run but that's just a guess without data to support it.

comments that explain the naming of the tensor axis (in case of it was not clear):

For 4D:

https://github.com/ROCmSoftwarePlatform/MIOpen/blob/8d67ae8b6a84b48debd3f140482b077b104c32dd/src/kernels/MIOpenTensorKernels.cl#L911-L914

For 5D:

https://github.com/ROCmSoftwarePlatform/MIOpen/blob/8d67ae8b6a84b48debd3f140482b077b104c32dd/src/kernels/MIOpenTensorKernels.cl#L396-L397

0 replies

densamoilov · 2022-07-05T03:54:05Z

densamoilov
Jul 5, 2022
Author

I don't expect a lot of performance differences for one single run but that's just a guess without data to support it.

The reason I have my question is that MIOpen has different code paths for tensors with different number of dimensions. So I wanted to clarify whether it makes sense to create tensor descriptors for the exact number of the given dimensions or I could just generalize the logic to always use a 4D tensor. Apparently, it works from functional perspective but I'm worrying about performance.

0 replies

atamazov · 2022-07-20T20:44:20Z

atamazov
Jul 20, 2022

@densamoilov

So I wanted to clarify whether it makes sense to create tensor descriptors for the exact number of the given dimensions or I could just generalize the logic to always use a 4D tensor.

For better performance I recommend using the former approach for now.

AFAIU the singular dimensions (those of size 1) can be relatively easily "collapsed into nothing" so the library would automatically switch from 4D to 3D tensors, for example. But this kind of optimization has not yet been designed/implemented.

Please let us know if you would like this optimization to be implemented (and indicate how urgent/important is this for you).

0 replies

densamoilov · 2022-07-21T01:01:53Z

densamoilov
Jul 21, 2022
Author

That was mostly a general question about possible approaches. Thanks for clarifying this.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Performance implications when using tensors [N, C, 1, 1] vs [N, C] #1703

{{title}}

Replies: 4 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Question: Performance implications when using tensors [N, C, 1, 1] vs [N, C] #1703

densamoilov Jun 9, 2022

Replies: 4 comments

v01dXYZ Jun 29, 2022

densamoilov Jul 5, 2022 Author

atamazov Jul 20, 2022

densamoilov Jul 21, 2022 Author

densamoilov
Jun 9, 2022

v01dXYZ
Jun 29, 2022

densamoilov
Jul 5, 2022
Author

atamazov
Jul 20, 2022

densamoilov
Jul 21, 2022
Author