gpucoder.stencilKernel
CreateCUDAcode for stencil functions
Description
B = gpucoder.stencilKernel(FUN,A,[M N],shape,param1,param2...)
applies the functionFUN
to each[M,N]
sliding window of the inputA
. FunctionFUN
is called for each[M,N]
submatrix ofA
and computes an element of outputB
. The index of this element corresponds to the center of the[M,N]
window.
FUN
is the handle to a user-defined function that returns a scalar output of the same type as the input.
C= FUN(X,param1,param2, ...)
X
is the[M,N]
submatrix of the original inputA
.X
can be zero-padded when necessary, for instance at the boundaries of inputA
.X
and the window can also be 1-D.
C
is a scalar valued output ofFUN
. It is the output computed for the center element of the[M,N]
arrayX
and is assigned to the corresponding element of the output arrayB
.
param1,param2
are optional arguments. Pass these arguments ifFUN
requires any additional parameters in addition to the input window.
The window[M,N]
must be less than or equal to the size ofA
, with the same shape asA
.
IfA
is 1-D row vector, the window must be[1,N]
.
IfA
is 1-D column vector, the window must be[N,1]
.
shape
determines the size of the output arrayB
. It can have one of three possible values:
'same'
- Returns outputB
that is the same size asA
.'full'
- (default) Returns the full output. Size ofB
> size ofA
, that is, ifA
is of size (x,y). Size ofB = [x + floor(M/2), y + floor(N/2)]
'valid'
- Returns only those parts of the output that are computed without the zero-padded edges ofA
. Size ofB = [x - floor(M/2), y - floor(N/2)]
The inputA
must be a vector or matrix with a numeric type supported byFUN
. The class ofB
is the same as the class ofA
.
Code generation is supported only for fixed size outputs. Shape and window must be compile-time constants because they determine the size of the output.
Examples
Limitations
For very large input sizes, the
gpucoder.stencilKernel
function may produce CUDA code that does not numerically match the MATLAB®simulation. In such cases, consider reducing the size of the input to produce accurate results..
Version History
See Also
Apps
Functions
codegen
|coder.gpu.kernel
|gpucoder.matrixMatrixKernel
|coder.gpu.constantMemory
|gpucoder.reduce
|gpucoder.sort
|coder.gpu.nokernel