c++ - 3D Convolution with Intel MKL -
i have written c/c++
code uses intel mkl compute 3d convolution of array has about 300×200×200
elements. want apply kernel either 3×3×3
or 5×5×5
. both 3d input array , kernel have real values.
this 3d array stored 1d array of type double
in columnwise fashion. kernel of type double
, saved columnwise. example,
for( int k = 0; k < nk; k++ ) // loop through height. for( int j = 0; j < nj; j++ ) // loop through rows. for( int = 0; < ni; i++ ) // loop through columns. { ijk = + ni * j + ni * nj * k; my3darray[ ijk ] = 1.0; }
for computation of convolution, want perform not-in-place
fft on input array , kernel , prevent them getting modified (i need use them later in code) , backward computation in-place
.
when compare result obtained code 1 obtained matlab
different. kindly me fix issue? missing in code?
here matlab
code used:
a = ones( 10, 10, 10 ); kernel = ones( 3, 3, 3 ); aconvolved = convn( a, kernel, 'same' );
here c/c++
code:
#include <stdio.h> #include "mkl.h" void conv3d( double *in, double *ker, double *out, int nrows, int ncols, int nheights) { int ni = nrows; int nj = ncols; int nk = nheights; double *in_fft = new double [ni*nj*nk]; double *ker_fft = new double [ni*nj*nk]; dfti_descriptor_handle fft_desc = 0; mkl_long sizes[] = { nk, nj, ni }; mkl_long strides[] = { 0, nj*ni, ni, 1 }; dfticreatedescriptor( &fft_desc, dfti_double, dfti_real, 3, sizes ); dftisetvalue ( fft_desc, dfti_placement , dfti_not_inplace); // out-of-place computation. dftisetvalue ( fft_desc, dfti_input_strides , strides ); dftisetvalue ( fft_desc, dfti_output_strides, strides ); dftisetvalue ( fft_desc, dfti_backward_scale, 1/ni/nj/nk ); dfticommitdescriptor( fft_desc ); dfticomputeforward ( fft_desc, in , in_fft ); dfticomputeforward ( fft_desc, ker, ker_fft ); (long long = 0; < (long long)ni*nj*nk; ++i ) out[i] = in_fft[i]*ker_fft[i]; // in-place computation. dftisetvalue ( fft_desc, dfti_placement, dfti_inplace ); dfticommitdescriptor( fft_desc ); dfticomputebackward ( fft_desc, out ); dftifreedescriptor ( &fft_desc ); delete[] in_fft; delete[] ker_fft; } int main(int argc, char* argv[]) { int n = 10; int nkernel = 3; double *a = new double [n*n*n]; // array real. double *aconvolved = new double [n*n*n]; // convolved array real. double *kernel = new double [nkernel*nkernel*nkernel]; // kernel real. // fill array 'real' numbers. for( int = 0; < n*n*n; i++ ) a[ ] = 1.0; // fill kernel 'real' numbers. for( int = 0; < nkernel*nkernel*nkernel; i++ ) kernel[ ] = 1.0; // calculate convolution. conv3d( a, kernel, aconvolved, n, n, n ); printf("convolved:\n"); for( int = 0; < n*n*n; i++ ) printf( "%15.8f\n", aconvolved[i] ); delete[] a; delete[] kernel; delete[] aconvolved; return 0; }
you can't reverse fft real-valued frequency data (just magnitude). forward fft needs output complex data. done setting dfti_forward_domain
setting dfti_complex
.
dfticreatedescriptor( &fft_desc, dfti_double, dfti_complex, 3, sizes );
doing implicitly sets backward domain complex too.
you need complex data type. like,
mkl_complex16* in_fft = new mkl_complex16[ni*nj*nk];
this means have multiply both real , imaginary parts:
for (size_t = 0; < (size_t)ni*nj*nk; ++i) { out_fft[i].real = in_fft[i].real * ker_fft[i].real; out_fft[i].imag = in_fft[i].imag * ker_fft[i].imag; }
the output of inverse fft complex, , assuming input data real, can grab .real
component , result. means you'll need temporary complex output array (say, out_fft
above).
also note avoid artifacts, want size of fft (at least) m+n-1 on each dimension. choose next highest power of 2 speed.
i suggest implement in matlab first, using ffts. there many such implementations available (example), start basics , make simple function on own.
Comments
Post a Comment