Running parallel OpenCL kernels -
i have been looking opencl little while, see if useful in context, , while understand basics, i'm not sure understand how force multiple instances of kernel run in parallel.
in situation, application want run inherently sequential , takes (in cases) large input (hundreds of mb). however, application in question has number of different options/flags can set in cases make faster, or slower. hope can re-write application opencl , execute each option/flag in parallel, rather guessing sets of flags use.
my question this: how many kernels can graphics card run in parallel. can looked @ when purchasing? linked number of shaders, memory, or size of application/kernel?
additionally, while input application same each execution modify data in different way. need transfer input data each kernel separately allow this, or can each kernel allocate "local" memory.
finally, require multiple kernels, use work-items instead? in case, how determine how many work-items can run in parallel?
(reference: http://www.drdobbs.com/parallel/a-gentle-introduction-to-opencl/231002854?pgno=3)
your question seems pop time-to-time in various forums , on so. feature use run kernels separately on hardware level called device fission. read more extension on this page, or google "cl_ext_device_fission".
this extension has been enabled on cpus long time, not on gpus. newest graphics hardware might support device fission. need gpu @ least q2 2014 or newer, have research.
the way kernels run in parallel using opencl software queue them different command queues on same device. developers multiple queues harms performance, don't have experience personally.
Comments
Post a Comment