core - Opencl Workitems and streaming processors -


what relation between workitem , streaming processor(cuda core). read somewhere number of workitems should exceed number of cores, otherwise there no performance improvement. why so?? thought 1 core repsresents 1 workitem. can me understand this?

gpus , other hardware tend arithmetic faster can access of available memory. having many more work items have processors lets scheduler stagger memory use, while work items have read data using alu hardware processing.

here page optimization in opencl. scroll down " 2.4. removing 'costly' global gpu memory access", goes concept.


Comments

Popular posts from this blog

java - unable show chart in xls document using jasper reports -

javascript - Slick Slider width recalculation -

jsf - PrimeFaces Datatable - What is f:facet actually doing? -