core - Opencl Workitems and streaming processors -


what relation between workitem , streaming processor(cuda core). read somewhere number of workitems should exceed number of cores, otherwise there no performance improvement. why so?? thought 1 core repsresents 1 workitem. can me understand this?

gpus , other hardware tend arithmetic faster can access of available memory. having many more work items have processors lets scheduler stagger memory use, while work items have read data using alu hardware processing.

here page optimization in opencl. scroll down " 2.4. removing 'costly' global gpu memory access", goes concept.


Comments

Popular posts from this blog

javascript - Slick Slider width recalculation -

jsf - PrimeFaces Datatable - What is f:facet actually doing? -

angular2 services - Angular 2 RC 4 Http post not firing -