Heterogeneous Computing with OpenCL 2.0 teaches OpenCL and parallel programming for complex systems that may include a variety of device architectures: multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units (APUs). This fully-revised edition includes the latest enhancements in OpenCL 2.0 including:
• Shared virtual memory to increase programming flexibility and reduce data transfers that consume resources • Dynamic parallelism which reduces processor load and avoids bottlenecks • Improved imaging support and integration with OpenGL
Designed to work on multiple platforms, OpenCL will help you more effectively program for a heterogeneous future. Written by leaders in the parallel computing and OpenCL communities, this book explores memory spaces, optimization techniques, extensions, debugging and profiling. Multiple case studies and examples illustrate high-performance algorithms, distributing work across heterogeneous systems, embedded domain-specific languages, and will give you hands-on OpenCL experience to address a range of fundamental parallel algorithms.
- Updated content to cover the latest developments in OpenCL 2.0, including improvements in memory handling, parallelism, and imaging support
- Explanations of principles and strategies to learn parallel programming with OpenCL, from understanding the abstraction models to thoroughly testing and debugging complete applications
- Example code covering image analytics, web plugins, particle simulations, video editing, performance optimization, and more
|Sold by:||Barnes & Noble|
|File size:||13 MB|
|Note:||This product may take a few minutes to download.|
About the Author
Dr. Kaeli has co-authored more than 200 critically reviewed publications. His research spans a range of areas including microarchitecture to back-end compilers and software engineering. He leads a number of research projects in the area of GPU Computing. He presently serves as the Chair of the IEEE Technical Committee on Computer Architecture. Dr. Kaeli is an IEEE Fellow and a member of the ACM.
Perhaad Mistry works in AMD’s developer tools group at the Boston Design Center focusing on developing debugging and performance profiling tools for heterogeneous architectures. He is presently focused on debugger architectures for upcoming platforms shared memory and discrete Graphics Processing Unit (GPU) platforms. Perhaad has been working on GPU architectures and parallel programming since CUDA 0.8 in 2007. He has enjoyed implementing medical imaging algorithms for GPGPU platforms and architecture aware data structures for surgical simulators. Perhaad's present work focuses on the design of debuggers and architectural support for performance analysis for the next generation of applications that will target GPU platforms.
Perhaad graduated after 7 years with a PhD from Northeastern University in Electrical and Computer Engineering and was advised by Dr. David Kaeli who the leads Northeastern University Computer Architecture Research Laboratory (NUCAR). Even after graduating, Perhaad is still a member of NUCAR and is advising on research projects on performance analysis of parallel architectures. He received a BS in Electronics Engineering from University of Mumbai and an MS in Computer Engineering from Northeastern University in Boston. He is presently based in Boston.
Dana Schaa received a BS in Computer Engineering from Cal Poly, San Luis Obispo, and an MS and PhD in Electrical and Computer Engineering from Northeastern University. He works on GPU architecture modeling at AMD, and has interests and expertise that include memory systems, microarchitecture, performance analysis, and general purpose computing on GPUs. His background includes the development OpenCL-based medical imaging applications ranging from real-time visualization of 3D ultrasound to CT image reconstruction in heterogeneous environments. Dana married his wonderful wife Jenny in 2010, and they live together in San Jose with their charming cats.
Table of Contents
Foreword Ch 1: Introduction Ch 2: Device Architectures Ch 3: Introduction to OpenCL Ch 4: Examples Ch 5: Execution Model Ch 6: host-side memory model Ch 7: device-side memory model Ch 8: Implementation Ch 9: Case study: Image Clustering and Search Ch 10: Profiling and Debugging Ch 11: C++ AMP Ch 12: WebCL Ch 13: Foreign Lands: Plugging OpenCL In