Advanced School on HPC Computing with GPU Accelerators

You are here


Large-scale GPU clusters are becoming the standard in our HPC scientific computing community. The complexity of these architectures, sporting many CPUs sockets and GPU devices in the same node, may represent an hard task to tame.

The Advanced School on HPC Computing with GPU Accelerators provides a comprehensive training path for developers who want to take their scientific applications to the next level.

The course will start with an architectural overview of modern GPU based heterogeneous clusters, focusing on their components, computing units, memory interconnections and data movement needs. We will teach you how to profile your applications, identify bottlenecks and select or optimize computational intensive sections to run on GPU accelerators. We will explain how to exploit concurrent execution on both CPUs and GPUs while optimizing data transfers and communications.

The course will cover both a high level (pragma-based) programming approach for a fast-porting startup, and a lower-level (language instructions) approach for finer grained computationally intensive tasks. A special attention will be given on performance tuning and techniques to overcome common data movement bottlenecks and access patterns.


By the end of the course, students will be able to:

  • understand the strengths and weaknesses of GPUs as accelerators
  • program GPU accelerated applications using both higher and lower level programming approaches
  • profile your application, identify bottlenecks, make a porting plan, refine and improve
  • make best use of independent execution queues for concurrent computing/data-movement operations

Target audience: 

Researchers and programmers interested in porting scientific applications or use efficient post-process and data-analysis techniques in modern heterogeneous HPC architectures.


A basic knowledge of C or Fortran is mandatory. Developer environment will be on Linux systems. A basic knowledge of any parallel programming technique/paradigm is recommended.

Intended for: 
Research Institutions
5 dd
Provided as: 

Next courses

Non sono previste edizioni di questo corso.

Any question?

For HPC and computer graphics courses, write to


Cineca is a non profit Consortium, made up of 102 Italian national institutions: Universities, Italian Research Institutions and the Italian Ministries of Universities and Education.

Today it is the largest Italian computing centre, one of the most important worldwide. With more seven hundred employees, it operates in the technological transfer sector through high performance scientific computing, the management and development of networks and web based services, and the development of complex information systems for treating large amounts of data.

It develops advanced Information Technology applications and services, acting like a trait-d'union between the academic world, the sphere of pure research and the world of industry and Public Administration. .

Visit the Cineca website