Search course

Use the search function to find more information about the study programmes and courses available at Chalmers. When there is a course homepage, a house symbol is shown that leads to this page.

Graduate courses

Departments' graduate courses for PhD-students.


Syllabus for

Academic year
DAT400 - High-performance parallel programming  
Hög-prestanda parallell programmering
Syllabus adopted 2019-02-07 by Head of Programme (or corresponding)
Owner: MPHPC
7,5 Credits
Grading: TH - Five, Four, Three, Fail
Education cycle: Second-cycle
Major subject: Computer Science and Engineering, Information Technology

Teaching language: English
Application code: 86114
Open for exchange students: Yes
Block schedule: B+
Maximum participants: 80

Module   Credit distribution   Examination dates
Sp1 Sp2 Sp3 Sp4 Summer course No Sp
0119 Laboratory 3,0 c Grading: UG   3,0 c    
0219 Examination 4,5 c Grading: TH   4,5 c   30 Oct 2019 pm M   07 Jan 2020 am M   24 Aug 2020 am J

In programs



Miquel Pericas

  Go to Course Homepage


In order to be eligible for a second cycle course the applicant needs to fulfil the general and specific entry requirements of the programme that owns the course. (If the second cycle course is owned by a first cycle programme, second cycle entry requirements apply.)
Exemption from the eligibility requirement: Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling these requirements.

Course specific prerequisites

The course DAT017 - Machine oriented programming or similar course is required. The course TDA384 - Principles for Concurrent programming is recommended.


This course looks at parallel programming models, efficient programming methodologies and performance tools with the objective of developing highly efficient parallel programs.

Learning outcomes (after completion of the course the student should be able to)

Knowledge and Understanding
- List the different types of parallel computer architectures, programming models and paradigms, as well as different schemes for synchronization and communication.
- List the typical steps to parallelize a sequential algorithm
- List different methods for analysis methodologies of parallel program systems

Competence and skills
- Apply performance analysis methodologies to determine the bottlenecks in the execution of a parallel program
- Predict the upper limit to the performance of a parallel program

Judgment and approach
- Given a particular software, specify what performance bottlenecks are limiting the efficiency of parallel code and select appropriate strategies to overcome these bottlenecks
- Design energy-aware parallelization strategies based on a specific algorithms structure and computing system organization
- Argue which performance analysis methods are important given a specific context


The course consists of a set of lectures and laboratory sessions. The lectures start with an overview of parallel computer architectures and parallel programming models and paradigms. An important part of the discussion are mechanisms for synchronization and data exchange. Next, performance analysis of parallel programs is covered. The course proceeds with a discussion of tools and techniques for developing parallel programs in shared address spaces. This section covers popular programming environments such as pthreads and OpenMP. Next the course discusses the development of parallel programs for distributed address space. The focus in this part is on the Message Passing Interface (MPI). Finally, we discuss programming approaches for executing applications on accelerators such as GPUs. This part introduces the CUDA (Compute Unified Device Architecture) programming environment.

The lectures are complemented with a set of laboratory sessions in which participants explore the topics introduced in the lectures. During the lab sessions, participants parallelize sample programs over a variety of parallel architectures, and use performance analysis tools to detect and remove bottlenecks in the parallel implementations of the programs.


The teaching consists of theory-oriented lectures and lab sessions in which the participants develop code for different types of parallel computer systems


Parallel Programming for Multicore and Cluster Systems, Thomas Rauber, Gudula Rünger (2nd edition, 2013)

Examination including compulsory elements

The course is examined by an individual written exam that is carried out in an examination hall and a laboratory report written in groups of two.

The final grade of the course is based on the weighted average of the grades of the individual subcourses. Each individual subcourse will be graded on a scale of F,3,4,5

Page manager Published: Thu 04 Feb 2021.