Search course

Use the search function to find more information about the study programmes and courses available at Chalmers. When there is a course homepage, a house symbol is shown that leads to this page.

Graduate courses

Departments' graduate courses for PhD-students.


Syllabus for

Academic year
EDA283 - Parallel computer organization and design
Syllabus adopted 2015-02-10 by Head of Programme (or corresponding)
Owner: MPCSN
7,5 Credits
Grading: TH - Five, Four, Three, Not passed
Education cycle: Second-cycle
Major subject: Computer Science and Engineering, Electrical Engineering, Information Technology

Teaching language: English
Open for exchange students
Block schedule: B

Course module   Credit distribution   Examination dates
Sp1 Sp2 Sp3 Sp4 Summer course No Sp
0115 Project 3,0 c Grading: TH   3,0 c    
0215 Written and oral assignments 3,0 c Grading: TH   3,0 c    
0315 Laboratory 1,5 c Grading: TH   1,5 c    

In programs



Professor  Sally McKee


EDA280   Parallel computer systems EDA281   Parallel computer organization and design EDA282   Parallel computer organization and design

  Go to Course Homepage


In order to be eligible for a second cycle course the applicant needs to fulfil the general and specific entry requirements of the programme that owns the course. (If the second cycle course is owned by a first cycle programme, second cycle entry requirements apply.)
Exemption from the eligibility requirement: Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling these requirements.

Course specific prerequisites

The course DAT105 Computer architecture or equivalent is required. The course TDA383 Concurrent programming is recommended.


From 1975 to 2005, the computer industry accomplished a phenomenal mission: in 30 years, we put a personal computer on every desk and in every pocket. In 2005, however, mainstream computing hit a wall, and the industry undertook a new mission: to put a personal parallel supercomputer on every desk, in every home, and in every pocket. In 2011, we completed the transition to parallel computing in all mainstream form factors, with the arrival of multicore tablets and smartphones. Soon this "build out" of multicore will deliver mainstream quad- and eight-core tablets and even the last single-core gaming console will become multicore. For the first time in the history of computing, mainstream hardware is no longer a single-processor von Neumann machine.
Power and temperature have joined performance as first-class design goals. High-performance computing platforms now strive for the highest performance/watt. This course looks at the design of current multicore systems with an eye towards how those designs are likely to evolve over the next decade. We also cover the historical origins of many design strategies that have re-emerged in current systems in different forms and contexts (e.g., data parallelism, VLIW parallelism, and thread-level parallelism).

Learning outcomes (after completion of the course the student should be able to)

Students will master the terminology and key concepts of parallel computer architecture in order to follow the research advances in this field; understand the principles behind parallel computer organization (especially principles for the design of the communication substrate to support different programming models); understand different models of parallelism from a historical perspective; and exhibit basic skills in the design of software for parallel computers and understanding of the issues involved in designing efficient parallel software.

Students will strengthen communication skills and demonstrate their learning through participating in lectures, labs, and smaller project-group meetings; completing homework assignments (to demonstrate breadth of learning); and writing a research survey in collaboration with students in their project groups and in cooperation with the instructor (to demonstrate depth of learning on a parallel computer design subject of their choosing). 


The course covers architectural techniques for designing parallel  computers (in particular, techniques supporting major programming  paradigms like message passing, shared memory, data parallelism,  explicit instruction-level parallelism [VLIW], and modern combinations of these, such as the combined thread-and-data parallelism supported by GPUs).

The content is divided into several parts: The first part  gives an historical overview of the types of machines introduced at key points in history, highlighting which innovations had lasting impact right away, which resurfaced later under different technology and cost parameters, and which fell by the wayside (so far). By looking at the evolution of (high-performance, parallel) computer designs from this perspective, students will see that even though a particular computer may not succeed in the marketplace, the innovations it realizes may have far-reaching impact (and recurring).
We will touch on interesting emerging architectures (e.g., ultra low power tiled architectures and architectures emphasizing FPGA-based reconfigurability) being investigated at Chalmers. 

The next part covers Flynn's taxonomy, implementation of data parallel "supercomputers" (and lessons learned from them), and implementation of computers with wide issue of explicitly parallel (and statically controlled) instructions (VLIW). These studies highlight the context in which each strategic approach was proposed and developed, and then in what contexts the ideas and design principles reappear today.

In order to motivate deeper discussions of different classes of parallel computers, the third part covers the canonical steps in designing efficient parallel  software. Important concepts are decomposing a sequential program into parallel  threads (or parallel data), balancing the load across (parallel) architectural resources,  reducing communication, and synchronizing parallel activity (e.g., parallel threads). 
Lab assignments will specifically focus on shared-memory machines, highlighting ramifications of different memory coherence strategies for different kinds of workloads.

The fourth part focuses on design principles for (shared memory) small-scale parallel computers under shared memory, e.g., design principles for multi-core microprocessors to support thread-level parallelism. Important concepts covered are cache coherence and consistency. We study bus-based snoopy-cache protocols, the inclusion property, and multi-phase protocols.

The fifth part deals with scalability of parallel computers, i.e., architectural techniques for scaling the number of processors to a higher count, specifically with respect to cache coherence protocols.

The sixth part deals with interconnection networks, an essential component in chip multiprocessors and scalable parallel computer systems. Concepts covered are routing, switching, and topology design for scalable interconnects.

A common thread running through all parts is a discussion of cost tradeoffs with
respect to performance, power, energy, verifiability, programmability, and maintainability. A second unifying theme is the memory bottleneck, and the importance of efficient resource management. Example project topics include multithreading, relaxed memory consistency models,  prefetching, and memory access scheduling.


The course is organized into lectures, exercises, lab assignments, and a longer writing project, all of which strive to build communication skills while focusing on the principles and practices of parallel computer design.


See separate literature list.
We will refer to the course textbook, but we will also rely on articles from the research literature, from trade magazines, and possibly from the popular press.


Multi-week written project, rather than a conventional exam.

Page manager Published: Thu 04 Feb 2021.