CS798 Multicore programming - W21 term

Instructor: Trevor Brown
E-mail: trevor ( d o t ) brown ( a t ) uwaterloo ( d o t ) ca     (include 798 in your subject)
Times: Asynchronous pre-recorded videos
Tutorials / discussion: weekly synchronous online, time to be decided by class poll
My office: DC 2338 ( not that I am allowed in it :) )
Course syllabus

Relevant Links

Slides

Class Topics Video Slides Supplementary material (Optional!)
Lecture 1 - Sep. 4 Threading model, linearizability, atomics, counters See Learn v2: PowerPoint | PDF (lower quality) Recommended: Paper that introduced Linearizability
Lecture 2 - Sep. 9 Sharded counters, cache coherence, false sharing, padding, locks, fetch-and-add See Learn v3: PowerPoint | PDF (lower quality)
Lecture 3 - Sep. 11 Instruction reordering, stacks See Learn v4: PowerPoint | PDF (lower quality) Recommended: Treiber stack (original paper)
A later stack technique (elimination)
Even later stacks
Hazard pointers (memory reclamation)
Treiber stack with hazard pointers for memory reclamation
Lecture 4 - Sep. 16 Linearizability proof for the lock-free stack, ordering vs concurrency See Learn v3: Powerpoint | PDF (lower quality)
Lecture 5 - Sep. 18 Harnessing disorder: relaxed (multi-)queues, NUMA effects and thread pinning See Learn v2: Powerpoint | PDF (lower quality) First lock-free queue | Newer queues
Multi-Queue paper (mostly about graph algorithms)
Lecture 6 - Sep. 23 Concurrent hash tables (lock-based and lock-free) See Learn v2: Powerpoint | PDF (lower quality)
Lecture 7 - Sep. 25 Concurrent hash tables with expansion See Learn v2: Powerpoint | PDF (lower quality) Expandable hash table we are studying
Lecture 8 - Sep. 30 Finishing hash table expansion, starting linked data structures See Learn v3: Powerpoint | PDF (lower quality)
Lecture 9 - Oct. 2 Sketch of Harris list, difficulty of doubly-linked lists with lock-free searches See Learn v2: Powerpoint | PDF (lower quality) Recommended(!): Harris' linked list
Lecture 10 - Oct. 7 Higher level sync primitives: Doubly-linked list via k-word CAS See Learn v2: Powerpoint | PDF (lower quality)
Lecture 11 - Oct. 9 Implementing KCAS (and DCSS) from single word CAS See Learn v2: Powerpoint | PDF (lower quality) Recommended: KCAS implementation we are studying
Lecture 12 - Oct. 21 Implementing KCAS, reclaiming descriptors (briefly) See Learn v1: Powerpoint | PDF (lower quality)
Lecture 13 - Oct. 23 Hardware transactional memory, and TLE See Learn v1: Powerpoint | PDF (lower quality) GCC transactional memory intrinsics
Lecture 14 - Oct. 28 Advanced transactional memory See Learn v2: Powerpoint | PDF (lower quality) My paper that I mentioned in class (wrapping code in a transaction and repeatedly transforming it)
Lecture 15 - Oct. 30 Versioned locks, snapshots, version-lock-based KCAS See Learn v3: Powerpoint | PDF (lower quality)
Lecture 16 - Nov. 4 Optimizing lock-based KCAS with HTM, OpenMP, debugging/perf See Learn v2: Powerpoint | PDF (lower quality)
Lecture 17 - Nov. 6 Epoch-based memory reclamation, experimental methodology See Learn v2: Powerpoint | PDF (lower quality) See section 5.2.3 of Fraser's thesis for a discussion of EBR
Another paper that describes EBR and does some performance experiments

Assignments

See Learn.

Student presentation guidelines

DETAILS NOT YET DECIDED. Talks might be synchronous or recorded videos. I will probably resolve this with a class vote/poll. More soon, on Learn...