DM207 I/O-Efficient Algorithms and Data Structures

Fall 2015
Rolf Fagerberg


Official Course Description

See the course description at the web pages of the faculty.

Time and Place

The course starts Monday, August 31.

The schedule is as follows:

Teaching material

The course is based on handouts (no textbook).

Examination

The final exam is oral, with grades on the 7-point marking scale. The exam date is January 26. There are projects which must be passed during the course in order to attend the oral exam.

At the oral exam, you will draw one topic from a list of exam topics. The topic delineates the part of curriculum which you are to present. The full details of the exam format and curriculum are described at the bottom of the list of exam topics.

Lectures

Date Time Room Contents Reading
Monday, August 31 09-11 Imada seminar room Introduction to the course. The I/O-model. Pages 1--18 of the introduction sldes. For the original appearance of the I/O-model, see Section 1 of The Input/Output Complexity of Sorting and Related Problems. A. Aggarwal and J. S. Vitter. Communications of the ACM 31 (9): 1116--1127, 1988 (not curriculum). reading guide: set P=1 [i.e. forget about the parameter P], this parameter has since been removed from the basic model.
Wednesday, September 2 08-10 Imada seminar room Overview of basic bounds in the I/O-model. Stacks, and queues. Selection. Multi-way mergesort. Pages 19--22 of the introduction slides. If you want a more detailed reminder of the classic linear selection algorithm, it can be found in any good textbook on algorithms and data structures, e.g. Section 9.3 in Introduction to Algorithms, Cormen, Leiserson, Rivest, Stein, 3rd edition, MIT Press, 2009 (not curriculum). Pages 1-6 of the slides on sorting.
Monday, September 7 09-11 Imada seminar room Sorting by distribution: multiway quicksort. Pages 7-8 of the slides. External partition element finding, lecture notes by L. Arge and M. G. Lagoudakis.
Wednesday, September 9 08-10 Imada seminar room Start on lower bound proof for sorting. Pages 9-19 of the slides. For the original appearance of the upper and lower bounds on sorting, see sections 2, 3, 4, and 5 of The Input/Output Complexity of Sorting and Related Problems. A. Aggarwal and J. S. Vitter. Communications of the ACM 31 (9): 1116--1127, 1988 (not curriculum). Reading guide: set P=1 [i.e. forget about the parameter P], and do not read about other problems than sorting.
Monday, September 14 09-11 Imada seminar room End of lower bound proof for sorting. Pages 11-22 of the slides.
Wednesday, September 16 08-10 Imada seminar room CPU time for multiway mergesort. Permuting, upper and lower bounds. Notes from lecture on CPU time for multiway mergesort. The slides on permuting. For the original appearance of the upper and lower bounds on permuting, see sections 2, 3, 4, and 5 of The Input/Output Complexity of Sorting and Related Problems. A. Aggarwal and J. S. Vitter. Communications of the ACM 31 (9): 1116--1127, 1988 (not curriculum) Reading guide: set P=1 [i.e. forget about the parameter P], and do not read about other problems than permuting.
Monday, September 21 09-11 Imada seminar room Searching. Multiway search trees (alias (ab)-trees or B-trees). Section 1 and page 3 in External Memory Geometric Data Structures, lecture notes by L. Arge. Page 1 of notes from lectures on (a,b)-trees.
Wednesday, September 23 08-10 Imada seminar room Handout and discussion of Project 1. Rebalancing of (a,b)-trees. Rest of section 2 in External Memory Geometric Data Structures, lecture notes by L. Arge. Rest of notes from lecture on (a,b)-trees.
Monday, September 28 09-11 Imada seminar room Amortized analysis of rebalancing of (a,b)-trees. Page 1-4 of Amortized Analysis of (a,b)-Trees, lecture notes by R. Fagerberg. Notes from lecture on amortized rebalancing of (a,b)-trees.
Wednesday, September 30 09-11 Imada seminar room End of amortized analysis of rebalancing of (a,b)-trees. Lower bounds for searching. Page 1-4 of Amortized Analysis of (a,b)-Trees, lecture notes by R. Fagerberg. Notes from lecture on amortized rebalancing of (a,b)-trees. Notes from lecture on optimality of (a,b)-trees for comparison based searching.
Monday, October 5 09-11 Imada seminar room No lecture due to lecturer traveling.  
Wednesday, October 7 08-10 Imada seminar room Weight-balanced B-trees. Section 3 in External Memory Geometric Data Structures, lecture notes by L. Arge. Notes from lecture on weight-balanced B-trees.
Monday, October 19 09-11 Imada seminar room External priority queues. Sections 2, 3, and 4 (until end of proof of Thm. 4) in Heaps and heapsort on secondary storage by R. Fadel, K.V. Jakobsen, J. Katajainen, and J. Teuhola. In Theoretical Computer Science, 220 (2), 1999, pages 345-362. Notes from lecture on external priority queues.
Wednesday, October 21 08-10 Imada seminar room External interval trees. Section 6 in External Memory Geometric Data Structures, lecture notes by L. Arge. Notes from lecture on external interval trees (part I).
Monday, October 26 09-11 Imada seminar room More on interval trees. Notes from lecture on external interval trees (part II).
Wednesday, October 28 08-10 Imada seminar room End of external interval trees. Start on external priority search trees. Section 7 in External Memory Geometric Data Structures, lecture notes by L. Arge.
Monday, November 2 09-11 Imada seminar room End of external priority search trees. Section 7 in External Memory Geometric Data Structures, lecture notes by L. Arge.
Wednesday, November 4 08-10 Imada seminar room External range trees. Section 8.1 (and "8.0") in External Memory Geometric Data Structures, lecture notes by L. Arge.
Monday, November 9 09-11 Imada seminar room No lecture.

Wednesday, November 11 08-10 Imada seminar room Hand-back of project 1. List ranking. Notes from lecture on list ranking and Section 3.1 in An Optimal Cache-Oblivious Priority Queue and its Application to Graph Algorithms by L. Arge, M.A. Bender, E. Demaine, B. Holland-Minkley, and J.I. Munro. In SIAM Journal on Computing, 36(6), 2007, pages 1672-1695. (For now, just disregard the word cache-oblivious in the text.)
Monday, November 16 09-11 Imada seminar room Feedback on project 1. End of list ranking.

Wednesday, November 18 08-10 Imada seminar room Algorithms on trees: Euler tour, DFS. Start on BFS in general undirected graphs. Notes from lecture on Euler tour. Section 3.2 in An Optimal Cache-Oblivious Priority Queue and its Application to Graph Algorithms by L. Arge, M.A. Bender, E. Demaine, B. Holland-Minkley, and J.I. Munro. In SIAM Journal on Computing, 36(6), 2007, pages 1672-1695. (For now, just disregard the word cache-oblivious in the text.) Section 4 in External-Memory Breadth-First Search with Sublinear I/O by K. Mehlhorn and U. Meyer. In proceedings of 10th Ann. European Symposium on Algorithms (ESA), Lecture Notes in Computer Science vol. 2461, Springer, 2002, pages 723-735. Reading guide: set D=1 (single disk).
Monday, November 23 09-11 Imada seminar room External BFS in undirected graphs in o(V) time. Section 3, 5.2, and 6 in External-Memory Breadth-First Search with Sublinear I/O by K. Mehlhorn and U. Meyer. In proceedings of 10th Ann. European Symposium on Algorithms (ESA), Lecture Notes in Computer Science vol. 2461, Springer, 2002, pages 723-735. Reading guide: set D=1 (single disk), and skip Section 5.1 on the randomized preprocessing phase (some of the statements regarding this have turned out not to hold). We discuss the deterministic version only (which requires reading Section 5.2).
Wednesday, November 25 08-10 Imada seminar room Start on external MST, version based on Prim's algorithm. Section 2.1 in On External-Memory MST, SSSP and Multi-way Planar Graph Separation by L. Arge, G.S. Brodal, L. Toma. In Journal of Algorithms, 53(2), 2004, pages 186-206. [The same material is covered in section 3.4.2 in An Optimal Cache-Oblivious Priority Queue and its Application to Graph Algorithms by L. Arge, M.A. Bender, E. Demaine, B. Holland-Minkley, and J.I. Munro. In SIAM Journal on Computing, 36(6), 2007, pages 1672-1695. (For now, just disregard the word cache-oblivious in the text.)]
Monday, November 30 09-11 Imada seminar room More on external MST, version based on Boruvka's algorithm. Section 2.2.1 in On External-Memory MST, SSSP and Multi-way Planar Graph Separation by L. Arge, G.S. Brodal, L. Toma. In Journal of Algorithms, 53(2), 2004, pages 186-206. The same material is covered in section 3.4.1 in An Optimal Cache-Oblivious Priority Queue and its Application to Graph Algorithms by L. Arge, M.A. Bender, E. Demaine, B. Holland-Minkley, and J.I. Munro. In SIAM Journal on Computing, 36(6), 2007, pages 1672-1695. (For now, just disregard the word cache-oblivious in the text.) Note that the method for distributing the IDs of the new connected components during a contraction stage is not the same in the two papers. In class, we used the method from the latter of the two papers.
Wednesday, December 2 08-10 Imada seminar room End of external MST, the superphase algorithm. Handout and discussion of Project 2. Section 2.2.2 in On External-Memory MST, SSSP and Multi-way Planar Graph Separation by L. Arge, G.S. Brodal, L. Toma. In Journal of Algorithms, 53(2), 2004, pages 186-206.
Monday, December 7 09-11 Imada seminar room End of the superphase algorithm. The cache-oblivious model. Section 38.1 in Cache-Oblivious Data Structures by L. Arge, G.S. Brodal, and R. Fagerberg [handout in class]. In Handbook of Data Structures and Applications, Dinesh Mehta and Sartaj Sahni (ed.), CRC Press, 2005. The slides on the cache-oblivious model (up to page 21).
Wednesday, December 9 08-10 Imada seminar room Cache-oblivious double for-loop. Cache-oblivious static searching. Section 38.2.1 in Cache-Oblivious Data Structures by L. Arge, G.S. Brodal, and R. Fagerberg [handout in class]. In Handbook of Data Structures and Applications, Dinesh Mehta and Sartaj Sahni (ed.), CRC Press, 2005. The slides on the cache-oblivious model (pages 15-23).
Monday, December 14 09-11 Imada seminar room Cache-obliviuos range search. Cache-obliviuos dynamic searching. Sections 38.2.1 and 38.3.1 in Cache-Oblivious Data Structures by L. Arge, G.S. Brodal, and R. Fagerberg [handout in class]. In Handbook of Data Structures and Applications, Dinesh Mehta and Sartaj Sahni (ed.), CRC Press, 2005. The slides on the cache-oblivious model (pages 24-32).
Wednesday, December 16 08-10 Imada seminar room End of cache-oblivious dynamic searching. Sections 38.3.1 in Cache-Oblivious Data Structures by L. Arge, G.S. Brodal, and R. Fagerberg [handout in class]. In Handbook of Data Structures and Applications, Dinesh Mehta and Sartaj Sahni (ed.), CRC Press, 2005. Sections 3.1, and 3.2 in Cache-Oblivious Search Trees via Binary Trees of Small Height by G.S. Brodal, R. Fagerberg, and R. Jacob. In Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 39-48, 2002.
Monday, December 21 09-11 Imada seminar room Cache-obliviuos sorting. Section 38.2.2 in Cache-Oblivious Data Structures by L. Arge, G.S. Brodal, and R. Fagerberg [handout in class]. In Handbook of Data Structures and Applications, Dinesh Mehta and Sartaj Sahni (ed.), CRC Press, 2005. The slides on the cache-oblivious model (pages 33-44).


Maintained by Rolf Fagerberg (rolf@imada.sdu.dk)