DM207 I/O-Efficient Algorithms and Data Structures

Fall 2011
Rolf Fagerberg


Official Course Description

See the course description at the web pages of the faculty.

Time and Place

The course starts Wednesday, August 31 (the teacher has overlapping other obligations Monday, August 29).

The schedule in the 1st quarter is as follows:

  • Mondays 14.15 to 16.00 in U49B (Sept 5), U49E (Sept 19), and U49 (Sept 12, Sept 26, Oct 3, Oct 10).
  • Wednesdays 14.15 to 16.00 in Imadas seminar room.

The schedule in the 2nd quarter is as follows:

with the following exceptions:

  • Monday, Nov 7 (instead of Tuesday, Nov 8) 10:15 to 12:00 in U49B
  • Thursday, Nov 10 in U49C
  • Monday, Nov 21 (instead of Tuesday, Nov 22) 10:15 to 12:00 in U57 (10-11) and U141 (11-12).
  • In week 48, only the Thursday lecture will take place (Thursday, December 1).
  • Monday, Dec 5 (instead of Tuesday, Dec 6) 10:15 to 12:00 in U57 (10-11) and U17 (11-12).
  • Monday, Dec 19 (instead of Tuesday, Nov 29) 10:15 to 12:00 in U26
  • Tuesday, Dec 20 there will only be lectures 15:15 to 16:00 (still in Imadas seminar room).

Teaching material

The course is based on handouts (no textbook).

Examination

The final exam is oral, with grades on the 7-point marking scale. There are projects which must be passed during the course in order to attend the oral exam.

The exam date is January 24. The exam will take place in Imadas seminar room, with preparation in U49.

There will be a spørgetime (session for asking questions on the exam and the curriculum) Monday, January 23, at 10.15 in U49C.

The curriculum is everything mentioned under the heading Reading in the list of lectures below (respecting the reading guides stated).

At the oral exam, you will draw one topic from a list of exam topics. The topic delineates the part of curriculum which you are to present. The full details of the exam format are described at the bottom of the list of exam topics.

Lectures

  • August 31. Introduction to course. The I/O-model.

    Reading: The slides, pages 1-19. Section 1 of The Input/Output Complexity of Sorting and Related Problems. A. Aggarwal and J. S. Vitter. Communications of the ACM 31 (9): 1116--1127, 1988 (reading guide: set P=1 [i.e. forget about the parameter P], this parameter has since been removed from the basic model.)

  • September 5. Stacks, and queues. Selection.

    Reading: Last page two pages of the slides. If you want a more detailed reminder of the classic linear selection algorithm, it can be found in any good textbook on algorithms and data structures, e.g. Section 9.3 in Introduction to Algorithms, Cormen, Leiserson, Rivest, Stein, 3rd edition, MIT Press, 2009 (not curriculum).

  • September 7. More on classical sorting algorithms viewed in the I/O-model. Optimal sorting in the I/O-model: multiway mergesort, start on lower bound proof.

    Reading: Pages 1-6 and 9-15 of the slides.

  • September 12. End of lower bound proof for sorting.

    Reading: Pages 16-20 and 22 of the slides.

  • September 14. Last bits of lower bound proof for sorting. Sorting by distribution: multiway quicksort.

    Reading: Pages 21 and 7-8 of the slides. External partition element finding, lecture notes by L. Arge and M. G. Lagoudakis. For the original appearance of the upper and lower bounds on sorting, see sections 2, 3, 4, and 5 of The Input/Output Complexity of Sorting and Related Problems. A. Aggarwal and J. S. Vitter. Communications of the ACM 31 (9): 1116--1127, 1988 (Reading guide: set P=1 [i.e. forget about the parameter P], and do not read about other problems than sorting.)

  • September 19. Handout and discussion of Project 1. Start on permuting, upper and lower bounds.

    Reading: Pages 1-7 of the slides.

  • September 21. More details of permuting lower bound proof.

    Reading: Pages 1-7 of the slides.

    For the original appearance of the upper and lower bounds on permuting, see sections 2, 3, 4, and 5 of The Input/Output Complexity of Sorting and Related Problems. A. Aggarwal and J. S. Vitter. Communications of the ACM 31 (9): 1116--1127, 1988 (Reading guide: set P=1 [i.e. forget about the parameter P], and do not read about other problems than permuting.)

  • September 26. A few last math details from permuting lower bound proof. Start on searching: upper and lower bounds for static external comparison based searching.

    Reading: Same as last time, plus Section 1 and page 3 in External Memory Geometric Data Structures, lecture notes by L. Arge. Overview over which machines in the Imada terminal room not to use for experiments.

  • September 28. Dynamic searching: (a,b)-trees (aka. B-trees).

    Reading: Page 4 in External Memory Geometric Data Structures, lecture notes by L. Arge. Pages 1-4 in notes from lectures on (a,b)-trees.

  • October 3. More on (a,b)-trees.

    Reading: Rest of section 2 in External Memory Geometric Data Structures, lecture notes by L. Arge. Pages 5-6 in notes from lecture on (a,b)-trees. Pages 1-4 in Amortized Analysis of (a,b)-Trees, lecture notes by R. Fagerberg. Notes from lecture on amortized rebalancing of (a,b)-trees.

  • October 5. End of amortized analysis of (a,b)-trees.

    Reading: Rest of Amortized Analysis of (a,b)-Trees, lecture notes by R. Fagerberg. Notes from lecture on amortized rebalancing of (a,b)-trees.

  • October 10. Weight-balanced B-trees.

    Section 3 in External Memory Geometric Data Structures, lecture notes by L. Arge. Notes from lecture on weight-balanced B-trees.

  • October 12. Discussion of Project 1. End of weight-balanced B-trees.

    Reading: Same as last time.

  • November 7. Handback of Project 1. Start on external priority queues.

    Reading: Sections 2, 3, and 4 (until end of proof of Thm. 4 on page 12) in Heaps and heapsort on secondary storage by R. Fadel, K.V. Jakobsen, J. Katajainen, and J. Teuhola. In Theoretical Computer Science, 220 (2), 1999, pages 345-362. Notes from lecture on external priority queues.

  • November 10. Handout and discussion of Project 2. More on external priority queues.

    Reading: Same as last time.

  • November 15. End of external priority queues. Start on external interval trees.

    Reading: Section 6 in External Memory Geometric Data Structures, lecture notes by L. Arge. Notes from lecture on external interval trees (part I).

  • November 17. More on external interval trees.

    Reading: Same as last time.

  • November 21. End of external interval trees.

    Reading: Notes from lecture on external interval trees (part II).

  • November 24. External priority search trees.

    Reading: Section 7 in External Memory Geometric Data Structures, lecture notes by L. Arge.

  • December 1. End of external priority search trees.

    Reading: Same as last time.

  • December 5. External range trees.

    Reading: Section 8.1 (and "8.0") in External Memory Geometric Data Structures, lecture notes by L. Arge.

  • December 8. End of external range trees. Start on list ranking.

    Reading: Same as last time, plus first pages of notes from lecture on list ranking and Section 3.1 in An Optimal Cache-Oblivious Priority Queue and its Application to Graph Algorithms by L. Arge, M.A. Bender, E. Demaine, B. Holland-Minkley, and J.I. Munro. In SIAM Journal on Computing, 36(6), 2007, pages 1672-1695. (For now, just disregard the word cache-oblivious in the text.)

  • December 13. End of list ranking.

    Reading: Rest of notes from lecture on list ranking. Section 3.1 in An Optimal Cache-Oblivious Priority Queue and its Application to Graph Algorithms by L. Arge, M.A. Bender, E. Demaine, B. Holland-Minkley, and J.I. Munro. In SIAM Journal on Computing, 36(6), 2007, pages 1672-1695. (For now, just disregard the word cache-oblivious in the text.)

  • December 15. Algorithms on trees: Euler tour, DFS. Start on BFS on undirected graphs. Handout of Project 3.

    Reading: Notes from lecture on Euler tour. Section 3.2 in An Optimal Cache-Oblivious Priority Queue and its Application to Graph Algorithms by L. Arge, M.A. Bender, E. Demaine, B. Holland-Minkley, and J.I. Munro. In SIAM Journal on Computing, 36(6), 2007, pages 1672-1695. (For now, just disregard the word cache-oblivious in the text.) Section 1, 2, and 4 in External-Memory Breadth-First Search with Sublinear I/O by K. Mehlhorn and U. Meyer. In proceedings of 10th Ann. European Symposium on Algorithms (ESA), Lecture Notes in Computer Science vol. 2461, Springer, 2002, pages 723-735. Reading guide: set D=1 (single disk).

  • December 19. External BFS in o(V) time.

    Reading: Section 3, 5.2, and 6 in External-Memory Breadth-First Search with Sublinear I/O by K. Mehlhorn and U. Meyer. In proceedings of 10th Ann. European Symposium on Algorithms (ESA), Lecture Notes in Computer Science vol. 2461, Springer, 2002, pages 723-735. Reading guide: set D=1 (single disk), and skip Section 5.1 on the randomized preprocessing phase (some of the statements regarding this have turned out not to hold). We discuss the deterministic version only (which requires reading Section 5.2).

  • December 20. Cache-oblivious model. Cache-oblivious static searching.

    Reading: Chapter Cache-oblivious Model by R. Fagerberg. In Encyclopedia of Algorithms, Ming-Yang Kao (ed.), Springer 2008. Sections 38.1 and 38.2.1 in Cache-Oblivious Data Structures by L. Arge, G.S. Brodal, and R. Fagerberg. In Handbook of Data Structures and Applications, Dinesh Mehta and Sartaj Sahni (ed.), CRC Press, 2005.

  • December 22. Handback of Project 2. Cache-oblivious dynamic searching.

    Reading: Section 38.3.1 in Cache-Oblivious Data Structures by L. Arge, G.S. Brodal, and R. Fagerberg. In Handbook of Data Structures and Applications, Dinesh Mehta and Sartaj Sahni (ed.), CRC Press, 2005. Sections 1, 2, 3.1, and 3.2 in Cache-Oblivious Search Trees via Binary Trees of Small Height by G.S. Brodal, R. Fagerberg, and R. Jacob. In Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 39-48, 2002. Reading guide: You may skip any mentioning of BFS, DFS, and Inorder layout (only the van Emde Boas layout is in focus in this course). The implicit navigation in Section 2 is not curriculum.


Maintained by Rolf Fagerberg (rolf@imada.sdu.dk)