COMPUTER SCIENCE COLLOQUIUM (and Graduate Seminar) Input/Output Efficient Algorithms: Theory and Practice Professor Lars Arge Department of Computer Science Duke University April 27 (Monday), 1998 at 4:00 p.m. 1320 Digital Computer Laboratory In many modern applications that deal with massive data sets, communication between internal and external memory is the bottleneck in the computation. This is due to the huge difference in access time of fast internal memory and slower external memory such as disks. In order to amortize this access time over a large amount of data, disks typically read or write large blocks of contiguous data at once. In the area of Input/Output (or I/O) algorithms the main goal is to develop algorithms that minimize the number of such block transfers (I/Os). In this talk, I discuss several novel data structures (or index structures) and illustrate how they can be used to develop provably efficient I/O algo- rithms. In particular I will discuss the so-called buffer tree technique for designing efficient data structures for external memory. The buffer tree technique can be used to obtain I/O-efficient algorithms for a number of problems in areas such as computational geometry, graph theory and string processing. I will also discuss how the external segment tree can be used to develop I/O-efficient algorithms for a number of large-scale problems in- volving geometric data. Most problems considered will be motivated by the so-called map overlay problem, a key problem in geographic information sys- tems. Finally, I will present empirical results showing that often a significant speedup is attained by the new algorithms over currently used methods. I also illustrate that the theoretically developed algorithms are much more robust in their ability to handle all input distributions than previously known algorithms. large@cs.duke.edu http://www.cs.duke.edu/~large Lars Arge is being interviewed for a position in this department. Prof. Leonard Pitt is his technical host.