Polytechnic University, Brooklyn, NY
Large web search engines have to answer thousands of queries per second with interactive response times. Due to the size of the data sets involved, usually in the range of multiple terabytes, a single query may require the processing of hundreds of megabytes or gigabytes of index data. To keep up with this immense workload, large search engines employ clusters of thousands of machines, and techniques such as caching, index compression, and index and query pruning are used to improve scalability.
After a general overview of search engine architecture, we discuss some of our recent work on efficient query execution in large engines. We first focus on pruning techniques in query execution, and report query throughput results of various policies on a search engine prototype developed in our group. In the main part of the talk, we discuss some recent work on caching and its performance implications in search engines. In particular, we propose and discuss a three-level caching architecture and various cache admission and eviction policies that can significantly improve query throughput in large engines.
*Joint work with Xiaohui Long