Processing long queries against short text: Top-k advertisement matching in news stream applications

Publication Type:
Journal Article
ACM Transactions on Information Systems, 2017, 35 (3)
Issue Date:
Filename Description Size
a28-zhang.pdfPublished Version748.52 kB
Adobe PDF
Full metadata record
© 2017 ACM. Many real applications in real-time news stream advertising call for efficient processing of long queries against short text. In such applications, dynamic news feeds are regarded as queries to match against an advertisement (ad) database for retrieving the k most relevant ads. The existing approaches to keyword retrieval cannot work well in this search scenario when queries are triggered at a very high frequency. To address the problem, we introduce new techniques to significantly improve search performance. First, we devise a two-level partitioning for tight upper bound estimation and a lazy evaluation scheme to delay full evaluation of unpromising candidates, which can bring three to four times performance boosting in a database with 7 million ads. Second, we propose a novel rank-aware block-oriented inverted index to further improve performance. In this index scheme, each entry in an inverted list is assigned a rank according to its importance in the ad. Then, we introduce a block-at-a-time search strategy based on the index scheme to support amuch tighter upper bound estimation and a very early termination. We have conducted experiments with real datasets, and the results show that the rank-aware method can further improve performance by an order of magnitude.
Please use this identifier to cite or link to this item: