Ad verba per numeros
Wednesday, August 5, 2009, 09:59 PM
Gazpacho and summer rash: Lexical relationships from temporal patterns of web search queries. E. Alfonseca, M. Ciaramita and K. Hall. Empirical Methods in Natural Language Processing (EMNLP). 2009.In this paper we investigate temporal patterns of web search queries. We carry out several evaluations to analyze the properties of temporal profiles of queries, revealing promising semantic and pragmatic relationships between words. We focus on two applications: query suggestion and query categorization. The former shows a potential for time-series similarity measures to identify specific semantic relatedness between words, which results in state-of-the-art performance in query suggestion while providing complementary information to more traditional distributional similarity measures. The query categorization evaluation suggests that the temporal profile alone is not a strong indicator of broad topical categories.
I've found it really enjoying, specially because one of my students (Manuel Tejeiro) recently finished his final year project and it was not another thing that a framework to perform time series analysis. In fact, for most of the testing he was using the AOL 2006 query log obtaining fairly interesting results (in bold the input query):
- california lottery
- lottery, ny lottery, georgia lottery, michigan lottery, mass lottery, calottery.com, ohio lottery, new jersey lottery, njlottery, ...
- academy awards
- oscars, crash, oscar winners, box office, walk the line, ...
- disney channel
- www.disneychannel.com, disneychannel, cartoonnetwork, disneychannel.com, ikea (?), nick.com, hilary duff, ...
- Identifying similarities, periodicities and bursts for online search queries by Michail Vlachos, Christopher Meek, Zografoula Vagena, Dimitrios Gunopulos. In SIGMOD'04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data (2004), pp. 131-142.
- Trend Detection in Folksonomiesby A. Hotho, R. Jaschke, C. Schmitz, G. Stumme. In Proceedings of the First International Conference on Semantics and Digital Media Technology, Vol. 4306 (2006), pp. 56-70.
- Why we search: visualizing and predicting user behavior by Eytan Adar, Daniel S. Weld, Brian N. Bershad, Steven S. Gribble. In WWW'07: Proceedings of the 16th international conference on World Wide Web (2007), pp. 161-170.
- Why Are They Excited? Identifying and Explaining Spikes in Blog Mood Levels by K. Balog, G. Mishne, M. de Rijke. In Association for Computational Linguistics (2006)
P.S. If you are still hungry you can find yet another paper related to query log analysis and including food in the title "From 'dango' to 'japanese cakes': Query Reformulation Models and Patterns" by Paolo Boldi, Francesco Bonchi, Carlos Castillo and Sebastiano Vigna.
Back Next