A long story short...
I completed my PhD on 2005 under the supervision of Darío Álvarez Gutiérrez. During that time I developed an n-gram based vector method to process free text that was well-suited for different NLP and Information Retrieval tasks such as clustering, labeling, automatic summarization and searching [1, 2, 3].
After completing the PhD I moved on to query log analysis. I was interested in exploiting them not only to improve search quality, particularly regarding informational queries [4], but also to "pulse" public opinion.
Needless to say, query logs from search engines are difficult to obtain, and even the largest available ones are small when compared to the actual volume of queries submitted by users.
Fortunately, at the same time that research on query logs was becoming exceedingly difficult (circa 2006, the AOL-gate) Twitter was born. The beauty of Twitter is that (the majority of) tweets are public, they are much richer than queries and, moreover, they can be associated to a given user (with their accompanying social network) and geolocated (to some extent)—you can experience the power of Twitter in my Historical Twitter Archive (The historical, 2006-2009, Twitter archive is no longer online, if you are interested in it please read this).
Hence, since that moment I've been researching on social media issues; from influence [5] to spam [7], going through user profiling [6]. However, I've focused above all on public opinion, concretely on political opinion [10] and it's feasibility to forecast political outcomes such as elections [8, 9].You can check my list of publications but I feel that a short selection can provide a better perspective of both my background and my current research lines.
Selected references
PhD papers
- Gayo-Avello, Daniel; Álvarez-Gutiérrez, Darío; Gayo-Avello, José (2004). Naive algorithms for keyphrase extraction and text summarization from a single document inspired by the protein biosynthesis process. In Biologically Inspired Approaches to Advanced Information Technology (pp. 440-455). Springer.
- Gayo-Avello, Daniel; Álvarez-Gutiérrez, Darío; Gayo-Avello, José (2004). One Size Fits All? A Simple Technique to Perform Several NLP Tasks. In Advances in Natural Language Processing 4th International Conference, EsTAL 2004, J.L. Vicedo et al. (Eds.), LNAI 3230 (pp. 267-278).
- Gayo-Avello, Daniel; Álvarez-Gutiérrez, Darío; Gayo-Avello, José (2005). Application of variable length n-gram vectors to monolingual and bilingual information retrieval. In Multilingual Information Access for Text, Speech and Images (pp. 73-82). Springer.
Papers on query log mining
- Gayo-Avello, Daniel; Brenes, David J. (2009). Making the road by searching -- A search engine based on Swarm Information Foraging. arXiv preprint arXiv:0911.3979.
Social media research
- Gayo-Avello, Daniel; Brenes, David J.; Fernández-Fernández, Diego; Fernández-Menéndez, María E; García-Suárez, Rodrigo (2011). De retibus socialibus et legibus momenti. EPL (Europhysics Letters), 94(3), 38001.
- Gayo-Avello, Daniel (2011). All liaisons are dangerous when all your friends are known to us. In Proceedings of the 22nd ACM conference on Hypertext and hypermedia (pp. 171-180). ACM.
- Gayo-Avello, Daniel (2013). Nepotistic relationships in Twitter and their impact on rank prestige algorithms. Information Processing & Management, 49(6), 1250-1280.
Public opinion in social media
- Gayo-Avello, Daniel (2011). Don't turn social media into another 'Literary Digest' poll. Communications of the ACM, 54(10), 121-128.
- Gayo-Avello, Daniel (2012). No, you cannot predict elections with Twitter. Internet Computing, IEEE, 16(6), 91-94.
- Gayo-Avello, Daniel (2015). Political Opinion. In Y. Mejova, I. Weber, & M. Macy (Eds.), Twitter: A Digital Socioscope (pp. 52-74). Cambridge University Press.