Neural Networks for Information Retrieval

Introduction
Image Source: [1]
Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies a user’s information need. However, we know that, in reality, almost no data are truly ‘unstructured’. Patterns exist in most documents (for example, a text document might have a structure in the form of headings, paragraphs, footnotes, etc.) and we can use Machine Learning (ML) to identify these patterns and structures. This blog introduces how researchers are incorporating machine learning in IR systems to solve some commonly seen problems in information retrieval.
In most of image, language and speech understanding tasks, Deep Learning (DL) has established state-of-the-art performance. Information retrieval is no different. Taking cues from language understanding models, deep learning is penetrating information retrieval/extraction tasks. Over the years, Neural Networks (NNs) have emerged as a popular machine-learning paradigm, and now, these are being deployed to IR systems. Since NNs can themselves extract features from raw input data, they pose a considerable advantage over many other learning strategies. Consequently, the newer approaches are focussing on designing neural models to solve IR-based problems.

History
The usage of Neural Networks in IR is a relatively new technique. The popular ACM conference – SIGIR 2017, saw a tutorial [1] on Neural Networks for Information Retrieval (NN4IR) on the very first day, inviting huge audience. The aim of the tutorial, as said by the speakers, was to give a clear overview of current tried-and-trusted neural methods in IR and how they benefitted IR research. It covered key architectures and promising future directions. According to [1], researchers have proposed many different architectures and paradigms, such as auto-encoders, RNNs, CNNs, and, more recently, GANs (Generative Adversarial Networks). And most of these have been applied in IR settings. Not only this, these have been applied to all key parts of the typical modern IR pipeline, such as core ranking algorithms, knowledge graphs, text similarity, entity retrieval, language modeling, question answering, and dialogue systems. Some other papers [2, 3, 4] have been published relating to neural models in IR.

Employing NNs in IR Models
Quite a few IR models have employed NNs. We discuss some of them in brief:

1.  IR model that uses user browsing behaviour
A user’s browsing habits can significantly help in improving many search-based IR systems. However, various types of biases make it difficult to interpret user clicks accurately. A user might only prefer clicking the results at the top of the page or results that grab attention visually or by some other means.  These are referred to as positive bias and attention bias respectively. To account for these biases, traditional IR systems use a Probabilistic Graphical Model (PGM). However, with the advent of deep learning, Recurrent Neural Networks (RNNs) are being used instead. These RNNs learn about biases by finding patterns in the data.

2.  IR model that ranks the results of a query
Ranking the documents matching a query is an important task in information retrieval. In most IR systems, the documents are ranked based on a score that is assigned by finding their cosine similarity with the query, or by using a ranking function like BM25, etc. Neural nets can learn to rank documents even ‘without’ a ranking objective (Obviously, they can learn to rank documents with a ranking objective).

3.  Find items based on their description
This is a fairly common problem encountered in many IR systems. The traditional approach involves counting query term occurrences in the description text (e.g., BM25). Modern IR systems use a neural network that has been trained on relevant labeled data (supervised learning). Deep Structured Semantic Model (DSSM) is a deep neural architecture that uses NNs to match (semantic) between document and query [5].

4.  Response generation to conversational stimuli
Neural nets can incorporate natural language processing to create generative models that generate results/outputs for a given input. The sequence-to-sequence (seq2seq) learning based encoder-decoder model for Google’s Gmail Smart Reply is one such example. Conversational and dialog systems (chatbots) are another example. In general, here, the model either infers the answer from unstructured data (like textual documents that do not necessarily feature the answer literally), or generates natural language given structured data (like data from knowledge graphs or from external memories).

Challenges
Although deep learning is digging its roots in information retrieval, there still a few challenges posed by IR systems:
1.  NNs are unable to process a full document text in one go. The researchers find it desirable to learn all components of a full IR system in an end-to-end fashion.
2.  We do not have metrics to evaluate neural conversational systems.
3.  A chatbot should be able to stay consistent throughout the course of a conversation. We might want to give a persona to the chatbot.
As the research in this field progresses, it will expose many more challenges and might solve a few as well.

Conclusion
Machine learning is playing an important role in many IR systems, and all of them apply deep learning. The fast pace of modern-day research has given rise to many different approaches for solving various IR problems. We have discussed a few of them in brief. The details of these approaches are beyond the scope of this blog. To read more, kindly refer to the links below. Neural nets for IR are definitely a good idea; however, there are some challenges. Further research in this field will provide more insights on how to overcome these challenges and might even bring out some new challenges.

References
[1]        Neural Networks for Information Retrieval - (SIGIR'17)(arXiv)
[2]        Neural Models for Information Retrieval, arXiv
[3]        Machine learning for information retrieval: Neural networks, symbolic learning, and genetic algorithms, Wiley Online Library
[4]        A neural network for probabilistic information retrieval, ACM Digital Library
[5]        Information Retrieval using Deep Learning

Comments