Novelty & Diversity in Information Retrieval

We all use information retrieval system often in our daily lives starting from google search engine to google scholar, shopping websites to recommender systems and from ticket reservation to almost everywhere. Did you ever think about what could be the possible improve such IR system? The answer is 'yes' the apparent rule is to increase the satisfaction of user requirement. But the real question is how? To tune any IR system, we need a better measure of How well did the system satisfy the user requirements?


The existing methods do not consider novelty & diversity. They mainly focus on computing values corresponding to document, query pair rather than giving any emphasis on what the user has already seen. 

What is novelty & diversity in IR?

Let's take an example where the user searches for a particular query, & the system fetches relevant results. But the system shows results in which most of the documents are found repeating. Even if the results are consistent, but the repetition is not adding any novel information.
Take another example where the user searches a query which may have multiple contexts or meanings. The result set must include all the diverse results which complement the information already retrieved & delivers result for every context.

Key Point:

•    In short, Novelty is how well the IR system performed in case of redundancy.
•    Diversity is how efficiently the system resolves ambiguity in return results.

Evaluating Novelty & Diversity:

The α-nDCG (α-Discounted cumulative gain) method is a ranking method. It has extensive application in measuring the efficiency of web search engines.  It measures novelty & diversity.



where P(rank k | docs @ rank k) as gain values G[j].


References:

1. http://resources.mpi-inf.mpg.de/d5/teaching/ws14_15/atir/slides/2014-atir-ch05-novelty-and-diversity.pdf

2.  Conference on Research and development in information retrieval.pdf

3.  http://ws680.nist.gov/publication/get_pdf.cfm?pub_id=907308

Comments