Problems involved in making Temporal Information Retrieval


In the field of information retrieval, the queries which are given to the system are very vague. This vagueness may lead to outdated results hence will be of no use to the user.

Say a query could be related to the weather and it may not be useful if the system returns the data from the past.

           "Time should also be a factor for returing the relevant results" 

In the field of news, if a user wants a top news story then the system should return the top news story of that day not of any random day.


If the events can be assigned date then the retrieval of the documents will be more precise. So temporal tagging should be done. For example, Independence day of India is 15th of August.

The time can be mentioned in the query, in the document or in both.

Time in the Document:

The time attribute in the document can be the text of the document or it can be the date when the document was posted.


  • Time as the publication date of the document: This can be managed easily as the results can be filtered depending on the user's time preference. For example, a user wants all the news of the past day.  So for this query, first the date of the query will be observed and the searching range will be limited to the day before the query was done.

  • Time in the contents of the documents: So the systems should store the temporal information from the documents while indexing. But it is not that easy as it seems. While tokenizing dates mentioned in the documents can be extracted but some phrases in the documents also represent date implicitly for example "ban on crackers on this Diwali in Delhi NCR by Supreme Court". Time can be present in the documents in the terms of duration("three years ago")
Time mentioned in the Query:

The time period can be mentioned in the query directly or indirectly.

IMPLICITLY: "Mumbai Attacks","Earth Quake in Gujrat" etc.
EXPLICITLY: "World Cup 2015 winner","Winter Olympics Pyeongchang" etc.

The query can be related to past, present, and future. 

PAST:"British Rule in India"
PRESENT:"Nirav Modi PNB Scam"
FUTURE:"2019 Elections in India"

These are some of the problems which are to be addressed in order to make a temporal based information retrieval.

References:





Comments