Geographic Information Retrieval

GIR is formed by combining techniques of Information Retrieval and Geographical Information System to develop an application that can solve textual queries which include geographic dimension. GIR can comprehend the geographical knowledge provided within user queries and web documents, and is able to answer quite satisfactorily. for eg. GIR can be used to answer queries like, "restaurants in Delhi". GIR can also be used to extract and resolve locations meaning in an unstructured text. After correctly identifying the location references a GIR system index the information for searching and retrieving the query output.

GIR System

Issues in GIR :

· Detecting Geographic References : First issue in building a GIR System is detecting the genuine geographic references as a place name can refer to both a geographical location and an organization's name. for e.g. "talks with Washington". For solving this issue we need to analyze text to distinguish between a geographical place and some other entity.

· Disambiguating place names : Once it has been confirmed that a place name has been referred in geographical sense, the next challenge comes in uniquely determining the place to which the name refers. for e.g. Newport, Springfield. Ambiguity is removed by using knowledge obtained from contextual clues within the document.

· Vague Geographic Terminology : Sometimes users inputs jargons in the form of queries and it becomes difficult for a GIR System to produce results to these absurd queries. for e.g. South of France, the Midwest in the USA. To resolve such queries GIR make use of gazetteer defined in the later section of this blog.

· Geographical relevance ranking : Once a system finds some set of relevant results, ranking them in order of relevance to user query is very important. Relevance can be computed by a score which takes into account the frequency of occurrence of query terms in the retrieved documents. Spatial score can also be used to find geometric match between query footprint and document footprint.

Components of GIR :

1) Semantic Similarity : Semantic Similarity is the technique of calculating similarity (or say distance) between set of documents or terms on the basis of meaning or semantic content. SS can be computed by topological similarity or using several tools like WNetSS API, a JAVA API based on WordNet semantic resource. In GeoInformatics, SIM-DL similarity server is used to calculate similarity between concepts stored in ontology.

2) Word-sense disambiguation : WSD is the process of correctly identifying the sense of a word having multiple meanings in a sentence. There are 2 approaches to disambiguate word sense - Deep Approaches that tries to analyze the complete text. It is not used in practice, because we mainly don't get access to complete body of knowledge. Shallow Approaches that doesn't analyze the complete text. These approaches just use the surrounding words and then tag these word according to its sense. for e.g. If bass has nearby words like fishing or river then it must be in the fish sense.

An example of Word Sense Ambiguity

Gazetteer :

Gazetteer is a geographical directory used in addition to an atlas. GIR often relies on a gazetteer to obtain information regarding social statistics, physical features of a country, city or a region. The content varies from peaks and waterways dimensions to population data, GDP and literacy data.

Research Areas of GIR :

· Automatic generation of natural language photo captions.

· Exploitation of 3D city models to acquire knowledge about camera images view.

· Building Spatial search engines.

· Developing web mining techniques and creating a web questionnaire to acquire knowledge absurd place names.

References :

· Geographic Information Retrieval by Christopher B. Jones & Ross S. Purves.

· http://www.cs.unibo.it/~montesi/CBD/Labs/GIR_UNIBO.pdf

· https://en.wikipedia.org/wiki/Geographic_information_retrieval

IIITD IR MELANAGE

Search This Blog

Geographic Information Retrieval

Comments

Post a Comment