Want to find someplace to do some exciting thing coming to your mind?
Well, you can google it but, can you bear going through all the results it shows just to find one city name? If no, then CitySearcher is the solution.
CitySearcher is a search engine which can provide you with the cities from all over the world based on your general interest
Here is an example,
If you think of "History" which place comes into your mind first? Agra, Jaipur, Odhisa, Kedarnath.
CitySearcher does the same thing the "interest" act as a query on previous travel document corpus, having information about cities and predicts the best possible cities that match with your interest.
This model uses information retrieval system to fetch the results.We know that to process such things we need semantic matching of words.
Imagine a document which describes some city as "The city is rich in flora and is surrounded by a river" now what happens if the query comes as "Valley of flowers?" Even though the query has no common words with the document, but it is semantically relevant to this city.So, semantic matching is one of the critical features of this model.
Challenges in this model
- Cities with the city and interest name same, but have different meanings, will be problematic. For example city name Mobile will be ranked highly for the query "Gadgets." even if it is not famous for gadgets.So, this can cause serious problems.
Overview of functionality
- This model uses word2vec to represent the target city by combining all the semantic relationships of the words in the document
- Then by making the pair of city and interest, it finds the similarity between words in the city document and interest.
- Besides calculating just the similarity score, it also makes clusters of the cities based on a particular topic and then matches the interest based on the topic, then find its similarity with the city documents inside that cluster.
Now to resolve the issue of mismatching semantic results such as city name "sale" is not famous for "shopping" as an interest but has a high rank according to this model. Machine learning techniques are used to resolve this issue. Wondering how?
Here is the solution: The model is trained on such city document pairs, and users can give a rating on such documents. This rating will remove the target city which mismatches semantically with the document.
Techniques used
- The words in the city document and the interest, are represented as vectors by famous 'word2vec' model.
- To calculate the similarity between the interest ('itr') and the city document, it calculates cosine distance between these two vectors and by their average similarity score, it ranks the cities corresponding to the documents in decreasing order.
Here k is top k scores, Si is cosine distance between 'itr' and words in the document
Conclusion
This idea is exciting and can be used to find a city by our personalized interests. Now people will no longer search queries in the Web like "Best places to do paragliding," "Places rich in flora and fauna " etc. and waste their time in going through hundreds of results to select one best place.
References:
Mohamed Abdel Maksoud et al. "CitySearcher: A City Search Engine For Interests." In Proceedings of SIGIR ’17, Shinjuku, Tokyo, Japan, August 07-11, 2017
https://en.wikipedia.org/wiki/Word2vec
Comments
Post a Comment