Automatic acquisition of lexico-semantic information for question answeringLonneke van der Plas -
18 April 2008 (abstract)Abstract: People may use a wide variety of terminology to describe the same concept. Dealing with terminological variation appropriately seems important for applications such as question answering (QA). For example, a user might ask a question using words that are not found in the document that holds the answer. The way words are distributed over contexts tells us something about their semantic relatedness. For example, words that are found in the same syntactic relation, such as words that appear in the object position of the verb 'to drink', have something in common: liquidity. By calculating the distributional similarity between words we can acquire lexico-semantic information automatically. The use of syntactic contexts for finding semantically related words is rather common and so is the proximity-based (bag-of-words) approach. We used multilingual parallel corpora to define yet another type of context, that is, the translations a word receives in several languages acquired through automatic word alignment. The idea behind this is that words that receive the same translations in other languages, such as 'autumn' and 'fall', are often synonymous. In the first part of the talk I will describe the different methods we used to acquire related words and I will give a characterisation of the type of lexical relations that result from the various methods. In the second part of the talk I will show what happens, when we try to apply the lexico-semantic information to QA.