Natural Language Processing in Information Retrieval Research

Natural Language Processing

To avoid forcing searchers to memorize Boolean or other query languages, some systems allow them to type in a question, and use that as the query: this is known as "Natural Language Processing" (NLP). The simplest processing just removes stopwords and uses a vector search or other statistical approach. Some sophisticated systems try to extract concepts using linguistic analysis, and match those against concepts extracted by the indexer. Others try to categorize the form of the question and use it to define the query, so "who is" questions are not treated the same as "how many" or "why": a good example of this approach is the AskJeeves system.

