Web Site Searching and the User Experience
BayCHI, January 1999
Avi Rappoport, Search
Tools Consulting
Search Quality
Information Retrieval Theory
-
Recall vs. precision
-
Recall = all the relevant pages
-
Precision = only the relevant pages

-
Extended character sets and multiple languages
-
Multiple extended ASCIIs - Unix vs. Windows vs. Mac vs. Unicode
-
Non-Roman Character Sets
-
Multiple languages - English "mole" vs. Spanish "mole"
-
Stopwords
-
a, an and the are often removed from the index
-
Common words may be vital to some queries, e.g. song titles, "Because
of You"
-
Stemming
-
Indexing and searching on root forms rather than word forms, so searching
for "paid" will retrieve "pay" (can be problematic!)
-
Fuzzy matching
-
Using linguistic analysis to match words which are similar in many
ways: spelling, sound, etc.
-
ZDNet
(Thunderstone Texis)
Table of Contents | Previous
Page | Next Page