Search Tools Report
Xapian Code Library & Omega Search
Project Description
Xapian Mailing Lists
Price: free open source, GPL
Platform: Linux, Solaris, Mac OS X, FreeBSD, NetBSD, OpenBSD, Solaris, HP-UX, Tru64, IRIX, other Unix platforms, as well as Microsoft Windows.
Xapian is an active open source high-performance text retrieval system, based on years
of research and scalable to very large sets of documents. It now includes the Omega search engine, an application that implements the code library and makes it relatively simple to install and run.
Features
-
Scalable to hundreds of millions of pages, index database files greater than 2 GB.
-
Bindings to many popular languages including C/C++, Java, Perl, PHP, Python, TCL, C#, Ruby.
- Full access to the index: from term to document, but also from
document to terms.
- Indexes XML and RDF, can treat sections as separate documents.
.
- Allows simultaneous update and searching, new documents rapidly become searchable.
-
Word stemming in 12 European languages.
-
Search features include Boolean operators, phrase/proximity search, and wildcards
-
Probablistic relevance algorithm - "important" words get more weight.
-
Relevance feedback for query expansion and related documents.
- Omega - indexing and search application
-
Indexer supplied can index HTML, PHP, PDF, PostScript, and plain text, OpenOffice/StarOffice, OpenDocument, Microsoft Word, Word Perfect, RTF, Perl POD documentation, and other formats when converters available, and most common database formats using the Perl DBI module
- A CGI search module with nice default, customizable formatting; or output to XML and CSV
Examples
- Die Zeit - German newspaper.
- TheyWorkForYou.com - Searching Hansard,
the UK House of Commons official report site.
- Citebase - Citebase Search
is a search and citation analysis tool for the free, online research literature.
- Qoop - Dutch online auction site.
- Recoll - Free open source personal desktop search for Unix and Linux based on Xapian.