Search Tools Code Library Report
Lucene
Product Information
now part of Apache Jakarta
Platform: Java (designed for cross-platform use)
Price: free, open source, Apache Software License
Features
- Very fast indexing, minimal RAM required
- Index compression to 30% of original text
- Indexes text and HTML, document classes available for XML, PDF and RTF
- Search supports phrase and Boolean queries, plus, minus and quote marks,
and parentheses
- Allows single and multiple character wildcards anywhere in the search words,
fuzzy search, proximity
- Will search for punctuation such as + or ?
- Field searches for title, author, etc., and date-range searching
- Supports most European languages
- Option to store and display full text of indexed documents
- Search results in relevance order
- APIs for file format conversion, languages and user interfaces
Articles & Reviews
- JavaGuru Lucene FAQ jguru.com,
updated as of July 2002 by Otis Gospodnetic
Helpful information for indexing, searching, updates, configuration, etc.
- The
Lucene search engine: Powerful, flexible, and free JavaWorld, September
2000 by Brian Goetz
Thoughtful description of implementing the Lucene search engine for searching
Eyebrowse email archives, which are stored in a mySQL database. Discusses
the features, including the powerful indexing and updating scheme in some
detail, and includes code snippets for calling the code.
Examples