Search Tools Product Report
Platform: available as a server appliance, software for Solaris,
Linux or Windows 2000, or a remote search service
Price: contact company
- Fast but polite robot crawler for indexing internal/external web
- Flexible include/exclude rules using regexp (grep) patterns
- Accesses SSL secure sites via HTTPS
- Handles proxy servers and password protected areas.
- Indexes mounted file system volumes in native formats, NFS, Samba,
- Handles file formats: HTML, ASCII text, RTF, Microsoft Word, Excel, PowerPoint,
Acrobat PDF, PostScript
- Indexes XML files and searches within XML tags, can define DTDs and metadata.
- Indexes relational databases, MySQL, Oracle, PostgreSQL using JDBC interfaces.
- Update scheduler options.
- Metadata fields: URL, images, mailtos, hrefs, anchors, Dublin Core and AGLS
- External metadata assignment to documents or directories
- Supports Western European languages (ISO-8859-1) Afrikaans, Basque, Catalan,
Danish, Dutch, English, Faeroese, Finnish, French, Galician, German, Icelandic,
Irish, Italian, Norwegian, Portuguese, Scottish, Spanish, and Swedish
- Advanced search interface for additional metadata and query operators
- Synonym list
- Spellchecking using aspell, dictionaries and/or the site content text.
- Can enable stemming for searches.
- Can search in specified subsites or combined meta-collections
- Search page and results page customization using HTML templates.
- Results sorted by relevance, with extra weight for metadata matches
- Date sort option
- Shows "Featured pages" (manual recommendations).
- Shows search word in context in results pages and/or metadata content
- Option to view cached versions of files
- Advanced results customization uses Perl syntax for extensive flexibility
- Can return results in XML
- Web-browser administration interface for general customization
- Extensive config files for complete control
- Query log reports include most common queries, no-matches, time taken
- Based on PADRE (PArallel Document Retrieval Engine), started in 1994
- Scales to over 18 million web pages (100 gigabytes) on low-end hardware.
Page Created 2003-07-02