As of January, 2012, this site is no longer being updated, due to work and health issues
Search Tools Product Report
Doclinx Search Engines
Product Information
Price: TeraXML Enterprise Search and Docsan CD Publisher each start at $10,000; contact company for further price information.
Platform: Windows, Unix: Solaris, Linux, Java J2EE.
Features
- TeraXML
Enterprise Search is enterprise full-text and XML search engine
- Scales to terabytes of data
- Stores security information for each document
- Indexes office file formats, HTML, Adobe
Acrobat PDF, Microsoft Word and over 200 other formats.
-
Can index content from Lotus Notes, Oracle and most other relational databases.
-
Fast indexing, can handle over 4 GB per hour
- Compressed indexes, as little as 25% of original document size.
- Designed for multiple load-balanced search servers
- Structured search using XQuery and XPath
- Text search query language supports Boolean operators, phrase
search, fuzzy, wildcards, proximity and stemming
- Handles European and Asian languages using Unicode (UTF-8)
- Global Media Monitoring System
-
Uses advanced speech recognition and text-mining technologies in multiple languages to automate the process of intelligence gathering from news broadcasts and internet websites.
-
Real-time Speech Recognition module converts filtered speech into text.
-
Named entities such as proper names of persons, organizations and geographical locations, as well as monetary, numeric, and other expressions, are identified in the text created by the Speech Recognition module.
-
News broadcasts are automatically separated into stories and classified into one or more pre-defined topics.
-
Text and metadata are stored in XML format and fed to the XML indexing engine which creates a searchable archive.
-
Text indexer performs parallel, incremental indexing and offers scalability.
-
Distributed architecture allows implementation of large clusters of speech indexers and search servers.
-
Integrates with an notification system to send alerts when selected words are spoken in the broadcast.
-
Customizable web based user interface allows end users to search by spoken words, speaker names, topics, channels or program names.
-
Uses Java and XML technologies to enable integration with existing infrastructure.
- TeraXML Language Analyzer
-
Analyzes, identifies and categorizes important words and phrases contained in textual data.
-
Performs tokenization, part-of-speech tagging, sentence boundary detection, base noun phrase detection, and named entity detection.
-
Targeted full-text queries.
-
Conceptual document maps.
-
Visualization of relationships between entities.
-
XML messaging API enables deployment of distributed applications in a platform independent manner.
-
Available for English, Arabic, French, Italian, German, Spanish, Chinese, Japanese and Korean.
- Docsan E-commerce Server
- Online sales system for content-based products such as reports or music
files.
- Includes the TeraXML search engine
- Shopping cart included
- Personalization tools
- Integrates with corporate databases, sales tax, credit card processing.
- Scales across multiple servers to handle large user loads and provide fail-over capability.
- Docsan CD Publisher
- Java CD content publishing system
- Publishes to CD and/or the web.
- Integrated live web update
-
200 supported document formats.
- Stores index on CD
- Handles European and Asian languages
- Full-text and fielded search using a browser
Examples
-
igrep - A vertical niche search engine specifically aimed at developers and other people deeply interested in technology.
-
MarcoPolo Search Engine - Provides access to all of the educational resources created by the MarcoPolo Partners.
-
ADIN - Australian Drug Information Network.
Page Modified 2007-12-18