As of January, 2012, this site is no longer being updated, due to work and health issues

SearchTools.com

Search Tools News for the Year 2001


See also: Site change details

 

November 1, 2001

AskJeeves: New SearchTools Report

Site or intranet version of this question-answering search engine, supports many languages, natural-language processing for queries and search log analysis to improve the answer matrix.

October 24, 2001

Commerce Search Engines: Report Update

The quality of the search engine on an online store has a direct relationship to that store's bottom line, so it's even more important to make it work! Research analyst reports describe common problems with product catalog searching. This report includes a checklist of the most important functions and interface elements of an e-commerce search engine, and a new listing of the most prominent search engines.

Ripfire Ignite: New SearchTools Report

Designed for both structured (databases and XML) and free-text searching, this system is often used for e-commerce sites, integrated as Java middleware. It has realtime index updating, spellchecking, custom synonym listings, clustering search results into categories, and very fast retrieval.

October 23, 2001

Metadata Search: Report Update

Metadata, structured information about documents, can improve search engine results significantly. This report covers metadata and search engines, including new resources such as XML and RDF metadata, the Dublin Core NISO standard, Adobe XMP metadata within files, and topic maps.

Searching PDF: Report and Listings Update

Advice for web site and intranet managers on site search engines and PDF files includes suggestions for preparing PDF for searching, the new Adobe XMP metadata, identifying PDF indexing problems, and displaying PDF files in search results. Lists 44 site search engines which index and search PDF files.

Open Source Search Engines: Listings Update

Now includes a summary of Eric Lease Morgan's comparative review of eight leading open source search engines, as well as listings for twenty open-source search engines.

New Search Engines.

Dieselpoint: New SearchTools Report

Designed for online catalogs and e-commerce, this Java search engine indexes database fields as well as HTML and other text files.

Everyfind: New SearchTools Report

Using JavaScript on Windows, this search engine provides multilingual indexing for web sites and CD/DVD distribution. Extensive customization for results pages.

Juggernautsearch: New SearchTools Report

Perl search engine designed to scale to millions of documents, Pro version adds sophisticated indexing controls.

Orangevalley Intranet Search Engine: New SearchTools Report

Windows search engine with a spider for crawling intranets, uses ASP for searching. Search results show a snippet of text with the match words highlighted, searches are logged for later analysis.

Enterprise Search: New SearchTools Report

Scalable Windows search engine provides extensive control for indexing spider, multiple languages, search zones, customized results formatting and relevance rankings, and search logging.

URL Spider Pro: New SearchTools Report

Smaller-scale Windows search engine, but otherwise similar to Enterprise Search.

Updated Search Engines

ISearch Ported to VMS

Those in desperate need of a search engine for the VMS operating system should contact A/WWW Enterprises at www.awcubed.com which has ported the open-source search engine ISearch to VMS.

Obsolete and Discontinued Search Engines

JHLSearch Discontinued

Java Search engine no longer available.

Twirlix Directory Discontinued

Portal ASP remote search service has closed down.

RightSearch Search Engine Acquired

Company has been bought and the technology incorporated into other applications.

Search-It Service Discontinued

The search server has not responded to queries for the last week, nor does anyone answer email, so I think this service has been discontinued.

SeekIt Service Discontinued

The server does not respond, nor is there any way to contact the company.

SearchTools Site Links Updated

I've noticed an increasing rate of link-rot due to site reorganization as well as company failures. I've done my best to remove these links from the SearchTools site, but there may well be more. Please let me know if you notice any additional problems.

October 17, 2001

ht://Dig Security Flaw

The open-source search engine ht://Dig has reported a security vulnerability, and posted updates and patches to fix these problems. Administrators running versions 3.1.0b2 through 3.1.5 ,and 3.2 betas should update immediately.

October 15, 2001

EoExchange to close down

EoExchange, a provider of corporate portal search and taxonomy development services, is closing due to market and capital constraints.

October 12, 2001

Automatic Categorization Report Update

Automatic categorization is a hot topic these days, as the next frontier in search and navigation functionality. Large web sites and intranets need tools to group their reams of information into coherent categories, so they're looking to automated systems. Grouping search results by category can also provide context and allow searchers to locate the most fruitful areas quickly. Our report now has links to some excellent articles on this topic, while the Classification Tools page lists many new products.

Search Engines for Databases Report Update

Database search and text search functionality is merging, to the benefit of end-users. Databases are starting to improve their ability to index and search large amounts of text, while text-search engines are storing more database structure such as field names and value formats (number, date, price, etc.). This report describes the advantages of each approach and links to database search software.

New Search Engine Implementation Consultants Report

This page lists consultants who can help with installing, configuring or tuning a search engine for your site. This is not a recommendation, simply a list. If you are a consultant, please contact searchtools.com to get added to this list.

New Search Engine Reports

DMP Scout Windows Search Toolkit: New Report

This search code library uses linguistic analysis to improve retrieval, based on research from Lernout & Hauspie.

ebhath Perl Arabic/Roman Search Engine: New Report

Free open-source Perl search engine works with Arabic and Roman code pages, allows a customized header and footer for results pages.

Educesoft Windows ASP Search Engine: New Report

Search engine uses Windows ASP (Active Server Pages), indexes using file system, provides a browser administration interface and highlights matched text in title/description fields in search results.

Elise Matching Engine: New Report

Fuzzy searching for structured data, works with relational databases and standard domain vocabulary.

F3DSearch Portal Search Engine in Perl: New Report

Perl search engine for Unix and Windows is designed for topical portals. It uses an indexing robot to gather pages, provides customizable templates, a special relevance algorithm and results grouped in categories.

IBM Intelligent Miner for Text Search Engine: New Report

Multilingual search engine with robot crawler, scales to very large numbers of documents. Includes language identification and linguistic analysis, clustering and categorization, many file formats and all standard query formats. Java administration interface, available on Windows, Unix and OS/390.

Nathra Arabic/Roman Scalable Search Engine: New Report

Indexes and stems both Arabic and Roman text, scales up for large sites and topical portals. Includes a customizable header and footer for results pages. Runs on Unix and Windows.

orenge Retrieval Engine: New Report

Designed for Knowledge Management, e-commerce and complex customer support applications, uses natural-language processing when possible. Runs on Unix and Windows.

Zoom Free Windows Search Engine: New Report

Free search engine indexes using the local file system, provides templates for results page customization.

Discontinued Search Engines

XRS and BUS XML Search Engines No Longer Available

These XML search engines were the projects of a professor who has moved on, and the pages are no longer accessible.

October 11, 2001

Open Source Search Engines Listing

There are over a dozen open source search engine projects, mainly on Unix and Java. Many have active user bases and development groups, some can index hundreds of thousands of web pages. These are generally free, but require technical resources to compile, configure and maintain the software.

New Articles on E-Commerce and Enterprise Search

InternetWorld Conference Postponed

InternetWorld Fall 2001 has been rescheduled to December 10 through 14: Avi Rappoport of SearchTools.com will be speaking on Tuesday December 11, 2:45 to 3:45.

August 14, 2001

Intranet and Enterprise Information Portal Searching Report

As internal corporate networks get larger and more information is available in digital format, enterprises are installing search engines on their intranets. This allows them to find valuable documents quickly, avoid duplication of effort, take advantage of research and analysis already performed and make better use of resources. Many companies are also using this network to allow employee self-service for human resources and supply ordering, activities which are also appropriate for searching.

Enterprise Information Portals provide a starting point for people to access information and applications on the entire Intranet. They generally include a search engine for locating internal and external information; security features, so a person only has to use a single password; personalization so they get appropriate information; access to databases and enterprise applications; and so on.

EIPs are starting to create categories and directories of information, and using them with full-text searching to make the most effective presentation of relevant results, and provide alerting services using information filtering to watch for news and other incoming information of interest.

August 13, 2001

Microsoft Index Server Vulnerable to CodeRed Worm

The CodeRed worm can exploit a flaw in the Microsoft Index Server search engine to install itself and attack other machines. All administrators should install the patches and reboot to remove this vulnerability.

SearchTools Survey Results

New survey analysis up through the end of October 2000. We wanted to learn more about the relationship of search engines and web sites, and how web site managers view search engines.

We now have 1075 survey results, as of July 12, 2001, covering the topics of why site managers have or have not installed search engines, correlations of the sizes of sites and the installation of search engines, frequency of updates, file formats served, languages, and number of languages used on sites.

For web administrator ratings of the search tools they've used, see the Survey Ratings page. This includes evaluations of the most popular search engines (with seven or more responses), other products, and custom development.

Atomica: New MetaSearch Engine

Metasearch engine queries multiple locations and unifies content in the results.

SearchExpress: New Search Engine

Operates locally or as a remote search service, indexes many file formats and can scan and OCR paper documents, scales to millions of pages, ActiveX code library available.

Visual Net: New Search and Visualization Engine

Provides visual mapping of data to group related topics in enterprises.

SearchEngine Site Search Service Update

This remote search service, will now index up to 30,000 pages for free, although search results pages will show their advertising. Paid versions are available to remove advertising and company logos at a very reasonable price. Based in the UK, this ASP search service is particularly useful to sites in Europe, as indexing and search results don't have to travel to the US and take chances with network quality and latency.

MondoSearch Product/Service Update

MondoSoft, which has acquired the Searchbutton remote search ASP, has a new version with synonym lists and additional vocabulary help, phrase searching, recommended page categories, highlighting terms in results pages and many other useful features.

AOL PLWeb Search Engine Support

AOL affirms continued support for the PLS search products, including PLWeb and CPL in a response to questions on their mailing list. PLS is the search engine used for all AOL Time Warner sites including AOL, Netscape and ICQ. It continues to be available free of charge but with only partial source code.

QueryServer: Updated MetaSearch Engine

Dataware Knowledge Seeker metasearch server is now available from Open Text.

August 9, 2001

Multimedia Search Engines Report

As more digital multimedia archives are developed, they require specialized search engines can index and search these formats. Video and audio are hard to browse, so search engines can save significant time and effort in locating useful content.

Indexing multimedia is much more complex than indexing text. In some cases the media can be converted to text: broadcast television often includes digital text as closed-captions for the hearing impaired, and scene titles and captions within a video can be converted to text using OCR. Speech-recognition technology can digitize words spoken on audio tracks. Continuous media, such as video, also can be broken up into chunks by transitional effects, for better precision in results. Some groups are also working on form and shape recognition, which could allow searchers to draw a shape, such as a bridge or a tumor; or select an example picture and find others like it.

August 8, 2001

New Search Engine: Northern Light

Search service of the Northern Light search engine incorporates many Enterprise Information Portal (EIP) features, including security and personalization. Now integrated with Corporate Yahoo PortalBuilder service.

New Search Engine: Amberfish

Beta version of a new high-performance search engine with efficient indexing and searching.

New Search Engine: OpenFTS

Free open-source search engine from Russia, based on the PostgreSQL database, optimized for fast index updating.

Asian Text Retrieval Workshop

Evaluation of Asian language text retrieval, question answering and text summarization, following on the TREC workshops. Also includes cross-language information retrieval in Chinese, Korean, Japanese and English. Runs from September 2001 through October 2002, participants get a chance to perform tests, participate in discussions, receive evaluations of their software and publish their results. Anonymous participation is permitted.

June 28, 2001

New Search Engine: JXTA Search

Some of the smartest folks working on peer-to-peer computing designed this interchange standard to allow a central server to accept queries, distribute them to the appropriate search servers and return the results to the original clients. It was known as InfraSearch, then GoneSilent, and is now part of Sun's JXTA project.

New Search Engine: Windex Search

A Java search engine from France, the indexing is done first, and then a Java applet to search and display results. The index and applet can be distributed on CD or DVD disc, or from a web site.

mnoGoSearch Ported to Windows and Mac OS X

The Russian search engine mnoGoSearch has been ported to Windows, although that version is not free. This engine uses a database back end instead of an inverted index, and includes interfaces for PHP, Perl and so on. The Unix version has been ported to Mac OS X, as have the Onix and ht://Dig search engines.

Inktomi Search Updates to version 4.1.2

The new version of Inktomi's search software (formerly Ultraseek) includes more features for Japanese and Korean languages, updates to file format filters including double-byte PDF 1.2 files, support for cookies, summaries of MS Word documents and improved support for US Government Section 508 standards (disabled accessibility compliance).

XML Query Working Group Status

On June 12, the W3C XML Query Working group published new drafts of papers on XML Query language definition, use cases, formal semantics, data model and syntax.

June 27, 2001

MondoSearch Acquires Searchbutton

MondoSoft, which provides both remote search services and search software, has agreed to acquire Searchbutton.com, another leading remote search service. MondoSearch strengths include interfaces in many languages, unique frame recognition while indexing, showing results in categories, and date sensitivity. Searchbutton strengths are simple yet powerful administration interfaces and excellent search reporting. While this consolidation reduces some of the competition in the field, we hope that the merged company will be stronger and more able to withstand the current downturn. Searchbutton customers can move to the MondoSearch service, which starts at $420 per month, or the MondoSearch software on internal servers, starting at $6,200.

New Search Engine: 80-20 Discovery

Uses complex neural net and concept retrieval algorithms rather than simple word matching. Can distribute searching to multiple servers and integrates with Windows security.

New Search Engine: LikeIt

Performs pattern-matching on parts of words rather than whole words, for better recall. Can run locally or be hosted remotely, also available as an ANSI C code library.

New Search Engine: XML Query Engine

Free-text search for XML fields in hierarchies, compatible with XQL and XQuery (W3C Query Language draft). Available in the form of an Enterprise Java Bean.

New Product Report: Autonomy Search

Autonomy has been around for many years, but has not emphasized its search engine, which uses Bayesian pattern recognition to match queries to documents. This search is integrated into the Autonomy EIP, including automated categorization, document similarity matching, and adaptive and collaborative filtering technology.

Quiver To Integrate Classification with Inktomi Search

The Quiver taxonomy and categorization tool will integrate Inktomi Search Software to provide a complete information retrieval and navigation system.

New RobotsTxt.org Web Site

Martijn Koster has moved the Robots.txt and Robots Meta Tag information from the old webcrawler site to its own site. The new site was registered in September 2000 and there is a note that it will be updated in Q3 2000, but all the information I could find was the old standard.

June 18, 2001

AltaVista Search Enterprise and Personal Search

AltaVista has announced two new products: Enterprise Search and Personal Search. The enterprise version is designed to allow organizations to make use of internal information in structured databases and unstructured formats, such as personal archives and email servers. It will integrate with corporate security and access control systems as well as organizational policies for adjusting results relevance rankings. The Personal version will index text on workstations, allowing individuals to use the Web search paradigm to search their own hard drives and file stores.

An AP story interviewed privacy advocates who expressed concern about corporate intrusion into obscure and intimate files of individual employees. This may also expose companies to demands for evidence in harassment and employment law cases. Other analysts point out that any information on business machines is owned by that business, and AltaVista promises tools to limit indexing and protect specific areas.

Although AltaVista will index over 200 file formats, including ZIP, it cannot break encryption, so we expect to see a sudden upsurge of personal encryption utility use among attentive employees.

FizzyLabs closed

FizzyLabs, which provided related items based on AI and document-similarity analysis, has shut down, another victim of the dotcom downturn.

June 5, 2001

New Pricing for Atomz Remote Search Prime Service

Atomz Search Prime is now available for $600 per year for indexing and searching 1,000 pages, down from $2000. The free version remains available for up to 500 pages and does not display any advertising banners, though it does require an Atomz logo graphic. The paid versions provide more control, more frequent updates and telephone technical support. Atomz also offers an Enterprise Search for larger sites, ecommerce stores and enterprise intranets.

June 1, 2001

New Search Engine: Recommind MindServer

New search engine and automatic classification and categorization system uses semantic analysis to find the underlying topics of documents and return the most useful results first. Designed for Intranets and Enterprise Information Portals

May 4, 2001

Searching MP3 Metadata

ID3 versions 1 and 2 offer searchable information about MP3 music files. A few search engines recognize this information, and we hope to see more in the future. This report provides a short background discussion, links to the ID3 information, and listings of search engines which can find the MP3 metadata.

Meta Search Engines

As sites and Intranets get more complex, it's nice to search all the data at once time. MetaSearch engines can send requests to multiple text and database search interfaces, then present the results to users. This report provides a little background, some information on the Z39.50 metasearch standard, and listings of meta search engines and toolkits.

Smart Logik (previously Brightstation and Muscat) originally started developing this code library as open source (Open Muscat or Omsee) as the next version of its high-performance text retrieval system. It uses a probabilistic relevance algorithm, providing an efficient and scalable library for indexing and searching data. As of the end of April, 2001, the company closed down the Omsee open source effort, so outside developers have started a SourceForge project to continue working on the code. As of July 1 2001, the name is changing to OmSearch to avoid any confusion.

April 18, 2001

NNGroup report on e-commerce search engines (late 2000)

This report analyzes searching on online store sites, focusing on user experience. It says very much the same things we've been saying about search forms, results pages and search failures. Includes some solid test data backing up the recommendations to use a search box, recognize synonyms, accept various operators and errors, show helpful results metadata, explain results, handle search failure, and perform extensive search log analysis. Well worth the $45 to download the PDF report: for links to other articles, see the SearchTools.com E-Commerce Search page.

Article: Robot Exclusion Standard (Robots.txt) Has Legal Value

No Bots Allowed! Interactive Week, April 12, 2001 by James C. Luh
Describes how eBay's court case against the auction aggregator Bidder's Edge was won based in part on eBay's use of robots.txt. eBay's lawyers likened the directives in their robots.txt file to a "no trespassing" sign, and say that the court agreed with them. Martijn Koster, developer of the standard, says he has mixed feelings about enforcement based on it -- that's one reason it's a convention and not a formal IETF or W3C standard.

nexTrieve Ultralite

This search engine uses fuzzy matching extensively, to match terms misspelled either in the search query or the web pages. It has special features for indexing mailing lists, and provides speedy results on large collections, even on low-end servers.

April 5, 2001

Usability Testing for Search Field Location

Michael Bernard reports on a formal usability test of various standard web page elements, including the search field. Results show that both novice and experienced web users expect the search field and button to be in the center at the top or bottom of the page, or at the upper right corner.

April 4, 2001

Inktomi Search Software version 4.1

New version of the Inktomi site and enterprise search engine (formerly Ultraseek) now indexes content databases using ODBC on Windows NT/2000 and direct Oracle access on Unix. It also includes support for Korean, per-language synonym lists, XML attribute searching and automatic title generation for WML pages.

mnoGoSearch for Windows

Windows beta version of the mnoGoSearch software includes a graphical interface for indexing.

Alkaline 1.4

Swiss search engine Alkaline now has custom indexing metatags, numeric searches, multiple services on Windows NT, and a bare-bones Perl API.

FusionBot Updates

The remote search service FusionBot now indexes PDF files, password-protected areas and can use the HTTPS protocol.

HomepageSearchEngine version 3.3

HomepageSearchEngine now highlights term matches in found pages when clicking on results, and can use an HTML "template" page with SSI or PHP layout commands.

AltaVista Enterprise Search Seminars

AltaVista Search seminar on how search works for large institutions, with a presentation by Andrei Broder, AltaVista Chief Scientist, and other search experts. Dates and Locations: April 17 in Boston; April 18 in New York City; April 19 in Washington, DC; April 24 in Santa Clara (SF Bay Area); April 25 in Los Angeles.

Omsee - new name for Open Muscat

Omsee is an open-source probabilistic relevance engine, designed as an efficient and scalable library for indexing and searching data.

April 3, 2001

dtSearch version 6

This version has a robot indexing spider, Unicode and XML hierarchy support, and indexing update scheduling.

Search For Success: Internet Week Article

An expert in customer service points out that fixing the site search may be much more cost-effective than complex CRM solutions and provides specific suggestions.

April 2, 2001

Namazu Japanese Search Engine

Namazu is a free open-source Japanese search engine for Unix and Windows, it seems to be written in Perl. It only indexes local files, no robot crawling, but it looks very nice from here.

March 30, 2001

Microsoft SharePoint (formerly Tahoe)

SharePoint is an Enterprise Information Portal, with a search engine which can index and search text. It's compatible with Exchange Public Folders, file servers, Web sites and Lotus Notes databases. It uses a variant of SQL for queries, and probabilistic relevance ranking in results, with a "best bets" feature emphasizing frequently-linked documents.

RDF querying using Squish

Squish is an experimental query engine in Java which accepts SQL-style queries and searches through RDF documents.

March 29, 2001

Survey Results: Online Stores Need Good Search Engines

Not All Site Features Turn Online Shoppers Into Buyers PricewaterhouseCoopers, March 6, 2001
A survey of 547 Internet users in January of this year found that over three-quarters of the respondents use search features (77%).

Search functionality is considered the most important feature for online shopping by 43%, beating product information (40%), when choosing where to shop: both features led customer service, personalization and wish lists in selecting sites. When deciding what to buy, search functions also pay an important role, although enlarged product images, availability and comparison guides are more directly involved.

All this supports our proposition that e-commerce sites should concentrate on providing excellent search results rather than expensive and complex interactive features.

New and Updated Search Tools

March 21, 2001

Peer To Peer Search

Sun has just bought InfraSearch, temporarily known as GoneSilent, a distributed search engine started by one of the founders of the Gnutella P2P protocol. According to Sun's Press Release, this technology will be incorporated into software developed by Project Juxtapose, their peer-to-peer research incubator.

Watch this space for more news and analysis on peer-to-peer searching.

March 16, 2001

Updated Search Tools

Conferences

March 1, 2001

Articles

Revving Up the Search Engines to Keep the E-Aisles Clear New York Times, February 28, 2001 by Lisa Guernsey (registration may be required to read this article)
Discusses the difficulty of locating items in online stores, referring to the Forrester report of last spring. Describes the use of thesaurus tools for synonym searching and taking advantage of database structure in online stores. Quotes the vendors Mercado, which provides search for WebVan and Tower Records, and EasyAsk, as well as the chief scientist at Verity.

New Search Tools

XML Query Standards Progress

The World Wide Web Consortium XML Query Working Group has released a new version of the requirements for building a standard XML query language, as of February 16. This describes how the standard XML query language should work, with general goals, usage scenarios, terminology, data model information, functionality, and how the language should fit with the other XML standards. They've also posted XQuery: A Query Language for XML which is an implementation of a query language based on the requirements.

The discussion on the XML Query Public Mailing List has been brisk and thoughtful, indicating it is a good step towards a standard query language but there are some aspects of the current requirements document which will need revision. This is a public mailing list for interested parties; to join, just send a "subscribe" message to www-ql-request@w3.org, and see the archives for past discussions.

For more information, see our report on Searching XML and our list of XML Searching Resources.

February 26, 2001

New Search Tools

February 21, 2001

New Search Tools

Obsolete and Discontinued Search Tools

February 16, 2001

New Search Studies

New Search Tools and Services

February 15, 2001

Updated Topics

New Search Tools and Services

Updated Search Tools

Discontinued Search Tools

February 12, 2001

NQL (Network Query Language)

Powerful scripting language automates intelligent agent information transport for web site indexing, metasearching and accessing databases, email stores and more.


For earlier news, see the 2000, 1999 and 1998 news archive pages