As of January, 2012, this site is no longer being updated, due to work and health issues
See also: Site change details
AskJeeves: New SearchTools Report
Site or intranet version of this question-answering search engine, supports many languages, natural-language processing for queries and search log analysis to improve the answer matrix.
Commerce Search Engines: Report Update
The quality of the search engine on an online store has a direct relationship to that store's bottom line, so it's even more important to make it work! Research analyst reports describe common problems with product catalog searching. This report includes a checklist of the most important functions and interface elements of an e-commerce search engine, and a new listing of the most prominent search engines.
Ripfire Ignite: New SearchTools Report
Designed for both structured (databases and XML) and free-text searching, this system is often used for e-commerce sites, integrated as Java middleware. It has realtime index updating, spellchecking, custom synonym listings, clustering search results into categories, and very fast retrieval.
Metadata Search: Report Update
Metadata, structured information about documents, can improve search engine results significantly. This report covers metadata and search engines, including new resources such as XML and RDF metadata, the Dublin Core NISO standard, Adobe XMP metadata within files, and topic maps.
Searching PDF: Report and Listings Update
Advice for web site and intranet managers on site search engines and PDF files includes suggestions for preparing PDF for searching, the new Adobe XMP metadata, identifying PDF indexing problems, and displaying PDF files in search results. Lists 44 site search engines which index and search PDF files.
Open Source Search Engines: Listings Update
Now includes a summary of Eric Lease Morgan's comparative review of eight leading open source search engines, as well as listings for twenty open-source search engines.
New Search Engines.
Dieselpoint: New SearchTools Report
Designed for online catalogs and e-commerce, this Java search engine indexes database fields as well as HTML and other text files.
Everyfind: New SearchTools Report
Juggernautsearch: New SearchTools Report
Perl search engine designed to scale to millions of documents, Pro version adds sophisticated indexing controls.
Orangevalley Intranet Search Engine: New SearchTools Report
Windows search engine with a spider for crawling intranets, uses ASP for searching. Search results show a snippet of text with the match words highlighted, searches are logged for later analysis.
Enterprise Search: New SearchTools Report
Scalable Windows search engine provides extensive control for indexing spider, multiple languages, search zones, customized results formatting and relevance rankings, and search logging.
URL Spider Pro: New SearchTools Report
Smaller-scale Windows search engine, but otherwise similar to Enterprise Search.
Updated Search Engines
ISearch Ported to VMS
Those in desperate need of a search engine for the VMS operating system should contact A/WWW Enterprises at www.awcubed.com which has ported the open-source search engine ISearch to VMS.
Obsolete and Discontinued Search Engines
Java Search engine no longer available.
Twirlix Directory Discontinued
Portal ASP remote search service has closed down.
RightSearch Search Engine Acquired
Company has been bought and the technology incorporated into other applications.
Search-It Service Discontinued
The search server has not responded to queries for the last week, nor does anyone answer email, so I think this service has been discontinued.
SeekIt Service Discontinued
The server does not respond, nor is there any way to contact the company.
SearchTools Site Links Updated
I've noticed an increasing rate of link-rot due to site reorganization as well as company failures. I've done my best to remove these links from the SearchTools site, but there may well be more. Please let me know if you notice any additional problems.
ht://Dig Security Flaw
The open-source search engine ht://Dig has reported a security vulnerability, and posted updates and patches to fix these problems. Administrators running versions 3.1.0b2 through 3.1.5 ,and 3.2 betas should update immediately.
EoExchange, a provider of corporate portal search and taxonomy development services, is closing due to market and capital constraints.
Automatic categorization is a hot topic these days, as the next frontier in search and navigation functionality. Large web sites and intranets need tools to group their reams of information into coherent categories, so they're looking to automated systems. Grouping search results by category can also provide context and allow searchers to locate the most fruitful areas quickly. Our report now has links to some excellent articles on this topic, while the Classification Tools page lists many new products.
Database search and text search functionality is merging, to the benefit of end-users. Databases are starting to improve their ability to index and search large amounts of text, while text-search engines are storing more database structure such as field names and value formats (number, date, price, etc.). This report describes the advantages of each approach and links to database search software.
This page lists consultants who can help with installing, configuring or tuning a search engine for your site. This is not a recommendation, simply a list. If you are a consultant, please contact searchtools.com to get added to this list.
New Search Engine Reports
This search code library uses linguistic analysis to improve retrieval, based on research from Lernout & Hauspie.
Free open-source Perl search engine works with Arabic and Roman code pages, allows a customized header and footer for results pages.
Search engine uses Windows ASP (Active Server Pages), indexes using file system, provides a browser administration interface and highlights matched text in title/description fields in search results.
Fuzzy searching for structured data, works with relational databases and standard domain vocabulary.
Perl search engine for Unix and Windows is designed for topical portals. It uses an indexing robot to gather pages, provides customizable templates, a special relevance algorithm and results grouped in categories.
Multilingual search engine with robot crawler, scales to very large numbers of documents. Includes language identification and linguistic analysis, clustering and categorization, many file formats and all standard query formats. Java administration interface, available on Windows, Unix and OS/390.
Indexes and stems both Arabic and Roman text, scales up for large sites and topical portals. Includes a customizable header and footer for results pages. Runs on Unix and Windows.
Designed for Knowledge Management, e-commerce and complex customer support applications, uses natural-language processing when possible. Runs on Unix and Windows.
Free search engine indexes using the local file system, provides templates for results page customization.
Discontinued Search Engines
These XML search engines were the projects of a professor who has moved on, and the pages are no longer accessible.
There are over a dozen open source search engine projects, mainly on Unix and Java. Many have active user bases and development groups, some can index hundreds of thousands of web pages. These are generally free, but require technical resources to compile, configure and maintain the software.
New Articles on E-Commerce and Enterprise Search
- Desperately Seeking Search Technology [Commentary]BusinessWeek Online, September 24, 2001 by Robert D. Hof
Quotes eminent analysts from Jupiter, Patricia Seybold Group and Forrester to support the value of a good search engine on commerce web sites in particular. Recommends ultrafast updates, as exemplified by FAST on eBay, tolerance of misspellings [and typos], synonym recognition such as EasyAsk on LandsEnd and search fields on every page, like Ritz Interactive. Also suggests using Amazon-like recommendations and providing information stored in private product databases to web search indexers such as Google.
- Seeking far and wide for the right data InfoWorld, August 27 / September 3, 2001 by Cathleen Moore.
Describes the value of search engines and categorization as essential elements of corporate portal infrastructures, to handle the "deluge" of information within enterprises. Quotes Aberdeen analyst Guy Creese who points out that without a good way to search, corporations would be "blowing their investment in the content". Covers recent announcements of search and categorization features by Autonomy, Verity, AltaVista, iPhrase, and Smartlogik (Muscat).
- Enterprise Portals: The Current Big Thing [Survey Results] InformationWeek, July 23, 2001
Describes results of a survey of 100 IT and business professionals, who's companies hope for better productivity and efficiency using a portal approach. The most search-related aspect is the desire to "improve decision-making" (around 75% report this as a goal). Enterprise Portals are used mainly by employees, but half are used by customers and/or business partners, so security must be a major element. Budgets for enterprise portals are low, 1% to 5% of overall IT spending, and very few companies are delivering "richer" content (multimedia, presumably) due to cost, privacy and bandwidth concerns.
InternetWorld Conference Postponed
InternetWorld Fall 2001 has been rescheduled to December 10 through 14: Avi Rappoport of SearchTools.com will be speaking on Tuesday December 11, 2:45 to 3:45.
Intranet and Enterprise Information Portal Searching Report
As internal corporate networks get larger and more information is available in digital format, enterprises are installing search engines on their intranets. This allows them to find valuable documents quickly, avoid duplication of effort, take advantage of research and analysis already performed and make better use of resources. Many companies are also using this network to allow employee self-service for human resources and supply ordering, activities which are also appropriate for searching.
Enterprise Information Portals provide a starting point for people to access information and applications on the entire Intranet. They generally include a search engine for locating internal and external information; security features, so a person only has to use a single password; personalization so they get appropriate information; access to databases and enterprise applications; and so on.
EIPs are starting to create categories and directories of information, and using them with full-text searching to make the most effective presentation of relevant results, and provide alerting services using information filtering to watch for news and other incoming information of interest.
Microsoft Index Server Vulnerable to CodeRed Worm
The CodeRed worm can exploit a flaw in the Microsoft Index Server search engine to install itself and attack other machines. All administrators should install the patches and reboot to remove this vulnerability.
SearchTools Survey Results
New survey analysis up through the end of October 2000. We wanted to learn more about the relationship of search engines and web sites, and how web site managers view search engines.
We now have 1075 survey results, as of July 12, 2001, covering the topics of why site managers have or have not installed search engines, correlations of the sizes of sites and the installation of search engines, frequency of updates, file formats served, languages, and number of languages used on sites.
For web administrator ratings of the search tools they've used, see the Survey Ratings page. This includes evaluations of the most popular search engines (with seven or more responses), other products, and custom development.
Atomica: New MetaSearch Engine
Metasearch engine queries multiple locations and unifies content in the results.
SearchExpress: New Search Engine
Operates locally or as a remote search service, indexes many file formats and can scan and OCR paper documents, scales to millions of pages, ActiveX code library available.
Visual Net: New Search and Visualization EngineProvides visual mapping of data to group related topics in enterprises.
SearchEngine Site Search Service Update
This remote search service, will now index up to 30,000 pages for free, although search results pages will show their advertising. Paid versions are available to remove advertising and company logos at a very reasonable price. Based in the UK, this ASP search service is particularly useful to sites in Europe, as indexing and search results don't have to travel to the US and take chances with network quality and latency.
MondoSearch Product/Service UpdateMondoSoft, which has acquired the Searchbutton remote search ASP, has a new version with synonym lists and additional vocabulary help, phrase searching, recommended page categories, highlighting terms in results pages and many other useful features.
AOL PLWeb Search Engine SupportAOL affirms continued support for the PLS search products, including PLWeb and CPL in a response to questions on their mailing list. PLS is the search engine used for all AOL Time Warner sites including AOL, Netscape and ICQ. It continues to be available free of charge but with only partial source code.
QueryServer: Updated MetaSearch EngineDataware Knowledge Seeker metasearch server is now available from Open Text.
Multimedia Search Engines Report
As more digital multimedia archives are developed, they require specialized search engines can index and search these formats. Video and audio are hard to browse, so search engines can save significant time and effort in locating useful content.
Indexing multimedia is much more complex than indexing text. In some cases the media can be converted to text: broadcast television often includes digital text as closed-captions for the hearing impaired, and scene titles and captions within a video can be converted to text using OCR. Speech-recognition technology can digitize words spoken on audio tracks. Continuous media, such as video, also can be broken up into chunks by transitional effects, for better precision in results. Some groups are also working on form and shape recognition, which could allow searchers to draw a shape, such as a bridge or a tumor; or select an example picture and find others like it.
New Search Engine: Northern Light
Search service of the Northern Light search engine incorporates many Enterprise Information Portal (EIP) features, including security and personalization. Now integrated with Corporate Yahoo PortalBuilder service.
New Search Engine: Amberfish
Beta version of a new high-performance search engine with efficient indexing and searching.
New Search Engine: OpenFTS
Free open-source search engine from Russia, based on the PostgreSQL database, optimized for fast index updating.
Asian Text Retrieval Workshop
Evaluation of Asian language text retrieval, question answering and text summarization, following on the TREC workshops. Also includes cross-language information retrieval in Chinese, Korean, Japanese and English. Runs from September 2001 through October 2002, participants get a chance to perform tests, participate in discussions, receive evaluations of their software and publish their results. Anonymous participation is permitted.
New Search Engine: JXTA Search
Some of the smartest folks working on peer-to-peer computing designed this interchange standard to allow a central server to accept queries, distribute them to the appropriate search servers and return the results to the original clients. It was known as InfraSearch, then GoneSilent, and is now part of Sun's JXTA project.
New Search Engine: Windex Search
A Java search engine from France, the indexing is done first, and then a Java applet to search and display results. The index and applet can be distributed on CD or DVD disc, or from a web site.
mnoGoSearch Ported to Windows and Mac OS X
The Russian search engine mnoGoSearch has been ported to Windows, although that version is not free. This engine uses a database back end instead of an inverted index, and includes interfaces for PHP, Perl and so on. The Unix version has been ported to Mac OS X, as have the Onix and ht://Dig search engines.
Inktomi Search Updates to version 4.1.2The new version of Inktomi's search software (formerly Ultraseek) includes more features for Japanese and Korean languages, updates to file format filters including double-byte PDF 1.2 files, support for cookies, summaries of MS Word documents and improved support for US Government Section 508 standards (disabled accessibility compliance).
XML Query Working Group StatusOn June 12, the W3C XML Query Working group published new drafts of papers on XML Query language definition, use cases, formal semantics, data model and syntax.
MondoSearch Acquires Searchbutton
MondoSoft, which provides both remote search services and search software, has agreed to acquire Searchbutton.com, another leading remote search service. MondoSearch strengths include interfaces in many languages, unique frame recognition while indexing, showing results in categories, and date sensitivity. Searchbutton strengths are simple yet powerful administration interfaces and excellent search reporting. While this consolidation reduces some of the competition in the field, we hope that the merged company will be stronger and more able to withstand the current downturn. Searchbutton customers can move to the MondoSearch service, which starts at $420 per month, or the MondoSearch software on internal servers, starting at $6,200.
New Search Engine: 80-20 Discovery
Uses complex neural net and concept retrieval algorithms rather than simple word matching. Can distribute searching to multiple servers and integrates with Windows security.
New Search Engine: LikeItPerforms pattern-matching on parts of words rather than whole words, for better recall. Can run locally or be hosted remotely, also available as an ANSI C code library.
New Search Engine: XML Query Engine
Free-text search for XML fields in hierarchies, compatible with XQL and XQuery (W3C Query Language draft). Available in the form of an Enterprise Java Bean.
New Product Report: Autonomy Search
Autonomy has been around for many years, but has not emphasized its search engine, which uses Bayesian pattern recognition to match queries to documents. This search is integrated into the Autonomy EIP, including automated categorization, document similarity matching, and adaptive and collaborative filtering technology.
Quiver To Integrate Classification with Inktomi Search
The Quiver taxonomy and categorization tool will integrate Inktomi Search Software to provide a complete information retrieval and navigation system.
New RobotsTxt.org Web Site
Martijn Koster has moved the Robots.txt and Robots Meta Tag information from the old webcrawler site to its own site. The new site was registered in September 2000 and there is a note that it will be updated in Q3 2000, but all the information I could find was the old standard.
AltaVista Search Enterprise and Personal Search
AltaVista has announced two new products: Enterprise Search and Personal Search. The enterprise version is designed to allow organizations to make use of internal information in structured databases and unstructured formats, such as personal archives and email servers. It will integrate with corporate security and access control systems as well as organizational policies for adjusting results relevance rankings. The Personal version will index text on workstations, allowing individuals to use the Web search paradigm to search their own hard drives and file stores.
An AP story interviewed privacy advocates who expressed concern about corporate intrusion into obscure and intimate files of individual employees. This may also expose companies to demands for evidence in harassment and employment law cases. Other analysts point out that any information on business machines is owned by that business, and AltaVista promises tools to limit indexing and protect specific areas.
Although AltaVista will index over 200 file formats, including ZIP, it cannot break encryption, so we expect to see a sudden upsurge of personal encryption utility use among attentive employees.
FizzyLabs, which provided related items based on AI and document-similarity analysis, has shut down, another victim of the dotcom downturn.
New Pricing for Atomz Remote Search Prime Service
Atomz Search Prime is now available for $600 per year for indexing and searching 1,000 pages, down from $2000. The free version remains available for up to 500 pages and does not display any advertising banners, though it does require an Atomz logo graphic. The paid versions provide more control, more frequent updates and telephone technical support. Atomz also offers an Enterprise Search for larger sites, ecommerce stores and enterprise intranets.
New Search Engine: Recommind MindServer
New search engine and automatic classification and categorization system uses semantic analysis to find the underlying topics of documents and return the most useful results first. Designed for Intranets and Enterprise Information Portals
Searching MP3 Metadata
ID3 versions 1 and 2 offer searchable information about MP3 music files. A few search engines recognize this information, and we hope to see more in the future. This report provides a short background discussion, links to the ID3 information, and listings of search engines which can find the MP3 metadata.
Meta Search Engines
As sites and Intranets get more complex, it's nice to search all the data at once time. MetaSearch engines can send requests to multiple text and database search interfaces, then present the results to users. This report provides a little background, some information on the Z39.50 metasearch standard, and listings of meta search engines and toolkits.
Smart Logik (previously Brightstation and Muscat) originally started developing this code library as open source (Open Muscat or Omsee) as the next version of its high-performance text retrieval system. It uses a probabilistic relevance algorithm, providing an efficient and scalable library for indexing and searching data. As of the end of April, 2001, the company closed down the Omsee open source effort, so outside developers have started a SourceForge project to continue working on the code. As of July 1 2001, the name is changing to OmSearch to avoid any confusion.
NNGroup report on e-commerce search engines (late 2000)
This report analyzes searching on online store sites, focusing on user experience. It says very much the same things we've been saying about search forms, results pages and search failures. Includes some solid test data backing up the recommendations to use a search box, recognize synonyms, accept various operators and errors, show helpful results metadata, explain results, handle search failure, and perform extensive search log analysis. Well worth the $45 to download the PDF report: for links to other articles, see the SearchTools.com E-Commerce Search page.
Article: Robot Exclusion Standard (Robots.txt) Has Legal Value
No Bots Allowed! Interactive Week, April 12, 2001 by James C. Luh
Describes how eBay's court case against the auction aggregator Bidder's Edge was won based in part on eBay's use of robots.txt. eBay's lawyers likened the directives in their robots.txt file to a "no trespassing" sign, and say that the court agreed with them. Martijn Koster, developer of the standard, says he has mixed feelings about enforcement based on it -- that's one reason it's a convention and not a formal IETF or W3C standard.
This search engine uses fuzzy matching extensively, to match terms misspelled either in the search query or the web pages. It has special features for indexing mailing lists, and provides speedy results on large collections, even on low-end servers.
April 5, 2001
Usability Testing for Search Field Location
Michael Bernard reports on a formal usability test of various standard web page elements, including the search field. Results show that both novice and experienced web users expect the search field and button to be in the center at the top or bottom of the page, or at the upper right corner.
Inktomi Search Software version 4.1
New version of the Inktomi site and enterprise search engine (formerly Ultraseek) now indexes content databases using ODBC on Windows NT/2000 and direct Oracle access on Unix. It also includes support for Korean, per-language synonym lists, XML attribute searching and automatic title generation for WML pages.
mnoGoSearch for Windows
Windows beta version of the mnoGoSearch software includes a graphical interface for indexing.
Swiss search engine Alkaline now has custom indexing metatags, numeric searches, multiple services on Windows NT, and a bare-bones Perl API.
The remote search service FusionBot now indexes PDF files, password-protected areas and can use the HTTPS protocol.
HomepageSearchEngine version 3.3
HomepageSearchEngine now highlights term matches in found pages when clicking on results, and can use an HTML "template" page with SSI or PHP layout commands.
AltaVista Enterprise Search Seminars
AltaVista Search seminar on how search works for large institutions, with a presentation by Andrei Broder, AltaVista Chief Scientist, and other search experts. Dates and Locations: April 17 in Boston; April 18 in New York City; April 19 in Washington, DC; April 24 in Santa Clara (SF Bay Area); April 25 in Los Angeles.
Omsee - new name for Open Muscat
Omsee is an open-source probabilistic relevance engine, designed as an efficient and scalable library for indexing and searching data.
dtSearch version 6
This version has a robot indexing spider, Unicode and XML hierarchy support, and indexing update scheduling.
Search For Success: Internet Week Article
An expert in customer service points out that fixing the site search may be much more cost-effective than complex CRM solutions and provides specific suggestions.
Namazu Japanese Search Engine
Namazu is a free open-source Japanese search engine for Unix and Windows, it seems to be written in Perl. It only indexes local files, no robot crawling, but it looks very nice from here.
Microsoft SharePoint (formerly Tahoe)
SharePoint is an Enterprise Information Portal, with a search engine which can index and search text. It's compatible with Exchange Public Folders, file servers, Web sites and Lotus Notes databases. It uses a variant of SQL for queries, and probabilistic relevance ranking in results, with a "best bets" feature emphasizing frequently-linked documents.
RDF querying using Squish
Squish is an experimental query engine in Java which accepts SQL-style queries and searches through RDF documents.
Survey Results: Online Stores Need Good Search Engines
Not All Site Features Turn Online Shoppers Into Buyers PricewaterhouseCoopers, March 6, 2001
A survey of 547 Internet users in January of this year found that over three-quarters of the respondents use search features (77%).
Search functionality is considered the most important feature for online shopping by 43%, beating product information (40%), when choosing where to shop: both features led customer service, personalization and wish lists in selecting sites. When deciding what to buy, search functions also pay an important role, although enlarged product images, availability and comparison guides are more directly involved.
All this supports our proposition that e-commerce sites should concentrate on providing excellent search results rather than expensive and complex interactive features.
New and Updated Search Tools
- LexiQuest provides natural-language analysis and cross-language searching, along with automatic classification and categorization.
Peer To Peer Search
Sun has just bought InfraSearch, temporarily known as GoneSilent, a distributed search engine started by one of the founders of the Gnutella P2P protocol. According to Sun's Press Release, this technology will be incorporated into software developed by Project Juxtapose, their peer-to-peer research incubator.
Watch this space for more news and analysis on peer-to-peer searching.
Updated Search Tools
- MondoSearch has a new version 4.1, most notably compatible with Adobe Acrobat (PDF) and Microsoft Office file formats.
- Searchbutton now has "SearchNames", allowing search administrators to display specific pages in response to certain queries. For example, a commerce site with a main page for "perfumes" for example, might skip the results listings and take their customers directly to that page when they type the word into the search box.
- Inktomi Hosted Portal Search Engine improves crawling, indexing and relevance.
- SiteMiner no longer offers a free remote search service, which we assume is a result of the downturn in web advertising.
- Search Engine Meeting 2001 - the Infonortics conference tends to draw a nice variety of search engine developers, researchers, and search administrators. It's the major conference in the field. Avi Rappoport of SearchTools will be speaking about new developments in site and enterprise search engines.
- InfoToday 2001 (NationalOnline), May 15-17 in New York City has a whole track on search engines, searching and electronic publishing. It will include information on metadata, automatic classification, portals, and vertical search engines.
Revving Up the Search Engines to Keep the E-Aisles Clear New York Times, February 28, 2001 by Lisa Guernsey (registration may be required to read this article)
Discusses the difficulty of locating items in online stores, referring to the Forrester report of last spring. Describes the use of thesaurus tools for synonym searching and taking advantage of database structure in online stores. Quotes the vendors Mercado, which provides search for WebVan and Tower Records, and EasyAsk, as well as the chief scientist at Verity.
New Search Tools
- EasyAsk - sophisticated metasearch engine can query both SQL databases and text search engines. Recognizes natural-language and keyword searches, database categories and term synonyms.
XML Query Standards Progress
The World Wide Web Consortium XML Query Working Group has released a new version of the requirements for building a standard XML query language, as of February 16. This describes how the standard XML query language should work, with general goals, usage scenarios, terminology, data model information, functionality, and how the language should fit with the other XML standards. They've also posted XQuery: A Query Language for XML which is an implementation of a query language based on the requirements.
The discussion on the XML Query Public Mailing List has been brisk and thoughtful, indicating it is a good step towards a standard query language but there are some aspects of the current requirements document which will need revision. This is a public mailing list for interested parties; to join, just send a "subscribe" message to firstname.lastname@example.org, and see the archives for past discussions.
For more information, see our report on Searching XML and our list of XML Searching Resources.
New Search Tools
- ASPSeek - free open-source search engine written in C++, can handle multiple character sets and languages, includes phrase searching, optional stemming and easy customization.
New Search Tools
- Darwin SET - search toolkit for portals includes categorization (including e-commerce detection), caching for quick retrieval, dynamic weighting of search results based on click throughs and use patterns.
- Lucene - free open-source search toolkit in Java is designed for large indexes, flexible data sources, fast searching, field and date-range searching.
- Intelliseek Enterprise Search Server - a metasearch engine designed to aggregate search results from many sources, including Oracle, Lotus Notes, news feeds and more.
- HomePageSearchEngine - simple search engine for sites, can search on the fly or store text in a searchable index. Available in 20 languages including Chinese and Arabic.
Obsolete and Discontinued Search Tools
New Search Studies
- "Sex" Popular on the Web, Many People Inefficient at Reaching Their Online Destinations (content removed, see summary) Alexa Research, February 14, 2001
To no one's surprise, this comprehensive study of 10 web search engines over two years finds that the single most common search is for the word "sex" (0.3289%), as well as "porn", "nudes" and "xxx". More interestingly, web users get confused by the search field, because it is quite common to find them typing in URLs, such as "hotmail", "yahoo" and "ebay". We have found both of these common problems to be true on site search engines as well, and recommend designing No-Matches Pages with this problem in mind.
- Roper Study on "Web Rage" while Searching Roper Starch Surveys, December 18, 2000 (removed from the Roper site: see the ZDNet article on the study).
Commissioned by the search engine WebTop, this reports the results of a study of people doing general web searches saying:
On average, it takes 12 minutes of searching the Web for specific information before Internet users get frustrated. Almost one in five Internet users get frustrated if their search takes them up to five minutes to complete. About half of Internet users have the patience to search the Web for longer than 15 minutes.
Danny Sullivan of SearchEngineWatch has termed this "search rage" based on the statistic that total of 71% of people reported being frustrated at some time during searching, and over half get frustrated by irrelevant information in search results. However, more than 75% usually or always find what they're looking for.
SearchTools Advice For Search Engines
In response to both the WebTop and Alexa surveys, we believe that better interfaces, especially in search results will alleviate some of this pain. SearchTools advice to search administrators is:
- search index should be both complete and current
- emphasize the most content-rich pages, such as product descriptions and FAQ answers
- display helpful context (titles, categories and meta description summaries)
- show word matches in the results listings (hit highlighting)
- provide help with search failure by designing useful No-Matches Pages
- perform regular reviews search logs and reports to discover what people search for
- use search log data to have the search engine respond well to all reasonable searches.
New Search Tools and Services
- ReachSite works with the ReachCast content management system to automatically index data from various sources, including Oracle document databases.
- New Information Foraging Theory and "the scent of information" links to the User Experience page.
- Search conferences page now up to date.
- New separate pages for Natural-Language Processing and Cross-Language Information Retrieval.
New Search Tools and Services
- Spiderline remote search service has configurable results pages and an interesting pricing model, 1 penny per search with a small indexing fee. This should be very useful for startup topical portals, which have low traffic but want to offer rich search indexes.
- iPhrase performs complex metasearches and natural-language searching.
- ic-find is a sophisticated search engine within a full-featured eBusiness portal from the German company 7d.
- ZNOW performs both indexing and automatic classification, so search results get put into categories based on the text of the pages.
- Vivisimo performs clustering and classification on search results from other engines.
Updated Search Tools
- The free open-source ISearch engine now has Windows version. It also has a simplified version created by Bryn Dole of DMOZ, based on his changes to provide search for the ODP (Open Directory Project). It is faster, smaller, simpler, and indexes individual Chinese and Japanese characters in UTF-8.
- Atomz Search provides simple integration with Apple's Sherlock metasearch client on Mac OS. While the company is adding content-management products, it continues to develop its remote search services.
- Isys:web indexes many file formats and databases, and will even convert them to HTML for viewing search results. Also shows the search word matches within the pages.
Discontinued Search Tools
- Folio SiteDirector has been discontinued by its purchaser, LivePublish NextPage, but the technology lives on in NextPage Search.
NQL (Network Query Language)
Powerful scripting language automates intelligent agent information transport for web site indexing, metasearching and accessing databases, email stores and more.
For earlier news, see the 2000, 1999 and 1998 news archive pages