As of January, 2012, this site is no longer being updated, due to work and health issues

Search Tools

SearchTools News


See the News page for more recent information



SearchTools News - December 31, 1998

Remote Search Hosting Services

Remote searchers will crawl your site and store your info in an index on their server. When someone enters a query in the search form on your site, the link points at the search engine host. It receives the query,does the lookup in the index, formats the results, and sends them back in an HTML form with links directly to the pages on your site.

Some services provide simple, straightforward searches, while others offer powerful advanced search functions such as proximity operators (NEAR) and date-range searching. Some are free and supported by advertising, others charge by the page, by the search or monthly.

Remote search services do not require any programming or local access to the web server. They act like standard robot spiders, following links on your site, rather than using the local file system. For hosted sites with limited server access, these services are an excellent option.

We expect to have a full comparative report on remote site hosting services in February: in the meantime, you can try them out on our search test page.

Source Libraries for Search Code

Many of you may be interested in adding searching and text retrieval to your own applications. We've started a listing page for search source code, beginning with the excellent Findex package from LexTech and the dtSearch source package. You can also look at the open source search applications (listed on the same page), but be sure to check the license terms before including the code in your own programs.

New & Updated Site Search Tools Reports

BayCHI Presentation

The ACM's Computer-Human Interface group (SIGHI) in the SF Bay Area will sponsor a presentation on Interface for Web Site Search and Navigation on January 12, 1999 at Xerox PARC. SearchTools' Avi Rappoport will be discussing the most important issues in designing a web site search interface, and the architectural and site navigation. For more information, see BayCHI.org.

SearchTools News - December 17, 1998

Knowledge Management Stocks Go Wild

Verity has reported a third quarter of profits, Open Text is doing a deal with Nortel after failing to take over PC Docs/Fulcrum, which is partnering with ChiliSoft to integrate Chili!ASP and CyberDOCS. And all their stock is up (today at least).

SearchTools News - December 16, 1998

New Survey!

Take the Search Tools survey and help us get a picture of your web site -- and why you are (or are not) installing site search tools. Intranet sites are welcome too. It's a chance to explain what you have found that works and what doesn't. We'll write up a big article in February about the results, and will send you a copy of the report directly if you so choose.

SearchTools News - November 29, 1998

Overview Articles

Add search to your site CNET Builder.com, November 17, 1998 by Avi Rappoport
A comprehensive article by our own Avi Rappoport, covering background information, choosing the right site search engine for your site, testing and installing the software, designing the interface, and setting up a maintenance program. Includes many links to sites and products, summaries of some of the most attractive programs and an example of the installation process using the DNA Files web site.

Other new articles include Inter@ctive Week's Perfecting Corporate Search Engines, useful tips from chami.com, and A review of robot based internet search services (1996).

We recently separated the Overviews page into individual pages for Books and Articles, Links, Newsgroups & Mailing Lists, and Training and hope you find the organization useful.

Cross-Language Searching

As more sites present non-English text, site search engines must index and search these pages effectively. This ranges from handling extended characters (such as those in cîté and søk), through non-Roman character sets and even searching data in languages different from that of the original query. For links to resources, see the new Cross-Language Information Retrieval section.

Metasearch

Metasearch is the process of accessing multiple search engines at once, and presenting the results organized in a useful way. In addition to metasearch web sites and client applications, some site search programs can provide this service. See the new Metasearch section in webwide search page for details.

Surveys

Results of an Intranet Journal survey and an ht://Dig survey are now on the Surveys page.

We will be doing our own survey shortly -- watch for a special announcement!

New Site Search Tools

Conferences

The Infonortics Search Engines and Beyond conference for 1999 will concentrate on "Developing efficient knowledge management systems". Industry speakers include Ramana Rao of Inxight, Danny Sullivan of SearchEngineWatch, Rick Kenny of Fulcrum, John Snyder of Muscat, Mark Krellenstein of Northern Light, Dan Miller of Ask Jeeves, Ellen Voohrees of NIST/TREC. There will also be a number of academic and research speakers as well.

Y2K and Site Search Tools

Most search engines will not have any trouble in the year 2000: they will not fail because, in most cases, they do not depend on date comparisons.

The most serious likely problem area is index updating. Some automated compare the last update date and time to the current date and time to decide if they should run again. These may have trouble in the year 2000, if the programmers did not store the date as a four-digit number. In that case, the indexer could get confused every time it encounters a file modification date seemingly in the future (for example, the program thinks it's the year 1900 but the file was modified in 1998). The discrepancy could cause the indexer to re-index rather than updating, which could significantly affect server performance.

Other possible problem areas include administration and search log code, which may also have difficulty if features depend on two-digit years. In addition, searching on date ranges will not work if the file modification dates are stored in the index with two-digit years.

You should check the documentation, code (if available), the developer's web site, support mailing list or newsgroup first, but if this issue is not covered, conduct your own tests or contact the company before the end of 1999.

SearchTools News - November 13, 1998

Ultraseek to Index and Search XML

According to a story in Wired News, Infoseek's site search tool, Ultraseek version 3.0 will support searching XML. The product is due for release next Tuesday (November 17, 1998). Tim Bray, co-editor of the W3C XML standard, welcomed the news cautiously, warning that implementing high-volume and high-performance search of structured text is extremely difficult.

SearchTools News - October 26, 1998

Site Search Panel at Builder.Com Live - December 7-9, 1998, New Orleans

The panel will consist of Avi Rappoport of Search Tools Consulting, Louis Rosenfeld, author of Information Architecture for the WorldWide Web, an excellent book on site design and information architecture (including searching) and Jakob Nielsen, of the Nielsen Norman Group and the UseIt web site. We'll be covering various aspects of choosing, installing and improving site searching, and hope to see you there.

MondoSearch (new search tools)

New product offers automatic categorization (Yahoo-style directories), frame-handling, indexing of pages generated dynamically from databases and other sources, audio and video search, multilanguage filtering, and more.

PicoSearch and NetCreations PinPoint (new Remote Indexers)

New remote index and search engines index your site and store the results on their server. When a site visitor enters their search word or phrase and presses the search button, the remote server application receives the data, performs the search and returns the results. Try them out on our search page.

New Versions of Search Tools

Thunderstone Webinator version 2.5

New version includes optional metasearch: searching on multiple webwide search engines. Webinator is also available as a free remote indexer -- we have an example running.

Ultraseek - New Version 2.1, User Group Meeting

Ultraseek version 2.1 includes speed improvements, date range searching, indexing of documents on SSL (https) servers, indexing of newsgroups, XML tag support, distributed indexing and robot spider cooperation, more language support (Swedish, Norwegian and Danish added to French, German, Dutch, Spanish, Italian, Portuguese, Japanese and English), among a number of other features and bug fixes.

[The user group meeting in New Orleans has been postponed]

Phantom - New Version Announced

Maxum has recently announced the 2.2 version of the Phantom search tool, which will have PDF indexing, meta tag indexing, more results customization, and several nice administration features.

Quadralay WebWorks Search

WebWorks Search version 2.0.7 is available for download on Windows 95/98/NT. This update release adds support for IIS 4.0.

Site Search Installation Example: US Department of Education, Cross-Site Indexing Project 1997

Another good example of the process of choosing and installing a site search tool, in this case covering several Education Department sites. The group set up a requirements document, and tested Netscape Catalog (later replaced by Compass Server), InQuery, Verity Search '97 and Ultraseek, which they ultimately chose.

New Book:Web Developer.com Guide to Search Engines

A wide-ranging book covering everything from the beginnings of the robot spiders crawling and indexing the web to analysis of the major webwide search engines to detailed information on installing and configuring six local site search tools. The programs covered are AltaVista Search Intranet, Excite for Web Servers, Harvest, ht://Dig, Phantom and Ultraseek. Also describes BDDBot: An ongoing collaborative project, to create a Java web server and search spider, using open source under the GNU public license. Use the following links to buy from Amazon or Computer Literacy and you'll support this site.

New Book: Web Navigation: Designing the User Experience

New book on designing web site navigation, from the simple to the complex enterprise site. Use these links to buy at Amazon or Computer Literacy and support this site.

dc:DC - The Sixth Dublin Core Metadata Workshop

The Dublin Core is a simple set of metadata for describing web pages, such as the copyright information, subject, language and so on. The workshop will focus on practical implementation and interoperability. The meeting is in Washington DC, November 2-4, 1998.

See the Conferences page for other meeting information.

SearchTools News - September 14, 1998

Search Administrators Information

Search Tool Product Information

SearchTools News - August 29, 1998

InQuizit Product Report

New product promises true natural-language processing for site searching. A test version should be up at their web site shortly.

Domino Extended Search Product Report

Allows Domino servers to provide search access multiple Notes, ODBC and webwide search databases simultaneously.

Migration Path from PLS to Thunderstone Texis

Thunderstone Software will provide price discounts and support for customers who want to migrate from PLS to the Texis integrated relational database and search system. As reported here, PLS was bought out by America Online and the products are now shareware.

XML Query Language Proposal

The XML-QL, Query Language for XML proposal describes a query language approaching an XML file much like searching a database with SQL, rather than a free-text document. The focus of this proposal is EDI (Electronic Data Interchange) data as opposed to a library or information retrieval approach. It provides examples using specific data and element patterns and constructing new results listings.

RDF Revision Posted

Conferences

Extended listings of information retrieval, and related conferences on new page.

Thunderstone Webinator Remote Search

An alternate search option for this site, provided for us by Thunderstone. You can compare it with the Phantom and SearchButton search engines, and more options will come soon.

SearchTools News - August 10, 1998

XML Search Tool Announced

The BUS (Bottom-Up Scheme) search engine indexes XML and SGML text and recognizes document hierarchies and structure.

Search Tools Product Information

New Articles and Links

SearchTools News - July 26, 1998

Web Server 4D Site-Search Product Report

New SearchButton Search

An alternate index and search option for this site, provided for us by New Idea Engineering. You can compare it with the Phantom search engine, and more options will come soon.

SearchTools News - July 20, 1998

Verity Corporate News

A columnist at The Street.com, Herb Greenberg, had an article about Verity in the premium (paid) section of thestreet.com dated July 15, 1998, which was partially reprinted in the San Francisco Chronicle.

Infoseek's Java Search Engine Project

Infoseek is planning to create a very configurable Java search tool with source code included. The idea is to provide it as an add-on to databases such as Oracle, to browsers, email programs and the desktop. Due for developer release in July, final release in December (page apparently last updated in May, 1998). Other Java Search Tools are already available.

Product Updates

Articles

SearchTools News - July 11, 1998

Searching PDF (Adobe Acrobat) Files

Important information on serving PDF files, including weaknesses in online interaction, considerations for searching, and search tools which can index and search the format.

Web Admin's Guide to Site Search Tools Updated

SOIF and RDM

The Summary Object Interchange Format and Resource Description Message mechanism are designed to allow indexes to work together and update as needed, rather than forcing search indexers to re-crawl each site redundantly. This can improve site searching of multiple very large sites.

Search Tools Product Updates

SearchTools News - June 25, 1998

Search Notes from Web Design '98 in San Francisco

SearchTools News - June 17, 1998

PLWeb and all PLS products are now freeware from AOL, but no custom development, training or support is provided. This is the executable object code, rather than the source code, although the Perl scripts can be modified. The license allows royalty-free use, though you must include a "Powered by PLS" notice with a link to AOL.

SearchTools News - June 12, 1998

This site was the Cool Tool of the Day yesterday, and we're happy to welcome everyone who came from the cooltool site. The review indicates we're doing things right, and warmed the cockles of our hearts.

We have some articles forthcoming in Net Professional Magazine, including one on Web Site Search Tools for Mac WebServers.

SearchTools News - June 10, 1998

Lycos, Inc. has announced that they have received a patent on spider technology. They are claiming exclusive rights to "automated software robots which index the Web and collect targeted information from millions of different Web sites around the world...". The announcement stresses the patent's importance in recognizing Lycos as a Web pioneer, but contains no information on whether Lycos will attempt to enforce exclusive rights or require licensing from the many other webwide search engines which use this technique. Reuters reports that Lycos previously said it would defend patent rights aggressively. Danny Sullivan of SearchEngineWatch thinks that other search crawlers use sufficiently different technology that they would not infringe on the Lycos patent.

Added a Thunderstone Webinator page.

SearchTools News - June 3, 1998

Created product lists by platforms: Java, Mac, Unix, and Windows in additional to the alphabetical listing.

More links for intranets and knowledge management.

SearchTools News - May 31, 1998

New Host: www.searchtools.com is up!

New Search: site is now indexed and searched by the Phantom engine!

Ultraseek wins 1998 Network Computing Well-Connected Award for Intranet Search Engine

The editors were particularly impressed with Ultraseek's natural language interface, administration and search results.

Good Computerworld Articles on Site Search

Inktomi is selling their engine as a site search tool, but I can't find any articles or reviews.

New Metadata page.

More information on What is a Web Site Search Tool? (including sites which do not need them).

More information on Netscape Compass Server.

SearchTools News - May 4, 1998

Found some new site search tools

Cybotics
Java search engine works on the server side using the Java Servlet API, looked very good at first glance
ht://Dig
Unix freeware with source code.
Inference Find
Article describes how the engine is used.
WebWorks Search (Quadralay)
Provides automation for topical catalogs, can be created by authors and content editors.

Information Architecture

I've been reading the Information Architecture book written by the Argus folks and it's great -- I wish I'd written it myself. This covers more than just site search tools, it goes back to basics of site design, scalability, navigation, coherence, maintenance and so on. It makes you think about the web site as a whole, rather than as separate pages or even sections. Highly recommended! You can buy it online.

New Ideas Engineering

Company provides consulting and training on Verity and other search tools, including the Guerilla Verity Class.

Site Changes

Rearranged the tools pages so all tools with actual information get their own pages, also added platform compatibility icons to the Tools page and added a Related Topics section.

SearchTools News - April 24, 1998

Notes on Search Issues from the Web.Builder conference in San Francisco, April 14-16, 1998.

The Scent of Information
Jared Spool of User Interface Engineering reported on a study they conducted last year about locating data on web sites. They took a range of people, from non-computer users to experts, and had them look up data on certain web sites, such as C|Net and Car Talk. The subjects had several questions to answer with a limited time for each site, but knew the data was there somewhere. Among other navigation problems, the UIE group discovered some serious difficulties interpreting the results of site searches, especially cryptic results lists such as coin ID numbers, and the desire of users to incrementally improve searches by whittling down the results, rather than reissuing a search.

Cataloging Web Sites
Netscape's Information Architect, Ira Kleinberg, gave a presentation on his work in setting up a context and meta-structure for their web site, using the Dublin Core tags as a base. He recommends cataloging the sites to improve navigation, version control, ownership rights and reusability of information.

XML, DHTML, DOMs, CSS, and other acronyms.
A constant topic within the conference, Brian Travis of XMLU gave a good presentation about XML.

RDF (Resource Description Framework) was very big. Based on the MCF work done at Apple, RDF provides metadata, information about information. It will allow better navigation within sites, agents to exchange data between sites. Netscape/Mozilla is experimenting with automatic site maps and improved bookmarks using this format, and it's a proposed W3C standard.


Navigation and Visualization
Earl Rennison from Perspecta Systems showed some interesting interfaces providing feedback and context in site navigation/search. For an example, see AllTheNews.

Note: both domains have disappeared by September, 1998

Alexa
Brewster Kahle showed his most recent version of the Alexa search toolbar. While it's not a site search tool, it's really cool. There are several hundred thousand users and their data tracks are providing helpful feedback to others. The Mac version is in test.

 


For more news, see the Current News Page and the 1999 News Archive.