Home Guide Tools Listing News Info Search About SearchTools

As of January, 2012, this site is no longer updated, due to work and health issues.

Guide to Search Tools

Federated Search Systems - History and Development

see also: the Federated Search report

In the early history of the Web, when bandwidth and disk space were much more expensive, the SOIF and RDM federated indexing technology, was designed for local servers to gather and index data and then pass it on to search servers. This allows indexes to work together and update as needed, rather than forcing each search indexer to crawl each site separately.

Z39.50

Z39.50 is a standard developed for library and other bibliographic databases, this provides a common interface for federated search on a multitude of database formats. It has a standard messaging and wire protocol, which predates HTTP and is much more complex, with a complex session interaction system. Unlike the stateless HTTP protocol, Z39.50 doesn't have tools to deal with unavailable servers, and the system will not return until the slowest server replies. It assumes a "shared content semantic knowledge" -- oriented around library collections. Although the basic functionality is available: send a query, get a results set, get a record from results set, implementers found some important elements undefined and proceeded to use their own interpretation, breaking the interoperability. In addition to all this, the results were not necessarily readable and often require significant post-processing.

Z39.50 References

Z39.50 standard

For an in-depth explanation, see Z39.50: An Overview of Development and the Future.

United States Library of Congress listing of Z39.50 software

ZNG Initiative: Z39.50 Next Generation - new web service based on Z39.50 and the web technologies HTTP, XML, URI and SOAP/RPC.

Z39.50 Made Simple

Z39.50 Examples

Z39.50 Search Tools

WAIS

HARVEST

STARTS

1998 multilingual federated search article

JXTA - Sun's Federated Search and P2P Protocols

JXTA is Sun's definition for Peer to Peer communications includes a standard communications protocols for queries, and for distributing queries based on source coverage. The sources advertise their contents with metadata, and peers can choose where to send the search query. Search indexes can be centralized, decentralized, or mixed. The search engines use their own internal algorithms for retrieval and relevance, but the search response

See the SearchTools Coverage of JXTA Search for current information.

XQuery Version 1

 

"Best Sources" Problem

  • Hundreds or thousands of sources
  • Can't query them all, performance problems
  • Try to get users to choose domains
  • Ray Larson, UC Berkeley
    • Perform scanning of databases to list contents
    • Choose based on the best matches
  • Intelliseek
    • Analyze query for context
    • Find best sources for query type
    • Choose best-performing sources based on responsiveness & quality
  • Jon Kleinberg - social networks
    • Finding experts

Open Archives Metadata Harvesting Protocol

New Google APIs

 

Page Modified: 2011-2-1