How to get more out of an intranet search engine - needs analysis, content inventory, rich indexing,
date issues, multimedia, dealing with search failure, using analytics to tune, and more.
Description of search metrics and analytics in various forms and examples of how insights from log
analysis can improve the content, presentation and functionality of search engines.
Workshop: introduction to enterprise search, high-level views and then technical details of robot
crawling, indexing issues, security and access control, query processing, retrieval, relevance ranking,
search interface, analytics and choosing a search engine.
The basics -- gathering, indexing, query processing, retrieval, relevance ranking, UI, and log analysis.
And then the fun stuff: alerting, IA, taxonomies, faceted metadata, multimedia, compliance, social networking,
and personalization.
Short explanation of the competing forces in webwide search -- end-users, search engines content publishers
and advertisers -- and how it all seems to work now.
Extensive coverage of various aspects of enterprise search engines for intranets and web sites. Covers
gathering and spidering, index issues, query processing, retrieval, relevance ranking, search form and
results page user interface, maintenance, search log analysis and issues in choosing a search engine.
Describes what metrics and logs are available for search analysis, what to look for, how to use the
tools, comparing navigational and topical search, and examples of addressing problem queries.
Describes the value of federated search among many data sources, the history
of Z39.50 as an interchange standard, and the new SRW (Search Retrieve
Web Service) as a modern API to simplify the process.
Faceted metadata allows systems to generate a dynamic browse interface
to search results, providing the most flexible and helpful access to information.
As library catalogs are entirely composed of metadata, most of it reasonably
well faceted, this seems a uniquely good application of this technology.
Summary of current work on Distributed Search, P2P and Metasearch; integrating
search and CMS systems, and search and security / authorization / access control
systems.
Analyzes the purchase of the Inktomi enterprise search engine (formerly
Ultraseek) by Verity. This article describes the business relationships
and plans of the companies, features of the Inktomi search engine, and the
search engine marketplace.
Databases and Search EnginesGuest Lecture, UC Berkeley
School of Information Management and Studies, class 257, September 2002
Explanation of the issues involved with searching text stored in databases, advantages and disadvantages
of text search compared with database native search.
Description of the benefits of adding search to web sites created by CMSs,
survey of leading open source search engines, CMSs with search, and proposal
to use syndication standards for search index update notification.
Problems are using search as a crutch, publishing unsearchable content,
ignoring user needs, inventing complex new interfaces, providing incomplete
coverage, hiding search, creating obscure rules for search, showing confusing
results pages, and confusing no-matches pages.
Presentation on enterprise search functionality, illustrated with examples
of successful and unsuccessful interface examples, and a summary of the back
end processing.
Addressing the weaknesses of centralized search engines, covering alternatives
including peer search, metasearch engines with screen scraping, and distributed
search protocols for server-to-server communication about queries and results.
Presentation to computer-human interface experts on the evolution of web
search engines from text interfaces, metadata and cryptic Boolean commands
to graphical browser interfaces and clever full-text retrieval.
Moderate sessions and present information on adding search to a web site,
and on controlling search engine indexing spiders (also known as robots or
crawlers).
Comprehensive review of the functionality of five large-scale search engines:
Ultraseek (then Inktomi Search Software), AltaVista Search, Atomz Enterprise,
Searchbutton Corporate and Excalibur RetrievalWare. Evaluates indexing, robots,
language and file format compatibility, metadata, special fields, query options,
customization and search relevance ranking. Article also discusses the benefits
and problems of Natural-Language search, database vs. full-text search and
open-source search engines.
"Site Search Engines" OnlineWorld99
Conference & Exhibition for Internet Researchers and Managers, October
1999
Presentation to librarians, information brokers and researchers at a conference
in Chicago.
Robots and Spiders and Crawlers, Oh My! Ultraseek White Paper, September
1999
Detailed discussion, of how search engine indexing robots follow links and
read Web pages to store the information in search indexes. Includes coverage
of problem areas such as image maps, frames, JavaScript and dynamic data.
Notes describe how the Ultraseek Spider handles these problems.
Detailed report on the Infonortics Search Engines Meeting in Boston, providing
an overview of the state of research and products in searching, portals, filters,
clustering, summarization, video and sound information retrieval, cross-language
searching and text mining.
"Add Search To Your Site" Web Design & Development Conference, June 1999.
Presentation to webmasters, administrators and designers at a conference
in San Francisco.
Step by step information on setting up a Mac web server using WebSTAR, including
administration, standard add-ons, security, privacy, optimization, maintenance
and logging.
Detailed description of the issues involved in choosing and implementing
search on a web site or intranet. Includes a checklist, ideas for testing,
interface design, maintenance and summaries of the leading products.
Description of several sites which retrieve and collage search results from
several webwide search engines.
"Making your site searchable", Net Professional October,
1998 (v2, n2).
Background on the issues involved in making a site work well with both local
and webwide search engines. Explains how to use robots.txt, meta tags, frames,
titles and relative links for best site display.
Description of the benefits and issues of site search tools, including indexers
vs. crawlers. Examines the features of iHound, Boolean Search, Phantom and
WebSTAR Search.
"Web Site Search Tools for the Mac: Choosing a Web Server Search Engine", Net Professional, August,
1998 (v2, n1).
In-depth comparative review of the four top web site search tools on the
Mac: Apple e.g., iHound, Phantom and WebSTAR Search, with sidebars on using
the Unix ht:Dig with WebTEN, and using WebSonar as a publishing tool. Includes
a feature chart and many screenshots.
WebSTAR Reference Manual, versions 3 and 4, StarNine Technologies, February, 1998 and June 1999.
User guide and technical reference manual for Web server. Includes sections
on configuration, web server security, IP addresses, DNS, virtual domains
using IP multihoming and virtual hosts, server-side includes, Secure Sockets
Layer, proxy and FTP and mail servers in over 400 pages. Created the manual in FrameMaker
and wrote scripts to output in print, PDF and HTML, all from the same source.