User-agent: * Disallow: /test/robots/disallow/ Disallow: /test/robots/noindex/ Disallow: /test/robots/partial Allow: /test/robots/allow/ Disallow: /test/robots/wild* Allow: /test/robots/wildcard* Disallow: /test/robots/wildcard/longer* # added 2009-03-06 testing the order matched Allow: /test/robots/wildcard/l* Allow: /test/robots/a* # added 2009-10-29 playing with wildcards Disallow /test/robots/wildcard* Disallow: /test/robots/*z Disallow: /test/relativelinks/2ndlevel/http:// # weird old directories Disallow: /test/relativelinks/rtestprob/http://searchtools/about/ Disallow: /test/relativelinks/rtestprob/http://searchtools/analysis/ Disallow: /test/relativelinks/rtestprob/http://searchtools/guide/ Disallow: /test/relativelinks/rtestprob/http://searchtools/info/ Disallow: /test/relativelinks/rtestprob/http://searchtools/pub/ Disallow: /test/relativelinks/rtestprob/http://searchtools/robots/ Disallow: /test/relativelinks/rtestprob/http://searchtools/search/ Disallow: /test/relativelinks/rtestprob/http://searchtools/site/ Disallow: /test/relativelinks/rtestprob/http://searchtools/slides/ Disallow: /test/relativelinks/rtestprob/http://searchtools/surveys/ Disallow: /test/relativelinks/rtestprob/http://searchtools/tools/ Disallow: /slides/examples/ Disallow: /slides/ia/images/*.html Disallow: /%20%20NEW%20&%20IN%20PROGRESS/ Disallow: /info/conferences-past.html # redirects mostly Disallow: /background/ Disallow: /blog/ Disallow: /cgi-bin/ Disallow: /info/articles/ Disallow: /info/meetings/examples/ Disallow: /info/meetings/thunderlizard/examples/ Disallow: /info/robots/ Disallow: /info/slides/ Disallow: /lists/ Disallow: /related/ Disallow: /reviews/ Disallow: /searchtools/ # testing allow Allow: /site/sitemap.xml Allow: /site/site/contact* Disallow: /site/ Disallow: /ST/ Disallow: /st/ Disallow: /St/ Disallow: /wr/ # test of semicolon after disallow, 2009-10-29 # (Google reads it and reports a warning and then syntax error) User-agent: * Disallow; /test/robotx/ # test of semicolon after disallow 2009-10-29 # (interesting that Google reads it and reports a syntax error) User-agent: inktomi Disallow; /test/robot # test of empty user agent field on optional content, 2009-10-29 User-agent: Disallow: /site/rss-archives/ # updated 2002-03-22 (disallow rtestprob links) # updated 2002-06-25 (disallow info/slides links, info/robots/) # updated 2002-07-25 (disallow /searchtools/ which is an alias) # updated 2005-09-09 (rearranged as per Enrico's advice) # updated 2007-01-24 (added site directory to disallow list) # updated 2007-01-25 (removed rss disallow, file is gone) # updated 2008-06-24 (added wildcard tests lines 6-10) # updated 2008-07-02 (added allows for site subidrectory) # updated 2009-03-06 (added Google accepts notes, wildcard/l*) # updated 2013-01-31 (disallow %20NEW paths