RE: Disallowing files for robots
by Roxy <4Roxy(at)autumnweb.com>
|
Date: |
Sun, 26 Nov 2000 22:59:30 -0500 |
To: |
hwg-techniques(at)hwg.org |
References: |
hotmail |
|
todo: View
Thread,
Original
|
|
At 08:58 PM 11/26/00 , Bob wrote:
> Here is the correct robots .txt for the search engines.
>
><META NAME="Robot" CONTENT="NOINDEX">
><META NAME="Robot" CONTENT="NOFOLLOW">
>
>bob
Note, that's what can be put on a web page, in the meta tags. Some SEs
don't "obey" this line. My stats from my host, and my log files show the
amount of times the robots.txt file is accessed. I can see that dozens of
robots check that file weekly, and most will obey that.
http://info.webcrawler.com/mak/projects/robots/exclusion-user.html
and
http://info.webcrawler.com/mak/projects/robots/exclusion-admin.html
There's LOTS of information about meta tags at
http://www.searchenginewatch.com/ and they also cover the robots.txt file.
Many SEs have a "help" section, where they explain how to stop a SE spider
from accessing certain files. Example, Alta Vista has a whole tutorial
section for building web pages. They specifically state that they obey the
"Robot Exclusion Standard,"
http://doc.altavista.com/adv_search/ast_haw_avoiding.html
and if you follow enough links, you'll find the information at
http://info.webcrawler.com/mak/projects/robots/faq.html#prevent
I hope that helps,
Roxanne
Not just putting your business on the Web
Promoting your business on the Web!
Autumn Web ~ http://autumnweb.com/
design / development / promotion / search engine optimization
+ tutorials, web page help, free graphics for personal sites.
* -- * -- * -- * -- * -- * -- * -- * -- * -- * -- *
HWG hwg-techniques mailing list archives,
maintained by Webmasters @ IWA