Re: Meta tags and search engines
by "Abhay S. Kushwaha" <abhay(at)kushwaha.com>
|
Date: |
Fri, 22 Oct 1999 13:22:14 +0530 |
To: |
"Basics [HWG]" <hwg-basics(at)hwg.org> |
References: |
aol |
|
todo: View
Thread,
Original
|
|
There is a lot of confusion among webmasters about search "engines"
and search "directories" and I find it strange that it is so.
Search Directory
----------------
Eg: Yahoo! [1], DMOZ [2]
They are basically archiving URLs and indexing them in various
categories that people (visitors) can browse and reach. They
don't care what your "META" tags tell them.
Their indexers: HUMANS
Yes, editing at DMOZ was an eye-opener on how directories are
inherently different from engines. When you submit your site, it
is stored in a "unreviewed" database till an "indexer" comes along,
reviews your site and adds a description to it. Now that indexer
may or may not use your META DESCRIPTION and they may or may not
put a description to your site as well. You might have come across
sites at Yahoo! that do not have any description? Just sloppy
indexing.
But they do have searching capabilities. But what kind of search is
it? Here, let me give the example of Yahoo! and DMOZ since they
seem to be top-of-list right now.
First, DMOZ :
DMOZ has a simple search. It simply searches the directory
structure first (which is category, by the way). If it finds
any, the result is displayed at the topmost level. Then it
searches the database of all sites indexed under various
directory structures. The 'search word' if present in 'site
title' is given preference over it being present in 'site
description'.
It is but a simple IN-SITE search. As is clear from above,
your META commands don't mean anything!
Second, Yahoo! :
Yahoo has a wee more complex searching. Yahoo! uses the Inktomi
engine which searches not only the site (DMOZ like) but it also
searches the Inktomi database (Hotbot like). I guess you could
call it a hybrid now but it is essentially a directory.
How are things stored in Inktomi database? The site has it's
BOT which goes out and indexes the site (engine like) and
searches the keywords it has found. I haven't looked closely
at "pages result" to develop an opinion of whether your META
commands mean anything to the Inktomi BOT or not.
Search Engines
--------------
Eg: Altavista [3], HotBot [4]
Now these are what you call "search engines" - mammoth sites
containing HUGE databases of keyword-indexed URLs of millions
upon millions of pages. Here is where your META commands matter
the most.
Their indexers: BOTS, CRAWLERS (don't ask me the difference!)
They come to the specified page. *most* of them look at the META
commands and store the DESCRIPTION and KEYWORDS. But they don't
stop there. They *will* still go through the rest of your page
indexing words they think are relevant (every BOT has a different
'exclusion' list. come now, you didn't actually believe that
words like 'the', 'and', etc. were indexed!) Then they follow
your links and the story repeats itself.
Just remember that your KEYWORDS is a list of words that will be
attributed to your URL even if they do not occur later in the page
itself. It is a "minimum list" and the BOT will always add more
words from the page when it parses through it (if it finds any that
are not included in your META command).
I might as well say a word about repitition. It is a self defeating
cause. This is so because 50%+ of traffic to a site will result from a
directory only and it does not care what you put there. Top engines
like Altavista, Hotbot, Excite, etc. penalise you for repetition and
other unfair practices which have been identified by their respective
teams. So, the performance benefit runs into a negative.
Keywords & Key phrases
----------------------
Don't know the accuracy of this theory. But it states that a word
is different from a phrase in a search. :)
Meaning what? Meaning that "Gift Coupon" is different from "Gift",
"Coupon" (in the latter case, they appear independently) There was
a query where a lot of keyphreases starting with "Christmas" were
listed. Those were independent phrases and are useful in cases
when people use a phrase search instead of an AND search. eg:
Christmas Tree, Christmas Carols
should not generate any hit if searching solely for "Christmas" or
"Carols" unless these words appear separately. eg:
Christmas Tree, Christmas Carols, Christmas, Carols.
I think I've made the difference clear now?
Phew! Now I go and have a glass of water! ;-)
---
[1] http://www.yahoo.com
[2] http://www.dmoz.org
[3] http://www.altavista.com
[4] http://www.hotbot.com
[abhay]
PS: Wrote most of the above from observation, experience and simple
common sense. So, it might turn out that some content is
not fully accurate. My attitute-analyser gives it a 98.743% of
truth probability. [wink]
----- Original Message -----
From: <DALLASSTA(at)aol.com>
Sent: Tuesday, October 19, 1999 10:13 PM
> I do editing for dmoz.org. Their search engine places the
> sites in Alphbetical Order and really doesn't deal with the
> tags anyway. I write to the site, and request a description
> from the webmaster. But I am still trying to grasp the other,
> with the spider engines and all!
HTML: hwg-basics mailing list archives,
maintained by Webmasters @ IWA