Re: Meta tags and search engines

by "Abhay S. Kushwaha" <abhay(at)kushwaha.com>

 Date:  Fri, 22 Oct 1999 13:22:14 +0530
 To:  "Basics [HWG]" <hwg-basics(at)hwg.org>
 References:  aol
  todo: View Thread, Original
There is a lot of confusion among webmasters about search "engines"
and search "directories" and I find it strange that it is so.

Search Directory
----------------
   Eg: Yahoo! [1], DMOZ [2]
   They are basically archiving URLs and indexing them in various
   categories that people (visitors) can browse and reach. They
   don't care what your "META" tags tell them.

   Their indexers: HUMANS
   Yes, editing at DMOZ was an eye-opener on how directories are
   inherently different from engines. When you submit your site, it
   is stored in a "unreviewed" database till an "indexer" comes along,
   reviews your site and adds a description to it. Now that indexer
   may or may not use your META DESCRIPTION and they may or may not
   put a description to your site as well. You might have come across
   sites at Yahoo! that do not have any description? Just sloppy
   indexing.

   But they do have searching capabilities. But what kind of search is
   it? Here, let me give the example of Yahoo! and DMOZ since they
   seem to be top-of-list right now.

   First, DMOZ :
      DMOZ has a simple search. It simply searches the directory
      structure first (which is category, by the way). If it finds
      any, the result is displayed at the topmost level. Then it
      searches the database of all sites indexed under various
      directory structures. The 'search word' if present in 'site
      title' is given preference over it being present in 'site
      description'.

      It is but a simple IN-SITE search. As is clear from above,
      your META commands don't mean anything!

    Second, Yahoo! :
      Yahoo has a wee more complex searching. Yahoo! uses the Inktomi
      engine which searches not only the site (DMOZ like) but it also
      searches the Inktomi database (Hotbot like). I guess you could
      call it a hybrid now but it is essentially a directory.

      How are things stored in Inktomi database? The site has it's
      BOT which goes out and indexes the site (engine like) and
      searches the keywords it has found. I haven't looked closely
      at "pages result" to develop an opinion of whether your META
      commands mean anything to the Inktomi BOT or not.

Search Engines
--------------
   Eg: Altavista [3], HotBot [4]
   Now these are what you call "search engines" - mammoth sites
   containing HUGE databases of keyword-indexed URLs of millions
   upon millions of pages. Here is where your META commands matter
   the most.

   Their indexers: BOTS, CRAWLERS (don't ask me the difference!)
   They come to the specified page. *most* of them look at the META
   commands and store the DESCRIPTION and KEYWORDS. But they don't
   stop there. They *will* still go through the rest of your page
   indexing words they think are relevant (every BOT has a different
   'exclusion' list. come now, you didn't actually believe that
   words like 'the', 'and', etc. were indexed!) Then they follow
   your links and the story repeats itself.

   Just remember that your KEYWORDS is a list of words that will be
   attributed to your URL even if they do not occur later in the page
   itself. It is a "minimum list" and the BOT will always add more
   words from the page when it parses through it (if it finds any that
   are not included in your META command).

I might as well say a word about repitition. It is a self defeating
cause. This is so because 50%+ of traffic to a site will result from a
directory only and it does not care what you put there. Top engines
like Altavista, Hotbot, Excite, etc. penalise you for repetition and
other unfair practices which have been identified by their respective
teams. So, the performance benefit runs into a negative.

Keywords & Key phrases
----------------------
   Don't know the accuracy of this theory. But it states that a word
   is different from a phrase in a search. :)

   Meaning what? Meaning that "Gift Coupon" is different from "Gift",
   "Coupon" (in the latter case, they appear independently) There was
   a query where a lot of keyphreases starting with "Christmas" were
   listed. Those were independent phrases and are useful in cases
   when people use a phrase search instead of an AND search. eg:
      Christmas Tree, Christmas Carols
   should not generate any hit if searching solely for "Christmas" or
   "Carols" unless these words appear separately. eg:
      Christmas Tree, Christmas Carols, Christmas, Carols.
   I think I've made the difference clear now?

Phew! Now I go and have a glass of water! ;-)

---
 [1] http://www.yahoo.com
 [2] http://www.dmoz.org
 [3] http://www.altavista.com
 [4] http://www.hotbot.com

[abhay]

PS: Wrote most of the above from observation, experience and simple
    common sense. So, it might turn out that some content is
    not fully accurate. My attitute-analyser gives it a 98.743% of
    truth probability. [wink]

----- Original Message -----
From: <DALLASSTA(at)aol.com>
Sent: Tuesday, October 19, 1999 10:13 PM


>   I do editing for dmoz.org.  Their search engine places the
> sites in Alphbetical Order and really doesn't deal with the
> tags anyway. I write to the site, and request a description
> from the webmaster. But I am still trying to grasp the other,
> with the spider engines and all!

HTML: hwg-basics mailing list archives, maintained by Webmasters @ IWA