Optimizing Flash for Search Engines
by Collette McNeill <collette(at)mlwebworks.com>
|
Date: |
Wed, 07 May 2003 05:12:33 -0700 |
To: |
hwg-techniques(at)mail.hwg.org |
|
todo: View
Thread,
Original
|
|
Hi list, I thought I'd forward a useful article.
http://www.searchenginewatch.com/searchday/article.php/2200921
Optimizing Flash for Search Engines
By <mailto:craigfif(at)microsoft.com>Craig Fifield, Guest Writer
May 6, 2003
A special report from the Search Engine Strategies conference in Boston,
MA, March 4-6, 2003.
Macromedia Flash and other non-HTML formats can pose problems for search
engines, unless you take appropriate steps to optimize the content.
"Search engines were originally built to index and serve HTML documents,"
said Tim Mayer, Vice President of Web Search at FAST. "Now the web has
become more diverse in content types, knowing how to treat Flash and other
types of content has become more important for search engines."
"These other content types present different challenges to the search
engines," Mayer continued. "For example, Flash files generally contain too
little text whereas PDF documents contain too much text. The technology to
include differing content types and score them appropriately will become
even more important as new areas in web search become more important --
such as real time data which will provide the challenge of lacking inbound
links."
Participants in this panel discussed how crawlers interact with non-HTML
content, and offered a number of workarounds and optimization tips.
Search Engines and Flash
Flash is the leading vector graphics technology for creating design-focused
web sites. Over 98 percent of Internet users can view Flash content with
the Flash player software already installed in their browsers. Over 490
million people use the Flash player.
Gregory Markel, Founder/President of Infuse Creative, an entertainment and
technology consulting company, discussed issues related to search engine
visibility and Flash sites. "The good news is that FAST Search and Google
can follow embedded links within the [Flash] files," he said.
FAST built its Flash indexing capabilities using the Macromedia's Flash
search engine software developer's kit (SDK). The SDK was designed to
convert a Flash file's text and links into HTML for indexing.
"Not all search engine spiders have the ability to crawl or index Flash, he
said. "As far as I am able to determine, Google has not included the
Flash-SDK setup for indexing, like FAST. But Google can follow embedded links."
Markel warned that the Macromedia's SDK solution is far from perfect. "All
it does is it takes whatever [content] is there, and converts it to an HTML
version. But the converted HTML doesn't include anything you actually need
to do well in the search engines. No title tags, alt tags, body text, etc.
SDK is a step in the right direction, but has a long way to go."
"One of the big problems with Flash content is that it's very hard to
find," stated Tim Mayer. "We have a lot of Flash content in the FAST index,
though I've rarely come across a Flash file, myself, in the main search
results."
One of the reasons for the paucity of optimized Flash files is that the
search engine industry hasn't adopted SDK as the standard, explained Mayer.
"The SEOs out there don't know that we're actually going to index their
files," he said, "so they don't prepare them in an optimized way (for the
SDK). This will change as more search engines adopt this."
Mayer recommended that webmasters restrict the text to what they want
indexed. For example, making the "skip intro" a graphic file is better than
making it a text link. "Keep what you want as indexable as text," said
Mayer. "Make what you don't want as Flash graphics. We will also take out
links from Flash sites and follow them."
Multimedia files and Search Engines
Few search engines provide search for audio and video file formats.
Currently, AltaVista, FAST Search, and Singingfish support the following
multimedia formats: Windows Media (Windows Media Encoder), RealMedia, MP3,
and Quicktime.
Multimedia content can provide a better user experience in some industries.
"Multimedia content enables an immersive and emotive user experience beyond
text-based content," said Ken Berkun, Founder and VP of Strategy at
Singingfish. "A 30 second music clip is a strong advertisement for a CD."
When you create a multimedia file, you have the opportunity to give it
metadata. "Give each file a title, copyright stream, author, description,
and keywords, said Berkun. "Every single one of these medias have these
fields."
"You would be surprised at how little that's done," continued Berkun." The
most popular titles in our database is the default - nothing. People spend
thousands or even hundreds of thousands of dollars on the production and
don't even take the time to add a title."
In addition to having metadata in the multimedia file, Berkun also advised
to have actual content on the HTML page that contains the file. Include
accurate anchor text and an around multimedia files. And make sure that
your entire web site is spiderable.
"Singingfish does spider HTML sites with multimedia," he said. "We handle
frames pretty well, but we've had mixed results with JavaScript. We can't
handle Flash, although we hope to get to it someday."
PDF Documents and Search Engines
Shari Thurow, Marketing Director of Grantastic Designs, a full-service
search engine marketing and design firm, addressed PDF files and the search
engines. "PDF stands for portable document format, which is a universal
file format that preserves fonts, colors, graphic images, and formatting of
any source document," she said.
Unlike Flash documents, PDF documents frequently appear within regular
search results. Thurow stated that she typically finds PDF-formatted
technical manuals, white papers, press kits, and spec sheets in the top 30
search engine results pages in both Google and FAST Search. Because of
this, Thurow commonly optimizes PDF documents for various industries.
Thurow also stressed the importance of using the Robots Exclusion Protocol
on some PDF documents. "Search engines have made it clear that they do not
want redundant content in their indices," she said. "Even having both a PDF
and HTML version of the same content is redundant. For that reason, I place
the Robots Exclusion Protocol on the PDF version. HTML format is better for
the search engines, anyway, since HTML files tend to be smaller in file size."
PDF documents can be submitted through the normal submit URL forms and the
paid inclusion programs at the major search engines.
HWG hwg-techniques mailing list archives,
maintained by Webmasters @ IWA