Optimizing Flash for Search Engines
Last summer when Google announced that they were now indexing SWF files, they spawned many questions among the Flash community. "What exactly is being crawled?", "Do we need to do anything different?", "Which Flash Player versions?". At the same time, Google too, was a little unsure to what extent this content would be crawled.
"We've developed an algorithm that explores Flash files in the same way that a person would, by clicking buttons, entering input, and so on. Our algorithm remembers all of the text that it encounters along the way, and that content is then available to be indexed. We can't tell you all of the proprietary details, but we can tell you that the algorithm's effectiveness was improved by utilizing Adobe's new Searchable SWF library."
So which SWF content is being crawled and indexed?
Since then, many Flash developers and SEO experts have spent time doing numerous tests to determine to what extent this content is crawled. Sites such as flashnseo.com, have done a series of experiments to find answers to these questions.
So what exactly is being crawled and indexed in SWFs?
- All text that users can see as they interact with your Flash file. This includes static text, dynamic text, and dynamic text inserted via ActionScript.
- External XML loaded into swf. Search engines are indexing XML files loaded externally into the swf. However, the search result link points to the XML file as a separate entity rather than pointing to the HTML page that hosts the SWF. This would enforce the practice of having well-written XML, while avoiding the placement of any information in the XML you wouldn't want public.
- URLs. Guava UK ran some nice studies on how these URLs are crawled. From these experiments, they discovered that Googlebot could follow URLs in Flash movies as well as URLS contained within ActionScript code that were hidden. This test begs the question, "Does this open the door for spammers to start Flash Bombing with hundreds of hidden links?". My guess is that search engines will eventually, if not already, treat Flash Bombing the same way they rank sites that hide links with CSS, or other Black Hat SEO methods.
- SWFs embedded with JavaScript, such as swfObject for example, can also be crawled.
From all the experiments I've come across, it seems as though all versions of Flash Player are being crawled and better yet, they are being crawled without developers having to modify their code (Considering the SWFs are set up as listed above).
Possible pitfalls of indexed SWFs
While all these are all nice improvements, I see a few red flags with these advancements that are worth keeping in consideration when developing your Flash.
-
SWFs indexed as separate entities. While I think it's great that SWFs can now be crawled and indexed, one thing to keep in mind is that these SWFs are being listed in results as the SWF file itself, separate from its HTML. The same way PDFs are listed in search results. Which leads me to believe that if I'm relying on any kind of Javascript communication to my SWF, that communication may be lost when viewing the SWF apart from its HTML file.
For example, if I use variables in swfObject to populate my content, that content won't be present. If the SWF communicates with any HTML elements by way of JS communication, that communication will be lost when only viewing the SWF.
In my opinion, I'd like to see search engines include a link to "View as HTML", the same way most PDFs do.
- Cannot determine hierarchy on specific chunks of content. I have yet to come across any examples of these search engines placing hierarchy on Flash content the same way you would with HTML elements using header tags, lists, emphasis, etc.
- XML listed in results as a separate entity. As I mentioned above, XML is listed in results the same way the swfs are listed. As an individual file. Things to keep in mind would be making sure that you XML is written in a clear way that is still usable even in feed view, as well as making sure you're not storing information in your XML that you wouldn't want the public to see.
The number one most important Flash/SEO factor
I recently finished the book, Search Engine Optimization for Flash, by Todd Perkins. Definitely worth reading if you're interested in learning more on better optimizing your SWFs in both Flash and Flex in easy-to-use, well-written examples.
My favorite line from the book is when Todd mentions,
"The most important factor for determining your page rank in search engine results pages is based on your HTML code."
Such a true statement that I hope doesn't get lost with all the search engine/SWF advancements. Yeah, SWFs are being crawled and indexed, but it all begins with your HTML code. Keeping HTML fundamentals in mind such as:
- Page Titles
- Alternative Flash content
- Header tags and description text within your alternative Flash content
- meta info using good keywords (not quite as dominant as it was years back, but still very useful)
- And just keeping in mind all other good, clean, healthy SEO and HTML practices
Alternate Flash content with SWFObject
Part of having well written HTML, lies in the HTML within your alternate Flash content. SWFObject is an easy-to-use and standards-friendly method to embed Flash content, which utilizes one small JavaScript file.
SWFObject is nothing new to the Flash community by any means, but recently was added to Google's list of standards compliant open source JavaScript libraries. One of the key benefits to using SWFObject, is the ability to include alternative content when either JavaScript is inactive, or the user's Flash player is not up to date.
The key then, is to utilize the alternative content area to place any headers, description text, and other related content associated with the SWF. This content can be indexed by search engines and styled however you wish using CSS. More documentation can be found here.
Deep linking
Deep linking is another great Flash SEO/usability practice that takes the user to different states of your application, while providing unique virtual URLs that can point to a specific section or application state. One popular deep linking framework amongst Flash developers today is swfAddress.
Some of the benefits to deep linking are:
- Allows Back/Forward functionality within the Flash application.
- Allows users to bookmark or share different states within the application.
- Utilizes browser history.
- Virtual URLs can be indexed by serach engines.
- Virtual URLs can be used to create sitemaps that can be submitted to search engines.
One thing worth noting with deep linking, is while search engines index these unique virtual URLs, the URLs live in the same physical HTML page. Meaning, the same HTML content is being indexed under each of those URLs.
Moral of the Story
With Google leading the charge on crawling SWFs, and other search engines such as Yahoo! and MSN following suite, these are great strides. However, search engines are not yet crawling content to the same extent. So be careful how much you rely on search engines crawling your SWFs to earn your page ranks. It all starts with well-written HTML, and alternative Flash content. Using methods such as deep linking won't hurt. Most importantly, we must use Flash responsibly.
I'd love to hear more insights/feedback on this matter. I feel there's much more to learn here. If anyone has anything to add, drop a line in the comments.
flash movies…
Thanks for the information. Any other posts or blogs you can recommend on flash movies?…
Trackback by flash movies — 10/10/09 @ 3:01 pm