Category Archives: SEO Industry

Google Algorithm Update Timeline

I have been documenting Google algorithm and SERP updates since late 2010. I’ve probably missed a few ton, so please feel free to suggest ones I may have missed in the comments! To my knowledge, the last Google PageRank update was August 2, 2012. Google Panda Update #21 began November 7, 2012. 

  • 2012.11.07 Panda Update #21
  • 2012.08.20 Panda 3.9.1
  • 2012.08.14 Top 10 Limits to 7 Results for Many Queries
  • 2012.08.10 Repeat DMCA Offenders Penalty
  • 2012.07.24 Panda 3.9 Refresh
  • 2012.06.25 Panda 3.8 Refresh
  • 2012.06.08 Panda 3.7 (1% of Searches in U.S. and Globally)
  • 2012.05.25 Penguin 1.1
  • 2012.04.27 Panda 3.6
  • 2012.04.24 Penguin 1
  • 2012.04.19 Panda 3.5
  • 2012.03.23 Panda 3.4 Refresh
  • 2012.02.26 Panda 3.3
  • 2012.02.03 Panda Refresh
  • 2012.01.25 Panda Update 3.2 – Data Refresh
  • 2012.01.10 Google Search + Your World
  • 2011.11.18 Panda Update 3.1
  • 2011.11.03 Algorithm “Freshness” Update – 35% of searches
  • 2011.10.19 Panda Update 2.5.3
  • 2011.10.13 Panda Update 2.5.2
  • 2011.10.09 Panda Update 2.5.1
  • 2011.09.28 Panda Update 2.5
  • 2011.08.?? Panda Update 2.4
  • 2011.07.23 Panda Update 2.3
  • 2011.06.16 Panda Update 2.2
  • 2011.05.10 Panda Update 2.1
  • 2011.04.11 Panda Update 2.0
  • 2011.02.24 Farmer/Panda Update 1.0
  • 2010.12.02 “Sentiment” found to be an algorithm consideration in Google – reviews, etc.
  • 2010.11.17 Same domain until now did not appear more than two times in Google SERPs – not anymore
  • 2010.11.09 Google launches instant preview of results when hovering over magnifying glass
  • 2010.11.01 Google begins ranking internal pages over home page for some short tail keyword searches
  • 2010.10.27 Google moves map to right sidebar, mixes map and organic listings in SERPs
  • 2010.10.18 Google allows users to specify custom locations
  • 2010.09.30 Google introducesk eyword navigation in SERPs with arrow next to currently selected result
  • 2010.09.07 Google Instant launches

Summary of Structured Data: Schema, Microformats and RDFa

Contrary to common misconceptions, sitemaps and microdata are not the same thing. A prioritized sitemap emphasizing the most frequently updated and/or important content on your website is indeed helpful, but it will not appeal to Google and Bing’s new movement toward microdata, and in turn, enhanced search engine results listings, or “rich snippets”. A sitemap is merely a map of page URLs featured within a site. Microdata like Schema serves an entirely different purpose – to take content within a page, and categorize that data so the search engines can quickly discern what your content is about  and better deliver your content to the right searchers using structured data. Schema uses specific verbiage to categorize different types of data. So when a search engine reads a user query and detects searcher intent, it can deliver better, richer results to the searcher.

For example, at its most basic, Schema markups will help a search engine quickly understand if your content is about a person, place, thing, product, event, organization, creative work or review. (There are others – these are the biggest, baddest ones in my opinion.) To delve deeper, if your content is about something like a recipe, you can also categorize parts of that recipe – which part is the description, name, image, ingredients, cook time, type of cuisine, and so on. One possible impact of using Schema markup language (which is essentially simple HTML tags wrapped around existing content on the page) is richer snippets in the search results.

Take, for example, the “cornbread recipe” Google search result below. It shows the image, number of reviews, star rating, cook time and calorie count right on the search results page. Search engines would quickly able to pull those parts from your page to display in the search results, because you’ve flagged parts of your content as that type of data.

Schema Recipe Rich Snippet

Schema, Microformats and RDFa are all currently acceptable ways to categorize data on the page, but many experts are leaning toward Schema winning out as the ideal microdata vocabulary that will likely be accepted by the major search engines in the future. The three major search engines – Google, Bing and Yahoo – came out in June of last year advocating Schema in particular, though they said that the other formats would continue to be supported.

There is no evidence that right now these Schema markups affect search engine rankings, though multiple search engines (including Stefan Weiz from Bing in a presentation at MozCon 2011 in Seattle) have stated publicly that this new way of categorizing data will become increasingly important. Also, richer search results with more accessible data and images have shown time and again to increase click-through rates to your content, as the searcher’s eye is drawn to concise, visual data. There is no guarantee that the search engines will elect to use the microdata in a rich snippet, but it greatly increases the likelihood.

Essentially, Schema, microformats and RDFa take the guess work out of detecting what your content is about for search engines. It takes easy-to-read content written for people, and translates it to easy-to-categorize data for search engines.

Here’s my SlideShare on the topic. Structured Data: Schema, Microformats & the Future of SEO:

How do I allow search engines to crawl and index restricted content?

So you have white papers, articles or other information reserved for registered users or paying customers only on your site. (Otherwise known as “restricted content.”) The question is – how do you allow search engine crawlers to index these restricted content pages, without giving away all your restricted content for free? What are the SEO best practices for restricted content? You certainly don’t want to be penalized for showing the search engines one set of content, yet showing the visitors to your site something entirely different.

In order for this content to be visible to potential customers or new members, you probably want these pages showing up in the search results index, right? But you also don’t want to give away the content for free or to users without registering. News organizations (like Time.com for example) do this exact thing all the time!

The two best solutions I know of to date are the Preview method and the First Click Free method, but I welcome your alternative suggestions.

The “Preview Restricted Content” Method

Time.com Restricted Content LoginYou can show a preview of the content up to a certain number of words, like Time.com does in this screen shot. You have optimized meta data and a short bit of the optimized content, but the full view isn’t available until a user signs in, registers or pays. Once a user signs in, registers or pays, you grant access to the full content on a separate unique page, which you could theoretically specify in the robots.txt to nocrawl/noindex (though be sure not to nocrawl/noindex the content preview page – that totally defeats the purpose). So show a preview, allow it to be indexed, but nocrawl/noindex the full content.

It’s certainly not a perfect solution – you don’t have the full content for the bots to crawl and rank. Which is no doubt why Google created a different solution back in 2008 called “Google First Click Free for Web Search.”

The “First Click Free” Method

The basic premise is that the first page a user clicks in the search engine results will be displayed in full without requiring registration or payment. From that point, you block the user with a requirement to log in or make a payment to view additional content when he or she tries to navigate to another restricted resource. The content that visitors see and Googlebot sees is identical, and the first-click user must be able to see the full article even if it has multiple pages (which you can specify to display all on one page for Googlebot and visitors, apparently, or you can use a cookie instead). You need to ensure that Googlebot can access all the restricted content in full using the robots.txt file, and the referring URL in the HTTP request-header field will be a Google domain.

Although it’s a useful resource for sites with lots of restricted content to be able to be indexed fully while protecting most of the content from users initially, First Click Free isn’t a perfect solution either. It has two inherent problems. First, I imagine that many users who searched very specific and/or long-tail key terms to arrive at your content could get everything they need with that one click, and have no intention to pay or register for your restricted content. Or second, if you only have a few restricted resources, or even just one, a “first click free” solution could give away your entire revenue or registration stream. In this case, I suggest the preview solution I mentioned above instead.

The Google Analytics (not set) and (not provided) Nightmare

The impact of Google Analytics displaying (not set) and (not provided) has been a hot topic of discussion lately within the Search industry, and now things are about to get exponentially worse as Mozilla Firefox enabled Google’s HTTPS encrypted search as their default search service this week, and the change should affect regular users within the next few months according to this article (thanks @webaddict for passing it on). This could impact searches from up to 25% of Internet users who currently use Firefox as their default browser.

I recently did an analysis of one of the larger sites I work on, and the following surfaced. (Organic and Google total traffic numbers are removed for client privacy.) I found that (not provided) keywords do make up single digit percentages of Google traffic, looming around 4%, which is true to what Google’s Matt Cutts claimed back in November in a back and forth between him and SEOmoz’s Rand Fishkin. But I also discovered that the proportion of (not set) keywords is in the double digits and has been growing steadily and inexplicably since November.

Google Analytics (not set) and (not provided) Data

I’m sure other SEOs can relate – this is a tracking nightmare. We’re all left wondering, “Why are there so many (not set) and (not provided) keywords in my Google Analytics?” and with nowhere to turn to get answers – or data. My year-over-year comparisons have been rendered completely useless and unreliable, as we have no way of reliably knowing how my work is affecting the sites in the long run. There is obviously no way of knowing which keywords are impacted by the (not set) and (not provided) problem.  Yes, we can still track traffic, top content, engagement, and so on. But it will be an increasingly difficult struggle to identify which search audience and keyword traffic is most relevant and converts best on our site, so we can in turn continue to grow and develop content that is most relevant and useful for our users.

So what do (not set) and (not provided) in Google Analytics really mean? From what I’ve gathered, these are the current explanations:

(not provided) – This marker is a result of the Google encryption of key terms that drove traffic to your site if the searcher was a logged in Google user. Google announced in October 2011 that they wanted to “protect personalized search results” by encrypting those search terms – even though the searcher’s personally identifying data is not revealed to us in the Analytics console – and that SSL Search would become the default search experience for those users.  So while this traffic is reported as organic search traffic, you no longer get access to the query terms. Oh, and PPC AdWords users still gets to see their keywords; that data is unaffected.

(not set) – The Google Analytics blog in 2009 said that (not set) is “any direct visit or referral visit… because it does not have a keyword, ad content or any other campaign information associated with the visit.” This problem has been attributed to faulty auto-tagging on destination URLs and gclid redirection for keywords in paid campaigns. Hopefully one of you dear readers can explain this part to me, though: Many have said that (not set) refers to traffic coming from referrals or direct landings. However, I am struggling to understand why referral and direct traffic are coming from Traffic Sources > Search > Organic (excluding paid/PPC) and identifying their source as Google organic. Why is it in Search at all, instead of under Direct or Referral traffic? This has been identified as a common issue with AdWords traffic, but according to Analytics this is not paid traffic.

I welcome your insight and expertise in the comments, because I am honestly stumped on the (not set) issue.

UPDATE – SEPTEMBER 2012

Here are updated percentages of (not set) and (not provided) Google organic keyword traffic from November 2011 through August 2012. As you can see, it continues to climb and now it’s almost at 19% of organic Google keyword traffic. That means I can’t identify nearly a quarter of the keywords Google organic are sending traffic from. Still just on this one site (which runs both PPC and organic), and it spiked very suddenly in November from virtually none before-hand. It’s almost entirely desktop traffic, not mobile traffic.

Month NP % NS %
November 4.00% 13.53%
December 3.98% 14.62%
January 4.23% 14.98%
February 4.06% 15.37%
March 4.59% 16.27%
April 4.21% 13.92%
May 4.39% 16.15%
June 4.37% 16.35%
July 2.63% 16.80%
August 4.94% 18.79%

 

UPDATE – NOVEMBER 2012

We had a handful of findings and undertook some projects to clean up our Analytics in hopes of finding a solution to this (not set) and (not provided) problem. I’m thrilled to say that the (not set) keywords have plummeted to a comfortable zero. However, for the site referenced above, the (not provided) keywords almost immediately shot up almost as much as (not set) went down – roughly 20% – bringing the total amount of invisible Google organic keywords back to that 25% range we struggled with before.

So what that tells us is these terms may be interchangeable.  Though I’m still looking into that and how it times with the rise in full SSL browsers encrypting searches, and will update when I know more. Regardless, the “war on keywords” goes on as more full SSL browsers begin to surface, and as Google Chrome gets more and more locked down even for non-logged in users. It’s unfortunate, really.

Here’s what we did:

  1. Found and replaced all instances of duplicate Analytics code or any outdated Urchin Analytics code with the most current, up-to-date Analytics tracking code. (Truthfully, this should have been in good working order anyway, but we found several parts of the site that still had old or duplicate code on them.)
  2. Found that AdWords was connected to two separate Analytics accounts. We eliminated one duplicate account and ensured all AdWords channels were connected to the proper Analytics UA.
  3. Double-checked all ad tagging and made sure that all ads were pointing to landing pages with the proper www. version of the URL rather than the non-www. version and being forced through a canonicalization redirect.

All of these are important, but if you do nothing else, check into Item 1.  Ever since we scoured the site for outdated Analytics code and replaced it with the most current version, we haven’t seen a single “(not set)” keyword in GA under Search > Organic keywords. 

If this still doesn’t solve your problem, please check out the comments below for a very informative community discussion with lots of potential theories to consider. And if something else worked for you, please share in the comments below so the community can benefit from your learnings!

The SEO Importance of Writing New New Content

A colleague sent me this article about how many sites affected by the Google Panda algorithm updates are still struggling to recover, though at this point it could arguably be phrased “have failed to recover.” He advises site owners in many situations to simply scrap all their content and start over. Fortunately I had no sites significantly affected by the Google Panda algorithm update, so it would be easy to wipe my brow and feel like I dodged a bullet and call it good. But Michael Martinez brings up a great point. He said he hadn’t created new content in years. It got me thinking about how often we search engine optimizers get stuck in a rut of just making formulaic on-page SEO changes to existing stuff on the site, do the occasional linkbuilding push, and even get lazy and just start automating or syndicating content for the sake of getting content on the site – regardless of how junky it is.

So many SEOs write content only to appease the search engines. So many website managers write content only to make sales. But they’re missing the point. What’s in it for the visitor? If we all took a minute to really think about what the visitors are looking for, writing all new content isn’t as hard as we’re making it out to be in our heads. We’re just procrastinating. You know better than anyone what you’re writing about, so just sit down and do it. And make it valuable. And make it interesting. Maybe I’m oversimplifying the simplicity of writing great content (I have a journalism degree) but I feel that any great SEO should also be a great writer, or at least a great reader.

I can’t say that I’ve syndicated garbage content, but I can say that I am guilty of letting slide the creation of new, evergreen content (not just fleeting blog articles) with the intention of being useful, relevant content with serious longevity. I get used to tweaking and pushing the old stuff, which is great and all, but I get hung up on it because it’s easier to do than to plop down and start researching and writing something new. There are so many benefits to adding legitimate new content to your sites, including providing a more relevant and useful resource to your visitors and potential visitors, but also expanding into new key terms you may not have ranked for before. So my conversations this week will center not around modifying old stuff or trying to capture and comment on fleeting trends for our blogs, but instead on how we can start writing incremental new content I’ll call it. Surely there are some folks’ fresh questions we’re not answering if we haven’t added any legitimately new content in years. What areas have we not thought of that we can reach into? What are entirely new areas of coverage we’re missing?

So I will be going through each of my sites and my clients’ sites this week and looking for those areas of opportunity for incremental new content. Each of these sites can speak from a position of very niche expertise, and there are countless good, valuable, new content pages to be had. There are resources out there that we can provide that others aren’t yet, or that no one’s organized into a usable medium yet. It’ll take real thought and legwork, but new new content on our sites is worth it – and not just for SEO.