Wednesday, November 12, 2014

The Danger of Crossing Algorithms: Uncovering The Cloaked Panda Update During Penguin 3.0

Posted by glenngabe

Penguin 3.0 was one of the most anticipated algorithm updates in recent years when it rolled out on October 17, 2014. Penguin hadn't run for over a year at that point, and there were many webmasters sitting in Penguin limbo waiting for recovery. They had cleaned up their link profiles, disavowed what they could, and were simply waiting for the next update or refresh. Unfortunately, Google was wrestling with the algo internally and over twelve months passed without an update.

So when Pierre Far finally announced Penguin 3.0 a few days later on October 21, a few things stood out. First, this was not a new algorithm like Gary Illyes had explained it would be at SMX East. It was a refresh and underscored the potential problems Google was battling with Penguin (cough, negative SEO).

Second, we were not seeing the impact that we expected. The rollout seemed to begin with a heavier international focus and the overall U.S impact has been underwhelming to say the least. There were definitely many fresh hits globally, but there were a number of websites that should have recovered but didn't for some reason. And many are still waiting for recovery today.

Third, the rollout would be slow and steady and could take weeks to fully complete. That's unusual, but makes sense given the microscope Penguin 3.0 was under. And this third point (the extended rollout) is even more important than most people think. Many webmasters are already confused when they get hit during an acute algorithm update (for example, when an algo update rolls out on one day). But the confusion gets exponentially worse when there is an extended rollout.

The more time that goes by between the initial launch and the impact a website experiences, the more questions pop up. Was it Penguin 3.0 or was it something else? Since I work heavily with algorithm updates, I've heard similar questions many times over the past several years. And the extended Penguin 3.0 rollout is a great example of why confusion can set in. That's my focus today.

Penguin, Pirate, and the anomaly on October 24

With the Penguin 3.0 rollout, we also had Pirate 2 rolling out. And yes, there are some websites that could be impacted by both. That added a layer of complexity to the situation, but nothing like what was about to hit. You see, I picked up a very a strange anomaly on October 24. And I clearly saw serious movement on that day (starting late in the day ET).

So, if there was a third algorithm update, then that's three potential algo updates rolling out at the same time. More about this soon, but it underscores the confusion that can set in when we see extended rollouts, with a mix of confirmed and unconfirmed updates.

Penguin 3.0 tremors and analysis

Since I do a lot of Penguin work, and have researched many domains impacted by Penguin in the past, I heavily studied the Penguin 3.0 rollout and published a blog post based on analyzing the first ten days of Penguin 3.0 which included some interesting findings for sure.

And based on the extended rollout, I definitely saw Penguin tremors beyond the initial October 17 launch. For example, check out the screenshot below of a website seeing Penguin impact on October 17, 22, and 25.

But as mentioned earlier, something else happened on October 24 that set off sirens in my office. I started to see serious movement on sites impacted by Panda, and not Penguin. And when I say serious movement, I'm referring to major traffic gains or losses all starting on October 24. Again, these were sites heavily dealing with Panda and had clean link profiles. Check out the trending below from October 24 for several sites that saw impact.

A good day for a Panda victim:


A bad day for a Panda victim:


And an incredibly frustrating day for a 9/5 recovery that went south on 10/24:

I saw this enough that I tweeted heavily about it and included a section about Panda in my Penguin 3.0 blog post. And that's when something wonderful happened, and it highlights the true beauty and power of the internet.

As more people saw my tweets and read my post, I started receiving messages from other webmasters explaining that they saw the same exact thing, and on their websites dealing with Panda and not Penguin. And not only did they tell me about, they showed me the impact.

I received emails containing screenshots and tweets with photos from Google Analytics and Google Webmaster Tools. It was amazing to see, and it confirmed that we had just experienced a Panda update in the middle of a multi-week Penguin rollout. Yes, read that line again. Panda during Penguin, right when the internet world was clearly focused on Penguin 3.0.

That was a sneaky move Google… very sneaky. :)

So, based on what I explained earlier about webmaster confusion and algorithms, can you tell what happened next? Yes, massive confusion ensued. We had the trifecta of algorithm updates with Penguin, Pirate, and now Panda.

Webmaster confusion and a reminder of the algo sandwich from 2012

So, we had a major algorithm update during two other major algorithm updates (Penguin and Pirate) and webmaster confusion was hitting extremely high levels. And I don't blame anyone for being confused. I'm neck deep in this stuff and it confused me at first.

Was the October 24 update a Penguin tremor or was this something else? Could it be Pirate? And if it was indeed Panda, it would have been great if Google told us it was Panda! Or did they want to throw off SEOs analyzing Penguin and Pirate? Does anyone have a padded room I can crawl into?

Once I realized this was Panda, and started to communicate the update via Twitter and my blog, I had a number of people ask me a very important question:

"Glenn, would Google really roll out two or three algorithm updates so close together, or at the same time?"

Why yes, they would. Anyone remember the algorithm sandwich from April of 2012? That's when Google rolled out Panda on April 19, then Penguin 1.0 on April 24, followed by Panda on April 27. Yes, we had three algorithm updates all within ten days. And let's not forget that the Penguin update on April 24, 2012 was the first of its kind! So yes, Google can, and will, roll out multiple major algos around the same time.

Where are we headed? It's fascinating, but not pretty

Panda is near real-time now

When Panda 4.1 rolled out on September 23, 2014, I immediately disliked the title and version number of the update. Danny Sullivan named it 4.1, so it stuck. But for me, that was not 4.1… not even close. It was more like 4.75. You see, there have been a number of Panda tremors and updates since P4.0 on May 20, 2014.

I saw what I was calling "tremors" nearly weekly based on having access to a large amount of Panda data (across sites, categories, and countries). And based on what I was seeing, I reached out to John Mueller at Google to clarify the tremors. John's response was great and confirmed what I was seeing. He explained that there was not a set frequency for algorithms like Panda. Google can roll out an algorithm, analyze the SERPs, refine the algo to get the desired results, and keep pushing it out. And that's exactly what I was seeing (again, almost weekly since Panda 4.0).

When Panda and Penguin meet in real time…

…they will have a cup of coffee and laugh at us. :) So, since Panda is near-real time, the crossing of major algorithm updates is going to happen. And we just experienced an important one on October 24 with Penguin, Pirate, and Panda. But it could (and probably will) get more chaotic than what we have now. We are quickly approaching a time where major algorithm updates crafted in a lab will be unleashed on the web in near-real time or in actual real time.

And if organic search traffic from Google is important to you, then pay attention. We're about to take a quick trip into the future of Google and SEO. And after hearing what I have to say, you might just want the past back…

Google's brilliant object-oriented approach to fighting webspam

I have presented at the past two SES conferences about Panda, Penguin, and other miscellaneous disturbances in the force. More about those "other disturbances" soon. In my presentation, one of my slides looks like this:

Over the past several years, Google has been using a brilliant, object-oriented approach to fighting webspam and low quality content. Webspam engineers can craft external algorithms in a lab and then inject them into the real-time algorithm whenever they want. It's brilliant because it isolates specific problems, while also being extremely scalable. And by the way, it should scare the heck out of anyone breaking the rules.

For example, we have Panda, Penguin, Pirate, and Above the Fold. Each was crafted to target a specific problem and can be unleashed on the web whenever Google wants. Sure, there are undoubtedly connections between them (either directly or indirectly), but each specific algo is its own black box. Again, it's object-oriented.

Now, Panda is a great example of an algorithm that has matured to where Google highly trusts it. That's why Google announced in June of 2013 that Panda would roll out monthly, over ten days. And that's also why it matured even more with Panda 4.0 (and why I've seen tremors almost weekly.)

And then we had Gary Illyes explain that Penguin was moving along the same path. At SMX East, Gary explained that the new Penguin algorithm (which clearly didn't roll out on October 17) would be structured in a way where subsequent updates could be rolled out more easily. You know, like Panda.

And by the way, what if this happens to Pirate, Above the Fold, and other algorithms that Google is crafting in its Frankenstein lab? Well my friends, then we'll have absolute chaos and society as we know it will crumble. OK, that's a bit dramatic, but you get my point.

We already have massive confusion now… and a glimpse into the future reveals a continual flow of major algorithms running in real-time, each that could pummel a site to the ground. And of course, with little or no sign of which algo actually caused the destruction. I don't know about you, but I just broke out in hives. :)

Actual example of what (near) real-time updates can do

After Panda 4.0, I saw some very strange Panda movement for sites impacted by recent updates. And it underscores the power of near-real time algo updates. As a quick example, temporary Panda recoveries can happen if you don't get out of the gray area enough. And now that we are seeing Panda tremors almost weekly, you can experience potential turbulence several times per month.

Here is a screenshot from a site that recovered from Panda, didn't get out of the gray area and reentered the strike zone, just five days later.

Holy cow, that was fast. I hope they didn't plan any expensive trips in the near future. This is exactly what can happen when major algorithms roam the web in real time. One week you're looking good and the next week you're in the dumps. Now, at least I knew this was Panda. The webmaster could tackle more content problems and get out of the gray area… But the ups and downs of a Panda roller coaster ride can drive a webmaster insane. It's one of the reasons I recommend making significant changes when you've been hit by Panda. Get as far out of the gray area as possible.

An "automatic action viewer" in Google Webmaster Tools could help (and it's actually being discussed internally by Google)

Based on webmaster confusion, many have asked Google to create an "automatic action viewer" in Google Webmaster Tools. It would be similar to the "manual actions viewer," but focused on algorithms that are demoting websites in the search results (versus penalties). Yes, there is a difference by the way.

The new viewer would help webmasters better understand the types of problems that are being impacted by algorithms like Panda, Penguin, Pirate, Above the Fold, and others. Needless to say, this would be incredibly helpful to webmasters, business owners, and SEOs.

So, will we see that viewer any time soon? Google's John Mueller addressed this question during the November 3 webmaster hangout (at 34:54).

John explained they are trying to figure something out, but it's not easy. There are so many algorithms running that they don't want to provide feedback that is vague or misleading. But, John did say they are discussing the automatic action viewer internally. So you never know…

A quick note about Matt Cutts

As many of you know, Matt Cutts took an extended leave this past summer (through the end of October). Well, he announced on Halloween that he is extending his leave into 2015. I won't go crazy here talking about his decision overall, but I will focus on how this impacts webmasters as it relates to algorithm updates and webspam.

Matt does a lot more than just announce major algo updates… He actually gets involved when collateral damage rears its ugly head. And there's not a faster way to rectify a flawed algo update than to have Mr. Cutts involved. So before you dismiss Matt's extended leave as uneventful, take a look at the trending below:

Notice the temporary drop off a cliff, then 14 days of hell, only to see that traffic return? That's because Matt got involved. That's the movie blog fiasco from early 2014 that I heavily analyzed. If Matt was not notified of the drop via Twitter, and didn't take action, I'm not sure the movie blogs that got hit would be around today. I told Peter from SlashFilm that his fellow movie blog owners should all pay him a bonus this year. He's the one that pinged Matt via Twitter and got the ball rolling.

It's just one example of how having someone with power out front can nip potential problems in the bud. Sure, the sites experienced two weeks of utter horror, but traffic returned once Google rectified the problem. Now that Matt isn't actively helping or engaged, who will step up and be that guy? Will it be John Mueller, Pierre Far, or someone else? John and Pierre are greatly helpful, but will they go to bat for a niche that just got destroyed? Will they push changes through so sites can turn around? And even at its most basic level, will they even be aware the problem exists?

These are all great questions, and I don't want to bog down this post (it's already incredibly long). But don't laugh off Matt Cutts taking an extended leave. If he's gone for good, you might only realize how important he was to the SEO community after he's gone. And hopefully it's not because your site just tanked as collateral damage during an algorithm update. Matt might be running a marathon or trying on new Halloween costumes. Then where will you be?

Recommendations moving forward:

So where does this leave us? How can you prepare for the approaching storm of crossing algorithms? Below, I have provided several key bullets that I think every webmaster should consider. I recommend taking a hard look at your site now, before major algos are running in near-real time.

  • Truly understand the weaknesses with your website. Google will continue crafting external algos that can be injected into the real-time algorithm. And they will go real-time at some point. Be ready by cleaning up your site now.
  • Document all changes and fluctuations the best you can. Use annotations in Google Analytics and keep a spreadsheet updated with detailed information.
  • Along the same lines, download your Google Webmaster Tools data monthly (at least). After helping many companies with algorithm hits, that information is incredibly valuable, and can help lead you down the right recovery path.
  • Use a mix of audits and focus groups to truly understand the quality of your site. I mentioned in my post about aggressive advertising and Panda that human focus groups are worth their weight in gold (for surfacing Panda-related problems). Most business owners are too close to their own content and websites to accurately measure quality. Bias can be a nasty problem and can quickly lead to bamboo-overflow on a website.
  • Beyond on-site analysis, make sure you tackle your link profile as well. I recommend heavily analyzing your inbound links and weeding out unnatural links. And use the disavow tool for links you can't remove. The combination of enhancing the quality of your content, boosting engagement, knocking down usability obstacles, and cleaning up your link profile can help you achieve long-term SEO success. Don't tackle one quarter of your SEO problems. Address all of them.
  • Remove barriers that inhibit change and action. You need to move fast. You need to be decisive. And you need to remove red tape that can bog down the cycle of getting changes implemented. Don't water down your efforts because there are too many chefs in the kitchen. Understand the changes that need to be implemented, and take action. That's how you win SEO-wise.

Summary: Are you ready for the approaching storm?

SEO is continually moving and evolving, and it's important that webmasters adapt quickly. Over the past few years, Google's brilliant object-oriented approach to fighting webspam and low quality content has yielded algorithms like Panda, Penguin, Pirate, and Above the Fold. And more are on their way. My advice is to get your situation in order now, before crossing algorithms blend a recipe of confusion that make it exponentially harder to identify, and then fix, problems riddling your website.

Now excuse me while I try to build a flux capacitor. :)


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

No comments:

Post a Comment