Archives for 

seo

How to Recover Lost Pageviews in pushState Experiences

Posted by GeoffKenyon

PushState and AJAX can be used in tandem to deliver content without requiring the entire page to refresh, providing a better user experience. The other week, Richard Baxter dove into the implications of pushState for SEO on Builtvisible. If you’re not familiar with pushState, you should spend some time to read through his post.

If you’re not familiar with delivering content this way, you can check out these sites using pushState and AJAX to deliver content:

Time: When you scroll to the bottom of the article, a new article loads and the URL changes
Halcyon: When you click on a navigation link, the left hand panel doesn’t refresh

While pushState is really cool and great for UX, there are analytics issues presented by this technology.

When the content on a page and URL are updated using AJAX and pushState, in most cases, the  _trackPageView beacon is not fired and the pageview is not tracked. This artificially increases your bounce rate while reducing your pages per visit, time on site, and total pageviews along with other metrics associated with pageviews. 

How to tell if you’re having tracking problems

If you have a very high bounce rate or are generally curious to check if this is a problem for you, start by installing the GA Debugger extension for Chrome. Then go to the URL you want to investigate and open up the console (windows: control + shift + j, mac: command + option + j). Now, clear the console using the button at the left, and refresh the URL.

Once you refresh the page, you should see GA debugging show up in the console. To check that the initial page view is being tracked, you should see a “sent beacon” for a pageview.

Once you’ve established the initial pageview is tracked, click a link to load another page. If GA is properly tracking pageviews, you should see another pageview beacon being sent. If you don’t see this, then you have a problem.

Capturing these pageviews with GTM

The good news is that even though this is a huge problem, it can easily be fixed with Google Analytics and Google Tag Manager.

Start by creating a new “History Listener” tag. Now set your fire rules to all pages and hit save. This will simply look for changes to the URL.

Now we’ll need to create a separate event to fire a pageview when the URL History Listener fires. To do this, create a new GA tag. 

If you already run Google Analytics from GTM, you’ll simply need to modify your existing tag. This tag should, by default, be set to track pageviews. 

At this point we’ll need to set the firing rules. First, we should make sure the tag is firing on all of our pages for our basic GA installation.

The firing rule for all pages should be a default option.

If you are already running GA via GTM, you’ll already have this set up. You’ll need to create a subsequent firing rule to fire a pageview for this URL History Listener.

To do this, click to add a new firing rule and then select “create new rule.” Name the rule, and then move on to conditions. The default rule should be [url] [contains]; we need to change this to [event] [equals]. Then we’ll set the condition to gtm.historyChange. Now click save.

Now you should be all set to hit publish on your updated tag container. Overnight, you should see a change in your pageviews and related metrics.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

How to Recover Lost Pageviews in pushState Experiences

Posted by GeoffKenyon

PushState and AJAX can be used in tandem to deliver content without requiring the entire page to refresh, providing a better user experience. The other week, Richard Baxter dove into the implications of pushState for SEO on Builtvisible. If you’re not familiar with pushState, you should spend some time to read through his post.

If you’re not familiar with delivering content this way, you can check out these sites using pushState and AJAX to deliver content:

Time: When you scroll to the bottom of the article, a new article loads and the URL changes
Halcyon: When you click on a navigation link, the left hand panel doesn’t refresh

While pushState is really cool and great for UX, there are analytics issues presented by this technology.

When the content on a page and URL are updated using AJAX and pushState, in most cases, the  _trackPageView beacon is not fired and the pageview is not tracked. This artificially increases your bounce rate while reducing your pages per visit, time on site, and total pageviews along with other metrics associated with pageviews. 

How to tell if you’re having tracking problems

If you have a very high bounce rate or are generally curious to check if this is a problem for you, start by installing the GA Debugger extension for Chrome. Then go to the URL you want to investigate and open up the console (windows: control + shift + j, mac: command + option + j). Now, clear the console using the button at the left, and refresh the URL.

Once you refresh the page, you should see GA debugging show up in the console. To check that the initial page view is being tracked, you should see a “sent beacon” for a pageview.

Once you’ve established the initial pageview is tracked, click a link to load another page. If GA is properly tracking pageviews, you should see another pageview beacon being sent. If you don’t see this, then you have a problem.

Capturing these pageviews with GTM

The good news is that even though this is a huge problem, it can easily be fixed with Google Analytics and Google Tag Manager.

Start by creating a new “History Listener” tag. Now set your fire rules to all pages and hit save. This will simply look for changes to the URL.

Now we’ll need to create a separate event to fire a pageview when the URL History Listener fires. To do this, create a new GA tag. 

If you already run Google Analytics from GTM, you’ll simply need to modify your existing tag. This tag should, by default, be set to track pageviews. 

At this point we’ll need to set the firing rules. First, we should make sure the tag is firing on all of our pages for our basic GA installation.

The firing rule for all pages should be a default option.

If you are already running GA via GTM, you’ll already have this set up. You’ll need to create a subsequent firing rule to fire a pageview for this URL History Listener.

To do this, click to add a new firing rule and then select “create new rule.” Name the rule, and then move on to conditions. The default rule should be [url] [contains]; we need to change this to [event] [equals]. Then we’ll set the condition to gtm.historyChange. Now click save.

Now you should be all set to hit publish on your updated tag container. Overnight, you should see a change in your pageviews and related metrics.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

Is It Possible to Have Good SEO Simply by Having Great Content – Whiteboard Friday

Posted by randfish

This question, posed by Alex Moravek in our Q&A section, has a somewhat complicated answer. In today’s Whiteboard Friday, Rand discusses how organizations might perform well in search rankings without doing any link building at all, relying instead on the strength of their content to be deemed relevant and important by Google.

For reference, here’s a still of this week’s whiteboard!

Video transcription

Howdy Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re chatting about is it possible to have good SEO simply by focusing on great content to the exclusion of link building.

This question was posed in the Moz Q&A Forum, which I deeply love, by Alex Moravek — I might not be saying your name right, Alex, and for that I apologize — from SEO Agencias in Madrid. My Spanish is poor, but my love for churros is so strong.

Alex, I think this is a great question. In fact, we get asked this all the time by all sorts of folks, particularly people in the blogging world and people with small and medium businesses who hear about SEO and go, “Okay, I think can make my website accessible, and yes, I can produce great content, but I just either don’t feel comfortable, don’t have time and energy, don’t understand, or just don’t feel okay with doing link building.” Link acquisition through an outreach and a manual process is beyond the scope of what they can fit into their marketing activities.

In fact, it is possible kind of, sort of. It is possible, but what you desperately need in order for this strategy to be possible are really two things. One is content exposure, and two you need time. I’ll explain why you need both of these things.

I’m going to dramatically simplify Google’s ranking algorithm. In fact, I’m going to simplify it so much that those of you who are SEO professionals are going to be like, “Oh God, Rand, you’re killing me.” I apologize in advance. Just bear with me a second.

We basically have keywords and on-page stuff, topical relevance, etc. All your topic modeling stuff might go in there. There’s content quality, all the factors that Google and Bing might measure around a content’s quality. There’s domain authority. There’s link-based authority based on the links that point to all the pages on a given domain that tell Google or Bing how important pages on this particular domain are.

There are probably some topical relevance elements in there, too. There’s page level authority. These could be all the algorithms you’ve heard of like PageRank and TrustRank, etc., and all the much more modern ones of those.

I’m not specifically talking about Moz scores here, the Moz scores DA and PA. Those are rough interpretations of these much more sophisticated formulas that the engines have.

There’s user and usage data, which we know the engines are using. They’ve talked about using that. There’s spam analysis.

Super simplistic. There are these six things, six broad categories of ranking elements. If you have just these four — keywords, on-page content quality, user and usage data, spam analysis, you’re not spammy — without these, without any domain authority or any page authority, it’s next to impossible to rank for competitive terms and very challenging and very unlikely to rank even for stuff in the chunky middle and long tail. Long tail you might rank for a few things if it’s very, very long tail. But these things taken together give you a sense of ranking ability.

Here’s what some marketers, some bloggers, some folks who invest in content nearly to the exclusion of links have found. They have had success with this strategy. They’ve basically elected to entirely ignore link building and let links come to them.

Instead of focusing on link building, they’re going to focus on product quality, press and public relations, social media, offline marketing, word of mouth, content strategy, email marketing, these other channels that can potentially earn them things. Advertising as well potentially could be in here.

What they rely on is that people find them through these other channels. They find them through social, through ads, through offline, through blogs, through very long tail search, through their content, maybe their email marketing list, word of mouth, press. All of these things are discovery mechanisms that are not search.

Once people get to the site, then these websites rely on the fact that, because of the experience people have, the quality of their products, of their content, because all of that stuff is so good, they’re going to earn links naturally.

This is a leap. In fact, for many SEOs, this is kind of a crazy leap to make, because there are so many things that you can do that will nudge people in this link earning direction. We’ve talked about a number of those at Moz. Of course, if you visit the link building section of our blog, there are hundreds if not thousands of great strategies around this.

These folks have elected to ignore all that link building stuff, let the links come to them, and these signals, these people who visit via other channels eventually lead to links which lead to DA, PA ranking ability. I don’t think this strategy is for everyone, but it is possible.

I think in the utopia that Larry Page and Sergey Brin from Google imagined when they were building their first search engine this is, in fact, how they hoped that the web would work. They hoped that people wouldn’t be out actively gaming and manipulating the web’s link graph, but rather that all the links would be earned naturally and editorially.

I think that’s a very, very optimistic and almost naive way of thinking about it. Remember, they were college students at the time. Maybe they were eating their granola, and dancing around, and hoping that everyone on the web would link only for editorial reasons. Not to make fun of granola. I love granola, especially, oh man, with those acai berries. Bowls of those things are great.

This is a potential strategy if you are very uncomfortable with link building and you feel like you can optimize this process. You have all of these channels going on.

For SEOs who are thinking, “Rand, I’m never going to ignore link building,” you can still get a tremendous amount out of thinking about how you optimize the return on investment and especially the exposure that you receive from these and how that might translate naturally into links.

I find looking at websites that accomplish SEO without active link building fascinating, because they have editorially earned those links through very little intentional effort on their own. I think there’s a tremendous amount that we can take away from that process and optimize around this.

Alex, yes, this is possible. Would I recommend it? Only in a very few instances. I think that there’s a ton that SEOs can do to optimize and nudge and create intelligent, non-manipulative ways of earning links that are a little more powerful than just sitting back and waiting, but it is possible.

All right, everyone. Thanks for joining us, and we’ll see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

Announcing LocalUp Advanced: Our New Local SEO Conference (and Early Bird Tickets!)

Posted by EricaMcGillivray

That’s right, Moz fans, we’re diving into the the Local SEO conference space. Join us Saturday, February 7th in Seattle as we team up with Local U to present LocalUp Advanced, an all-day intensive local SEO conference. You’ll learn next-level tactics for everything from getting reviews and content creation to mobile optimization and local ranking factors. You’ll also have opportunities to attend workshops and meet other people who love local SEO just as much as you.

Don’t miss the early bird deal! The first 25 tickets receive $200 off registration.

Early bird is sold out!

Moz or Local U Subscribers: $699 ($499 early-bird)
General Admission: $999 ($799 early-bird)

Get your LocalUp Advanced early bird ticket today

Also, to get the best pricing,  take a 30-day free trial of Moz Pro or sign up for Local U’s forum.


Who’s speaking at LocalUp Advanced?

Aaron Weiche

Aaron Weiche

Spyder Trap

Aaron Weiche is a digital marketing geek focused on web design, mobile, and search marketing. Aaron is the COO of Spyder Trap in Minneapolis, Local U faculty member, founding board member of MnSearch, and a Local Search Ranking Factors Contributor since 2010.


Cindy Krum

Cindy Krum

Cindy Krum is the CEO and Founder of MobileMoxie, LLC, a mobile marketing consultancy and host of the most cutting-edge online mobile marketing toolset available today. Cindy is the author of Mobile Marketing: Finding Your Customers No Matter Where They Are, published by Que Publishing.

Dana DiTomaso

Dana DiTomaso

Kick Point

Whether at a conference, on the radio, or in a meeting, Dana DiTomaso likes to impart wisdom to help you turn a lot of marketing BS into real strategies to grow your business. After 10+ years and with a focus on local SMBs, she’s seen (almost) everything. In her spare time, Dana drinks tea and yells at the Hamilton Tiger-Cats.


Darren Shaw

Darren Shaw

Whitespark

Darren Shaw is the President and Founder of Whitespark, a company that builds software and provides services to help businesses with local search. He’s widely regarded in the local SEO community as an innovator, one whose years of experience working with massive local data sets have given him uncommon insights into the inner workings of the world of citation-building and local search marketing. Darren has been working on the web for over 16 years and loves everything about local SEO.


David Mihm

David Mihm 

Moz

David Mihm is one of the world’s leading practitioners of local search engine marketing. He has created and promoted search-friendly websites for clients of all sizes since the early 2000s. David co-founded GetListed.org, which he sold to Moz in November 2012. Since then, he’s served as our Director of Local Search Marketing, imparting his wisdom everywhere!


Ed Reese

Ed Reese

Ed Reese leads a talented analytics and usability team at his firm Sixth Man Marketing, is a co-founder of Local U, and an adjunct professor of digital marketing at Gonzaga University. In his free time, he optimizes his foosball and disc golf technique and spends time with his wife and two boys.

Jade Wang

Jade Wang

Google

If you’ve gone to the Google and Your Business Forum for help (and, of course, you have!), then you know how quickly an answer from Google staffer Jade Wang can clear up even the toughest problems. She has been helping business owners get their information listed on Google since joining the team in 2012. 


Mary Bowling

Mary Bowling

Local U

Mary Bowling’s been specializing in SEO and local search since 2003. She works as a consultant at Optimized!, is a partner at a small agency called  Ignitor Digital, is a partner in Local U, and is also a trainer and writer for Search Engine News. Mary spends her days interacting directly with local business owners and understands holistic local needs.


Mike Blumenthal

Mike Blumenthal

Local U

If you’re in Local, then you know Mike Blumenthal, and here is your chance to learn from this pioneer in local SEO, whose years of industry research and documentation have earned him the fond and respectful nickname ‘Professor Maps.’  Mike’s blog has been the go-to spot for local SEOs since the early days of Google Maps. It’s safe to say that there are few people on the planet who know more about this area of marketing than Mike. He’s also the co-founder of GetFiveStars, an innovative review and testimonial software. Additionally, Mike loves biking, x-country skiing, and home cooking.


Mike Ramsey

Mike Ramsey

Nifty Marketing

Mike Ramsey is the president of Nifty Marketing with offices in Burley and Boise, Idaho. He is also a Partner at Local U and many other ventures. Mike has an awesome wife and three kids who put up with all his talk about search.


Dr. Pete Meyers

Dr. Pete Meyers

Moz

Dr. Pete Meyers is the Marketing Scientist for Moz, where he works with the marketing and data science teams on product research and data-driven content. He’s spent the past two years building research tools to monitor Google, including the MozCast project, and he curates the Google Algorithm History.


Rand Fishkin

Rand Fishkin

Moz

Rand Fishkin is the founder of Moz. Traveler, blogger, social media addict, feminist, and husband.


Will Scott

Will Scott

Search Influence

Helping small businesses succeed online since 1994, Will Scott has led teams responsible for thousands of websites, hundreds of thousands of pages in online directories, and millions of visits from search. Today, Will leads nearly 100 professionals at Search Influence putting results first and helping customers successfully market online.

Why should I attend LocalUp Advanced?

Do you have an interest in or do you delve into local SEO in your work? If so, then yes, you should definitely join us on February 7th. We believe LocalUp Advanced will be extremely valuable for marketers who are:

  • In-house and spending 25% or more of their time on local SEO
  • Agencies or consultants serving brick-and-mortar businesses
  • Yellow Pages publishers

In addition to keynote-style talks, we’ll have intensive Q&A sessions with our speakers and workshops for you to get direct, one-to-one advice for your business. And as with all Moz events, there will be breakfast, lunch, two snacks, and an after party (details coming soon!) included in your ticket cost. Plus, LocalUp Advanced will take place at the MozPlex in the heart of downtown Seattle; you’ll get to check out Roger’s home!

Get your LocalUp Advanced early bird ticket today

See you in February!


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

How Big Was Penguin 3.0?

Posted by Dr-Pete

Sometime in the last week, the first Penguin update in over a year began to roll out (Penguin 2.1 hit around October 4, 2013). After a year, emotions were high, and expectations were higher. So, naturally, people were confused when MozCast showed the following data:

The purple bar is Friday, October 17th, the day Google originally said Penguin 3.0 rolled out. Keep in mind that MozCast is tuned to an average temperature of roughly 70°F. Friday’s temperature was slightly above average (73.6°), but nothing in the last few days indicates a change on the scale of the original Penguin update. For reference, Penguin 1.0 measured a scorching 93°F.

So, what happened? I’m going to attempt to answer that question as honestly as possible. Fair warning – this post is going to dive very deep into the MozCast data. I’m going to start with the broad strokes, and paint the finer details as I go, so that anyone with a casual interest in Penguin can quit when they’ve seen enough of the picture.

What’s in a name?

We think that naming something gives us power over it, but I suspect the enchantment works both ways – the name imbues the update with a certain power. When Google or the community names an algorithm update, we naturally assume that update is a large one. What I’ve seen across many updates, such as the 27 named Panda iterations to date, is that this simply isn’t the case. Panda and Penguin are classifiers, not indicators of scope. Some updates are large, and some are small – updates that share a name share a common ideology and code-base, but they aren’t all equal.

Versioning complicates things even more – if Barry Schwartz or Danny Sullivan name the latest update “3.0”, it’s mostly a reflection that we’ve waited a year and we all assume this is a major update. That feels reasonable to most of us. That doesn’t necessarily mean that this is an entirely new version of the algorithm. When a software company creates a new version, they know exactly what changed. When Google refreshes Panda or Penguin, we can only guess at how the code changed. Collectively, we do our best, but we shouldn’t read too much into the name.

Was this Penguin just small?

Another problem with Penguin 3.0 is that our expectations are incredibly high. We assume that, after waiting more than a year, the latest Penguin update will hit hard and will include both a data refresh and an algorithm update. That’s just an assumption, though. I firmly believe that Penguin 1.0 had a much broader, and possibly much more negative, impact on SERPs than Google believed it would, and I think they’ve genuinely struggled to fix and update the Penguin algorithm effectively.

My beliefs aside, Pierre Far tried to clarify Penguin 3.0’s impact on Oct 21, saying that it affected less than 1% of US/English queries, and that it is a “slow, worldwide rollout”. Interpreting Google’s definition of “percent of queries” is tough, but the original Penguin (1.0) was clocked by Google as impacting 3.1% of US/English queries. Pierre also implied that Penguin 3.0 was a data “refresh”, and possibly not an algorithm change, but, as always, his precise meaning is open to interpretation.

So, it’s possible that the graph above is correct, and either the impact was relatively small, or that impact has been spread out across many days (we’ll discuss that later). Of course, many reputable people and agencies are reporting Penguin hits and recoveries, so that begs the question – why doesn’t their data match ours?

Is the data just too noisy?

MozCast has shown me with alarming clarity exactly how messy search results can be, and how dynamic they are even without major algorithm updates. Separating the signal from the noise can be extremely difficult – many SERPs change every day, sometimes multiple times per day.

More and more, we see algorithm updates where a small set of sites are hit hard, but the impact over a larger data set is tough to detect. Consider the following two hypothetical situations:

The data points on the left have an average temperature of 70°, with one data point skyrocketing to 110°. The data points on the right have an average temperature of 80°, and all of them vary between about 75-85°. So, which one is the update? A tool like MozCast looks at the aggregate data, and would say it’s the one on the right. On average, the temperature was hotter. It’s possible, though, that the graph on the left represents a legitimate update that impacted just a few sites, but hit those sites hard.

Your truth is your truth. If you were the red bar on the left, then that change to you is more real than any number I can put on a graph. If the unemployment rate drops from 6% to 5%, the reality for you is still either that you have a job or don’t have a job. Averages are useful for understanding the big picture, but they break down when you try to apply them to any one individual case.

The purpose of a tool like MozCast, in my opinion, is to answer the question “Was it just me?” We’re not trying to tell you if you were hit by an update – we’re trying to help you determine if, when you are hit, you’re the exception or the rule.

Is the slow rollout adding noise?

MozCast is built around a 24-hour cycle – it is designed to detect day-over-day changes. What if an algorithm update rolls out over a couple of days, though, or even a week? Is it possible that a relatively large change could be spread thin enough to be undetectable? Yes, it’s definitely possible, and we believe Google is doing this more often. To be fair, I don’t believe their primary goal is to obfuscate updates – I suspect that gradual rollouts are just safer and allow more time to address problems if and when things go wrong.

While MozCast measures in 24-hour increments, the reality is that there’s nothing about the system limiting it to that time period. We can just as easily look at the rate of change over a multi-day window. First, let’s stretch the MozCast temperature graph from the beginning of this post out to 60 days:

For reference, the average temperature for this time period was 68.5°. Please note that I’ve artificially constrained the temperature axis from 50-100° – this will help with comparisons over the next couple of graphs. Now, let’s measure the “daily” temperature again, but this time we’ll do it over a 48-hour (2-day) period. The red line shows the 48-hour flux:

It’s important to note that 48-hour flux is naturally higher than 24-hour flux – the average of the 48-hour flux for these 60 days is 80.3°. In general, though, you’ll see that the pattern of flux is similar. A longer window tends to create a smoothing effect, but the peaks and valleys are roughly similar for the two lines. So, let’s look at 72-hour (3-day) flux:

The average 72-hour flux is 87.7° over the 60 days. Again, except for some smoothing, there’s not a huge difference in the peaks and valleys – at least nothing that would clearly indicate the past week has been dramatically different from the past 60 days. So, let’s take this all the way and look at a full 7-day flux calculation:

I had to bump the Y-axis up to 120°, and you’ll see that smoothing is in full force – making the window any larger is probably going to risk over-smoothing. While the peaks and valleys start to time-shift a bit here, we’re still not seeing any obvious climb during the presumed Penguin 3.0 timeline.

Could Penguin 3.0 be spread out over weeks or a month? Theoretically, it’s possible, but I think it’s unlikely given what we know from past Google updates. Practically, this would make anything but a massive update very difficult to detect. Too much can change in 30 days, and that base rate of change, plus whatever smaller updates Google launched, would probably dwarf Penguin.

What if our keywords are wrong?

Is it possible that we’re not seeing Penguin in action because of sampling error? In other words, what if we’re just tracking the wrong keywords? This is a surprisingly tough question to answer, because we don’t know what the population of all searches looks like. We know what the population of Earth looks like – we can’t ask seven billion people to take our survey or participate in our experiment, but we at least know the group that we’re sampling. With queries, only Google has that data.

The original MozCast was publicly launched with a fixed set of 1,000 keywords sampled from Google AdWords data. We felt that a fixed data set would help reduce day-over-day change (unlike using customer keywords, which could be added and deleted), and we tried to select a range of phrases by volume and length. Ultimately, that data set did skew a bit toward commercial terms and tended to contain more head and mid-tail terms than very long-tail terms.

Since then, MozCast has grown to what is essentially 11 weather stations of 1,000 different keywords each, split into two sets for analysis of 1K and 10K keywords. The 10K set is further split in half, with 5K keywords targeted to the US (delocalized) and 5K targeted to 5 cities. While the public temperature still usually comes from the 1K set, we use the 10K set to power the Feature Graph and as a consistency check and analysis tool. So, at any given time, we have multiple samples to compare.

So, how did the 10K data set (actually, 5K delocalized keywords, since local searches tend to have more flux) compare to the 1K data set? Here’s the 60-day graph:

While there are some differences in the two data sets, you can see that they generally move together, share most of the same peaks and valleys, and vary within roughly the same range. Neither set shows clear signs of large-scale flux during the Penguin 3.0 timeline.

Naturally, there are going to be individual SEOs and agencies that are more likely to track clients impacted by Penguin (who are more likely to seek SEO help, presumably). Even self-service SEO tools have a certain degree of self-selection – people with SEO needs and issues are more likely to use them and to select problem keywords for tracking. So, it’s entirely possible that someone else’s data set could show a more pronounced Penguin impact. Are they wrong or are we? I think it’s fair to say that these are just multiple points of view. We do our best to make our sample somewhat random, but it’s still a sample and it is a small and imperfect representation of the entire world of Google.

Did Penguin 3.0 target a niche?

In that every algorithm update only targets a select set of sites, pages, or queries, then yes – every update is a “niche” update. The only question we can pose to our data is whether Penguin 3.0 targeted a specific industry category/vertical. The 10K MozCast data set is split evenly into 20 industry categories. Here’s the data from October 17th, the supposed data of the main rollout:

Keep in mind that, split 20 ways, the category data for any given day is a pretty small set. Also, categories naturally stray a bit from the overall average. All of the 20 categories recorded temperatures between 61.7-78.2°. The “Internet & Telecom” category, at the top of the one-day readings, usually runs a bit above average, so it’s tough to say, given the small data set, if this temperature is meaningful. My gut feeling is that we’re not seeing a clear, single-industry focus for the latest Penguin update. That’s not to say that the impact didn’t ultimately hit some industries harder than others.

What if our metrics are wrong?

If the sample is fundamentally flawed, then the way we measure our data may not matter that much, but let’s assume that our sample is at least a reasonable window into Google’s world. Even with a representative sample, there are many, many ways to measure flux, and all of them have pros and cons.

MozCast still operates on a relatively simple metric, which essentially looks at how much the top 10 rankings on any given day change compared to the previous day. This metric is position- and direction-agnostic, which is to say that a move from #1 to #3 is the same as a move from #9 to #7 (they’re both +2). Any keyword that drops off the rankings is a +10 (regardless of position), and any given keyword can score a change from 0-100. This metric, which I call “Delta100”, is roughly linearly transformed by taking the square root, resulting in a metric called “Delta10”. That value is then multiplied by a constant based on an average temperature of 70°. The transformations involve a little more math, but the core metric is pretty simplistic.

This simplicity may lead people to believe that we haven’t developed more sophisticated approaches. The reality is that we’ve tried many metrics, and they tend to all produce similar temperature patterns over time. So, in the end, we’ve kept it simple.

For the sake of this analysis, though, I’m going to dig into a couple of those other metrics. One metric that we calculate across the 10K keyword set uses a scoring system based on a simple CTR curve. A change from, say #1 to #3 has a much higher impact than a change lower in the top 10, and, similarly, a drop from the top of page one has a higher impact than a drop from the bottom. This metric (which I call “DeltaX”) goes a step farther, though…


If you’re still riding this train and you have any math phobia at all, this may be the time to disembark. We’ll pause to make a brief stop at the station to let you off. Grab your luggage, and we’ll even give you a couple of drink vouchers – no hard feelings.


If you’re still on board, here’s where the ride gets bumpy. So far, all of our metrics are based on taking the average (mean) temperature across the set of SERPs in question (whether 1K or 10K). The problem is that, as familiar as we all are with averages, they generally rely on certain assumptions, including data that is roughly normally distributed.

Core flux, for lack of a better word, is not remotely normally distributed. Our main Delta100 metric falls roughly on an exponential curve. Here’s the 1K data for October 21st:

The 10K data looks smoother, and the DeltaX data is smoother yet, but the shape is the same. A few SERPs/keywords show high flux, they quickly drop into mid-range flux, and then it all levels out. So, how do we take an average of this? Put simply, we cheat. We tested a number of transformations and found that the square root of this value helped create something a bit closer to a normal distribution. That value (Delta10) looks like this:

If you have any idea what a normal distribution is supposed to look like, you’re getting pretty itchy right about now. As I said, it’s a cheat. It’s the best cheat we’ve found without resorting to some really hairy math or entirely redefining the mean based on an exponential function. This cheat is based on an established methodology – Box-Cox transformations – but the outcome is admittedly not ideal. We use it because, all else being equal, it works about as well as other, more complicated solutions. The square root also handily reduces our data to a range of 0-10, which nicely matches a 10-result SERP (let’s not talk about 7-result SERPs… I SAID I DON’T WANT TO TALK ABOUT IT!).

What about the variance? Could we see how the standard deviation changes from day-to-day instead? This gets a little strange, because we’re essentially looking for the variance of the variance. Also, noting the transformed curve above, the standard deviation is pretty unreliable for our methodology – the variance on any given day is very high. Still, let’s look at it, transformed to the same temperature scale as the mean/average (on the 1K data set):

While the variance definitely moves along a different pattern than the mean, it moves within a much smaller range. This pattern doesn’t seem to match the pattern of known updates well. In theory, I think tracking the variance could be interesting. In practice, we need a measure of variance that’s based on an exponential function and not our transformed data. Unfortunately, such a metric is computationally expensive and would be very hard to explain to people.

Do we have to use mean-based statistics at all? When I experimented with different approaches to DeltaX, I tried using a median-based approach. It turns out that the median flux for any given day is occasionally zero, so that didn’t work very well, but there’s no reason – at least in theory – that the median has to be measured at the 50th percentile.

This is where you’re probably thinking “No, that’s *exactly* what the median has to measure – that’s the very definition of the median!” Ok, you got me, but this definition only matters if you’re measuring central tendency. We don’t actually care what the middle value is for any given day. What we want is a metric that will allow us to best distinguish differences across days. So, I experimented with measuring a modified median at the 75th percentile (I call it “M75” – you’ve probably noticed I enjoy codenames) across the more sophisticated DeltaX metric.

That probably didn’t make a lot of sense. Even in my head, it’s a bit fuzzy. So, let’s look at the full DeltaX data for October 21st:

The larger data set and more sophisticated metric makes for a smoother curve, and a much clearer exponential function. Since you probably can’t see the 1,250th data point from the left, I’ve labelled the M75. This is a fairly arbitrary point, but we’re looking for a place where the curve isn’t too steep or too shallow, as a marker to potentially tell this curve apart from the curves measured on other days.

So, if we take all of the DeltaX-based M75’s from the 10K data set over the last 60 days, what does that look like, and how does it compare to the mean/average of Delta10s for that same time period?

Perhaps now you feel my pain. All of that glorious math and even a few trips to the edge of sanity and back, and my wonderfully complicated metric looks just about the same as the average of the simple metric. Some of the peaks are a bit peakier and some a bit less peakish, but the pattern is very similar. There’s still no clear sign of a Penguin 3.0 spike.

Are you still here?

Dear God, why? I mean, seriously, don’t you people have jobs, or at least a hobby? I hope now you understand the complexity of the task. Nothing in our data suggests that Penguin 3.0 was a major update, but our data is just one window on the world. If you were hit by Penguin 3.0 (or if you received good news and recovered) then nothing I can say matters, and it shouldn’t. MozCast is a reference point to use when you’re trying to figure out whether the whole world felt an earthquake or there was just construction outside your window. 


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →