Archives for 

seo

Hummingbird Unleashed

Posted by gfiorelli1

Sometimes I think that us SEOs could be wonderful characters for a Woody Allen movie: We are stressed, nervous, paranoid, we have a tendency for sudden changes of mood…okay, maybe I am exaggerating a little bit, but that’s how we tend to (over)react whenever Google announces something.

Cases like this webmaster, who is desperately thinking he was penalized by Hummingbird, are not uncommon.

One thing that doesn’t help is the lack of clarity coming from Google, which not only never mentions Hummingbird in any official document (for example, in the post of its 15th anniversary), but has also shied away from details of this epochal update in the “off-the-record” declarations of Amit Singhal. In fact, in some ways those statements partly contributed to the confusion.

When Google announces an update—especially one like Hummingbird—the best thing to do is to avoid trying to immediately understand what it really is based on intuition alone. It is better to wait until the dust falls to the ground, recover the original documents, examine those related to them (and any variants), take the time to see the update in action, calmly investigate, and then after all that try to find the most plausible answers.

This method is not scientific (and therefore the answers can’t be defined as “surely correct”), it is philological, and when it comes to Google and its updates, I consider it a great method to use.

The original documents are the story for the press of the event during which Google announced Hummingbird, and the FAQ that Danny Sullivan published immediately after the event, which makes direct reference to what Amit Singhal said.

Related documents are the patents that probably underlie Hummingbird, and the observations that experts like Bill Slawski, Ammon Johns, Rand Fishkin, Aaron Bradley and others have derived.

This post is the result of my study of those documents and field observations.

Why did Amit Singhal mix apples with oranges?

When announcing Hummingbird, Amit Singhal said that it wasn’t since Caffeine in 2010 that the Google Algorithm was updated so deeply.

The problem is that Caffeine wasn’t an algorithmic change; it was an infrastructural change.

Caffeine’s purpose, in fact, was to optimize the indexation of the billions of Internet documents Google crawls, presenting a richer, bigger, and fresher pool of results to the users.

Instead, Hummingbird’s objective is not a newer optimization of the indexation process, but to better understand the users’ intent when searching, thereby offering the most relevant results to them.

Nevertheless, we can affirm that Hummingbird is also an infrastructural update, as it governs the more than 200 elements that make up Google’s algorithm.

The (maybe unconscious) association Amit Singhal created between Caffeine and Hummingbird should tell us:

  • That Hummingbird would not be here if Caffeine wasn’t deployed in 2010, and hence it should be considered an evolution of Google Search, and not a revolution.
  • Moreover, that Hummingbird should be considered Google’s most ambitious attempt to solve all the algorithmic issues that Caffeine caused.

Let me explain this last point.

Caffeine, quitting the so-called “Sand Box,” caused the SERPs to be flooded with poor-quality results.

Google reacted by creating “patches” like Panda, Penguin, and the exact-match domain (EMD) updates, among others.

But these updates, so effective in what we define as middle- and head-tail queries, were not so effective for a type of query that—mainly because of the fast adoption of mobile search by the users—more and more people have begun to use: conversational long tail queries, or those that Amit Singhal has defined as “verbose queries.”

The evolution of natural language recognition by Google, the improved ability to disambiguate entities and concepts through technology inherited from Metaweb and improved with Knowledge Graph, and the huge improvements made in the SERPs’ personalized customization have given Google the theoretical and practical tools not only for solving the problem of long-tail queries, but also for giving a fresh start to the evolution of Google Search.

That is the backstory that explains what Amit Singhal told about Hummingbird, paraphrased here by Danny Sullivan:

[Hummingbird] Gave us an opportunity […] to take synonyms and knowledge graph and other things Google has been doing to understand meaning to rethink how we can use the power of all these things to combine meaning and predict how to match your query to the document in terms of what the query is really wanting and are the connections available in the documents. and not just random coincidence that could be the case in early search engines.

How does Hummingbird work?

“To take synonyms and knowledge graph and other things…”

Google has been working with synonyms for a long time. If we look at the timeline Google itself shared in its 15th anniversary post, it has used them since 2002, even though we can also tell that disambiguation (meant as orthographic analysis of the queries) has been applied since 2001.

Image from Fifteen years on—and we’re just getting started by Amit Singhal on Inside Search blog

Last year Vanessa Fox wrote “Is Google’s Synonym Matching Increasing?…” on Search Engine Land.

Reading that post and seeing the examples presented, it is clear that synonyms were already used by Google—in connection with the user intent underlying the query—in order to broaden the query and rewrite it to offer the best results to the users.

That same post, though, shows us why only using a thesaurus of synonyms or relying on the knowledge of the highly ranked queries was not enough to assure relevant SERPs (see how Vanessa points out how Google doesn’t consider “dogs” pets in the query “pet adoption,” but does consider “cats”).

Amit Singhal, in this old patent, was also conscious that only relying on synonyms was not a perfect solution, because two words may be synonyms and may not be so depending on the context they are used (i.e.: “coche” and “automóvil” both mean “car” in Spanish, but “carro” only means “car” in Latin American Spanish, meaning “wagon” in Spain).

Therefore, in order to deliver the best results possible using semantic search, what Google needed to understand better, easier, and faster was context. Hummingbird is how Google solved that need.

Image from "The Google Hummingbird Patent?" by Bill Slawki on SEO by the Sea.

Synonyms remain essential; Amit Singhal confirmed that in the post-event talk with Danny Sullivan. How they are used now has been described by Bill Slawski in this post, where he dissects the Synonym identification based on co-occurring terms patent.

That patent, then is also based on the concept of “search entities,” which I described in my last post here on Moz, when talking about personalized search.

Speaking literally, words are not “things” themselves but the verbal representation of things, and search entities are how Google objectifies words into concepts. An object may have a relationship with others that may change depending on the context in which they are used together. In this sense, words are treated like people, cities, books, and all the other named entities usually related to the Knowledge Graph.

The mechanisms Google uses in identifying search entities are especially important in disambiguating the different potential meanings of a word, and thereby refining the information retrieval accordingly to a “probability score.”

This technique is not so different from what the Knowledge Graph does when disambiguating, for instance, Saint Peter the Apostle from Saint Peter the Basilica or Saint Peter the city in Minnesota.

Finally, there is a third concept playing an explicit role in what could be the “Hummingbird patent:” co-occurrences.

Integrating these three elements, Google now is (in theory) able:

  1. To better understand the intent of a query;
  2. To broaden the pool of web documents that may answer that query;
  3. To simplify how it delivers information, because if query A, query B, and query C substantively mean the same thing, Google doesn’t need to propose three different SERPs, but just one;
  4. To offer a better search experience, because expanding the query and better understanding the relationships between search entities (also based on direct/indirect personalization elements), Google can now offer results that have a higher probability of satisfying the needs of the user.
  5. As a consequence, Google may present better SERPs also in terms of better ads, because in 99% of the cases, verbose queries were not presenting ads in their SERPs before Hummingbird.

Maybe Hummingbird could have solved Fred Astaire and Ginger Rogers speaking issues…

90% of the queries affected, seriously?

Many SEOs have questioned the fact that Hummingbird has affected the 90% of all queries for the simple reason they didn’t notice any change in traffic and rankings.

Apart from the fact that the SERPs were in constant turmoil between the end of August and the first half of September, during which time Hummingbird first saw the light (though it could just be a coincidence, quite an opportune one indeed), the typical query that Hummingbird targets is the conversational one (e.g.: “What is the best pizzeria to eat at close to Piazza del Popolo e via del Corso?”), a query that usually is not tracked by us SEOs (well, apart from Dr. Pete, maybe).

Moreover, Hummingbird is about queries, not keywords (much less long-tail ones), as was so well explained by Ammon Johns in his post “Hummingbird – The opposite of long-tail search.” For that reason, tracking long-tail rankings as a metric of the impact of Hummingbird is totally wrong.

Finally, Hummingbird has not meant the extinction of all the classic ranking factors, but is instead a new framework set upon them. If a site was both authoritative and relevant for a query, it still will be ranking as well as it was before Hummingbird.

So, which sites got hit? Probably those sites that were relying just on very long tail keyword-optimized pages, but had no or very low authority. Therefore, as Rand said in his latest Whiteboard Friday, now it is far more convenient to create better linkable/shareable content, which also semantically relates to long-tail keywords, than it is to create thousands of long tail-based pages with poor or no quality or utility.

If Hummingbird is a shift to semantic SEO, does that mean that using Schema.org will make my site rank better?

One of the myths that spread very fast when Hummingbird was announced was that it is heavily using structured data as a main factor.

Although it is true that for some months now Google has stressed the importance of structured data (for example, dedicating a section to it in Google Webmaster Tools), considering Schema.org as the magic solution is not correct. It is an example of how us SEOs sometimes confuse the means with the purpose.

Google Data Highlighter is a good alternative to Schema.org, even though not such potent

What we need to do is offer Google easily understandable context for the topics around which we have created a page, and structured data are helpful in this respect. By themselves, however, they are not enough. As mentioned before, if a page is not considered authoritative (thanks to external links and mentions), it most likely will not have enough strength for ranking well, especially now that long-tail queries are simplified by Hummingbird.

Is Hummingbird related to the increased presence of the Knowledge Graph and Answers Cards?

Many people came up with the idea that Hummingbird is the translation of the Knowledge Graph to the classic Google Search, and that it has a direct connection with the proliferation of the Answer Cards. This theory led to some very angry posts ranting against the “scraper” nature of Google.

This is most likely due to the fact that Hummingbird was announced alongside new features of Knowledge Graph, but there is no evident relationship between Hummingbird and Knowledge Graph.

What many have thought as being a cause (Hummingbird causing more Knowledge Graph and Answer Cards, hence being the same) is most probably a simple correlation.

Hummingbird substantially simplified verbose queries into less verbose ones, the latter of which are sometimes complemented with the constantly expanding Knowledge Graph. For that reason, we see a greater number of SERPs presenting Knowledge Graph elements and Answer Cards.

That said, the philosophy behind Hummingbird and the Knowledge Graph is the same, moving from strings to things.

Is Hummingbird strongly based on the Knowledge Base?

The Knowledge Base is potent and pervasive in how Google works, but reducing Hummingbird to just the Knowledge Base would be simplistic.

As we saw, Hummingbird relies on several elements, the Knowledge Base being one of them, especially in all queries with personalization (which should be considered a pervasive layer that affects the algorithm).

If Hummingbird was heavily relying on the Knowledge Base, without complementing it with other factors, we could fall into the issues that Amit Singhal was struggling with in the earlier patent about synonyms.

Does Hummingbird mean the end of the link graph?

No. PageRank and link-related elements of the algorithm are still alive and kicking. I would also dare to say that links are even more important now.

In fact, without the authority a good link profile grants to a site, a web page will have even more difficulty ranking now (see what I wrote just above about the fate of low-authority pages).

What is even more important now is the context in which the link is present. We already learned this with Penguin, but Hummingbird reaffirms how inbound links from topically irrelevant contexts are bad links.

That said, Google still has to improve on the link front, as Danny Sullivan said well in this tweet:

Links are the fossil fuel of search relevancy signals. Polluted. Not getting better. And yet, that’s what Google Hummingbird drinks most.
— Danny Sullivan (@dannysullivan) October 18, 2013

At the same time, though (again because of context and entity recognition), brand co-occurrences and co-citations assume an even more important role with Hummingbird.

Is Hummingbird related to 100% (not provided)?

The fact that Hummingbird and 100% (not provided) were rolled out at almost the same time seems to be more than just a coincidence.

If Hummingbird is more about search entities, better information retrieval, and query expansion—an update where keywords by themselves have lost part of the omnipresent value they had—then relying on keyword data alone is not enough anymore.

We should stop focusing only on keyword optimization and start thinking about topical optimization.

This obliges us to think about great content, and not just about “content.” Things like “SEO copywriting” will end up being the same as “amazing copywriting.”

For that, as SEOs, we should start understanding how search entities work, and not simply become human thesauruses of synonyms.

If Hummingbird is a giant step toward Semantic SEO, then as SEOs, our job “is not about optimizing for strings, or for things, but for the connections between things,” as brilliantly says Aaron Bradley in this post and deck for SMX East.

Semantic SEO – The Shift From Strings To Things by Aaron Bradley #SMX
from Search Marketing Expo – SMX

What must we do to be Hummingbird-friendly?

Let me ask you few questions, and try to answer them sincerely:

  1. When creating/optimizing a site, are you doing it with a clear audience in your mind?
  2. When performing on-page optimization for your site, are you following at least these SEO best practices?
    1. Using a clear and not overly complex information architecture;
    2. Avoiding canonicalization issues;
    3. Avoiding thin-content issues;
    4. Creating a semantic content model;
    5. Topically optimizing the content of the site on a page-by-page basis, using natural and semantically rich language and with a landing page-centric strategy in mind;
    6. Creating useful content using several formats, that you yourself would like to share with your friends and link to;
    7. Implementing Schema.org, Open Graph and semantic mark-ups.
  3. Are your link-building objectives:
    1. Better brand visibility?
    2. Gaining referral traffic?
    3. Enhancing the sense of thought leadership of your brand?
    4. Topically related sites and/or topically related sections of a more generalist site (i.e.: News site)?
  4. As an SEO, is social media offering these advantages?
    1. Wider brand visibility;
    2. Social echo;
    3. Increased mentions/links in the form of derivatives, co-occurrences, and co-citation in others’ web sites;
    4. Organic traffic and brand ambassadors’ growth.
If you answered yes to all these questions, you don’t have to do anything but keep up the good work, refine it, and be creative and engaging. You were likely already seeing your site ranking well and gaining traffic thanks to the more holistic vision of SEO you have.

If you answered no to few of them, then you have just to correct the things you’re doing wrong and follow the so-called SEO best practices (and the 2013 Moz Ranking Factors are a good list of best practices).

If you sincerely answered no to many of them, then you were having problems even before Hummingbird was unleashed, and things won’t get better with it if you don’t radically change your mindset.

Hummingbird is not asking us to rethink SEO or to reinvent the wheel. It is simply asking us to not do crappy SEO… but that is something we should know already, shouldn’t we?


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

Take the 2013 Moz Industry Survey: Share Your Voice!

Posted by Cyrus-Shepard

We’re very excited to announce the 2013 Moz Industry Survey is ready to take. This is the fourth edition of the survey, which started in 2008 as the SEO Industry Survey and only ran every two years. So much has changed since the last survey that we thought it was important to run it anew in order to gain fresh insights. Some of what we hope to learn and share:

  • Who works in inbound marketing and SEO today?
  • What tactics and tools are most popular?
  • Where are marketers spending their dollars?
  • What does the future of the industry hold?

This year’s survey was redesigned to be easier and only take 5-10 minutes. When the results are in we’ll share the data freely with you, our partners, and the rest of the world.

Prizes

It wouldn’t be the Industry Survey without a few excellent prizes thrown in as an added incentive.

This year we’ve upped the game with prizes we feel are both exciting and perfect for the busy inbound marketer. To see the full sweepstakes terms and rules, go to our sweepstakes rules page. The winners will be announced by June 4th. Follow us on Twitter to stay up to date.

Grand Prize: Attend MozCon 2014 in Seattle

+ Flight
+ Hotel
+ Lunch with an industry expert

Come see us Mozzers in Seattle! The Grand Prize includes one ticket to MozCon 2014 plus airfare and accommodations. We’ll also arrange a one-on-one lunch for you with an industry expert.

2 First Prizes: iPad 2

We’re giving away two separate iPad 2s.

10 Second Prizes: $100 Amazon.com gift cards

Yep, 10 lucky people will win $100 Amazon.com gift cards. Why not buy yourself a nice book?

Why the survey is important

By comparing answers and predictions from one year to the next, we can spot trends and gain insight not easily reported through any other source. This is our best chance to understand exactly where the future of our industry is headed. Some of the things we hope to learn:

  • Demographics: Who is practicing inbound marketing and SEO today? Where do we work and live?
  • Agencies vs. in-house vs. other: How are agencies growing? What’s the average size? Who is doing inbound marketing on their own?
  • Tactics and strategies: What’s working for people today? How have strategies and tactics evolved?
  • Tools and technology: What are marketers using to discover opportunities, promote themselves, and measure the results?
  • Budget and spending: What tools and platforms are marketers investing in?

Every year the Industry Survey delivers new insights and surprises. For example, the chart below (from the 2012 survey) lists average reported salary by role. Will it change in 2013?

2012 SEO Industry Survey

Thanks to our partners

Huge thanks to our partners who are helping to spread the word and encouraging their audience to participate in the survey. We’d especially like to give special recognition to Search Engine Land, Buffer, aimClear, SEOverflow, CopyBlogger, Econsultancy, Content Marketing Institute, TopRank Marketing, MarketingProfs, HootSuite and Entreprenuer.com, Distilled and Hubspot.

Sharing is caring

The number of people who take the survey is very important! The more people who take the survey, the better and more accurate the data will be, and the more insight we can share with the industry.

So please share with your co-workers. Share on social media. Share with your email lists. You can use the buttons below this post to get you started, but remember to take the survey first!


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

Take the 2013 Moz Industry Survey: Your Peers Need You!

Posted by Cyrus-Shepard

We’re very excited to announce the 2013 Moz Industry Survey is ready to take. This is the fourth edition of the survey, which started in 2008 as the SEO Industry Survey and only ran every two years. So much has changed since the last survey that we thought it was important to run it anew in order to gain fresh insights. Some of what we hope to learn and share:

  • Who works in inbound marketing and SEO today?
  • What tactics and tools are most popular?
  • Where are marketers spending their dollars?
  • What does the future of the industry hold?

This year’s survey was redesigned to be easier and only take 5-10 minutes. When the results are in we’ll share the data freely with you, our partners, and the rest of the world.

Prizes

It wouldn’t be the Industry Survey without a few excellent prizes thrown in as an added incentive.

This year we’ve upped the game with prizes we feel are both exciting and perfect for the busy inbound marketer. To see the full sweepstakes terms and rules, go to our sweepstakes rules page. The winners will be announced by June 4th. Follow us on Twitter to stay up to date.

Grand Prize: Attend MozCon 2014 in Seattle

+ Flight
+ Hotel
+ Lunch with an industry expert

Come see us Mozzers in Seattle! The Grand Prize includes one ticket to MozCon 2014 plus airfare and accommodations. We’ll also arrange a one-on-one lunch for you with an industry expert.

2 First Prizes: iPad 2

We’re giving away two separate iPad 2s.

10 Second Prizes: $100 Amazon.com gift cards

Yep, 10 lucky people will win $100 Amazon.com gift cards. Why not buy yourself a nice book?

Why the survey is important

By comparing answers and predictions from one year to the next, we can spot trends and gain insight not easily reported through any other source. This is our best chance to understand exactly where the future of our industry is headed. Some of the things we hope to learn:

  • Demographics: Who is practicing inbound marketing and SEO today? Where do we work and live?
  • Agencies vs. in-house vs. other: How are agencies growing? What’s the average size? Who is doing inbound marketing on their own?
  • Tactics and strategies: What’s working for people today? How have strategies and tactics evolved?
  • Tools and technology: What are marketers using to discover opportunities, promote themselves, and measure the results?
  • Budget and spending: What tools and platforms are marketers investing in?

Every year the Industry Survey delivers new insights and surprises. For example, the chart below (from the 2012 survey) lists average reported salary by role. Will it change in 2013?

2012 SEO Industry Survey

Thanks to our partners

Huge thanks to our partners who are helping to spread the word and encouraging their audience to participate in the survey. We’d especially like to give special recognition to Search Engine Land, Buffer, aimClear, SEOverflow, CopyBlogger, Econsultancy, Content Marketing Institute, TopRank Marketing, MarketingProfs, HootSuite and Entreprenuer.com, Distilled and Hubspot.

Sharing is caring

The number of people who take the survey is very important! The more people who take the survey, the better and more accurate the data will be, and the more insight we can share with the industry.

So please share with your co-workers. Share on social media. Share with your email lists. You can use the buttons below this post to get you started, but remember to take the survey first!


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

Say Hello to Fresh Alerts: New Mentions and Link Notifications in Your Inbox

Posted by Cyrus-Shepard

Imagine a product similar to Google Alerts, only much better. It’s built specifically for marketers and SEOs. This product not only finds mentions of your keywords and brand, but also reports new links to any website or URL you choose. It comes equipped with advanced search operators to discover new opportunities, and its exportable metrics are sortable by both date and Feed Authority.

To top it all off, it now alerts you via email whenever it finds something new.

Announcing Fresh Alerts from Fresh Web Explorer

For the past few months we’ve enjoyed using Fresh Web Explorer, which has quickly become my favorite new marketing tool. Since then, our engineers and developers have been working to add email alerts to the mix to vastly improve its value.

Starting today, when you use Fresh Web Explorer, you can now set up alerts for up to 10 queries of your choice. The emails are sent daily whenever anything new is discovered. Because Fresh Web Explorer refreshes its index every 8 hours, this means you can be notified of new links and mentions literally within hours after they appear on the web.

When you run a query in Fresh Web Explorer, you have the opportunity to create an alert based on that search.

One key feature is the ability to set your timezone. This helps tailor the reporting specific to your area of the world, so the alerts are more relevant to you.

Fresh Alerts for SEO and inbound marketing

I’ve had the opportunity to beta-test Fresh Alerts for two months, and I can say without hesitation that it’s literally changed the way I do SEO and inbound marketing. We also tested the product with 1,000 Moz beta users, and the feedback has showcased the variety of ways folks are using Fresh Alerts.

1. Link building

While we built Fresh Alerts as a mentions tool, it does a remarkably good job at helping to build links through the process of link reclamation. By using the built-in search operators, you can set your alerts to find non-linking mentions of your brand or keywords on the web.

For example, if I want to search for folks who mention MozRank (a Moz branded term) but don’t include a link to Moz, I’d set up my Fresh Alert like this:

mozrank –rd:moz.com (mentions of Mozrank that don’t link to the root domain moz.com)

With this alert set, every day I would get a new Fresh Alert in my inbox with a list of mentions. Looking at the number of non-linking mentions above, I’d better get link building!

2. Reputation management

Using Fresh Alerts, you can easily be notified whenever anyone mentions you name or brand on the web. Hopefully the information is positive which gives you the opportunity to open a relationship or simply stay on top of the information. If negative, you can reach out and try to mitigate the damage.

Here’s a Fresh Alert email set up for mentions of Rand Fishkin. (In this case, only included mentions that don’t link to moz.com are included.)

You could also use reputation-based alerts to send daily emails to your clients and monitor the conversation about your brand across the web.

3. Competitive intelligence

You can easily set up Fresh Alerts to notify you when your competition is in the news. Better yet, use the search operators to notify you when specific media outlets mention your competition.

In the example below, FEW shows us whenever “Amazon” is mentioned specifically on TechCrunch.

You can also monitor when and where your competitors earn new links. For example, if you wanted to set up a link alert for yourcompetition.com, simply use the Root Domain search operator, like so:

rd:yourcompetition.com (alerts for all new links to the root domain)

By understanding how your competitors earn links and mentions, you may discover new opportunities that are easy to replicate.

4. Reporting and content performance

This is a tip I don’t hear people talk about, so I thought I’d share it. Whenever we publish a big piece of content here at Moz, I set up a Fresh Alert to notify me whenever someone mentions it.

For example, we recently published the 2013 Search Engine Ranking Factors. Because this was an important piece of content for us, I set up 2 different Fresh Alerts:

  • One Fresh Alert notified me whenever people mentioned “Search Engine Ranking Factors” but didn’t link to Moz
  • Another alert to tell me when people linked to the report itself

In the first example, I can reach out to those people who mentioned us without linking to see if I can start a relationship and possibly earn a link.

In the second example, as seen in the graph below, I can monitor our link-building efforts.

5. Discover publishing and guest-post opportunities

Fresh Alerts has to be one of the easiest ways to find distribution, publishing and guest-post opportunities for your content. Yes, high-quality guest posting, when combined with quality content and smart placement, remains a powerful tactic when integrated with other marketing opportunities.

For example, let’s say your subject is “dragons” and you want to find blogs that have posted guest posts in the past few days. You can simply create an alert for “dragons” AND “guest post”.

This alert will notify you whenever a new post is published mentioning both “guest post” and “dragons”.

This technique isn’t limited to guest posting, either. Getting creative, you could find other publishing opportunities for your specific niche.

The details

Starting now, we’ve made Fresh Alerts available to subscribers of Moz Analytics. If you’re not a PRO member, you can sign up for a 30-day trial to give them a spin if you’d like, which also includes access to our new Moz Analytics and full suite of inbound marketing tools.

Here’s what you need to know about Fresh Alerts:

  • Activate up to 10 Alerts per Moz Analytics account
  • When Fresh Web Explorer finds new mentions or links, you receive an email within 24 hours
  • Alerts are sorted by Feed Authority, our new metric created specifically for FWE
  • All advanced operators used by Fresh Web Explorer are available with Fresh Alerts

Have you tried Fresh Web Explorer already? If so, let us know your best ideas for Fresh Alerts in the comments below.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

A [Poorly] Illustrated Guide to Google’s Algorithm

Posted by Dr-Pete

Like all great literature, this post started as a bad joke on Twitter on a Friday night:

If you know me, then this kind of behavior hardly surprises you (and I probably owe you an apology or two). What’s surprising is that Google’s Matt Cutts replied, and fairly seriously:

Matt’s concern that even my painfully stupid joke could be misinterpreted demonstrates just how confused many people are about the algorithm. This tweet actually led to a handful of very productive conversations, including one with Danny Sullivan about the nature of Google’s “Hummingbird” update.

These conversations got me thinking about how much we oversimplify what “the algorithm” really is. This post is a journey in pictures, from the most basic conception of the algorithm to something that I hope reflects the major concepts Google is built on as we head into 2014.

The Google algorithm

There’s really no such thing as “the” algorithm, but that’s how we think about it—as some kind of monolithic block of code that Google occasionally tweaks. In our collective SEO consciousness, it looks something like this:

So, naturally, when Google announces an “update”, all we see are shades of blue. We hear about a major algorithm update ever month or two, and yet Google confirmed 665 updates (technically, they used the word “launches”) in 2012—obviously, there’s something more going on here than just changing a few lines of code in some mega-program.

Inputs and outputs

Of course, the algorithm has to do something, so we need inputs and outputs. In the case of search, the most fundamental input is Google’s index of the worldwide web, and the output is search engine result pages (SERPs):

Simple enough, right? Web pages go in, [something happens], search results come out. Well, maybe it’s not quite that simple. Obviously, the algorithm itself is incredibly complicated (and we’ll get to that in a minute), but even the inputs aren’t as straightforward as you might imagine.

First of all, the index is really roughly a dozen data centers distributed across the world, and each data center is a miniature city unto itself, linked by one of the most impressive global fiber optic networks ever built. So, let’s at least add some color and say it looks something more like this:

Each block in that index illustration is a cloud of thousands of machines and an incredible array of hardware, software and people, but if we dive deep into that, this post will never end. It’s important to realize, though, that the index isn’t the only major input into the algorithm. To oversimplify, the system probably looks more like this:

The link graph, local and maps data, the social graph (predominantly Google+) and the Knowledge Graph—essentially, a collection of entity databases—all comprise major inputs that exist beyond Google’s core index of the worldwide web. Again, this is just a conceptualization (I don’t claim to know how each of these are actually structured as physical data), but each of these inputs are unique and important pieces of the search puzzle.

For the purposes of this post, I’m going to leave out personalization, which has its own inputs (like your search history and location). Personalization is undoubtedly important, but it impacts many areas of this illustration and is more of a layer than a single piece of the puzzle.

Relevance, ranking and re-ranking

As SEOs, we’re mostly concerned (i.e. obsessed) with ranking, but we forget that ranking is really only part of the algorithm’s job. I think it’s useful to split the process into two steps: (1) relevance, and (2) ranking. For a page to rank in Google, it first has to make the cut and be included in the list. Let’s draw it something like this:

In other words, first Google has to pick which pages match the search, and then they pick which order those pages are displayed in. Step (1) relies on relevance—a page can have all the links, +1s, and citations in the world, but if it’s not a match to the query, it’s not going to rank. The Wikipedia page for Millard Fillmore is never going to rank for “best iPhone cases,” no matter how much authority Wikipedia has. Once Wikipedia clears the relevance bar, though, that authority kicks in and the page will often rank well.

Interestingly, this is one reason that our large-scale correlation studies show fairly low correlations for on-page factors. Our correlation studies only measure how well a page ranks once it’s passed the relevance threshold. In 2013, it’s likely that on-page factors are still necessary for relevance, but they’re not sufficient for top rankings. In other words, your page has to clearly be about a topic to show up in results, but just being about that topic doesn’t mean that it’s going to rank well.

Even ranking isn’t a single process. I’m going to try to cover an incredibly complicated topic in just a few sentences, a topic that I’ll call “re-ranking.” Essentially, Google determines a core ranking and what we might call a “pure” organic result. Then, secondary ranking algorithms kick in—these include local results, social results, and vertical results (like news and images). These secondary algorithms rewrite or re-rank the original results:

To see this in action, check out my post on how Google counts local results. Using the methodology in that post, you can clearly see how Google determines a base set of rankings, and then the local algorithm kicks in and not only adds new features but re-ranks the original results. This diagram is only the tip of the iceberg—Bill Slawski has an excellent three-part series on re-ranking that covers 40 different ways Google may re-rank results.

Special inputs: penalties and disavowals

There are also special inputs (for lack of a better term). For example, if Google issues a manual penalty against a site, that has to be flagged somewhere and fed into the system. This may be part of the index, but since this process is managed manually and tied to Google Webmaster Tools, I think it’s useful to view it as a separate concept.

Likewise, Google’s disavow tool is a separate input, in this case one partially controlled by webmasters. This data must be periodically processed and then fed back into the algorithm and/or link graph. Presumably, there’s a semi-automated editorial process involved to verify and clean this user-submitted data. So, that gives us something like this:

Of course, there are many inputs that feed other parts of the system. For example, XML sitemaps in Google Webmaster Tools help shape the index. My goal it to give you a flavor for the major concepts. As you can see, even the “simple” version is quickly getting complicated.

Updates: Panda, Penguin and Hummingbird

Finally, we have the algorithm updates we all know and love. In many cases, an update really is just a change or addition to some small part of Google’s code. In the past couple of years, though, algorithm updates have gotten a bit more tricky.

Let’s start with Panda, originally launched in February of 2011. The Panda update was more than just a tweak to the code—it was (and probably still is) a sub-algorithm with its own data structures, living outside of the core algorithm (conceptually speaking). Every month or so, the Panda algorithm would be re-run, Panda data would be updated, and that data would feed what you might call a Panda ranking factor back into the core algorithm. It’s likely that Penguin operates similarly, in that it’s a sub-algorithm and separate data set. We’ll put them outside of the big, blue oval:

I don’t mean to imply that Panda and Penguin are the same—they operate in very different ways. I’m simply suggesting that both of these algorithm updates rely on their own code and data sources and are only periodically fed back into the system.

Why didn’t Google just re-write the algorithm to account for the Panda and/or Penguin intent? Part of it is computational—the resources required to process this data are beyond what the real-time infrastructure can probably handle. As Google gets faster and more powerful, these sub-algorithms may become fully integrated (and Panda is probably more integrated than it once was). The other reason may involve testing and mitigating impact. It’s likely that Google only updates Penguin periodically because of the large impact that the first Penguin update had. This may not be a process that they simply want to let loose in real-time.

So, what about the recent Hummingbird update? There’s still a lot we don’t know, but Google has made it pretty clear that Hummingbird is a fundamental rewrite of how the core algorithm works. I don’t think we’ve seen the full impact of Hummingbird yet, personally, and the potential of this new code may be realized over months or even years, but now we’re talking about the core algorithm(s). That leads us to our final image:

Image credit for hummingbird silhouette: Michele Tobias at Experimental Craft.

The end result surprised even me as I created it. This was the most basic illustration I could make that didn’t feel misleading or simplistic. The reality of Google today far surpasses this diagram—every piece is dozens of smaller pieces. I hope, though, that this gives you a sense for what the algorithm really is and does.

Additional resources

If you’re new to the algorithm and would like to learn more, Google’s own “How Search Works” resource is actually pretty interesting (check out the sub-sections, not just the scroller). I’d also highly recommend Chapter 1 of our Beginner’s Guide: “How Search Engines Operate.” If you just want to know more about how Google operates, Steven Levy’s book “In The Plex” is an amazing read.

Special bonus nonsense!

While writing this post, the team and I kept thinking there must be some way to make it more dynamic, but all of our attempts ended badly. Finally, I just gave up and turned the post into an animated GIF. If you like that sort of thing, then here you go…


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →