Archives for 

seo

It’s Penguin-Hunting Season: How to Be the Predator and Not the Prey

Posted by russvirante

Penguin changed everything. For most search engine optimizers like myself, especially those who operate in the gray areas of optimization, we had long grown comfortable with using “ratios” and “percentages” as simple litmus tests to protect ourselves against the wrath of Google. I can’t tell you how many times I both participated in and was questioned about what our current “anchor text ratio” was. Many of you probably remember having the same types of discussions back in the keyword-stuffing days.

We now know unequivocally that Google has used and continues to use statistical tools far more advanced than simply looking at where an individual ranking factor sits on a dial. (We certainly have more than enough Remove’em users to prove that.) My understanding of Penguin and its content-focused predecessor Panda is that Google now employs machine-learning techniques across large data sets to uncover patterns of over-optimization that aren’t easily discerned by the human eye or the crude algorithms of the past. It is with this understanding that I and my company, Virante, Inc., undertook the Open Penguin Data project, and ultimately formed our Penguin Vulnerability Score.

The Open Penguin Data Project

Matt Cutts occasionally gives us a heads-up about future updates, and in the Spring of 2013 we were informed that within a few weeks Penguin 2.0 would roll out. I remember exactly when the idea hit me. I was reading “How is Big Data Different from Previous Data” by Bryan Eisenberg, and it occurred to me that the kind of stuff we were doing at Remove’em to detect bad links just didn’t keep muster with the sophistication of the “big data” analysis Google was using at the time. So Virante went to work. We started monitoring a huge number of keywords, so that when Penguin 2.0 hit we could catch winners and losers. In the end, we used data from three different awesome providers: Authority Labs (for the initial data set), Stat Search Analytics (for cross-validation) and SerpMetrics (for determining that we weren’t just picking up manual penalties). We identified around 600 losing URL/keyword pairs and matched them with their competitors who did not lose rankings.

We then opened the data up to the community at the Open Penguin Data project website and asked members of the community to contribute their ideas for factors that might influence the Penguin algorithm. You can go there right now and download the latest data set, although at present I know there is a bug in the mozRank and mozTrust columns that needs to be fixed. We have identified over 70 factors that may influence Penguin and are still building upon them, with the latest variable update being October 14th. Unfortunately, only certain variables can be added now as fresh data won’t be relevant. The data behind the factors came from a large number of sources beginning with Moz of course, and including Majestic SEO, Ahrefs, Grep/Words, and Archive.org

We then began to analyze the data in a number of ways. The first was through standard correlation coefficients to help determine direction of influence (assuming there was any influence at all). It is important that I deal with the issue of correlation vs. causation here, because I am sure one of you will bring it up.

Correlation vs. causation

The purpose of the Open Penguin Data Project was not and is not to determine which factors cause a Penguin penalty. Rather, we want to determine which factors predict a Penguin penalty so that we can build a reasonable model of vulnerability. Once we know a website’s vulnerability to Penguin, we can start applying different techniques to lower that vulnerability that fall closer to the realm of causal factors.

For example, we will talk about the difference of mozTrust and mozRank as being a fairly good predictor of Penguin. No one in their right mind believes that Google consumes Moz’s data to determine who and who not to penalize. However, once we know that a site is likely to be penalized (because we know the mozTrust and mozRank differential), we can start to apply tactics that will likely counter Penguin, such as using the disavow tool or removing spammy links. We aren’t talking about causation, we are talking about prediction.

The analysis of the risk factors

We then began analyzing the data using a couple of methods. First, we used standard mean Spearman correlations to give us an idea of the lay of the land. This allowed us to also build a crude regression model that actually works quite well without much tweaking. This model essentially comes from adding up the correlation coefficients for each of the factors. Obviously, more sophisticated modeling is better than this, but to build a crude overview, this works quite nicely and can be done on the fly. The real magic happens, though, when we apply the same sorts of machine-learning techniques to the data set that Google uses in building models like Penguin.

Let me be clear, I do not presume to know what statistical techniques Google used to build their model. However, there are certain types of techniques that are regularly used to answer these types of multivariate classification problems and I chose to use them. In particular, I chose to use a gradient boosting algorithm. You can read up on the methodology or the specific implementation we used via scikit-learn, but I’ll save you the headache and tell you what you need to know.

Most of us think about statistical analysis as putting some variables in Excel and making a nice graph with a linear regression that shows an upward or downward trend. You can see this below. Unfortunately, this grossly over-simplifies complex problems and often produces a crude result where everything above the line is considered different from that below the line, when clearly they are not. As you see in the example graph below, there are plenty of penalized sites that get missed by falling below the line and completely decent sites that are above the line that get hit.

Classification systems work differently. We aren’t necessarily concerned with higher or lower numbers, we are concerned with patterns that might predict something. In this case, we know sites that were hit by Penguin, so now we use a whole bunch of factors and see how the patterns between them might accurately predict them. We don’t need to draw an arbitrary line, we can individually analyze the points using machine learning, as you see in the example graph below.

The hard part is that machine learning tells us a lot about prediction, but not a lot about how we came to that prediction. That is where some extra work comes into play. With the Open Penguin Data project, we grouped some of the factors by common characteristics and measured the effectiveness of their predictions in isolation from the other factors. For example, we grouped trust metrics together and anchor text metrics together. We then grouped them in combinations as well. This then gave us a model we could use to determine not only increased Penguin vulnerability, but also what factors contributed to that vulnerability and to what degree.

So, let’s talk through some of them here.

Anchor text

By now, everyone and their paid search guy knows that manipulated commercial anchor text is a risk factor for both algorithmic and manual penalties. So, of course, we looked at this closely from the start. We actually broke down the anchor text into three subcategories: exact-match anchor text (meaning the keyword is exactly the keyword for which you would like to rank), phrase-match anchor text (meaning the keyword for which you would like to rank occurs somewhere within the anchor text) and commercial anchor text (the anchor text has a high CPC value).

Exact-match anchor text

We broke exact-match anchor text down into a couple of metrics:

  1. The most common anchor to the page is exact match
  2. The highest mozRank passed anchor to the page is exact match
  3. There is at least one exact match anchor to the page
  4. The most common anchor to the domain is exact match
  5. The highest mozRank passed anchor to the domain is exact match
  6. There is at least one exact match anchor to the domain
Across the board, every single metric related to anchor text provided some positive predictive power except for highest mozRank passed anchor to the domain. Importantly, no single factor had a particularly strong mean Spearman correlation coefficient. For example, the highest was that the domain merely had a single link with the exact match anchor text (.11 correlation coefficient). This is a very weak signal, but our analysis looks to find patterns in these weak signals, so we are not necessarily hindered because each measurement is not sufficiently predictive.

For the biggest victims of Penguin, we often see that exact match anchor text is the second- or third-largest predictor. For example, the below webmaster’s predictive vulnerability score could be lowered by 50% simply by impacting exact match anchor text links. For this particular webmaster, the anchor text hit most positive signals we measure regarding anchor text.

Now let me say it one more time: I am not saying that Google is using anchor text to determine who to penalize, rather that it is a strong predictor. Prediction is not causation. However, we can say that the groupings of exact-match anchor text metrics allow us to detect Penguin vulnerability quite well.

Phrase-match anchor text

We broke down phrase-match anchor text in the exact same fashion. This was one of the more surprising features we noticed. In many cases, phrase-match anchor text metrics appeared to be more predictive than exact-match anchor text. Many SEOs, myself included, have long depended on what we call “brand blend” to protect against over-optimization penalties. Instead of just building links for the keyword “SEO”, we might build links for “Virante SEO” or “SEO by Virante”. This may have insulated us against manual anchor text over-optimization penalties, but it does not appear to be the case with Penguin.

In the example I mentioned above, the webmaster hit nearly every exact match anchor text metric. They also hit every phrase match metric as well. The combination of these factors increased their prediction of being impact by Penguin by a full 100%.

Shoving your high-value keywords inside other phrases doesn’t guarantee you any protection. Now, there are a lot of potential takeaways from this. It could be an artifact of merely doubling the exact match influence (i.e. if you score high on exact match, you will also score high on phrase match). We do see some of this occurring, but it doesn’t appear to explain all of the additional predictive power. It could be that they are targeting other related keywords and thereby increase their exposure to other parts of the Penguin algorithm. All we know, though, is that the predictive power of the model increases greatly when we take into account phrase-match anchor text. Nothing more, nothing less.

Commercial anchor text

This is my favorite measure of all, as it shows how Google can use one of its most powerful ancillary data sets, bid prices for keywords, to detect manipulation of the link graph. We built 4 metrics around commercial anchor text.

  1. The page has a high-value anchor in a single link
  2. The majority of the anchors are valuable
  3. The majority of links are very high-value anchors
  4. Has a high CPC site-wide
Both having high-value anchors and very high-value anchors had strong predictive values of penguin vulnerability. In keeping with the example we have been using so far, you can see that removing commercial anchor text would have a profound impact on our prediction as to whether or not the site will be impacted by Penguin.

If you’ve been paying close attention, you may have noticed that a lot of these are related. Having exact-match and phrase-match anchor text likely means you have highly commercial anchors. All of these metrics are related to one another and it is their combined weak signals that make it easier to detect Penguin vulnerability.

Link sources

The next issue we tried to target was the quality of link sources. The most obvious step was trying to detect commonly spammed link sources: directories, forums, guestbooks, press releases, articles, and comments. Using a set of footprints to identify these types of links and spidering all of the backlinks of the training set, we were able to build a few metrics identifying sites that either simply had these types of links or had a preponderance of these types of links.

First, it was interesting that every type of link was positively correlated, but only very weakly. You can’t just look at a bunch of article directory submissions and assume that is the cause of a Penguin penalty. However, the combination—that is a site that would rely on four or five of these types of techniques for nearly all of their PageRank—would appear to have a greater risk factor.

At this point, I want to stop and draw attention to something: Each of these groupings of factors appear to have some good predictive value, but none of them comes even close to explaining the whole vulnerability. Fixing your exact-match anchor text links, or phrase-match links, or commercial anchor links, or poor link sources by themselves will not insulate you from detection. It is the combination of these factors that appears to increase the vulnerability to Penguin. Most sites that we see hit by Penguin have vulnerability scores that are 250%+, although in Penguin 2.1 we saw them as low as 150%. To get to these levels you have to trip a wide variety of factors, but you don’t have to be egregiously violating any one single SEO tactic.

Site-wides

This was one of the most disappointing features we used. I was certain, as were many, that site-wide links would be the nail in the coffin. Clearly site-wide links are the culprit behind the Penguin penalty, right? Well, the data just doesn’t bear that out.

Site-wides are just too common. The best sites on the web enjoy tons of site-wide links, often in the form of Blog-Rolls. In fact, high site-wide rates correlate negatively with Penguin penalties. Certainly this doesn’t mean you should run out and try to get a bunch of site-wide links, but it does beg the question: Are site-wides really all that bad?

Here is where we find the real difference: anchor text. Commercial anchor text site-wides positively correlate with Penguin penalties. While we cannot say they cause them, there is definitely a predictive leap between just any old site-wide link and a site-wide link with specific, commercially valuable anchor text.

This also helps illustrate another issue we SEOs often run into: anecdotal evidence. It is really easy to look at a link profile, see that site-wide, and immediately assume it is the culprit. It is then seemingly reinforced when we scratch the surface with too simple an analysis like looking at the preponderance of that feature among sites that are penalized. It can and does often lead us down the wrong path.

Trust, trust, trust

Of all the eye-opening, mind-blowing discoveries revealed by the Open Penguin Data project, this one was the biggest. At minimum, we all need to tip our hats to the folks at Moz and Majestic for providing us with great link statistics. Two of the strongest metrics we found in helping predict Penguin vulnerability were MozRank greater than MozTrust (Moz) and Domain Citation Flow over Domain Trust Flow (Majestic).

Both Moz and Majestic give us statistics that mimic to a certain degree the raw flow of PageRank (MozTrust and Citation Flow) and an alternative often referred to as Trust Rank (MozRank and Trust Flow). They are essentially the same thing, except Trust metrics start with a trusted set of URLs like .govs and .edus and gives extra value to sites that get links from these trusted sources. These metrics by themselves, while useful in other endeavors, don’t really give us much information about Penguin.

However, if we flag URLs and domains where the trust metrics are lower than the raw link metrics, we score some of the highest correlations of all factors tested. Even cruder metrics like whether or not the domain has a single .gov link help predict Penguin vulnerability. While it would be insane to conclude that Google has a subscription to Moz and Majestic and use them to build their Penguin algorithm, this appears to be true: In the aggregate, cheap, low quality links are a Penguin risk factor.

What we should learn

There are some really amazing takeaways that we can build from this kind of analysis—the kind of takeaways that should change your understanding of Penguin and Google’s algorithm for many of you who are not yet seasoned professionals. So let’s dive in…

Penguin isn’t spam detection, it’s you detection

Try this fact on for size. If you hit every anchor text trigger in the Open Penguin data set, our predictive model actually DROPS in effectiveness. At first glance this seems counter-intuitive. Certainly Google should catch these extreme spammers. The reality is, though, that cruder algorithms generally clear out this type of search spam. If you have done any traditional off-site SEO in the last three years, it will probably create additional Penguin vulnerability. The Penguin update is targeted at catching patterns of optimization that aren’t so easily detected. The most egregious offenders are more likely to be caught by other algorithms than Penguin. So when the next Penguin update comes out and you hear people complain about how some spam site wasn’t affected, you can be confident that this isn’t a flaw in Penguin, rather a deliberate choice on Google’s behalf to create separate algorithms to target different types of over-optimization.

The rise of the link assassin

It was Ian Curl, a former Virante employee and now head of Link Assassins who first pointed out to me the clear future of SEO: pruning the link graph. Google has essentially given us the tools via GWT to both view our links and disavow them. A new class of link removal and disavow professionals has grown over the last year: SEOs who can spot a toxic link and guide you through the process of not just cleaning up a penalty but proactively managing your link profile to avoid penalties in the first place. These “link assassins” will play a vital role in the future of SEO in just the same way that one would expect a professional gardener to prune back excessive growth.

The demise of cheap, scalable white-hat link building

Let me be clear: If it works, Google wants to stop it. We have already heard the shots across the bow for lily-white link building techniques like guest posting from Matt Cutts. Right now, the only hold-out I see left is broken link building which is only scalable under certain circumstances. Google is doing its best to identify the exact same footprints you use to link-build and adding them into their own link pattern detection. It isn’t an easy task, which is why Penguin only rolls out every few months, but it appears to be one to which Google is committed.

The growth of integrated SEO

There is no way around it. If you are interested in long term, effective, white-hat SEO, you are going to have to build integrated campaigns largely focused around content marketing that include multiple forms of advertising. There is a great write up on this by Chris Boggs over at Internet Marketing Ninjas on Integrating Content Marketing into Traditional Advertising Campaigns. As Google continues to get better at detecting unnatural patterns, it will be harder and harder to get away with simply turning one dial at a time.

Next steps

The average webmaster or SEO needs to really step back and make an honest account of their current SEO footprint. I don’t mean to be fear-mongering; only a fraction of a percent of all websites will ever get hit by Penguin. 75% of adult males who smoke a pack a day will never get lung cancer, but that doesn’t mean you should keep on smoking because the odds are in your favor. While the odds are greatly in your favor that Penguin will never strike your site, there is no reason to not take simple precautions to determine whether your tactics are putting your site at risk.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

How Google is Changing Long-Tail Search with Efforts Like Hummingbird – Whiteboard Friday

Posted by randfish

The Hummingbird update was different from the major algorithm updates like Penguin and Panda, revising core aspects of how Google understands what it finds on the pages it crawls. In today’s Whiteboard Friday, Rand explains what effect that has on long-tail searches, and how those continue to evolve.

Whiteboard Friday – How Google is Changing Long-Tail Search with Efforts Like Hummingbird

For reference, here’s a still of this week’s whiteboard!

Video Transcription

Howdy, Moz fans and welcome to another edition of Whiteboard Friday. This week I wanted to talk a little bit about Google Hummingbird slightly, but more broadly how Google has been making many efforts over the years to change how they deal with long-tail search.

Now long tail, if you’re not familiar already, is those queries that are usually lengthier in terms of number of words in the phrase and refer to more specific kinds of queries than the sort of head of the demand curve, which would be shorter queries, many more people performing them, and, generally speaking, the ones that in our profession, especially in the SEO world, the ones that we tend to care about. So those are the shorter phrases, the head of the demand curve, or the chunky middle of the demand curve versus the long tail.

Long tail, as Google has often mentioned, is a very big proportion of the Web search traffic. It’s anywhere from 20% to maybe 40% or even 50% of all the queries on the Web are in that long tail, sort of fewer than maybe 10 to 50 searches per month, in that bucket. Somewhere around 18% or 20% of all searches Google says are extremely long tail, meaning they’ve never seen them before, extremely unique kinds of searches.

I think Google struggles with this a little bit. They struggle from an advertising perspective because they’d like to be able to serve up great ads targeting those long-tail phrases, but inside of AdWords, Google’s Keyword Tool, for self-service advertising, it’s tough to choose those. Google doesn’t often show volume around them. Google themselves might have a tough time figuring out, “hey, is this query relevant to these types of results,” especially if it’s in a long tail.

So we’ve seen them get more and more sophisticated with content, context, and textual analysis over the years such that today, with the release of, in August according to Google, Hummingbird, which was an infrastructure update more so than an algorithmic update. You can think of Penguin or Panda as being algorithmic style updates, and Google Caffeine, which upgraded their speed, or Hummingbird, which they say upgrades their text processing and their content and context understanding mechanisms is affecting things today.

I’ll try and illustrate this with an example. Let’s say Google gets two search queries, “best restaurants SEA,” Seattle’s airport, that’s the airport code, the three-letter code, and “where to eat at Sea-Tac Airport in Terminal C.” Let’s say then that we’ve got a page here that’s been produced by someone who has listed the best restaurants at Sea-Tac, and they’ve ordered them by terminals.

So if you’re in Terminal A, Terminal B, Terminal C, it’s actually easy to walk between most of them except for N and S. I hope you never have to go N. It’s just a pain. S is even more of a pain. But in Terminal C, which I assume would be Beecher’s Cheese, because that place is incredible. It just opened. It’s super good. In Terminal C, they’ve got a Beecher’s Cheese, so they’ve got a listing for this.

A smart Google, an intelligent engineer at Google would go, “Man, you know, I’d really like to be able to serve up this page for this result. But it doesn’t target the words ‘where to eat’ or ‘Terminal C’ specifically, especially not in the title or the headline, the page title. How am I going to figure that out?” Well, with upgrades like what we’ve seen with Hummingbird, Google may be able to do more of this. So they essentially say, “I want to understand that this page can satisfy both of these kinds of results.”

This has some implications for the SEO world. On top of this, we’re also getting kind of biased away from long-tail search, because keyword (not provided) means it’s harder for an individual marketer to say: “Oh, are people searching for this? Are people searching for that? Is this bringing me traffic? Maybe I can optimize my page more towards it, optimize my content for it.”

So this kind of combination and this direction that we’re feeling from Google has a few impacts. Those include more traffic opportunities, opportunities for great content that isn’t necessarily doing a fantastic job at specific keyword targeting.

So this is kind of interesting from an SEO perspective, because we’re not saying, and I’m definitely not saying, stop doing keyword targeting, stop putting good keywords in your titles and making your pages contextually relevant to search queries. But I am saying if you do a good job of targeting this, best restaurants at SEA or best restaurants Sea-Tac, you might find yourself getting a lot more traffic for things like this. So there’s almost an increased benefit to producing that great content around this and serving, satisfying a number of needs that a search query’s intent might have.

Unfortunately, for some of us in the SEO world, it could get rougher for sites that are targeting a lot of mid and long-tail queries through keyword targeting that aren’t necessarily doing a fantastic job from a content perspective or from other algorithmic inputs. So if it’s the case that I just have to be ranking for a lot of long-tail phrases like this, but I don’t have a lot of the brand signals, link signals, social signals, user usage signals, I just have strong keyword signals, well, Google might be trying to say, “Hey, strong keyword signals doesn’t mean as much to us anymore because now we can take pages that we previously couldn’t connect to that query and connect them up.”

In general, what we’re talking about is Google rewarding better content over more content, and that’s kind of the way that things are trending in the SEO world today.

So I’m sure there’s going to be some great discussion. I really appreciate the input of people who have done extensive analysis on top of Hummingbird. Those folks include folks like Dr. Pete, of course, from Moz, Bill Slawski from SEO by the Sea, Ammon Johns, who wrote a great post about this. I think there’ll be more great discussion in the comments. I look forward to joining you there. Take care.

Video transcription by Speechpad.com


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

A Guide to Spanish Content Marketing

Posted by ZephSnapp

This post was originally in YouMoz, and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of Moz, Inc.

Si prefieres leer este post en Espaňol, se encuentra en el blog de Altura Interactive.

Just like the rest of the SEO/inbound/internet marketing world, we have spent the last year learning how to shift from link building to link earning, and despite the fact that this stuff is really, really hard, we’ve found some success by building out processes. One challenge (advantage?) that we have is that we work exclusively on Spanish-language projects. This means that while many of the strategies are the same, some of the tactics vary. This post is primarily meant for marketers interested in targeting the Spanish-speaking world, but should also be helpful to full-stack marketers no matter the language.

Are you ready for Spanish content marketing?

There are a ton of great reasons to get started on Spanish-language content marketing. The Hispanic community in the US grew 67% from 2000 to 2011 according to Pew Hispanic, and cleared 50 million people for the first time (although reaching them does not necessarily mean you need to start marketing in Spanish). Also, while growth has slowed in Latin American countries over the past couple of years, their economies are stable enough that they aren’t as affected by downturns in the US economy as they once were. Just because Hispanic marketing is hot, though, is not a good reason for your business to invest time, money, and sweat equity in marketing to Spanish speakers. You need to validate the concept and ensure it’s the right move for you.

First, translate your main keywords. In some cases this can be fairly straightforward, but there are some products that shouldn’t be translated, since the term exists on its own. A great example is “e-commerce:” While there are ways to translate this term, most of the time we leave it in English. But please, a word of advice: Don’t use a machine translation. Get a human being to translate your terms for you, then have someone else check their translations. It is of paramount importance that your terms preserve the same query intent, otherwise, any work on keyword research will be wasted.

Next, make sure that your website is in order, and that you have decided on an international strategy. If you need more help on that front, check out Aleyda’s Whiteboard Friday about International SEO Do’s and Dont’s and her International SEO Checklist. They are both excellent resources if you are thinking about taking your business abroad.

The research phase

We believe in doing persona-based marketing at all times. There is no reason to belabor the point of how to build personas, since this topic has been written about extensively. Suffice it to say, we follow the process explained by Mike King almost to a T. The main difference in our technique is that in addition to this process, we have to think about the country/region towards which we will be targeting the content. This informs the type of data we should use for a given piece of content. For example, if you are going after US-based Hispanics, you may not even need to create the content in Spanish!

Armed with these personas, we find actual people who are active on social media and see what type of content they are sharing. Followerwonk is a great way to do this. These are not necessarily prospects, but It’s absolutely necessary to drill down as much as possible, otherwise your outreach will not be nearly as effective.

Arm yourself with information

If you are going to create interesting content for Latin American audiences, you are going to need data. Lots of it. Luckily for you, we’ve gathered a ton of data resources from all over Latin America. Some of them are country specific, but others look at the region as a whole. The information is in Spanish, but as we say in Mexico, “gajes del oficio” (comes with the territory). At least we’ve translated the description of the databases so you’ll be able to find what you are looking for. It is also a living document. As we find more data sets, they will be added (and if you have any suggestions, please put them in the comments, either here or on that post).

Since you already have your personas built, you can easily decide the data that makes the most sense for your project, and then move on to another important step:

Building the content

If you are a data driven marketer (the best kind in my opinion), when you are diving into the data, your aim has to be to understand the story that the data is telling you, and how you can use it to promote your client. Once you have the story in place, we start thinking about how to best present the data. In some many cases, a great blog post will do the trick. In those cases, we have one person start writing titles. We write a minimum of five, because we want to stimulate creative thought—it is rare that the first idea is the best.

Our lead editor reviews the proposals with the author, and together they decide which best fits the subject, as well as the websites/people the post will be targeting. Then the post is written, reviewed by the editor, and then another content creator to ensure that the piece is focused, creative, and grammatically sound.

In many cases, users will respond more favorably to a visualization than to text. This is especially true if you are explaining a process or giving instructions. We’ve found that video can be an awesome way get through to these people. If you don’t have the budget or the ability to shoot a video yourself (although you should—as Phil Nottingham explained at MozCon, good video can be created pretty cheaply), PowToon allows you to create an animated explanation video, even if you don’t have incredible design chops.

If you must create an infographic, at least try to be original in how you present it. We’ve used Piktochart and Visual.ly just like everyone else, but there are a ton of other ways to present data. We’ve created a list of data visualization resources that includes some very unusual ways of presenting data. In many cases, the main investment is in learning how to use the platform.

Shameless Plug: In my Mozinar next Tuesday I’ll be sharing the easiest way to build resources with outreach prospects built in. It’s seriously awesome. You should sign up now. ¡Por favor!

Prospecting for outreach

Generally speaking, we are looking for:

People

Usually the best way to find experts in a given vertical is to look at Twitter, and the best way to qualify them is via Followerwonk. Enough blog posts have been written about this already, so there is no need for us to get into that here.

Websites

If you are really strapped for cash, all you need is a list of keywords for your vertical and Google’s advanced operators. We use these on occasion, but most of the time, it is faster and more efficient to lean on tools built by others.

Link Prospector supports multilingual queries, and if you want to get a great list of prospects quickly, this is a great way to find them. (Full disclosure: We helped build the multilingual tool, and while we didn’t profit from it, we do get to use it for free. Still, if you told me I could only use Moz and one other tool, this would be it).

Buzzstream is an awesome tool which also supports multilingual queries, and doubles as a way to remember what prospects are in what stage of a relationship. We have found that the contact information that the tool pulls is not particularly accurate for websites in Spanish, so if you are using this tool don’t depend on them—go get the information for yourself. Another platform that we’ve been using that has proven helpful is GroupHigh. Their platform is pricey, but the prospects that you can get from here are excellent, especially if you are doing a bilingual English/Spanish outreach campaign. The metrics they provide are based on Moz’s stats as well as social shares, but they don’t always coincide with what we find when we check sites by hand.

To be sure, we prequalify every single website we are going to do outreach to. And we craft every single pitch individually to ensure that they are more likely to looked upon favorably by our prospective partners.

Once we have our prospects, we separate them into tiers. The top tier is of the most important people and websites in a sphere. We know that getting in touch with and convincing these targets to share our content will be extraordinarily hard, simply because they are pitched to so often. The advantage we have is that most of the pitches they receive totally suck. Knowing how to approach each influencer can make or break your outreach efforts, which leads to our next point:

Outreach to influencers

The goal of any outreach campaign is to get the person/website you’ve targeted to share your content piece, right? In most cases, no matter the quality of your pitch, it will be ignored. This is because some websites are abandoned, the webmaster might be too busy with other work (like a day job), or they simply might not care enough to respond. These are the facts.

And then there is the question of culture and language. We’ve used templates developed by some of the best link builders in the US and seen zero or even negative response. So, it is crucial to localize not just the content, but also the approach. By following our process, you can increase your engagement rate when doing outreach, especially when it is for a piece of content you have created. Here are a few tips that we’ve found to be effective when doing outreach to Spanish-speaking webmasters, bloggers, and journalists:

1) Write it in Spanish

I know that this might seem obvious, but my friends who are bloggers—including for the oldest blog in Mexico—receive dozens of pitches from professional PR companies IN ENGLISH. Unforgivable.

2) Make it relevant

Even if the piece of content that you are promoting is only loosely related to the target site, make sure that you make an argument for why it would be interesting to the readership of that site. Yes, this means you can’t just blast emails. Too bad.

3) Keep it short

In Spanish, we have a tendency to be a bit verbose. In fact, we use more words to explain something than people usually do in English. That being said, it is still better to be concise.

4) Have a hook

Whenever you are doing outreach, the goal is to provide value to your client or company. Keep in mind, however, that webmasters don’t care about how great it will be for you if they share your latest infographic about dog food. They care about their readers and community, so make sure that your pitch addresses the benefits for them, not for you.

5) Address the webmaster how (s)he addresses users

In Spanish, you can address readers either formally or informally. By making your outreach consistent with how they address their readers, you can be sure that your pitch fits their style.

6) Be legit, be honest

Despite what I’ve heard about other markets, we’ve found that being TAGFEE is the best way to get results from an outreach campaign. That doesn’t mean that you can’t sugarcoat your outreach (“Links, Please” is probably not the best subject line), but we send emails from our own domain, and own up to working on behalf of a client. We even link back to our profile pages in our outreach emails.

7) Prioritize outreach method

The best method for outreach depends on who you are reaching out to. This is our priority list when reaching out to bloggers, for example:

  1. Contact form
  2. Facebook
  3. Email
  4. Twitter
In our experience, the first two methods are easily the most effective. This is another place where being open and honest works to our advantage. Since we are using our own Facebook profiles to conduct outreach, prospects can look at our pictures, read our updates, and see that we are human beings, just like them. They are far less likely to say no to someone who likes the same band as them, right?

Of course, if you are reaching out to a journalist (or even a web-based magazine) it is probably going to be best to reach out via phone. Having a prioritized list of methods makes things easier for the outreach specialist to work.

There is obviously a lot more that goes into outstanding Spanish content marketing, but this guide is here to give you the basics. If you want to dig deeper into our Spanish digital marketing processes, please sign up for my Mozinar. ¡Muchas Gracias!

If you would prefer to read this post in Spanish, check it out on the Altura Interactive blog.

Si quieres leer sobre estrategia de contenido en espaňol, este post también se encuentra en el blog de Altura Interactive.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →

Growing Sales Without a Sales Team: The Power of Distribution

Posted by AndrewDumont

Early on, we made the decision not to grow a sales team at Moz. We’re not anti-sales, per se, it’s just that it does’t fit our culture. We believe in practicing what we preach—inbound marketing—not interruption selling.

Consequently, that provides a bit of a problem for a B2B SaaS business like Moz. Growing through traditional inbound channels is immensely powerful, but at a certain scale, maintaining linearity in growth through content, social media, and search becomes difficult. Working with the role of Business Development at Moz, it’s my job to find those channels that will introduce growth at scale in a predictive way.

Which brings me to distribution.

Just over two years ago when I started at Moz, we began to ponder a simple question: If we offered an extended free trial on Moz (45, 60, 90, or 120 days) to select partners, could it move the needle on growth?

Before we’d be able to answer that question, however, we needed a few assets. The first was what we call a “partner page” internally—a lander that factors in a coupon code at check-out and offers a soft entry point from a third-party site. An example of that can be found below, which was shown in a recent partnership we launched with Get Startup Tools. It should be said that this is not an ideal partner page. There’s much to be tested in the way of alternate text length and incorporating partner logos, which have proven to bump conversion in relevant studies.

Next, we needed to find something to provide our partners with value outside of the extended trial period that they’d be offering to their community through the distribution. This brought us to the concept of a “perks page,” a collection of top web services that we could offer to Moz subscribers at discounted price, in a sense offering what we were looking for in return. With these two assets in place, were were ready to go.

Moz Perks

Which brings me back to the question I teed up for myself originally. Yes, it could move the needle on growth, but how much? Let’s take a look. Below is a breakdown of free trials and paid conversions that have come directly from the distribution channel since January of 2012.

(click for larger version)

Looking at these numbers, however unsophisticated the graph may be, it begs another question: Sure, the numbers are growing, but do they perform as well as organic free trials, or do they churn out at a higher rate? Below is an analysis of just that, comparing conversion rates and churn of organic trials versus distribution trials, broken down by month in their subscription.

(click for larger version)

As you’ll notice, month 0 and 1 are much higher than organic, but it then regulates out to something more manageable, a rate very similar to that of organic. Oddly enough, when we looked at trial length and the corresponding conversion rate, it didn’t increase with length. Out of a fairly large sample set of 45, 60, 75, 90, and 120-day trials, the 60-day trial performed best by far. Counterintuitive, to say the least. From a holistic view, the conversion rate was lower, but not by an insane amount.

(click for larger version)

Now, back to that needle. How much revenue—real money—have distributions brought in? As of August 1st, 2013, we generated roughly $139,788 in revenue on a monthly basis through the distribution channel, or $2.3M in revenue since the channel was created in January of 2012.

(click for larger version)

Not bad. But I haven’t even brought up the most amazing part of distribution: acquisition cost. Each one of these users that came through our partners via distributions came to us with $0 in acquisition cost, which is why step two of the legwork I mentioned was so darned important. By offering value back to our partners through their inclusion in our perks page, all of the numbers listed above were acquired without a rev-share or acquisition cost. The only spend was in the form of an increased operating cost from the extended trail. The hottest of damns.

That’s all and well for Moz, but how can you apply it to your business? Well, regardless of your business, it’s definitely worth adding to your tool belt as one of your 21 tactics, but it’s typically best-suited for SaaS businesses like Moz.

If you’re a service provider, you’ll likely have to get a little bit more creative. Though it’s not a direct corollary, the closest comparable in the service world would be a partnership with a software product like Moz, wherein Moz becomes a recommendation engine for new clients. You can see this in practice through our partnership with Distilled. For most software companies, they don’t want to derail focus from the product through consulting work, so there’s a lot of value to be added in becoming that missing consultation piece.

Regardless, the same concepts apply. Provide value and receive value; that’s the nature of any partnership. Yet for some reason, partnerships typically aren’t thought of as a growth channel in the inbound marketing mix, when they can clearly have an impact.

Hell, in this example, they even build links, if you’re into that sort of thing.

A huge thanks to Alyson and Kurtis for making all of this data possible, both for our internal analysis and for the sake of this post, they employed some serious mySQL-fu.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue reading →