Posted by rjonesx.
Moz’s Domain Authority is requested over 1,000,000,000 times per year, it’s referenced millions of times on the web, and it has become a veritable household name among search engine optimizers for a variety of use cases, from determining the success of a link building campaign to qualifying domains for purchase. With the launch of Moz’s entirely new, improved, and much larger link index, we recognized the opportunity to revisit Domain Authority with the same rigor as we did keyword volume years ago (which ushered in the era of clickstream-modeled keyword data).
What follows is a rigorous treatment of the new Domain Authority metric. What I will not do in this piece is rehash the debate over whether Domain Authority matters or what its proper use cases are. I have and will address those at length in a later post. Rather, I intend to spend the following paragraphs addressing the new Domain Authority metric from multiple directions.
Correlations between DA and SERP rankings
The most important component of Domain Authority is how well it correlates with search results. But first, let’s get the correlation-versus-causation objection out of the way: Domain Authority does not cause search rankings. It is not a ranking factor. Domain Authority predicts the likelihood that one domain will outrank another. That being said, its usefulness as a metric is tied in large part to this value. The stronger the correlation, the more valuable Domain Authority is for predicting rankings.
Determining the “correlation” between a metric and SERP rankings has been accomplished in many different ways over the years. Should we compare against the “true first page,” top 10, top 20, top 50 or top 100? How many SERPs do we need to collect in order for our results to be statistically significant? It’s important that I outline the methodology for reproducibility and for any comments or concerns on the techniques used. For the purposes of this study, I chose to use the “true first page.” This means that the SERPs were collected using only the keyword with no additional parameters. I chose to use this particular data set for a number of reasons:
- The true first page is what most users experience, thus the predictive power of Domain Authority will be focused on what users see.
- By not using any special parameters, we’re likely to get Google’s typical results.
- By not extending beyond the true first page, we’re likely to avoid manually penalized sites (which can impact the correlations with links.)
- We did NOT use the same training set or training set size as we did for this correlation study. That is to say, we trained on the top 10 but are reporting correlations on the true first page. This prevents us from the potential of having a result overly biased towards our model.
I randomly selected 16,000 keywords from the United States keyword corpus for Keyword Explorer. I then collected the true first page for all of these keywords (completely different from those used in the training set.) I extracted the URLs but I also chose to remove duplicate domains (ie: if the same domain occurred, one after another.) For a length of time, Google used to cluster domains together in the SERPs under certain circumstances. It was easy to spot these clusters, as the second and later listings were indented. No such indentations are present any longer, but we can’t be certain that Google never groups domains. If they do group domains, it would throw off the correlation because it’s the grouping and not the traditional link-based algorithm doing the work.
I collected the Domain Authority (Moz), Citation Flow and Trust Flow (Majestic), and Domain Rank (Ahrefs) for each domain and calculated the mean Spearman correlation coefficient for each SERP. I then averaged the coefficients for each metric.
Moz’s new Domain Authority has the strongest correlations with SERPs of the competing strength-of-domain link-based metrics in the industry. The sign (-/+) has been inverted in the graph for readability, although the actual coefficients are negative (and should be).
Moz’s Domain Authority scored a ~.12, or roughly 6% stronger than the next best competitor (Domain Rank by Ahrefs.) Domain Authority performed 35% better than CitationFlow and 18% better than TrustFlow. This isn’t surprising, in that Domain Authority is trained to predict rankings while our competitor’s strength-of-domain metrics are not. It shouldn’t be taken as a negative that our competitors strength-of-domain metrics don’t correlate as strongly as Moz’s Domain Authority — rather, it’s simply exemplary of the intrinsic differences between the metrics. That being said, if you want a metric that best predicts rankings at the domain level, Domain Authority is that metric.
Note: At first blush, Domain Authority’s improvements over the competition are, frankly, underwhelming. The truth is that we could quite easily increase the correlation further, but doing so would risk over-fitting and compromising a secondary goal of Domain Authority…
Handling link manipulation
Historically, Domain Authority has focused on only one single feature: maximizing the predictive capacity of the metric. All we wanted were the highest correlations. However, Domain Authority has become, for better or worse, synonymous with “domain value” in many sectors, such as among link buyers and domainers. Subsequently, as bizarre as it may sound, Domain Authority has itself been targeted for spam in order to bolster the score and sell at a higher price. While these crude link manipulation techniques didn’t work so well in Google, they were sufficient to increase Domain Authority. We decided to rein that in.
The first thing we did was compile a series off data sets that corresponded with industries we wished to impact, knowing that Domain Authority was regularly manipulated in these circles.
- Random domains
- Moz customers
- Blog comment spam
- Low-quality auction domains
- Mid-quality auction domains
- High-quality auction domains
- Known link sellers
- Known link buyers
- Domainer network
- Link network
While it would be my preference to release all the data sets, I’ve chosen not to in order to not “out” any website in particular. Instead, I opted to provide these data sets to a number of search engine marketers for validation. The only data set not offered for outside validation was Moz customers, for obvious reasons.
For each of the above data sets, I collected both the old and new Domain Authority scores. This was conducted all on February 28th in order to have parity for all tests. I then calculated the relative difference between the old DA and new DA within each group. Finally, I compared the various data set results against one another to confirm that the model addresses the various methods of inflating Domain Authority.
In the above graph, blue represents the Old Average Domain Authority for that data set and orange represents the New Average Domain Authority for that same data set. One immediately noticeable feature is that every category drops. Even random domains drops. This is a re-centering of the Domain Authority score and should cause no alarm to webmasters. There is, on average, a 6% reduction in Domain Authority for randomly selected domains from the web. Thus, if your Domain Authority drops a few points, you are well within the range of normal. Now, let’s look at the various data sets individually.
Random domains: -6.1%
Using the same methodology of finding random domains which we use for collecting comparative link statistics, I selected 1,000 domains, we were able to determine that there is, on average, a 6.1% drop in Domain Authority. It’s important that webmasters recognize this, as the shift is likely to affect most sites and is nothing to worry about.
Moz customers: -7.4%
Of immediate interest to Moz is how our own customers perform in relation to the random set of domains. On average, the Domain Authority of Moz customers lowered by 7.4%. This is very close to the random set of URLs and indicates that most Moz customers are likely not using techniques to manipulate DA to any large degree.
Link buyers: -15.9%
Surprisingly, link buyers only lost 15.9% of their Domain Authority. In retrospect, this seems reasonable. First, we looked specifically at link buyers from blog networks, which aren’t as spammy as many other techniques. Second, most of the sites paying for links are also optimizing their site’s content, which means the sites do rank, sometimes quite well, in Google. Because Domain Authority trains against actual rankings, it’s reasonable to expect that the link buyers data set would not be impacted as highly as other techniques because the neural network learns that some link buying patterns actually work.
Comment spammers: -34%
Here’s where the fun starts. The neural network behind Domain Authority was able to drop comment spammers’ average DA by 34%. I was particularly pleased with this one because of all the types of link manipulation addressed by Domain Authority, comment spam is, in my honest opinion, no better than vandalism. Hopefully this will have a positive impact on decreasing comment spam — every little bit counts.
Link sellers: -56%
I was actually quite surprised, at first, that link sellers on average dropped 56% in Domain Authority. I knew that link sellers often participated in link schemes (normally interlinking their own blog networks to build up DA) so that they can charge higher prices. However, it didn’t occur to me that link sellers would be easier to pick out because they explicitly do not optimize their own sites beyond links. Subsequently, link sellers tend to have inflated, bogus link profiles and flimsy content, which means they tend to not rank in Google. If they don’t rank, then the neural network behind Domain Authority is likely to pick up on the trend. It will be interesting to see how the market responds to such a dramatic change in Domain Authority.
High-quality auction domains: -61%
One of the features that I’m most proud of in regards to Domain Authority is that it effectively addressed link manipulation in order of our intuition regarding quality. I created three different data sets out of one larger data set (auction domains), where I used certain qualifiers like price, TLD, and archive.org status to label each domain as high-quality, mid-quality, and low-quality. In theory, if the neural network does its job correctly, we should see the high-quality domains impacted the least and the low-quality domains impacted the most. This is the exact pattern which was rendered by the new model. High-quality auction domains dropped an average of 61% in Domain Authority. That seems really high for “high-quality” auction domains, but even a cursory glance at the backlink profiles of domains that are up for sale in the $10K+ range shows clear link manipulation. The domainer industry, especially the domainer-for-SEO industry, is rife with spam.
Link network: -79%
There is one network on the web that troubles me more than any other. I won’t name it, but it’s particularly pernicious because the sites in this network all link to the top 1,000,000 sites on the web. If your site is in the top 1,000,000 on the web, you’ll likely see hundreds of root linking domains from this network no matter which link index you look at (Moz, Majestic, or Ahrefs). You can imagine my delight to see that it drops roughly 79% in Domain Authority, and rightfully so, as the vast majority of these sites have been banned by Google.
Mid-quality auction domains: -95%
Continuing with the pattern regarding the quality of auction domains, you can see that “mid-quality” auction domains dropped nearly 95% in Domain Authority. This is huge. Bear in mind that these drastic drops are not combined with losses in correlation with SERPs; rather, the neural network is learning to distinguish between backlink profiles far more effectively, separating the wheat from the chaff.
Domainer networks: -97%
If you spend any time looking at dropped domains, you have probably come upon a domainer network where there are a series of sites enumerated and all linking to one another. For example, the first site might be sbt001.com, then sbt002.com, and so on and so forth for thousands of domains. While it’s obvious for humans to look at this and see a pattern, Domain Authority needed to learn that these techniques do not correlate with rankings. The new Domain Authority does just that, dropping the domainer networks we analyzed on average by 97%.
Low-quality auction domains: -98%
Finally, the worst offenders — low-quality auction domains — dropped 98% on average. Domain Authority just can’t be fooled in the way it has in the past. You have to acquire good links in the right proportions (in accordance with a natural model and sites that already rank) if you wish to have a strong Domain Authority score.
What does this mean?
For most webmasters, this means very little. Your Domain Authority might drop a little bit, but so will your competitors’. For search engine optimizers, especially consultants and agencies, it means quite a bit. The inventories of known link sellers will probably diminish dramatically overnight. High DA links will become far more rare. The same is true of those trying to construct private blog networks (PBNs). Of course, Domain Authority doesn’t cause rankings so it won’t impact your current rank, but it should give consultants and agencies a much smarter metric for assessing quality.
What are the best use cases for DA?
- Compare changes in your Domain Authority with your competitors. If you drop significantly more, or increase significantly more, it could indicate that there are important differences in your link profile.
- Compare changes in your Domain Authority over time. The new Domain Authority will update historically as well, so you can track your DA. If your DA is decreasing over time, especially relative to your competitors, you probably need to get started on outreach.
- Assess link quality when looking to acquire dropped or auction domains. Those looking to acquire dropped or auction domains now have a much more powerful tool in their hands for assessing quality. Of course, DA should not be the primary metric for assessing the quality of a link or a domain, but it certainly should be in every webmaster’s toolkit.
What should we expect going forward?
We aren’t going to rest. An important philosophical shift has taken place at Moz with regards to Domain Authority. In the past, we believed it was best to keep Domain Authority static, rarely updating the model, in order to give users an apples-to-apples comparison. Over time, though, this meant that Domain Authority would become less relevant. Given the rapidity with which Google updates its results and algorithms, the new Domain Authority will be far more agile as we give it new features, retrain it more frequently, and respond to algorithmic changes from Google. We hope you like it.
Be sure to join us on Thursday, March 14th at 10am PT at our upcoming webinar discussing strategies & use cases for the new Domain Authority:
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!