Tuesday 10 December 2013

[Build Backlinks Online] 4 Lessons From a Year of MozCast Data

Build Backlinks Online has posted a new item, '4 Lessons From a Year of MozCast
Data'

Posted by BenMorel86This post was originally in YouMoz, and was promoted to the
main blog because it provides great value and interest to our community. The
author's views are entirely his or her own and may not reflect the views of Moz,
Inc.
We all know that over the past year, there have been some big updates to
Google's algorithms, and we have felt what it has been like to be in the middle
of those updates. I wanted to take a big step back and analyse the cumulative
effects of Google's updates. To do that, I asked four questions and analysed a
year of MozCast data to find the answers.




Looking back over the last year â or more precisely the last 15 months
through 1st September 2013 â I aimed to answer four questions I felt are
really important to SEOs and inbound marketers. These questions were:

Are there really more turbulent days in the SERPs than we should expect, or are
all SEOs British at heart and enjoy complaining about the weather?
If it's warmer today than yesterday, will it cool down tomorrow or get even
warmer?
It sometimes feels like big domains are taking over the SERPs; is this true, or
just me being paranoid?
What effects have Google's spam-fighting had on exact and partial domain
matches in SERPs?

Before We Start


First, thanks to Dr. Pete for sending me the dataset, and for checking this
post over before submission to make sure all the maths made sense.


Second, as has been discussed many times before on Moz, there is a big caveat
whenever we talk about statistics: correlation does not imply causation. It is
important not to reverse engineer a cause from an effect and get things muddled
up. In addition, Dr. Pete had a big caveat about this particular dataset:


"One major warning - I don't always correct metrics data past 90 days, so
sometimes there are issues with that data on the past. Notably, there was a
problem with how we counted YouTube results in November/December, so some
metrics like "Big 10" and diversity were out of whack during those months. In
the case of temperatures, we actively correct bad data, but we didn't catch this
problem early enoughâ


All that's to say that I can't actually verify that any given piece of past
data is completely accurate, outside of the temperatures (and a couple of those
days have been adjusted). So, proceed with caution."


So, with that warning, let's have a look at the data and see if we can start
to answer those questions.




Analysis: MozCast gives us a metric for turbulence straight away: temperature.
That makes this one of the easier questions to answer. All we need to do is to
take the temperature's mean, standard deviation, skew (to see whether the graph
is symmetric or not), and kurtosis (to see how "fat" the tails of the curve
are). Do that, and we get the following:



Mean


68.10F


Standard Deviation


10.68F


Skew


1.31


Kurtosis


2.60



What does all this mean? Well:

A normal day should feel pretty mild (to the Brits out there, 68F is 20C). The
standard deviation tells us that 90% of all days should be between 46F and 90F
(8C and 32C), which is a nicely temperate range.
However, the positive skew means that there are more days on the warm side than
the cool side of 68F.
On top of this, the positive kurtosis means we actually experience more days
above 90F than we would expect.

You can see all of this in the graph below, with its big, fat tail to the
right of the mean.




Graph showing the frequency of recorded temperatures (columns) and how a normal
distribution of temperatures would look (line).


As you can see from the graph, there have definitely been more warm days than
we would expect, and more days of extreme heat. In fact, while the normal
distribution tells us we should see temperatures over 100F (38C) about once a
year we have actually seen 14 of them. That's two full weeks of the year! Most
of those were in June of this year (the 10th, 14th, 18th, 19th, 26th, 28th, 29th
to be precise, coinciding with the multi-week update that Dr. Pete wrote about)


And it looks like we've had it especially bad over the last few months. If we
take data up to the end of May the average is only 66F (19C), so the average
temperature over the last three months has actually been a toasty 73F (23C).


Answer: The short answer to the question is "pretty turbulent, especially
recently". The high temperatures this summer indicate a lot of turbulence, while
the big fat tail on the temperature graph tells us that it has regularly been
warmer than we might expect throughout the last 15 months. We have had a number
of days of unusually high turbulence, and there are no truly calm days. So, it
looks like SEOs haven't just been griping about the unpredictable SERPs they've
had to deal with, they've been right.




Analysis: The real value of knowing about the weather is in being able to make
predictions with that knowledge. So, if today's MozCast shows is warmer than
yesterday it would be useful to know whether it will be warmer again tomorrow or
colder.


To find out, I turned to something called the Hurst exponent, H. If you want
the full explanation, which involves autocorrelations, rescaled ranges, and
partial time series, then head over to Wikipedia. If not, all you need to know
is that:

If H<0.5 then the data is anti-persistent (an up-swing today means that there
is likely to be a down-swing tomorrow)
If H>0.5 the data is persistent (an increase is likely to be followed by
another increase)
If H=0.5 then today's data has no effect on tomorrow's

The closer H is to 0 or 1 the longer the influence of a single day exists
through the data.


A normal distribution â like the red bell curve in the graph above
â has a Hurst exponent of H=0.5. Since we know the distribution of
temperatures with its definite lean and fat tails not normal, we can guess that
its Hurst exponent probably won't be 0.5. So, is the data persistent or
anti-persistent?


Well, as of 4th September that answer is persistent: H=0.68. But if you'd
asked on 16th July â just after Google's Multi-week Update but before The
Day The Knowledge Graph Exploded - the answer would have been "H=0.48, so
neither": it seems that one effect of that multi-week update was to reduce the
long-term predictability of search result changes. But back in May, before that
update, the answer would again have been "H=0.65, so the data is persistent".


Answer: With the current data, I am pretty confident in saying that if the last
few days have got steadily warmer, it's likely to get warmer again tomorrow. If
Google launches another major algorithm change, we might have to revisit that
conclusion. The good news is that the apparent persistence of temperature
changes should give us a few days warning of that algo change.




Analysis: We've all felt at some point like Wikipedia and About.com have taken
over the SERPs. That we're never going to beat Target or Tesco despite the fact
that they never seem to produce any interesting content. Again, MozCast supplies
us with a couple of ready-made metrics to analyse whether or not this is true or
not: Big 10 and Domain Diversity.


First, domain diversity. Plotting each day's domain diversity for the last 15
months gives you the graph below (I've taken a five-day moving average to reduce
noise and make trends clearer).




Trends in domain diversity, showing a clear drop in the number of domains in
the SERPs used for the MozCast.


As you can see, domain diversity has dropped quite a lot. It dropped 16% from
57% in June 2012 to 48% in August 2013. There were a couple of big dips in
domain diversity â 6th May 2012, 29th September 2012, and 31st January
2013 â but really this seems like a definite trend, not the result of a
few jumps.


Meanwhile, if we plot the proportion of the SERPs being taken over by the Big
10 we see a big increase over the same period, from 14.3% to 15.4%. That's an
increase of 8%.




Trends in the five-day moving average of the proportion of SERPs used in the
MozCast dataset taken up by the daily Big 10 domains.


Answer: The diversity of domains is almost certainly going down, and big
domains are taking over at least a portion of the space those smaller domains
leave behind. Whether this is a good or bad thing almost certainly depends on
personal opinion: somebody who owns one of the domains that have disappeared
from the listings would probably say it's a bad thing, Mr. Cutts would probably
say that a lot of the domains that have gone were spammy or full of thin content
so it's a good thing. Either way, it highlights the importance of building a
brand.




Analysis: Keyword-matched domains are a rather interesting subject. Looking
purely at the trends, the proportion of listings with exact (EMD) and partial
(PMD) matched domains is definitely going down. A few updates in particular have
had an effect: One huge jolt in December 2012 had a particular and long-lasting
effect, knocking 10% of EMDs and 10% of PMDs out of the listings; Matt Cutts
himself announced the bump in September 2012; and that multi-week update that
cause the temperature highs in June also bumped down the influence of PMDs.




Trends in the five day moving averages of Exact and Partial Matched Domain
(EMD and PMD) influence in the SERPs used in the MozCast dataset.


Not surprisingly, there is a strong correlation (0.86) between changes in the
proportion of EMDs and PMDs in the SERPs. What is more interesting is that there
is also a correlation (0.63) between their 10-day volatilities, the standard
deviation of all their values over the last 10 days. This implies that when one
metric sees a big swing it is likely that the other will see a big swing in the
same direction â mostly down, according to the graph. This supports the
statements Google have made about various updates tackling low-quality
keyword-matched domains.


Something else rather interesting that is linked to our previous question is
the very strong correlation between the portion proportion of PMDs in the SERPs
and domain diversity. This is a whopping 0.94, meaning that a move up or down in
domain diversity is almost always accompanied by a swing the same way for the
proportion of SERP space occupied by PMDs, and vice versa.


All of this would seem to indicate that keyword matching domains is becoming
less important in the search engines' eyes. But hold your conclusions-drawing
horses: this year's Moz ranking factors study tells us that "In our data
collected in early June (before the June 25 update), we found EMD correlations
to be relatively high at 0.17â just about on par with the value from our
2011 study". So, how can the correlation stay the same but the number of results
go down? Well, I would tend to agree with Matt Peters' hypothesis in that post
that it could be due to "Google removing lower quality EMDs". There is also the
fact that keyword matches do tend to have some relevance to searches: if I'm
looking for pizzas and I see benspizzzas.com in the listings I'm quite likely to
think "they sound like they do pizzas â I'll take a look at them". So
domain matches are still relevant to search queries, as long as they are
supported by relevant content.


So, how can the correlation stay the same but the numbers of results drop?
Well, the ranking factors report looks at how well sites rank once they have
already ranks. If only a few websites with EMDs rank but they rank very highly,
the correlation between rankings and domain matching might be the same as if a
number of websites rank way down the list. So if lower quality EMDs have been
removed from the ranking - as Dr. Matt and Dr. Pete speculate - but the ones
remaining rank higher than they used to, the correlation coefficient we measure
will be the same today in 2011.


Answer: The number of exact and partial matches is definitely going down, but
domain matches are still relevant to search queries â as long as they are
supported by relevant content. We know about this relevance because brands
constantly put their major services into their names: look at SEOmoz (before it
changed), or British Gas, or HSBC (Hong Kong-Shanghai Banking Corporation).
Brands do this because it means their customers can instantly see what they do
â and the same goes for domains.


So, if you plan on creating useful, interesting content for your industry then
go ahead and buy a domain with a keyword or two in. You could even buy the exact
match domain, even if that doesn't match your brand (although this might give
people trust issues, which is a whole different story). But if you don't plan on
creating that content, buying a keyword-matched domain looks unlikely to help
you, and you could even be in for a more rocky ride in the future than if you
stick to your branded domain.




Whew, that was a long post. So what conclusions can we draw from all of this?


Well, in short:

Although the "average" day is relatively uneventful, there are more hot, stormy
days than we would hope for
Keyword-matched domains, whether exact or partial, have seen a huge decline in
influence over the last 15 months â and if you own one, you've probably
seen some big drops in a short space of time
The SERPs are less diverse than they were a year ago, and the big brands have
extended their influence
When EMD/PMD influence drops, ERP diversity also drops. Could the two be
connected?
If today is warmer than yesterday, it's likely that tomorrow will be warmer
still

What are your thoughts on the past year? Does this analysis answer any
questions you had â or make you want to ask more? Let me know in the
comments below (if it does make you ask more questions I'll try to do some more
digging and answer them).
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten
hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think
of it as your exclusive digest of stuff you don't have time to hunt down but
want to read!



You may view the latest post at
http://feedproxy.google.com/~r/seomoz/~3/7BrC1p1LrGU/4-lessons-from-a-year-of-mozcast-data

You received this e-mail because you asked to be notified when new updates are
posted.
Best regards,
Build Backlinks Online
peter.clarke@designed-for-success.com

No comments:

Post a Comment