Sunday, July 31, 2011

Better Understanding Link-based Spam Analysis Techniques

One frustrating aspect of link building is not knowing the value of a link. Although experience, and some data, can make you better at link valuation, it is impossible to know to what degree a link may be helping you. It’s hard to know if a link is even helping at all. Search engines do not count all links, they reduce the value of many that they do count, and use factors related to your links to further suppress the value that’s left over. This is all done to improve relevancy and spam detection.

Understanding the basics of link-based spam detection can improve your understanding of link valuation and help you understand how search engines approach the problem of spam detection, which can lead to better link building practices.
I’d like to talk about a few interesting link spam analysis concepts that search engines may use to evaluate your backlink profile.
Disclaimer:
I don’t work at a search engine, so I can make no concrete claims about how search engines evaluate links. Engines may use some, or none, of the techniques in this post. They also certainly use more (and more sophisticated) techniques than I can cover in this post. However, I spend a lot of time reading through papers and patents, so I thought it'd be worth sharing some of the interesting techniques.

#1 Truncated PageRank

Truncated PageRank
The basics of Truncated PageRank are covered in the paper Linked-based Characterization and Detection of Web Spam. Truncated PageRank is a calculation that removes the direct “link juice” contribution provided by the first level(s) of links. So a page boosted by naïve methods (such as article marketing) are receiving a large portion of the PageRank value directly from the first layer. However, a link from a well linked to page will receive “link juice” contribution from additional levels. Spam pages will likely show a Truncated PageRank that is significantly less than the PageRank. The ratio of Truncated PageRank to PageRank can be a signal to indicate the spamminess of a link profile.

#2 Owned / Accessible Contributions

Links can be bucketed into three general buckets.
  1. Links from owned content – Links from pages that search engines have determined some level of ownership (well-connected co-citation, IP, whois, etc.)
  2. Links from accessible content – Links from non-owned content that is easily accessible to add links (blogs, forums, article directories, guest books, etc.)
  3. Links from inaccessible content – Links from independent sources.
A link from any one of these source is neither good nor bad. Links from owned content, via networks and relationships, are perfectly natural. However, a link from inaccessible content could be a paid link, so that bucket doesn’t mean it’s inherently good. However, knowing the bucket a link falls into can change the valuation.
Owned Contribution
This type of analysis on two sites can show a distinct difference in a link profile, all other factors being equal. The first site is primarily supported on links from content it directly controls or can gain access to. However, the second site has earned links from a substantially larger percentage of unique, independent sources. All things being equal, the second site is less likely to be spam.

#3 Relative Mass

Relative Mass accounts for the percent distribution of a profile for certain types of links. The example of the pie charts above demonstrates the concept of relative massive.
Relative Mass
Relative Mass is discussed more broadly in the paper Link Spam Detection Based on Mass Estimation. Relative Mass analysis can define a threshold at which a page is determined “spam”. In the image above, the red circles have been identified as spam. The target page now has a portion of value attributed to it via “spam” sites. If this value of contribution exceeds a potential threshold, this page could have its rankings suppressed or the value passed through these links minimized. The example above is fairly binary, but there is often a large gradient between not spam and spam.
This type of analysis can be applied to tactics as well, such as distribution of links from comments, directories, articles, hijacked sources, owned pages, paid links, etc. The algorithm may provide a certain degree of “forgiveness” before its relative mass contribution exceeds an acceptable level.

#4 Counting Supporters / Speeds to Nodes

Another method of valuing links is by counting supporters and the speed of discovery of those nodes (and the point at which this discovery peaks).
counting supporters
A histogram distribution of supporting nodes by hops can demonstrate the differences between spam and high quality sites.
supporters histogram
Well-connected sites will grow in supporters more rapidly than spam sites and spam sites are likely to peak earlier. Spam sites will grow rapidly and decay quickly as you move away from the target node. This distribution can help signify that a site is using spammy link building practices. Because spam networks have higher degrees of clustering, domains will repeat upon hops, which makes spam profiles bottleneck faster than non-spam profiles.
Protip: I think this is one reason that domain diversity and unique linking root domains is well correlated with rankings. I don’t think the relationship is as naïve as counting linking domains, but an analysis like supporter counting, as well as Truncated PageRank, would make receiving links from a larger set of diverse domains more well correlated with rankings.

#5 TrustRank, Anti-TrustRank, SpamRank, etc.

The model of TrustRank has been written about several times before and is the basis of metrics like mozTrust. The basic premise is that seed nodes can have both Trust and Spam scores which can be passed through links. The closer to the seed set, the higher the likelihood you are what that seed set was defined as. Being close to spam, makes you more likely to be spam, being close to trust, makes you more likely to be trusted. These values can be judged inbound and outbound.
I won’t go into much more detail than that, because you can read about it in previous posts, but it comes down to four simple rules.
  • Get links from trusted content.
  • Don’t get links from spam content.
  • Link to trusted content.
  • Don’t link to spam content.
This type of analysis has also been used to use SEO forums against spammers. A search engine can crawl links from top SEO forums to create a seed set of domains to perform analysis. Tinfoil hat time....

#6 Anchor Text vs. Time

Monitoring anchor text over time can give interesting insights that could detect potential manipulation. Let’s look at an example of how a preowned domain that was purchased for link value (and spam) might appear with this type of analysis.
anchor text over time
This domain has a historical record of acquiring anchor text including both brand and non-branded targeted terms. Then suddenly that rate drops and after time a new sudden influx of anchor text, never seen before, starts to come in. This type of anchor text analysis, in combination with orthogonal spam detection approaches, can help detect the point in which ownership was changed. Links prior to this point can then be evaluated differently.
This type of analysis, plus some other very interesting stuff, is discussed in the Google paper Document Scoring Based on Link-Based Criteria.

#7 Link Growth Thresholds

Sites with rapid link growth could have the impact dampened by applying a threshold of value that can be gained within a unit time. Corroborating signals can help determine if a spike is from a real event or viral content, as opposed to link manipulation.
link growth thresholds
This threshold can discount the value of links that exceed an assigned threshold. A more paced, natural growth profile is less likely to break a threshold. You can find more information about historical analysis in the paper Information Retrieval Based on Historical Data.

#8 Robust PageRank

Robust PageRank works by calculating PageRank without the highest contributing nodes.
robust pagerank
In the image above, the two strongest links were turned off and effectively reduced the PageRank of a node. Strong sites often have robust profiles and do not heavily depend on a few strong sources (such as links from link farms) to maintain a high PageRank. Robust PageRank calculations is one way the impact of over-influential nodes can be reduced. You can read more about Robust PageRank in the paper Robust PageRank and Locally Computable Spam Detection Features.

#9 PageRank Variance

The uniformity of PageRank contribution to a node can be used to evaluate spam. Natural link profiles are likely to have a stronger variance in PageRank contribution. Spam profiles tend to be more uniform.
pagerank variance
So if you use a tool, marketplace, or service to order 15 PR 4 links for a specific anchor text, it will have a low variance in PR. This is an easy way to detect these sorts of practices.

#10 Diminishing Returns

One way to minimize the value of a tactic is to create diminishing marginal returns on specific types of links. This is easiest to see in sitewide links, such as blogroll links or footer paid links. At one time, link popularity, in volume, was a strong factor which lead to sitewides carrying a disproportionate amount of value.
link building diminishing returns
The first link from a domain carries the first vote and getting additional links from one particular domain will continue to increase the total value from a domain, but only to a point. Eventually inbound links from the same domain will continue to experience diminishing returns. Going from 1 link to 3 links from a domain will have more of an effect than 101 links to 103 links.
Protip: Although it’s easy to see this with sitewide links, I think of most link building tactics in this fashion. In addition to ideas like relative mass, where you don’t want one thing to dominate, I feel tactics lose traction overtime. It is not likely you can earn strong rankings on a limited number of tactics, because many manual tactics tend to hit a point of diminishing returns (sometimes it may be algorithmic, other times it may be due to diminishing returns in the competitive advantage). It's best to avoid one-dimensional link building.

Link Spam Algorithms

All spam analysis algorithms have some percentage of accuracy and some level of false positives. Through the combination of these detection methods, search engines can maximize the accuracy and minimize the false positives.
Web spam analysis allows for more false positives than email spam detection, because there are often multiple alternatives to replace a pushed down result. It is not like email spam detection, which is binary in nature (inbox or spam box). In addition to this, search engines don’t have to create binary labels of “spam” or “not spam” to effectively improve search results. By using analysis, such as some of those discussed in this post, search engines can simply dampen rankings and minimize effects.
These analysis techniques are also designed to decrease the ROI of specific tactics, which makes spamming harder and more expensive. The goal of this post is not to stress about what links work, and which don’t, because it’s hard to know. The goal is to demonstrate some of the problem solving tactics used by search engines and how this impacts your tactics.

Thursday, July 28, 2011

5 Tips for Meeting Online Friends IRL

Dr. Pete and GianlucaSocial media is a bit of a paradox – we have more “friends” than ever, but our relationships feel more and more superficial. When we retreat to the comfort of the internet, we introverts have even less incentive to get to know people IRL (In Real Life, for those who don't spend all day on the internet). If you know me online, it may surprise you to hear that I consider myself a recovering introvert. I’m also a work-at-home father of a 1-year-old, so I’m lucky to hit one SEO conference a year.

In honor of being in Seattle for Mozcon this week, I’d like to share 5 tips for how I’ve managed to make social media count and turn online relationships into real, offline friendships and business partnerships. Just to illustrate the point, that’s a picture of me with SEOmoz enthusiast and fellow proud dad Gianluca Fiorelli, who I finally got to meet in person today (thanks to Rudy Lopez for snapping the picture).

1. Get to Know People

If you only see your online friends as a way to get more Likes and +1s or water your Farmville crops when you’re out of town, you’ll never develop a real-life connection. Building any lasting relationship starts with sincerity. I think that 80% of my own success comes from the fact that I genuinely like people. Social media blurs the lines between work and personal life, and it’s a tremendous opportunity to get to know more about people’s lives outside of work.

2. Be a White-hat Stalker

Social media is also an amazing way to keep track of people, especially with real-time information like Twitter and FourSquare. Sometimes, all it takes is paying attention and knowing when you and your online friends will be in the same place at the same time. A couple of years ago, I was on Twitter and noticed that an industry friend was visiting the Google office in Chicago, just a few blocks from my condo. I pinged him, and two hours later we were having a beer together.

I’m not suggesting that you actually stalk people and show up uninvited to wherever they check in. White-hat stalking is about finding opportunity in the fact that many people in our industry spend a lot of time on the road. Sometimes, an online friend from across the country or even the other side of the globe just happens to be in town. Sometimes, you’re going to the same event, and may not even realize it. It’s all about paying attention.

3. Pre-arrange a Meetup

If you are going to an event, especially a large conference, it’s easy to assume that meeting people will just naturally happen. Conferences are big events and 2-4 days can go by in a flash. If you’re going to be at an event, let people know. It may feel self-indulgent, but announce online that you’re going. If you leave meeting up to chance, you’re going to miss a lot of people. Arrange a meetup – it could be dinner the night before the event, or it could just be making sure you find each other at the after-party. Don’t overthink it – a simple “Hey, I’m in Session A3 – where are you?” on Twitter works wonders.

4. Don’t Miss a Chance

When an opportunity does come along to meet someone IRL, don’t pass it up. Not to keep picking on Gianluca, but when he arrived at the hotel yesterday he tweeted that he was down in the lobby. At a relatively small, 3-day conference, it’s easy to assume that we’d have plenty of chances to meet up, but instead I told him to wait a minute, grabbed my room key, and jumped in the elevator. I can’t count the number of times I saw someone I wanted to meet, thought “They look busy, I’m sure I’ll see them later” and then didn’t. Don’t miss your chance.

5. Act Like an Extrovert

I hate the phrase “Fake it ‘til you make it” because of that one word – fake. It’s taken me a long time to accept that there’s a huge difference between deliberately being fake and acting the way you’d like to act, even if it’s a bit out of character. If you’re outgoing online, you’d probably like to be a little more outgoing IRL. So, why not try it on for size? No one online knows that you’re secretly terrified of your own shadow. These days, when I recognize an online friend, I approach them like we’ve known each other forever. It’s amazing what a difference that makes.

To the introverts out there, I’d just like to end by saying that many of the people in this industry that you think are social animals are closet introverts themselves. One of my favorite industry posts of all time is Lisa Barone’s introvert confession back in 2008. Even social media professionals struggle with actually being social IRL. If you're at Mozcon, don't be afraid to say “hi” – I only bite when I haven't been fed.

Replicate Google's Panda Questionnaire - Whiteboard Friday

Want to avoid the next Panda Update and improve your websites quality? This week Will Critchlow from Distilled joins Rand to discuss an amazing idea of Will's to help those who are having problem with Panda and others who want to avoid future updates. Feel free to leave your thoughts on his idea and anything you might do to avoid Panda.

Video Transcription

Rand: Howdy, SEOmoz fans. Welcome to a very special edition of Whiteboard Friday. I am joined today by Will Critchlow, founder and Director of Distilled, now in three cities - New York, Seattle, London. My God, 36 or 37 people at Distilled?

Will: That's right. Yeah, it's very exciting.

Rand: Absolutely amazing. Congratulations on all the success.

Will: Thank you.

Rand: Will, despite the success that Distilled is having, there are a lot of people on the Web who have been suffering lately.

Will: It's been painful.

Rand: Yeah. What we're talking about today is this brilliant idea that you came up with, which is essentially to replicate Google's Panda questionnaire, send it out to people, and help them essentially improve your site, make suggestions for management, for content producers, content creators, for people on the Web to improve their sites through this same sort of search signals that Panda's getting.

Will: That's right. I would say actually the core thing of this, what I was trying to do, is persuade management. This isn't necessarily about things that we as Internet marketers don't know. We could just look at the site and tell people this, but that doesn't persuade a boss or a client necessarily. So a big part of this was about persuasion as well.

So, background, I guess, people probably know but Goggle gave this questionnaire to a bunch, I think they used students mainly to assess a bunch of websites, then ran machine learning algorithms over the top of that so that they could algorithmically determine the answer.

Rand: Take a bunch of metrics from maybe user and usage data, from possibly linked data, although it doesn't feel like linked data, but certainly onsite analysis, social signals, whatever they've got. Run these over these pages that had been marked as good or bad, classified in some way by Panda questionnaire takers, and then produce results that would push down the bad ones, push up the good ones, and we have Panda, which changed 12% of search results in the U.S.

Will: Yeah, something like that.

Rand: And possibly more.

Will: And repeatedly now, right? Panda two point whatever and so forth. So, yeah, and of course, we don't know exactly what questions Google asked, but . . .

Rand: Did you try to find out?

Will: Obviously. No luck yet. I'll let you know if I do. But there's a load of hints. In fact, Google themselves have released a lot of these questions.

Rand: That's true. They talked about it in the Wired article.

Will: They did. There have been some that have come out on Search Engine Land I think as well. There have been some that have come out on Twitter. People have referred to different kinds of questions.

Rand: Interesting. So you took these and aggregated them.

Will: Yeah. So I just tried to pull . . . I actually ignored quite a chunk that I found because they were hard to turn into questions that I could phrase well for the kinds of people I knew I was going to be sending this questionnaire to. Maybe I'll write some more about that in the accompanying notes.

Rand: Okay.

Will: I basically ended up with some of these questions that were easy to have yes/no answers for anybody. I could just send it to a URL and say, "Yes or no?"

Rand: Huh, interesting. So, basically, I have a list of page level and domain level questions that I ask my survey takers here. I put this into a survey, and I send people through some sort of system. We'll talk about Mechanical Turk in a second. Then, essentially, they'll grade my pages for me. I can have dozens of people do this, and then I can show it to management and say, "See, people don't think this is high enough quality. This isn't going to get past the Panda filter. You're in jeopardy."

Will: That's right. The first time I actually did this, because I wasn't really sure whether this was going to be persuasive or useful even, so I did it through a questionnaire I got together and sent it to a small number of people and got really high agreement. Out of the 20 people I sent the questionnaire to, for most questions you'd either see complete disagreement, complete disarray, basically people saying don't know, or you'd see 18 out of 20 saying yes or 18 out of 20 saying no.

Rand: Wow.

Will: With those kind of numbers, you don't need to ask 100 people or 1,000 people.

Rand: Right. That's statistically valid.

Will: This is looking like people think this.

Rand: People think this article contains obvious errors.

Will: Right. Exactly. So I felt like straight away that was quite compelling to me. So I just put it into a couple of charts in a deck, took it into the client meeting, and they practically redesigned that "catch me" page in that meeting because the head of marketing and the CEO were like okay, yeah.

Rand: That's fantastic. So let's share with people some of these questions.

Will: And they're simple, right, dead simple.

Rand: So what are the page level ones?

Will: Page level, what I would do is typically find a page of content, a decent, good page of content on the site, and Google may well have done this differently, but all I did was say find a recent, good, well presented, nothing desperately wrong with it versus the rest of the content on the site. So I'm not trying to find a broken page. I'm just trying to say here's a page.

Rand: Give me something average and representative.

Will: Right. So, from SEOmoz, I would pick a recent blog post, for example.

Rand: Okay, great.

Will: Then I would ask these questions. The answers were: yes, no, don't know.

Rand: Gotcha.

Will: That's what I gave people. Would you trust the information presented here?

Rand: Makes tons of sense.

Will: It's straightforward.

Rand: Easy.

Will: Is this article written by an expert? That is deliberately, vaguely worded, I think, because it's not saying are you certain this article's written by an expert? But equally, it doesn't say do you think this article . . . people can interpret that in different ways, but what was interesting was, again, high agreement.

Rand: Wow.

Will: So people would either say yes, I think it is. Or if there's no avatar, there's no name, there's no . . . they're like I don't know.

Rand: I don't know.

Will: And we'd see that a lot.

Rand: Interesting.

Will: Does this article have obvious errors? And I actually haven't found very many things where people say yes to this.

Rand: Gotcha. And this doesn't necessarily mean grammatical errors, logical errors.

Will: Again, it's open to interpretation. As I understand it, so was Google's. There are some of these that could be very easily detected algorithmically. If you're talking spelling mistakes, obviously, they can catch those. But here, where we're talking about they're going to run machine learning, it could be much broader. It could be formatting mistakes. It could be . . .

Rand: Or this could be used in concert with other questions where they say, boy, it's on the verge and they said obvious errors. It's a bad one.

Will: Exactly.

Rand: Okay.

Will: Does the article provide original content or information? A very similar one. Now, as SEOs, we might interpret this as content, right?

Rand: But a normal survey taker is probably going to think to themselves, are they saying something that no one has said before on this topic?

Will: Yeah, or even just, "Do I get the sense that this has been written for this site rather than just cribbed from somewhere?"

Rand: Right.

Will: And that may just be a gut feel.

Rand: So this is really going to hurt the Mahalos out there who just aggregate information.

Will: You would hope so, yeah. Does this article contain insightful analysis? Again, quite vague, quite open, but quite a lot of agreement on it. Would you consider bookmarking this page? I think this is a fascinating question.

Rand: That's a beautiful one.

Will: Obviously, again, here I was sending these to a random set of people, again which, as I understand it, is very similar to what Google did. They didn't take domain experts.

Rand: Ah, okay.

Will: As I understand it. They took students, so smart people, I guess.

Rand: Right, right.

Will: But if it's a medical site, these weren't doctors. They weren't whatever. I guess some people would answer no to this question because they're just not interested in it.

Rand: Sure.

Will: You send an SEOmoz page to somebody who's just not . . .

Rand: But if no one considers bookmarking a page, not even consider it, that's . . .

Will: Again, I think the consider phrasing is quite useful here, and people did seem to get the gist, because they've answered all of the questions by this point. I would send the whole set to one person as well. They kind of get what we're asking. Are there excessive adverts on this page? I love this question.

Tom actually was one of the guys, he was speculating early on that this was one of the factors. He built a custom search engine, I think, of domains that had been hit by the first Panda update, and then was like, "These guys are all loaded with adverts. Is that maybe a signal?" We believe it is, and this is one of the ones that management just . . . so this was the one where I presented a thing that said 90% of people who see your site trust it. They believe that it's written by experts, it's quality content, but then I showed 75% of people who hit your category pages think there are too many adverts, too much advertising.

Rand: It's a phenomenal way to get someone to buy in when they say, "Hey, our site is just fine. It's not excessive. There's tons of websites on the Internet that do this."

Will: Yeah.

Rand: And you can say, "Let's not argue about opinions."

Will: Yes.

Rand: "Let's look at the data."

Will: Exactly. And finally, would you expect to see this article in print.?

Rand: This is my absolute favorite question, I've got to say, on this list. Just brilliant. I wish everyone would ask that of everything that they put on the Internet.

Will: So you have a chart that you published recently that was the excessive returns from exceptional content.

Rand: Yeah, yeah.

Will: Good content is . . .

Rand: Mediocre at this point in terms of value.

Will: And good is good, but exceptional actually has its exponential. I think that's a question that really gets it.

Rand: What's great about this is that all of the things that Google hates about content farms, all of the things that users hate about not just content farms but content producers who are low quality, who are thin, who aren't adding value, you would never say yes to that.

Will: What magazine is going to go through this effort?

Rand: Forget it. Yeah. But you can also imagine that lots of great pieces, lots of authentic, good blog posts, good visuals, yeah, that could totally be in a magazine.

Will: Absolutely. I should mention that I think there's some caveats in here. You shouldn't just take this blindly and say, "I want to score 8 out of 8 on this." There's no reason to think that a category page should necessarily be capable of appearing in print.

Rand: Or bookmarked where the . . .

Will: Yes, exactly. Understand what you're trying to get out of this, which is data to persuade people with, typically, I think.

Rand: Love it, love it. So, last set of questions here. We've got some at the domain level, just a few.

Will: Which are similar and again, so the process, sometimes I would send people to the home page and ask them these questions. Sometimes I would send them to the same page as here. Sometimes it would be a category page or just kind of a normal page on the site.

Rand: Right, to give them a sense of the site.

Will: Yeah. Obviously, they can browse around. So the instructions for this are answer if you have an immediate impression or if you need to take some time and look around the site.

Rand: Go do that.

Will: Yeah. Would you give this site your credit card details? Obviously, there are some kinds of sites this doesn't apply to, but if you're trying to take payment, then it's kind of important.

Rand: A little bit, a little bit, just a touch.

Will: There's obvious overlaps with all of this, with conversion rate optimization, right? This specific example, "Would you trust medical information from this site," is one that I've seen Google refer to.

Rand: Yeah, I saw that.

Will: They talk about it a lot because I think it's the classic rebuttal to bad content. Would you want bad medical content around you? Yeah, okay. Obviously, again only applies if you're . . .

Rand: You can swap out medical information with whatever type is . . .

Will: Actually, I would just say, "Would you trust information from this site?" And just say, "Would you trust it?"

Rand: If we were using it on moz, we might say, "Would you trust web marketing information? Would you trust SEO information? Would you trust analytics information?"

Will: Are these guys domain experts in your opinion? This is almost the same thing. Would you recognize this site as an authority? This again has so much in it, because if you send somebody to Nike.com, no matter what the website is, they're probably going to say yes because of the brand.

Rand: Right.

Will: If you send somebody to a website they've never heard of, a lot of this comes down to design.

Rand: Yes. Well, I think this one comes down to . . .

Will: I think an awful lot of it does.

Rand: A lot of this comes down to design, and authority is really branding familiarity. Have I heard of this site? Does it seem legitimate? So I might get to a great blog like StuntDouble.com, and I might think to myself, I'm not very familiar with the world of web marketing. I haven't heard of StuntDouble, so I don't recognize him as an authority, but yeah, I would probably trust SEO information from this site. It looks good, seems authentic, the provider's decent.

Will: Yeah.

Rand: So there's kind of that balance.

Will: Again, it's very hard to know what people are thinking when they're answering these questions, but the degree of agreement is . . .

Rand: Is where you get something. So let's talk about Mechanical Turk, just to end this up. You take these questions and put them through a process using Mechanical Turk.

Will: So I actually used something called SmartSheet.com, which is essentially a little bit like Google Doc spreadsheets. It's very similar to Google Doc spreadsheets, but it has an interface with Mechanical Turk. So you can just literally put the column headings as the questions. Then, each row you have the page that you want somebody to go to, the input, if you like.

Rand: The URL field.

Will: So SEOmoz.org/blog/whatever, and then you select how many rows you want, click submit to Mechanical Turk, and it creates a task on Mechanical Turk for each row independently.

Rand: Wow. So it's just easy as pie.

Will: Yeah, it's dead simple. This whole thing, putting together the questionnaire and gathering it the first time, took me 20 minutes.

Rand: Wow.

Will: I paid $0.50 an answer, which is probably slightly more than I would have had to, but I wanted answers quickly. I said, "I need them returned in an hour," and I said, "I want you to maybe have a quick look around the website, not just gut feel. Have a quick look around." I did it for 20, got it back in an hour, cost me 10 bucks.

Rand: My God, this is the most dirt cheap form of market research for improving your website that I can think of.

Will: It's simple but it's effective.

Rand: It's amazing, absolutely amazing. Wow. I hope lots of people adopt this philosophy. I hope, Will, you'll jump into the Q&A if people have questions about this process.

Will: I will. I will post some extra information, yeah, definitely.

Rand: Excellent. And thank you so much for joining us.

Will: Anytime.

Rand: And thanks to all of you. We'll see you again next week for another edition of Whiteboard Friday. Take care.

Will: Bye.

Wednesday, July 27, 2011

Brand New Open Site Explorer is Here (and Linkscape's Updated, too)

This morning at Mozcon, I announced the launch of Open Site Explorer v3, a long-awaited upgrade to one of the most popular marketing tools on the web. I'm more than a little excited about all the progress, hard work and remarkable features that are included in this upgrade, so let's get right to them.

The first thing you'll notice is the new design (of which I'm a huge fan):

Open Site Explorer Homepage

This continues into the top view of link data and now, social metrics. I've always wanted these to be side-by-side, and it's great to finally be able to see both at the same time.

Open Site Explorer Social + Link Metrics

The menus of filters have improved, and there's now a new visualization to show links as groups in domains or as separate links (like the classic Yahoo! Site Explorer view).

Open Site Explorer Filters

Social metrics are also included in the Top Pages reports, so you can see how the most-linked-to content has performed on the social web. This is particularly cool for popular blogs.

Open Site Explorer Top Pages

The anchor text and linking domains tabs have a new feature that lets you see a sample of the links that come from that domain (or with that anchor text). Beware that right now, there's a small bug where we're sorting those links we do show in some odd ways. This should be fixed in the next Linkscape update.

Open Site Explorer Anchor Text Drilldown

Comparison reports have also taken a nice step forward, and feature the ability to side-by-side compare metrics for pages, subdomains and root domains on up to 5 sites simultaneously. They match the metrics you can get in the PRO web app, as well, which is very cool.

Open Site Explorer Site/Page Comparison

And last, but not least, the new advanced reports tab lets you query like a SQL master! Without having to write any complex logic against our API (though you can still do lots of awesome stuff with that), you can grab any combination of link sorts, filters and keywords you'd like (and exclude data you don't want). This is particularly excellent for link builders looking at competitive or industry-related sites' link profiles, and I expect we'll see a number of blog posts in the near future with strategies on how to employ this tool.

Open Site Explorer Advanced Reports

In addition to all the amazing new features in Open Site Explorer, Linkscape's index just updated using a new infrastructure that's allowed us to crawl much deeper on large, important sites. For many pages/domains, this will mean an increase in the total number of links we report, but likely a lower count of linking domains (unless you've gained a lot of links in late June/July) since we're excluding many domains that are low-quality/not-well-linked-to. We'd love your feedback on this index, as it's the first one of its kind, and will continue to see tweaks/improvements over the next few updates.

  • 58,273,105,508 (58.2 billion) URLs +47% from June (our largest index growth ever from one month to another!)
  • 637,828,397 (637 million) Subdomains +71% (it appears the domains we're crawling have more subdomains)
  • 91,013,438 (91 million) Root Domains -23% (due to the depth vs. breadth focus of this crawl)
  • 456,474,577,597 (456 billion) Links +14%
  • Followed vs. Nofollowed
    • 2.28% of all links found were nofollowed +5%
    • 60.44% of nofollowed links are internal, 39.56% are external
  • Rel Canonical - 9.50% of all pages now employ a rel=canonical tag +20% (my guess is higher quality domains are more likely to employ rel=canonical)
  • The average page has 78.64 links on it (+30% from 60.67 last index)
    • 65.33 internal links on average
    • 13.32 external links on average

We're looking forward to your feedback on the new features and the new index (which we plan to continue iterating upon). There's actually even more new features coming in September, so stay tuned and thanks so much for all the support and use of OSE; it's run more than a million reports, and we hope the next million are just around the corner.

Tuesday, July 26, 2011

Leveraging your SEO for Search Retargeting

Here at Moz we work hard to break down those silly silo things (frankly they scare us). We believe that the different pieces of marketing should constantly be communicating with each other. Cyrus (our SEO lead) and I try to communicate on what we are seeing, where we might be overlapping, dropping the proverbial ball and so on and so forth. We know that leveraging each person's daily activities for maximum impact is the key to any company's success.

In the past, on the blog, we've talked about ways to leverage one of your tasks for related gains. Just a few days ago we talked about utilizing your analysis in GA for eCommerce SEO, and a few months ago we talked about how you can repurpose your on page SEO techniques for off page SEO success. These are great examples of how we should all be looking at the work we do daily and ask ourselves, "who else could use this?" and "how can I leverage this information for more gains?" I've never liked the phrase "kill two birds with one stone" (cuz why are we all so cool with killing birds?) so instead I'm coining the phrase "eating two cupcakes with one fork" (cuz we all love cupcakes). Working off that approach, today I'm going to talk about another way you can leverage your SEO duties for marketing success.

cupcakes and a fork
"It's like killing two cupcakes with one fork"
(just go with it)
angry fork photo credit

SEO & Search Retargeting: A Perfect Pair

Specifically I want to outline a few ways we can take all of the data mining and reports we work on and extend there value by using it for search retargeting. But wait, what the hell is search retargeting? Good question my friend. To understand search retargeting, we need to first understand retargeting. I wrote a post a while back that defined "retargeting" as "a form of marketing in which you target users who have previously visited your website with banner ads on display networks across the web."

Search retargeting is a subset of retargeting, and takes it one step further. It is a paid acquisition channel that allows advertisers to reach back out to users who have previously searched for their brand name or target keywords.

The difference between the two makes for a huge opportunity. The visitor doesn't have to have visited your site to be added to your audience to target with ads. For those of us that aren't ranking #1 for every word they want to, and are possibly losing visits to our competitors, you can target those lost visitors simply by going after people that searched for words in a catergory, industry, service, etc. You can quickly see why it would be beneficial for me to know (when setting up these campaigns and my targeting) what Cyrus is up to, and what he has been working on in regards to keyword targeting, our rankings, and more.

In fact let me show some fun stats to really sell you on the value of search retargeting. Did you know that "retargeted consumers are nearly 70% more likely to complete a purchase as compared to non-retargeted customers." Couple that with the fact that a number of reports have come out saying that retargeted customers also spend close to 50% more than those that weren't retargeted, and you got yourself a hot little thing happening.

Ways to Recycle Those Hours of SEO Work

Okay now that we have shown off just how effective search retargeting can be, lets talk about how we can repurpose some of that hard work us SEOs do to help our search retargeting efforts succeed.

#1 Ranking Reports (the "obvious" candidate)

How much time do you spend looking at ranking tools (possibly even ours ) in gauge the performance of your target keywords? Hours upon hours are spent by SEOs looking at their rankings, or lack there of. This information helps us all understand where the actual visits to our site are coming from, and subsequently what keywords are driving conversions. But what about the rankings you can't seem to conquer? For a second lets focus on the words you simply haven't been able to make any headway on. Those are prime candidates for a search retargeting campaign.

What if you pass that list of words off to your paid marketer counterpart and told them to focus their energy (and budget) on targeting those people with highly targeted ads? That would not only help supplement your SEO efforts nicely, but you would be spending your retargeting budget on a prequalified audience. Often paid marketers spend a great deal of budget trying to isolate out a solid audience to go after, you'd be saving them time and money. Much like passing those words off to your PPC manager, you can quickly gain visits from these high converting, targeted keywords you are having a hard time ranking for.

Example time: The phrase "free seo tools" results in a lot of conversions for us, but as you can see below, we don't rank in the top five for it.

free seo tools query
Our efforts in increasing our rankings here have been slow to respond. While we continue to work on SEO efforts here, we can supplement with highly targeted ads.

Target: People who search for "free seo tools," "seo tools," "cheap seo tools," etc. with ads like this:

They directly speak to the searcher's intent, "free seo tools" and would likely produce both high CTR for us, as well as increased conversions. This can help us grow our free trial numbers while we figure some things out on the SEO front and get our rankings up for "free seo tools."

#2 Second Tier Keywords (a "little less obvious" candidate)

Oh keyword research, how we love thee. Okay maybe some of us don't loveeee it, but it's a huge part of the process. SEOs spend hours pulling likely keyword targets, pulling traffic data, and competitive data to help them decide what to go after next.

In that gold mine of keyword data are dozens of likely search retargeting candidates. SEOs know that not every word they deem valuable can be a priority right now for their company or their clients. These get pushed into some second tier keyword bucket, that often doesn't get as much content, link building, or other resources allocated to it. My advice? Send that list on over to your paid marketer. Ask them to target these topics, site categories, etc. with their search retargeting ads until you have some more time and resources available to go after them.

Example time: Let's say we are ranking well for "seo tools" and "seo software" but we don't have the time to build an SEO campaign around the idea of "SEO resources." We know there are a number of people searching for this niche (SEO newbies, SEO students, etc.) and we know we have a ton of valuable content around the topic. So how can we help people find our resources, and associate us as a SEO resource if they have never heard of SEOmoz or visited us before?

Target: users that visit {seo blogs, seo training sites, and seo tool providers} with the below ad:

By using the word "resources" we are speaking directly to this user's need for more SEO learning material. We also get the added benefit of lining our logo up with this type of value add, which hopefully, down the road could result in a visit to our site and possibly a free trial signup.

#3 Competitive Research (a "no one out there is really doing this, so go kick some butt" candidate)

My favorite part. I don't know what it is about competitive research that has us all thinking the information we gather is context specific, but it's true. I remember the first time I mentioned to a PPC colleague they should look at SEMrush's SEO results for one of our competitors to build out our PPC campaigns, you would have thought I just smacked a puppy. She was in shock.

The truth is we are all playing on the same field here guys. A lot can be gained by studying your competitors total efforts, not just their paid or organic ones. Next time you spend time spying on your competitor's organic efforts, pass that information off to your paid marketer and ask them to build a search retargeting campaign around it. Because let's be real, we all have limited time and resources, and some of their targets will never make it onto your prioritization list. Plus, often you will see brand association start to shift through retargeting, which will reciprocally help your SEO efforts. Whoa, cool huh?

Example time: Look at the below results when I used SEMrush to view some of my competitors top keyword rankings. While we perform well organically for "seo software", "best seo software", etc. we don't necessarily have many SEO campaigns around the concept of "powerful" and "easy/simple."

rankings for SEMrush

This tool is showing us that our competitors are cleaning up here, and we know we need to atleast be building some brand sentiment around these adjectives. Search Retargeting can help.

Target: We can set up search retargeting ads for people that are searching online for these terms, and then target them with ads that directly speak to this. Below you can see we have incorporated these words to help build our brand association with them.

Getting Stuff Done by Video

Hi SEOmozzers! I’m Phil Nottingham and I've recently joined Distilled as an in-house pirate. This is my first SEOmoz post and I look forward to hearing your thoughts in the comments!

The Conundrum

Creating detailed and actionable client reports has become a vitally important skill for any agency SEO to hone. Often we’ll spend 20-30 hours composing a veritable treat of a read for our clients, a hand crafted sluice for a torrent of brilliant ideas, delegations, and requests that will certainly lead to a better performance in the SERPs once put into practice... but, as we’re all painfully aware, sending over these floods of text and screenshots often fails to get stuff done. It seems that often these reports get stuck in the quagmire of uncompleted items lurking at the bottom of our clients inboxes for weeks; to end up competing with a perpetual inundation of other requests, constantly clamouring for attention and requiring immediate action. And it’s no surprise that these reports often fail to make the impact we have in mind for them.
Consider a typical reading list for a web marketing type on a Monday morning. It’s probably going to look something like this:
  • Emails
  • Blog feeds
  • Google Reader – News & Articles.
  • Twitter
  • Facebook
If you’re anything like me, this list is going to feature well in excess of a hundred items, the vast majority of which you will only skim read and deal with quickly. As technology thunders on, accelerating global connectivity and productivity on an exponential scale, this brevity and superficiality of attention span is likely only to expand; threatening the practical viability of our beautifully crafted and detailed client reports.

How Can We Communicate Detailed Concepts and Suggestions to our Clients More Effectively?

The obvious answer is to do more phone-calls, lunches, video-conferences and direct face to face communication with the client so you can explain things and answer questions when you have their full attention available. However, most clients are typically busy and over-loaded people like us, sometimes based in different time-zones, making this approach rarely feasible.
At Distilled, client reports were taken to the shearers a while back. It’s now company-wide policy to send out succinct, simple, bare-bones reports a maximum of 3-4 pages long, which focus purely on the actionable and achievable aspects of all the findings from our 20-30 hours of research.
But just recently, we’ve also started trying out a more creative method of communicating complicated tasks and ideas to our clients and colleagues – demonstrating our thoughts and suggestions through recorded video.
Without going into too much detail, below is an eloquent summary of our findings so far from my colleague Tom Anthony:
Written reports - 20-30 pages = very little shit gets done
Distilled reports - 3-4 pages of actions = lots of shit gets done
Video report - video(s) + 1 page summary w/checklist = masses of shit gets completely annihilated

Why Go To The Effort?

There are some unique benefits of using video to communicate with clients as a supplement to email and telephone calls....

1. It’s different and fun

Video doesn’t feel like as much of a chore to plough through as emails or reports and this helps it to stick out from the remaining mass of inbox clutter and generate interest.

2. It’s a great teaching environment

If you’re client is not particularly SEO savvy, video is an efficient and easy way to practically explain some of the basic principles driving the ranking factors.

3. Clients can’t skim read a video

You cannot skip through a video as innocuously as you can skim through an email or document; it requires conscious effort to avoid.

4. It's easy and quick to make

If you become practiced and efficient at making videos, it can be an extremely fast process and take less time than composing a long email.

5. You can demonstrate complicated technical issues as if explaining them in person

It can be easier to explain complicated design and technical considerations with screencasts and diagrams, rather than through extensive writing and annotated screenshots. Problems with UI and design are often better looked at than talked about.

6. It can be edited

As with an email, but unlike a phone-call or video conferencing; a video allows you time to consider your response and suggestions before sending it.

7. It lives on after it’s been created

Unlike a phone-call or VC, videos can be watched back by multiple people at their leisure. This can be a great way to help clients and as can keep the video for future reference, as well as showing it quickly to colleagues.

8. It can be rapport building

Videos can also be a fantastic tool for building rapport with your client. If they happen to live a long way away and are on different time zones, so you’ve never met, allowing them to see your face and hear your voice on a regular basis is a great way of building trust and mutual understanding. You can also convey emotion through video where you would struggle in formalised written word.

9. It's not Rocket Science

While is fantastic to have a top-of-the-range camera and microphone to work with, you can still create relatively high quality videos with modest resources.
This...
Was recorded and uploaded using this...

Common Pitfalls When Making Videos

Although videos can be an incredibly useful resource when integrated into a holistic approach to communication, it is incredibly easy to undo the potential benefits videos offer...
1. Thinking Video Can Work for Everything
I'm not suggesting here that video is an all an out solution for all communication, but rather that it works when included in a holistic approach encompassing email, phone-calls and traditional reports. Video is particularly valuable when you don't have the opportunity to meet with your client and explain things to them in the flesh, such as with International SEO, but it doesn't replace traditional methods of communication.

2. Lack of Clarity

The best thing about email and reports is that they can be edited down to succinct actions points, which cut out the prognostication and deliberation populating everyday phone-calls and conversations. To make effective instructional and informational videos – always stick to the point at hand and avoid meandering tangents. Videos are only valuable in as much as they maintain an audience’s interest.

3. Inability to hone in on specific points

If you’re going to end up putting your video on YouTube, then an interactive transcript can be used to allow your client to skip to relevant points within the video. If not, then creating a contents list with corresponding time-codes for your video can be a great aid for efficient viewing.

4. Low Quality

Having good picture quality and clear audio is essential when producing a video. Especially when discussing complicated technical processes, there cannot be any compromise on this. Ensure you record all content at high resolution and avoid microphone interference.

5. Difficult to work out actionable tips

Clients aren’t going to want to watch through your videos multiple times and transcribe the point you make in order to ascertain appropriate action points. Whenever you send a video, ensure it comes complete with a list of jobs to be undertaken, which you’re client can study while watching your presentation. This will focus their mind to the practical essentials of what you are trying to say and ensure stuff gets done.

How to Convert a Written Report into a Video Report

  1. Decide the form appropriate form the different parts of your report should take – which bits are best shown through a screencast and which bits would work best with a whiteboard Friday style talking head presentation?
  2. Convert your report into a script, removing any descriptive passages which can be displayed visually – If it makes sense within the context of your report; write a script for multiple videos covering a single subject on each one. Six 5-minute long videos are easier to digest than one 30-minute video.
  3. Practice speaking through your script in time with your screencast a couple of times before recording, ensuring you cut out any “umms” or “likes” opting for pauses any time you are unsure what to say.
  4. When recording, always talk slightly slower than you would in everyday conversation, as the nuances of corporeal expression are inevitably lost through the cables of a microphone -- Speak at the speed where it just starts to feel uncomfortably slow. In most cases, when you listen back to your recording, you’ll be surprised how slow it doesn’t feel.
  5. For any talking head passages of your recording, always look straight into the lens of the camera.
  6. After recording, trim out any sections which lag or feel unnecessary to make the overall points.
  7. Add zooms, markers and annotations where necessary.
  8. Export your content to video and upload to a cloud hosting service if necessary.
  9. Creative an executive summary of the key points in text and create a contents checklist for your client to use to navigate to relevant points in the video(s).

Examples:

Last week Rand published a blog titled The Best Kept Secret in the SEOmoz Toolset, which explains how to access the new SERPs analysis tool. I've taken the process he explained through text and diagrams and put together a tutorial video which works towards the same purpose. Using Rand's post as a script, this video took roughly 10 minutes to make using Camtasia for Mac.
Do you find it easier to watch through the video or read through the post? Please let me know your thoughts!

If anyone would like to see further practical demonstrations of turning a written report into a movie, then i invite you to email me over your content (phil.nottingham@distilled.net) and i will use the above formula to convert it into a video and share it in the comments section.
If you’re interested to know more about the practical process for making awesome videos, please check out a post I created a the Distilled blog last month - Creating Awesome Videos for SEO.

Our Latest Chrome Toolbar Update is Live! (and one more cool thing)

Hi, I'm Karen, a new product manager at SEOmoz. On the heels of our Firefox toolbar launch in May, I’m happy to announce that we’ve launched our MozBar for Chrome. With this update, you’ll be able to research sites in your favorite browser--Chrome or Firefox--using a powerful toolbar that gets you quickly to the data you need most.

We’ve made a number of useful improvements, most suggested by you! Let’s take a look at what you can do with the new MozBar.

1. Redesign for better integration with the Chrome user interface

You can now access all functions, menus and tools in Chrome from an icon to the right of your address bar. This update incorporates the toolbar into the native design of Chrome, which gives you access to extension menus and toolbars via icons tucked into your address bar to remove “clutter” from the browser window.

Chrome Toolbar button and menu

The “toolbar” has become your analytics bar. You can move it to the top, bottom, or right side of your browser, or close it, easily at any time.

Analytics BarWe realize that while this design might be less intrusive, it also creates an extra click to get to some functionality and tools. That’s why we’ve rearranged the toolbar features to give you Page Analysis and country info on launch of the toolbar window. You’ll find all function buttons (Page Analysis, Highlighting, and Country info) positioned to the left in the menu. Tools, settings, SEOmoz quick links, and help menus are placed to the right.


2. More highlighting options for links and keywords

With yesterday’s toolbar, you could easily highlight no-followed links. Now, you can also highlight followed, external or internal links, as well as keywords.


3. Define custom searches by search engine, country, and region/city

Let’s say you own three Zum Uerige alt-bier pubs in Nordrhein-Westfalen in Germany (you lucky duck), and you want to see how they perform in search results for those three areas. You can set up one or more search profile (and up to 10 total) for each area to monitor how they rank:

Adding A Custom Search Profile

Then, you can use the profiles to monitor and compare results between areas or compare their rankings between the major search engines:

Three Custom Search Profiles, Defined


4. Country flag/name and IP address at a glance

You can view the country flag, and on mouse-over, country name and IP address. When you click the flag, you’ll be directed to full details for the first IP address listed for the site.

Country Flag in Page Analysis window

and in the main menu, for at-a-glance access when you need it.

Country Flag Info in Toolbar Menu


5. Subdomain metrics, plus one-click access to to Open Site Explorer

We've added a subdomain metrics display alongside domain metrics in the analytics bar.

6. Run Keyword Analysis reports quickly

You have one-click access to keyword difficulty reports for your search terms from a link in the SERP overlay.

Thanks again to for your feedback and suggestions for improvements, and for helping us build this toolbar, one great idea at a time! And feel free to head over to our feature request forum and tell us how we can make the toolbar even better.

Get the MozBar

But wait, there's more!

Adam just stopped by my desk and asked me to tell you about some updates to the Keyword Analysis report. By popular request, we've added two new features to the SERP Analysis:

1. On-page grades for each URL. Now with each report, we will analyze how well-targeted each page is for the selected keyword, and provide each with a letter grade.

2. Competitive URL. You can now add a URL that you want to compare to the top-10 ranking URLs for a SERP.

Be sure to check out the new keyword analysis report.

Monday, July 25, 2011

Mission ImposSERPble: Establishing Click-through Rates

Google and its user experience is ever changing. For a company that has more than 60% of the search market, it's common to hear the question, “How many visitors can we expect, if we rank [x]?” It’s a fair question. It's just impossible to predict. Which is a fair answer. But, as my father says, “If you want fair, go to the Puyallup.” So we inevitably hear, “Well, can you take a guess? Or give us an estimate? Anything?”

To answer that question, we turned to major studies about click-through rates, incuding Optify, Enquiro, and the studies released using the leaked AOL data of 2006. But these studies are old; this study is new. Ladies, Gentlemen, and Mozbot, it is our immense pleasure to present to you…

The Slingshot SEO Google CTR Study Whitepaper: Mission ImposSERPble

There have been a number of changes to the Google user experience since those studies/surveys were published years ago. There's a new algorithm, a new user interface, increased mobile search, and social signals. On top of that, the blended SERP is riddled with videos, news, places, images, and even shopping results.

We made this study super transparent. You can review our step-by-step process to see how we arived at our results. This study is an ongoing project that will be compared with future SERPs and other CTR studies. Share your thoughts on the study and the research process to help us include additional factors and methods in the future.

Our client databank is made up of more than 200 major retailers and enterprise groups, and our sample set was chosen from more than thousands of keywords based on very strict criteria to ensure the accuracy and quality of the study results.

The study qualification criteria is as follows:

  • A keyword phrase must rank in a position (1 to 10)
  • The position must be stable for 30 days

Each keyword that we track at Slingshot was considered and every keyword that matched our strict criteria was included. From this method, we generated a sample set of exactly 324 keywords, with at least 30 in each of the top 10 ranking positions.

We are confident in the validity of this CTR data as a baseline model, since the data was generated using more than 170,000 actual user visits across 324 keywords over a 6-month period.

Data-Gathering Process

Authority Labs: Finding Stable Keywords

We currently use Authority Labs to track 10,646 keywords' daily positions in SERPs. From this, we were able to identify which keywords had stable positions for 30 days. For example, for the keyword “cars,” we observed a stable rank at position 2 for June 2011.

Stable 30 day ranking - ImposSERPble

Google Adwords Keyword Tool: All Months Are Not Created Equal

We found the number of [Exact] and “Phrase” local monthly searches using the Google Adwords keyword tool. It is important to note that all keywords have different monthly trends. For example, a keyword like “LCD TV” would typically spike in November, just before the holiday season. If you’re looking at searches for that keyword in May, when the search volume is not as high, your monthly search average may be overstated. So we downloaded the .csv file from Adwords, which separates the search data by month for more accuracy.

Google keyword tool csv download - ImposSERPble

By doing this, we were able to calculate our long-tail searches for that keyword. “Phrase” – [Exact] = Long-tail.

Google Analytics: Exact and Long-Tail Visits

Under Keywords in Google Analytics, we quickly specified the date of our keywords’ stable positions. In this case, “cars” was stable in June 2011. We also needed to specify “non-paid” visits, so that we were only including organic results.

Google analytics non paid - ImposSERPble

Next, we needed to limit our filter to visits from Google in the United States only. This was important since we were using Local Monthly Searches in Adwords, which is specific to U.S. searches.

Google analytics phrase and exact - ImposSERPble

After applying the filter, we were given our exact visits for the word “cars” and phrase visits, which included the word “cars” and every long-tail variation. Again, to get the number of long-tail visits, we simply used subtraction: Phrase – Exact = Long-Tail visits.

Calculations

We were then able to calculate the Exact and Long-Tail Click-through rate for our keyword.

EXACT CTR = Exact Visits from Google Analytics / [Exact] Local Monthly Searches from Adwords

LONG-TAIL CTR = (Phrase Visits – Exact Visits from Google Analytics) / (“Phrase” – [Exact] Local Monthly Searches from Adwords)

Results

What was the observed CTR curve for organic U.S. results for positions #1-10 in the SERP?

Based on our sample set of 324 keywords, we observed the following curve for Exact CTR:

Google CTR curve - ImposSERPble

Our calculations revealed an 18.2% CTR for a No. 1 rank and 10.05% for No. 2. CTR for each position below the fold (Positions 5 and beyond) is below 4%. An interesting implication of our CTR curve is that for any given SERP, the percentage of users who click on an organic result in the top 10 is 52.32%. This makes sense and seems to be typical user behavior, as many Google users will window shop the SERP results and search again before clicking on a domain.

Degrees of Difference

CTR study comparisons - ImposSERPble

The first thing we noticed from the results of our study was that our observed CTR curve was significantly lower than these two previous studies. There are several fundamental differences between the studies. One should not blindly compare the CTR curves between these studies, but note their differences.

Optify’s insightful and thorough study was conducted during the holiday season of December 2010. There are significant changes in Google’s rankings during the holiday season that many believe have a substantial impact on user behavior, as well as the inherent change in user intent.

The study published by Enquiro Search Solutions was conducted in 2007 using survey data and eye-tracking research. That study was the result of a business-to-business focused survey of 1,084 pre-researched and pre-selected participants. It was an interesting study because it looked directly at user behavior through eye-tracking and how attention drops off as users scroll down the page.

Long-Tail CTR: Volatile and Unpredictable

For each keyword, we found the percentage of click-through for all long-tail terms over the same period. For example, if “cars” ranks at position 2 for June 2011, how much traffic could that domain expect to receive from the keyword phrases “new cars,” “used cars,” or “affordable cars?” The reasoning is, if you rank second for “cars,” you are likely to drive traffic for those other keywords as well, even if those positions are unstable. We were hoping to find an elegant long-tail pattern, but we could not prove that long-tail CTR is directly dependent on the exact term’s position in the SERP. We did observe an average long-tail range of 1.17% to 5.80% for each position.

Google CTR data table - ImposSERPble

Blended SERPs: The “Universal” Effect

Starting in May 2007, news, video, local, and book search engines were blended into Google SERPs, which have since included images, videos, shopping, places, real-time, and social results. But do blended SERPs have lower CTRs? Since these blended results often push high-ranking domains towards the bottom of the page, we predicted that CTR would indeed be lower for blended SERPs. However, a counter-intuitive hypothesis would suggest that because certain SERPs have these blended results inserted by Google, they are viewed as more credible results and that CTR should be higher for those blended SERPs. We analyzed our sample set and failed to show significant differences in user behavior regarding blended versus non-blended results. The effect of blended results on user behavior remains to be seen.

Google CTR blended data table - ImposSERPble

As previously mentioned, this study will be used in comparison to future SERPs as the Slingshot SEO Research & Development team continues to track and analyze more keywords and collect additional CTR data. It is our hope that these findings will assist organic SEOs in making performance projections and consider multiple factors when selecting keywords. We look forward to additional studies, both yours and ours, on CTRs and we encourage you to share your findings. With multiple prospective and recent social releases, our research team will be dedicated to examining the effects of social platforms and Click-through rates, and how the organic CTR curve changes over time.

Visit the Slingshot SEO website for the full Mission ImposSERPble: Google CTR Study Whitepaper. It’s free.

Sunday, July 24, 2011

How to Research Local Citations After Google Removed them from Places

Late last week, in a move that was apparently spurred by threats of an FTC investigation, Google removed third-party reviews and listings from their Places pages in the Local/Maps results. This change was intended to help thwart complaints by sources like Yelp, TripAdvisor and Citysearch who claimed that Google unfairly used their content to make the Places pages results useful without compensation or traffic.

Below is a visual of the change via the WSJ:

Google Removes Citations from Local/Places Pages

Impact on Local/Maps/Places SEO

Unfortunately, this move has a strong negative consequence for SEOs, web marketers and local businesses trying to improve their rankings (or earn a listing) in Google Places results. In particular, the popular tactic of researching the citation sources of competitors and fellow business listees in a city/region via their Places pages is now defunct.

Since citations are like links for SEO/rankings in Google Places, this change is going to be tough on many citation researchers and local optimizers.

Other Options for Local Citation Discovery

Thankfully, there are other ways to find the sources Google may be using to resource their Places data.

#1: Identify Aggregators in the Standard Search Results

This is as basic as it sounds. Just perform a query and seek out the aggregators - those that rank in the top few pages of results that list multiple local businesses. Not only is this a useful activity for Places SEO, it can also help drive direct traffic and brand awareness (e.g. Getting a listing on Yelp isn't just good for Google SEO, it's a great idea because lots of people use Yelp to find local businesses).

Aggregators of Local Business Results in Google

In the screenshot above, I've pointed to several well-known aggregators that are likely good sources for a listing/citation if a business is targeting Seattle Ice Cream results.

#2: Perform Competitive Research Using Google's Standard Results

You don't need the citations listed in the Places pages to find where a business is earning listings/links/references. You can use good, old, regular Google results:

Molly Moon's Competitive Analysis

The screenshot above shows one way to do this - grab a listing from the Local/Places results and use the combination of the business' phone number and name to see where they're mentioned on the web. This also works with any combination of address, business name, cityname, etc. It's likely the most simple and direct way to replace the old competitive citation analysis method.

#3 - Search for Multiple Businesses at Once (Co-Citation)

Another simple option is to query Google for several businesses at once in hopes of finding pages/sites that have listings for several places.

Multiple Business / Co-Citation Search

The example in the screenshot above is a very simplistic one - you may want to combine this with phone numbers/addresses to help identify more listing-focused sites.

#4 - The WhiteSpark Local Citation Finder Tool

Darren Shaw's great citation finding tool has long been a staple of Places SEO research, and since it uses a methodology similar to tactic #2 above, it's not affected by Google's change to the Places pages.

WhiteSpark Local Citation Finder

Just plug in a search as shown in the image above, and the tool will return a list of potential places to acquire a citation/listing. It takes a while to run (up to 24 hours in my experience), but is remarkably useful.


Undoubtedly, these four aren't the only options for local citation research. If you've got more suggestions/ideas for ways to do this, please leave them in the comments below!

p.s. Many thanks to David Mihm and Mike Blumenthal for their contributions and help in understanding this change and offering alternatives.

Thursday, July 21, 2011

Google's Negative Ranking Factors - Whiteboard Friday

By now you've heard about SEOmoz's study of Google ranking factors, but what about negative ranking factors? Sure, positive factors such as the correlations between social media shares and higher rankings earn a lot of attention - and they should. Smart SEOs look at all the factors, including those at the bottom of the list! Today we look at negative ranking factors - those SEO characteristics correlated with lower rankings - and how to avoid them.

Video Transcription

Howdy, SEOmoz! Welcome to another edition of Whiteboard Friday. Today we're going to be talking about negative ranking factors.

Now, we talk about ranking factors a lot here at SEOmoz. Every two years SEOmoz publishes a study called the "Ranking Factors." We just published one about a month and a half ago, two months ago. The positive factors get a lot of publicity. We find things that correlate to higher rankings, and we spend a lot of our time on those.

Some of the more positive famous ranking factors that we talk about are such things as page authority, which has a 0.28 correlation to higher rankings. Now, I know we say this a lot, but I need to give my disclaimer here, that correlation does not equal causation. What this means is that when we see pages with high page authority, they are most likely associated with higher rankings. We look at thousands of search results across the website, we analyze those pages, and we try to find relationships characteristic of those pages and those higher rankings. When we find a relationship, we often say that they are positively correlated. Other elements that have positive correlation would be exact match dot com domains. So if your domain name is, say, Diamonds.com, you have a pretty good chance of ranking for diamonds - for that keyword. Also, linking root domains with partial anchor text is a 0.25 correlation. That just means the broad diversity of domains that link to you with some sort of partial anchor text, there is a pretty high correlation between that measurement and higher rankings. Now, this is what we talk about a lot.

What we don't talk a lot about is the opposite effect, the negative correlation. There are certain factors, there are certain things we find associated with web page that actually are associated with negative rankings. We don't pay a lot of attention to those, but they are actually in there in the ranking factors and they are all the way at the bottom, but they are sort of worth paying attention to, because if we can avoid these, we might be able to learn something about better ranking models and better correlations.

Domain Name Length

Starting with some simple ones, an obvious negative correlation is the domain name length, 0.07. This is kind of an obvious one. If you had a domain, Shoes.com, this tends to rank better in search results than something like Buy-Cheap-Mens-Shoes.com. Now again, correlation does not equal causation. We can think of a lot of reasons for this. For example, Shoes.com, that's probably a much older domain name. It's probably been around for 10 years, has a lot of back links going to it. Buy-Cheap-Mens-Shoes.com kind of looks a little spammy. It is probably not something that is going to earn a lot of links. By the way, if you go ahead and look at these correlation statistics, dashes actually are another negative factor. The more hyphens a domain name has, that is actually another negative correlation factor. That doesn't mean you can't use long domain names. It just means they tend to not do as well from what we observed.

Response Time

Kind of a controversial one here - response time. We love drawing small pictures of animals here on Whiteboard Friday, so here is our tortoise and our hare. 0.05. Now, we don't really know what this is. There is a lot of debate in the SEO world if slower web pages, slower servers cause lower rankings. We don't really have a lot of data on that. We don't really have a definite answer. What we can see from the correlation, this isn't a huge correlation, but we see that these pages tend to rank a little lower than others. We know that faster websites, faster response times present a better user experience. If you have a slow site, it is definitely worth looking into.

AdSense

Now here is a surprising one. There are a lot of people, getting new into SEO, they think that if you use Google services, such as installing Google Analytics on your site or putting AdSense on your site, that Google tends to favor those websites and that you'll rank higher. Correlation data shows exactly the opposite. Google AdSense slots correlated with lower rankings, 0.06. So website A here, if it has all these AdSense, and you've seen these pages - you click on them and they are filled with AdSense - they tend to not rank as well as pages with fewer AdSense slots. Another thing is the number of pixels. So, not only the amount of slots you have, but the pure amount of volume, of space on your website that is taken up by AdSense, we see those associated with lower rankings. Again, doesn't necessarily cause it, but that's what we see. As a user, if you think about it, which page would you rather link to? Both things being equal, I'd much rather link to that page. So it makes sense.

Percent of Followed Linking Pages

The most surprising result of this year's correlation data was the percent of followed linking pages. This requires a little bit of explanation. This means that if all your links pointing towards your domain are followed, we tend to see those sites ranking a little lower. That doesn't seem to make a lot of sense off the get-go. You'd think if all your links were followed, you'd just be great in rankings. But think of domain diversity. Sites that rank well tend to have a lot of sites linking to them. They have sites like Wikipedia that have no followed links, citations no followed links. In general, they have a diverse link profile, whereas spammier sites, smaller sites, newer sites, they are going after those links. They have to work very hard for each one of them, and their diversity is not as great.

These are only a few of the negative ranking factors that you'll find in this year's 2011 SEOmoz Ranking Factors. You can dig into it. We'll link to it in the bottom of this post and Explore Your Own. It's worth looking into all of them. You can learn so much SEO. I love to hear your comments. Thanks everybody. Have a great day.