Rss Directory > Internet > S.E.O > Stone Temple Consulting Articles on SEO Topics
Stone Temple Consulting Articles on SEO Topics
Articles, interviews, and case studies on a wide range of search engine optimization (SEO) topics.
Copyright: StoneTemple Consulting (STC)
  Tue, 30 Sep 2008 23:09:28 +0200
Published: September 29, 2008

Bruce has operated as an executive with several high-technology businesses, and comes from a long career as a technical executive with leading Silicon Valley firms, and since 1996 in the Internet Business Consulting arena.

Richard is Founder, President, and Chief Gooroo for AdGooroo, and is a long-time internet marketer over 15 years experience in technology and advertising management. He was previously a technology executive at Publicis Groupe/Leo Burnett.

He has a BS in Computer Engineering from the University of Illinois and an MBA in Entrepreneurship and Technology Management from the Kellogg Graduate School of Management at Northwestern University. Richard is a regular speaker on search marketing topics, is a certified expert in both email marketing and conversion optimization, and is the author of "Mastering Search Advertising - How the Top 3% of Search Advertisers Dominate Google AdWords".

Interview Transcript

Eric Enge: Tell us a little bit about the book and what motivated it.

Rich Stokes: The book started out as the result of a study that we did. We started in December, and we surveyed about eighteen different industries. And what we were hoping for was to just compile a list of interesting statistics, and what advertisers were doing in the adverts market place.

What we found really surprised me, that fewer than 3% of advertisers were getting 80% or more of the available traffic from Google. That is pretty much what motivated me to do this book.

Eric Enge: It seems to me that there are just a lot of people who structure and build their campaigns very inefficiently, and that's probably substantive part of the explanation for why 3% of the advertisers get 80% of the clicks.

Rich Stokes: That's right.

Eric Enge: Okay great. So let's talk about some of the things from the book in sequential order. For example, how about some common keyword mistakes?

Rich Stokes: The biggest mistakes that I see these days are people just dumping in thousands of terms. They don't really pay too much attention to the relevancy of those terms. So, you can get dinged for that pretty big on Google these days. They are not only looking at your click through rate for any individual term, they are looking at your click through rate for the campaign as a whole.

If you have a lot of really off-target keywords, it doesn't penalize you too much, because they have poor quality scores and don't end up being shown very often, if at all. It's those mediocre keywords that kill you. The ones that don't do horrible, that Google doesn't have to be as active in to raise in bids, and are not really high enough to standout above the industry average.

Taking your competitors and your own campaigns into account, that's the really important thing. So, the number one keyword rule is to focus on keywords that are extremely relevant to your campaign. You want to focus all your efforts exploring those keywords to the maximum percent possible.

Eric Enge: Right. So, the big issue that you highlighted essentially is that the mediocre keywords are probably going to get lower click through rates and drag up the cost of your click throughs.

Rich Stokes: Yes.

Eric Enge: Right. So, click through rate gets dragged down, and the cost per click goes up.

Rich Stokes: That's right

Eric Enge: So, what's the right way to choose keywords then?

Rich Stokes: Well, you really have to take a disciplined approach to it. So, one of the things that I recommend in the book is to use a wide variety of tools. Now, everybody knows of the free tools out there, and there are good ones. I would say, I think the biggest free tool out there is Google's own keyword suggestion tool which is built right in the AdWords. It's a great tool; but it doesn't return a lot of results though.

It will tend to be between two and three word phrases. They give you a lot of bogus ones, and today it's a computerized algorithm that's making the suggestions. So, that's just something you have to live with.

Wordtracker is really useful to building out your initial keyword list. One of the key differentiators in the market place is that there is good data out there. You have to pay for it, however, because that data is not free to the company that's providing it to you.

Fewer people are willing to pay for keyword research, and therefore you have fewer competitors that know about those keywords. So, you could win on two fronts there. Another real resource for keywords that requires a little bit of programming knowledge, is to go into the APIs of the different search engines.

That's something I haven't covered in the book because it's beyond the capabilities of the average search marketer. But, if you have programming resources at your disposal, getting direct feeds into Google, Yahoo, and especially Microsoft live search APIs, is a really smart thing to do.

Eric Enge: What data are you getting through those APIs that's helping with the keyword selection?

Rich Stokes: Well, keyword suggestions for one, but also keyword traffic estimates straight from the engines themselves. In Yahoo and Microsoft's case I don't believe they exposed the traffic estimates directly. I had heard that Google has started to expose them to the AdWords interface, but I haven't confirmed that myself, though.

Eric Enge: And, I think another area that's an interesting way to come up with keywords is when you have broad match type keywords, you get a lot of clicks obviously on things which aren't the exact phrase you bid on. You can analyze the search phrases that people are coming in and then take some of those, and put them in as separate keywords.

Rich Stokes: Exactly. That's a great practice for tuning your campaigns.

Eric Enge: Talk a little bit about the customer life cycle model.

Rich Stokes: What that model says is that regardless of your industry, you have at least three major constituencies of web traffic who are searching for you. One of those three groups is the browsers, those are people who are not really sure what they want, but generally are looking to get educated and for some more information on a particular subject.

You also have shoppers who may know something about the product. They probably have identified a need, but they don't know what their options are for filling that need. And then you have the buyers who have identified their need. They most likely have the product or the service identified, and they are looking for specific information. In particular, pricing is a very big one with that group.

So, the differences in these three groups really tell you a lot about how you structure your keywords, your keyword groups, as well as your ad campaigning around the pages. Buyers are less likely to purchase, and they are looking to get educated. You don't want to spend a lot of money on that group. You also want to focus your marketing efforts on educating the customer, in some sense setting the bar for their expectations.

IBM did this. In fact, we put the book out there to educate the market heads about search advertising the way that the big boys are doing it. Obviously, the big boys always get customers. We hope to not only show people how to get better results from search marketing, but in the process they also begin to see the value of our tools. So, our book is a good example of something aimed at browsers; but shoppers and buyers are different stories. So, shoppers are typically looking for comparisons and reviews, whereas buyers are looking for specific models. They are looking for reviews on specific products, pricing, delivery options and all sorts of things. So, the traffic is there and I think there is a spectrum.

With the browsers you are doing the soft sell, but with the buyers you are doing the hard sell. You might want to put things like your pricing exactly in your ad for instance. You might want to say ‘hey, free delivery if you order today.' You don't want to spend a lot of money on browsers, but you probably want to spend top dollars for the buyers, because those are people most likely to buy something from you, and shoppers are typically somewhere in between.

Eric Enge: What about setting up ad groups?

Rich Stokes: A naive way of structuring an ad campaign would be to take every single keyword or close grouping of keywords, and then create three ad groups for each one of them. One would be browse bucket, one would be shop bucket, and the third would be for the buy bucket. Then you would tailor your bids and your ad copy in your landing pages for each one. It's a lot of work, but the book talks about how to shorten that process.

Most of the time keywords would only fall into one bucket. You can generally, though not always, tell which behavioral bucket a keyword falls in based on two things. One is the length of the keyword phrase in terms of the number of words. And, two is the specificity. So, if I ask you like what category "antivirus" falls into, that would obviously be...

Eric Enge: Pretty general.

Rich Stokes: Yes If I ask you what bucket would be a "Magnavox DVD 100 television, a nineteen inch television set" be in?

Eric Enge: They would be buyers, when you specify a model number you are not a shopper, unless you add the word "reviews".

Rich Stokes: Exactly. That word reviews I think is a good case of where you are not sure what the phrase might be. If they add a review they might be looking to buy, or they might just be looking to compare different models, in which case they are shoppers. The good news is you don't have to have three keyword groups for every group of keywords in your campaign closely related by the keyword phrases. You generally will need one and in some cases you may need two coming down to the trial and error. You can even put more scientific testing around that to figure out which is the best way.

Eric Enge: Let's talk about power keyword matching.

Rich Stokes: Keyword matching is a great thing; most people use broad matching only. And, as far as it goes, that's the easiest, lowest effort thing that you can do. But, adding in different match types is almost as little effort, and the payoff is absolutely huge. Let's say we take the Magnavox television as our keyword phrase.

Our naive approach would be to put this phrase, and related phrases, in their own ad group, and create some ad copy and a landing page for that particular keyword. With two little tweaks you can dramatically increase the click through rate, while simultaneously lowering the cost per click of that group. And, that is simply adding the two additional versions of the keyword phrase of Magnavox television using exact matching, and phrase matching.

It doesn't always work for every keyword, but generally for two or three word combos where you have a pretty good chance of getting exact match, your click through rate can absolutely skyrocket. In those cases where you get an exact match you stand a very good chance of getting your ad pushed into the premium positions on the search results page, and getting a correspondingly high click through rate.

Setting this up takes a very short amount of time. I personally have an excel spreadsheet that I have rigged up that does it for me. I have the broad match version in the first column, and then my second column adds brackets around it, for exact match. Then, my third column has quotes around it for phrase match. I can take that and I paste that into AdWords and then I get the job done very, very quickly.

Eric Enge: Right. And, it's basically as you've explained, that the gain you get out of that is that exact matched phrase effectively improves your position, because when the exact matched keyword is typed in, it's a stronger match.

Then, correspondingly with the phrase match it has the phrase in it. It will do better than a broad match that matches up with the same thing.

Rich Stokes: You got it, exactly. The other type of matching that I'll touch on is negative exact match. If you have broad keywords that generate a lot of traffic and you can't bear the thought of losing that traffic, then the negative exact match is really a great way to go. I have used this in the past to target keywords like Spyware. What the negative exact match does is show my ad on any longer phrase containing Spyware or anti-Spyware, but doesn't show it if somebody just types in that one keyword. That ends up filtering out a tremendous amount of traffic.

90% to 95% of people that type in the keyword Spyware into Google do not buy anything. So this effectively filters the traffic you don't want, but let's you get the traffic you do want. Those phrases will still end up showing up in your server logs, so that you can take all that for log offerings that people are typing, like how to eliminate Spyware. Find those log phrases in your server logs, and then add them into your campaign.

Eric Enge: Right, like when you have very, very general keywords that have potentially multiple meanings. For example, Jaguar, which could be a car, an animal or a guitar.

If you are selling the guitar, you know if you go broad match your conversion will be poor. You might want to use exact match on some 2 and 3 word phrases to keep yourself from showing up on animal or car related terms.

Rich Stokes: You can also negative match on the car and animal related terms.

Eric Enge: Right. Let's talk about the cast of low coverage.

Rich Stokes: So, there is a common belief that if you bid more than anybody your ad will appear all the time. This is not true. There are people bidding on keywords where they are bidding higher than every last one of their competitors, and there ads are showing 5% of the time, and 95% of the traffic is on a table

Fewer than 3% of advertisers have greater than 20% coverage, so 97% of advertisers can increase their relevant traffic by five times by fixing their coverage problems. Coverage means that your ads are only showing up a certain percentage of the time for the terms that you are targeting. So if I type in Magnavox television 100 times, and it only shows up 50 times, than the coverage for that ad is 50%.

The goal here is to get your coverage to a 100%, you always want to be there, and there are a lot of reasons for that. If you have a brand, or something that could potentially be a brand, you want to appear every time that somebody searches on your keywords, because the brand equity that the impressions generate is very valuable. Secondly, there is also an effective brand recall when people click quit through, and they continue to see the same brand over and over again consistently.

It generates the perception that you are a company of importance, or brand of importance; whereas if you see a company once every 10 or 20 times, you might NOT consider them seriously. So, there is that perception as well. In addition, most people who have low coverage are generally over paying for the traffic they we get. The way I communicate this people is to say if your adverting budget for today is one dollar would you rather spend the entire dollar getting one visitor to your website of getting ten visitors to your website, meaning that they only cost you ten cents each.

You almost certainly get higher bid prices in cases with low coverage; and you'll almost certainly benefit by lowering your bids that. Obviously, you get more bang for your buck; you are paying less for your ads, but also there tends to be less competition. You have much greater chance of getting higher coverage keeping all other factors equal such as ad copy.

Eric Enge: The idea is to manage your budget in a smart way to get the maximum amount of traffic for your investment.

Rich Stokes: Right, and maximizing your reach.

Eric Enge: Indeed. Alright, so what about a sweet spot for bidding; is there one?

Rich Stokes: Of course, there are sweet spots for bidding. So, we really used to report on this and go on at great length about how click through rates and cost per clicks change on Google AdWords depending on what position you are at. So, I think pretty much anybody on search advertising is comfortable with the idea that you are going to pay less with the bottom of the page, and your click through rate is going to be lower.

It's probably one of the most well studied and well verified relationships in advertising. What was not previously done is a study to show the way keyword length factors in the equation. So, we figured that out, and then we broke that out according to the length of the keyword phrase. So, broad keywords; they have a different curve then there is for specific keywords. Broad keywords are well served at the bottom of the page; paying as little as possible. It's typically only with the brand name advertisers who have deep pockets, and have a business model where they're not worried so much about the direct sales, that can afford to be in the top positions.

Orbitz is a great example; every time you see an Orbitz ad, well they are cementing your perception of Orbitz in your minds, so the page should be up there. When it does come time for you to buy a trip, you become more likely to go to Orbitz or maybe one or the other two or three big name travel sites who have managed to build some top of mind awareness with you.

Now, to get into specifics; with the broad keyword what we found with our mathematical model is in most cases you are going to best served by having your ad in position 7. Any higher than that and the cost starts going up way up in proportion to your revenues, and so point of maximum profit in most cases is position 7.

Now, if the keyword lengths get longer you start getting at least 3, 4, 5 length phrases. You want to bring your bids up correspondingly, and so what we've found is that switch box for this next keyword is generally position 2. There is a real dramatic drop box in many cases between position 2 and position 1. So, in position 2 you might be widely profitable; position 1 you may be losing an arm and a leg. So, it's real important we find the right position.

Eric Enge: It's interesting that the whole thing is so dependent on keyword length, but when you have a one word keyword for example, everybody in that space knows about that keyword and they are going to bid on it. You are not to do any homework; it is probably one of the terms that the CEO would have mentioned to you if you asked him or her to describe the space you are in, and you asked him for a dozen words; it's probably one of those words.

So, these terms inherently have brand value, and this is what drives the pricing up, and you certainly have people that are willing to lose money on the term. Even though in a direct response model they are losing money they are achieving their objectives.

Rich Stokes: That's exactly right. For them, this may provide very, very cheap branding benefits. Even if you are buying those premium positions, the price is still going to be less expensive than a television campaign.

Eric Enge: Right. So, it would be interesting to see how that drives the pay per click industry in the future as more and more of those brands come in and do that bidding. So let's talk about maximizing relevance.

Rich Stokes: We've picked keywords; we are working on getting those click through rates high, optimizing our bids and so on. So, you have three areas where you can maximize your relevance, #1 is your ad copy, #2 is the fit between the keyword phrase and your ad, maybe even your entire business value proposition, and #3 is actual landing page itself.

What you want to do is get those things aligned, and I think Google has done really everybody in the industry a great service by formalizing the idea that okay, these three things have to align with each other in some way. Not everybody likes it; and the algorithm is not perfect by any means, but I definitely think it was a smart move.

I think it should be obvious that if you are selling bikes you don't want to do advertise on the keyword Mexico. So, let's talk about ad copy. Ad copy is an amazing thing, one single character in an ad can change your click through rate by two times to three times very easily.

There are a number of these things, and we've included a number in the book on that. One great example is exclamation points. Google does not allow exclamation points in the title of your copy, but not a lot of people know that they will allow you to put one in line one or line two.

We've conducted a pretty large study of ad copy this year, and we found that a surprisingly high percentage of ads with high click through rates incorporate that exclamation point trick.

Eric Enge: Just make it look different.

Rich Stokes: Yes, exactly. There are a lot of different best practices that you can follow. Some of them we'll be releasing out later in the year, but the idea is you want to continually click test your ads, find out which ads are the best. And, you have a couple of different models for informing you about how to do that.

So, the first tool is obviously the customer life cycle model; what keyword are you targeting? If you are targeting a browse keyword then don't sit there and try to sell somebody something in your ad. It doesn't make sense to put the price of your product in your ad, because very few people are going to be interested. Conversely, if you are targeting the Magnavox television; putting a call to action phrase like learn more might be mediocre at best.

Because, they maybe looking to learn more, but they might be looking for good deal. So, a better line there might be free shipping today only; that would be a huge winner. And, we see a lot of high impact ads with that call to action model appropriately targeted. So, the ad copy for testing is a real big thing.

The next thing is landing page optimization. This is absolutely huge, and very few people know how to do it properly. It's part science; it's part art. I have optimized landing pages for some of the largest companies in the world and have; I'll give you an example of, I can't tell you the company. But, within two months of landing page optimization we increased their online sales from 20,000,000 a year to 80, 000,000 a year, and that was all web. So, landing page optimization is very big thing.

Eric Enge: Excellent. The next thing really was Killer Ad Copy, and you've already started to talk about that a little bit. The process by which you invent your first few examples of ads and try them out, and enhance that over time until you get to the killer copy.

Fourth Speaker: We just released a tool. Basically what we did was we devised a way to figure out which ads were performing the best in any given keyword or even an entire industry. And so, we did that without having access to the individual advertisers click through rate data. It turned out to be so effective that we put it in a new report. So, anybody who is with the AdGooroo service will put their keywords in. Thirty days after they have their group setup, and we then have enough data, we can analyze the impression, behavior of the ads in question for each of these keywords.

We can tell them with a high ability of success, which ads are currently doing the best in their industry. So, that's a really nice way to shortcut the whole learning thing.

Eric Enge: In addition to that of course it also makes sense to just keep testing and let Google do a lot of work for you, right?

Rich Stokes: Oh, absolutely. What we do is we start with the ad copy report. For example, when we started we were at 0.79% CTR and we increased it to over 1.5%, and that was virtually overnight. From there we kept testing, and we took the ad copy which was then the top ad copy, and then we turned it into something new by tweaking it.

We kept pushing the bar even higher, so not only do we stand out above the crowd, we get lower cost per clicks, higher click through rates; we often make it much, much harder for any competitors coming to our space in these edges.

Eric Enge: That's great. Let's wrap up with a beauty, Google Quality Score.

Rich Stokes: First of all there is a little bit of confusion about it. Google has two bots that come by and look at your site. One is the Googlebot, which indexes your site for web search. The other bot is the Ads bot, and that's the bot that attempts to figure out how well aligned, how relevant your page is for the keyword in question.

So, each bot works by different rules. You know people try to gain the Google bots all the time, and they are succeeding less and less these days. The Ads bot however is a different story; it's much simpler and I believe it works in a very similar manner as a Googlebot did in about 2002.

What we find with the Quality Score, Google is now assigning a Quality Score from 1 to 10; and the higher Quality Score the better the match with the ad text the better the match with the keyword, and the lower your minimum bid. So, if they decide that you've got a fantastic match, your minimum bid might be three cents or four cents. If they decide you are horrible match, your minimum bid might be $10. Very few people can afford to pay $10 a click for anything. And, even if you could; your coverage would be so low that your ads will hardly ever show.

So, it makes very little sense to try to use a brute force approach to the Google Quality Score. Rather what you have to do is you have to investigate your page; figure out what are the factors that are causing you to have a low quality score. What we found in our research was that many of these factors not all of them, many of them were onsite factors, things that were 100% within your control.

Having the keywords in question on your page is a factor. It's not a necessary factor, as the overall genre of your site may be consistent with the phrase they are targeting. But, most of the time it makes a lot of sense to have that word, that phrase on your page. Another very big thing; this is one thing that they have tweaked over the last six months is the page loading time.

What we found were that pages written in certain technologies, tended to have lower quality scores. This happens in technologies that insert large binary strings in the code, or otherwise bloat the page code. This slows the page load time and harms the quality score.

Eric Enge: Page load time is certainly a very interesting dimension, but you can see how it relates to the notion of quality. I mean it takes 5 seconds for the page to load, and the competitive page takes one, well what's a better user experience?

Rich Stokes: Right, exactly.

Eric Enge: So, there is something that really deserves some real consideration.

Rich Stokes: The great thing about this sure is that you can test this in real time. It's something a lot of people don't realize. Google has recently said that they are going to be updating page of Quality Scores a lot faster if not real time. I don't know if that's true or not, but it was three months or four months ago.

In the past, you had to wait three weeks to get an update on your Quality Score. So, we thought of a way around that. If you want to test two different landing pages, what you do is create two different ad groups. These ad groups are identical; they have the same keywords, same matching, everything is the same. The only difference is you give them different landing pages.

You're doing an A/B Split test. Then, what you do is you take the handy-dandy Google AdWords editor which is a Desktop Application that they give away for free on the site. And then, you load up your campaign in that. That gives you a way to look at all of the minimum bids required for each one of your keywords.

Let's say you've got 100 or 150 keywords in your group that you are testing. You may find that five of them have a $10 minimum bid; maybe 30 of them have a $5 minimum bid. Ten have got a $1 minimum bid and so on; you can group those up and you could see how many keywords got cheaper, and how many more keywords got more expensive with both versions of your page.

Using that you can very, very quickly figure out how to optimize your page to get the lowest minimum average bid on Google. You can do this in less than a couple of hours which provides an incredible return.

Eric Enge: Thanks Rich!

Rich Stokes: Thank you! The pleasure is always mine.

Have comments or want to discuss? You can comment on the Rich Stokes interview here.

Other Recent Interviews



About the Author

Eric Enge is the President of Stone Temple Consulting. Eric is also a founder in Moving Traffic Incorporated, the publisher of Custom Search Guide, a directory of Google Custom Search Engines, and City Town Info, a site that provides information on 20,000 US Cities and Towns.

Stone Temple Consulting (STC) offers search engine optimization and search engine marketing services, and its web site can be found at: http://www.stonetemple.com.

For more information on Web Marketing Services, contact us at:

Stone Temple Consulting
(508) 485-7751 (phone)
(603) 676-0378 (fax)
info@stonetemple.com
  Mon, 22 Sep 2008 21:42:16 +0200
Published: September 22, 2008

Bruce has operated as an executive with several high-technology businesses, and comes from a long career as a technical executive with leading Silicon Valley firms, and since 1996 in the Internet Business Consulting arena.

Bruce holds a BS in Math and Computer Science and also has his MBA from Pepperdine University, has had many articles published, has been a speaker at over 100 sessions including Search Engine Strategies, WebmasterWorld, ad:Tech, Search Marketing Expo Advanced, and many more, and has been quoted in the Wall Street Journal, USA Today, PC Week, Wired Magazine, Smart Money, several books, and many other publications.

He has also been featured on many podcasts and WebmasterRadio shows, as well as appearing on the NHK TV special "Google's Deep Impact". He has personally authored many advanced search engine optimization tools that are available from the company web sites.

He is recognized as a Wikipedia Notable SEO.

Interview Transcript

Eric Enge: Let's talk a bit about white hat and black hat approaches to SEO.

Bruce Clay: I think it is appropriate to consider the origins of white hat, black hat. Twelve years ago when I started in the industry, the search engines really didn't have any stringent spam rules. They basically said don't put white on white, but they didn't even have a way of enforcing it. And, the spam filters were really, really pitiful. So, back then a lot of the orientation was, my client is paying me to get them rank; the search engines aren't even enforcing their own rules.

So, getting them ranked was your job, I would say everything was white hat because you were working without spam rules. You had a fiduciary responsibility to help your clients achieve their goals. Most of the SEO companies did it that way. When I started, I felt that the web was going to be like the Gold Rush, and we all know that the people that made money during the California Gold Rush were people selling picks and shovels to the miners, while most of the miners went broke.

So, for every 100 sites that I would likely see, 99 of them would die off eventually. And, most new sites were at risk because most of them were under-funded. So, I thought that was the prudent way to do it; I am selling picks and shovels; Levi Strauss is still here for a reason. So, that was my approach, and I felt at the beginning that it wasn't ethical to really color too far out of the stated spam boundaries even if not enforced. You don't want to burn your client, ever.

SEO ethics is fundamentally about doing no harm to your client. I played within those bounds, but the only way that you know where the middle of the safe zone is, is if you know where the edges are. And, as a result everybody doing SEO work twelve years ago was doing experiments to know where the acceptable zone was and to keep out of trouble. They were experimenting on their own test sites to learn what worked and what was spam to the search engines.

And, that's where a lot of the spammers really started coming in, they were running 30 sites, 40 sites, 50 sites, or even more pushing Pills, Porn and Casinos, or something else that was easy money.

The spam influx occurred because it was big money back then, and that was really what forced the search engines to come up with stronger spam rules and put some teeth behind them. I think that there was no white hat or black hat until the rules were better defined and had teeth; until then you worked for your client and not the search engines. Even if the client was yourself.

Yes, there were people who colored outside of the lines, and they did get slapped a little bit. And then, they went on and did other things, and then they got slapped again, and then the rules became rules. Google actually started enforcing their spam rules and everyone started paying attention.

It was in the '98 to ‘99 period when the spam filters showed up. 2000 is when they had teeth; it wasn't until Google really said I have these best practices, and we are enforcing it that a lot of people felt that they had to play by rules. Until then the rules were Wild West rules.

SEO was too often whatever it took to get ranking; my client is paying me -- not the search engines, so I owe ranking to them at any cost. I am happy to have the search engines around so I can make money, but the search engines are not paying me and I owe them nothing.

I think the actual term (black hat) was coined by Mike Grehan. In the cowboy era there were white hats and black hats and the good guys always wore white hats. And, that does actually describe the SEO in the Wild West era. And that does relate to how people treat their SEO approach even today.

When the SEO has an objective to get ranked at any cost, including violating best practices, and the kinds of things they do are deceptive, and they do things the search engines are known to be fighting, and they do it deliberately, that becomes black hat.

I think there are only a few people who are really good at black hat. I have to give them some credit for being black hat and being nimble and smart enough to be here today. The black hats have spent their entire lives figuring out where the acceptable spam boundaries are.
They know what the boundaries are, because they play right at the edge. The people that are white hat do not often play at the edge, but they do pay attention to black hat behavior and thus also observe where the boundaries are. Generally, the white hats choose to play in the middle of the acceptable area. And, fundamentally that's my take on how the white hat and black hat mentality differs. The white hats are going to play by the rules, and they are not going to go near the edge and get hit or hurt. And, the black hats recognize that if you can play there and get away with it, there is a lot of money to be made.

The ethics difference between white hat and black hat I think is clear. Ethics being a statement of "do no harm, do not put your clients in harms way; do not do something to your client's website that will cause it to be banned". That is really where ethics is involved in SEO; it is unethical to harm your client. And, if I know that doing something will harm the client, it's unethical for me to do it.

Eric Enge: Right. Of course there are all kinds of shades of gray right, which is a class of things that you can do, which probably won't harm your client. But, they are not without some risk. In which case, you've got to disclose the nature of the risk, right?

Bruce Clay: Well, is it ethical to tell your client who may not understand SEO at all that this is something that's going to generate a lot of traffic? And that you think you can do it in a way that won't hurt them when it really could. Is that ethical? I would contend that in many cases the client doesn't have a clue what you are talking about, they have to trust the black hat, and they are being led down a path of doom.

It is clear to me that that's not an ethical act. Look what happened to BMW. Somebody made a decision that they were going to do something thinking that we are BMW we are immune, and they were wrong. So, who pays the price?

Eric Enge: The client pays the price.

Bruce Clay: So, I would contend that with their client's uneducated permission that it's still an unethical act. But it does get really fuzzy, you are right.

There is white hat that plays in the middle and observes where the boundaries are. There is black hat that plays at the edge and knows where the boundaries are. And then, there are the gray hats that are in the middle. I contend that maybe 80% of the people who are gray hats are just undereducated in the white hat way of doing SEO. They don't know where the boundaries are, but if they see somebody else getting away with it then they assume it's OK.

Eric Enge: For example, a lot of small webmasters want to go out and do some advertising, and they buy some links for traffic, and they may not even realize that there is an SEO benefit to all of that.

Bruce Clay: And, they take as gospel the statements made by the people selling these services. People believe that what they are doing is okay. There was a little company in Vegas named Traffic Power, and they had a version of their client's page which was visible to end-users and they had a different version of the page which appeared until the mouse was moved that was stuffed with keywords.

They started selling that because they could show that they were able to get ranked with it. And, there were many, many clients that had this technology installed. At least they did until Google rolled over on them and basically put Traffic Power out of business. Essentially they went out of business because they were spamming. The clients should have known they were spamming but did not, and the company stated that it wasn't spam, that their technology was totally cool and it worked. And, people bought it, snake oil salesmen are still out there preying on the uneducated.

"Buy my link and you will never get caught" is said way too often. Buying ads is not the same as a testimonial grade link.

The best way to succeed on the web is to do things naturally. Do not do deceptive things; and do not try to fool the search engines. You have to play by the rules; and you need to know where the edges are. You need to know this technique is out of bounds and this other technique isn't. And, you need to pay attention to what the search engines are allowing versus disallowing.

There is a request from Matt Cutts at Google that if you detect a spammer, report them. Report all the spam you find; everybody should report spam. Google wants to catch it all.

Eric Enge: You will be interested to know that I heard from a Googler at the Google Dance that they receive one million spam reports everyday.

Bruce Clay: Wow. From my point of view with a million a day, nobody has enough energy to go penalize sites one at a time.

What they have to do is categorize them into the types of spam they are, and assign it to a programmer that will find a million sites at once. Let the spam filters do their job.

It's been reported by people who were black hat that have now switched to the white side of the force, that in their beginning 10 years ago it was taking approximately 9 months for Google to catch a new spam tactic, and now it's in the 3 week to 4 week range. Google has become much more efficient at catching these spam techniques.

Eric Enge: Right. WordPress templates are, for example, one of the latest ones.

Bruce Clay: Right. Well, you spam; you get away with it for 3 weeks, then you die. It isn't worth doing. But, when it took 9 months, or if you did have the benefit and were somehow just flying under the radar, this was when clients saw success and said oh, they are getting away with it; it must be legal. And, that's where the gray players come from.

They are not as well educated about what white hat is and how it really should be played. Those are the ones that end up getting burnt more often. They are also the ones that are hurting the industry. Most of the black hats in my opinion are doing their own personal websites.

SMX Advanced in Seattle in 2008 received a black eye. A lot of the people who were presenting at SMX Advanced, in order to cover advanced topics, inadvertently had a fair number of speakers who were black hat. That makes sense, because the people that are playing at the edge (speakers advocating black hat techniques) are the ones that are trying to obtain colleague approval - which did not happen.

Where it went wrong at SMX Advanced is they had had black hat speakers presenting tactics where the speakers were running their own website not corporate websites. But, the audience was full of corporate webmasters, and the audience heard over and over and over again statements like, don't pay any attention to Google, they are just trying to keep you from being ranked. Do it this way and you'll get the traffic; and Matt Cutts was in the audience writing it all down.

Eric Enge: There were a bunch of other Googlers. I also think there was another dynamic at the conference. It's much, much easier to have a presentation well received if it's funny and entertaining. And, when someone talks about, or makes a rash statement, or talks about a real well done black hat tactic, then everybody is laughing.

They are entertained, and even though the reason why people are sent to conferences is to learn something that goes back to the ROI of the business sending them there, I think there is a certain amount of pressure to entertain the audience as opposed to simply educate. That's just easier to do if you say something crazy.

Bruce Clay: Well, at SES there was the white hat, black hat session, and I was on the panel with Todd, Greg, Jill and Dave. Halfway through the session Matt Cutts requested a microphone in order to respond to a couple of statements that were made on the podium. His response was to a specific statement that large companies get away with spam and that Google looks the other way. And, his response was most large companies do not get away with murder, Google just does not publicize it, and neither do those large companies getting caught.

In many cases the large companies don't even know they have been penalized because they are not smart enough to figure out that they just lost all this ranking.

When things like that happen I think that you have to know that Matt takes it personally. He really, really, really lives anti-spam. And, as he is one of the early Google employees I am sure he is worth a little bit of money, and he can sit there and not take it personally, but the fact is he does. He has earned my respect for his caring.

In addition, a lot of his staff takes it personally. And, these are people that work for a company with billions of dollars in revenue with one mission: protect our property. You've got to think that the areas where you can color outside of the line and get away with it are shrinking exponentially as Google gains momentum, and as they become smarter, wiser at spam tactics and how to fight them. In essence, know thy enemy, then beat them.

You've got to think that the longevity of a black hat is going to shrink, they know it, and it's just a matter of time before they turn to white hat type of tactics. One more thing worth mentioning, and this came out in a panel, a vast majority of the black hat folks are only black hat when they are running their experiments, but they are white hat when they are dealing with clients.

That is an entirely different scenario than existed 10 years ago. So, I would have to think that even the people who are openly violating the rules are sometimes just determining or validating the rules. They are finding out if there are holes in the rules; they are finding out what they can get away with. But, they are not necessarily doing that to their clients, and everyone on that panel whole heartedly supported my statement that it is unethical to harm your client.

As I mentioned earlier, I think that a lot of the "spammers" are just uneducated. They don't understand what the rules are, and what the constantly shifting boundaries are, or what you can and cannot do. They are home grown SEO's, and I think the conference audience generally agreed with the idea that what we really need is more education in the market. My company does face-to-face training; there are a lot of companies that are now getting into the seminar business, and collectively we are doing a lot to introduce SEO to the masses.

Network Solutions is running a seminar series in something like 40 cities this year to audiences that are about a hundred people each. They took my course both onsite and at our facility several times and learned how to do white hat and that's what they are preaching. After their seminars, I usually get two or three phone calls saying they talked about me, so they are obviously teaching proper SEO. I think that at the end of the day Network Solutions is doing a great service to our industry. And, as an extension to the education of Danny Sullivan's SMX conference, I am personally doing a two-day training course at the end of SMX in New York.

Eric Enge: At the end of the day the time that you spend in a white hat strategy gets paid back in a stable business. A black hat strategy is not stable; it's very transient as you point out. One of the things that I remember from the panel at SES is the statement that a black hat can do in 30 days something that will take; I think it was, a year or two years for a white hat to do.

Bruce Clay: When I said that I used the example of three and a half years, but yes that's true. It obviously depends upon the industry. I think that a white hat in a local market with few competitors can get them ranked quickly, but most projects are more complex than that. If the keyword is worth having, you can bet somebody is competing. And, if they are competing they are probably doing some form of professional service provider SEO which means that it will become even more competitive; hence it may take two years.

Eric Enge: The timeframe just seems a little long to me. You know we do things in a purest white hat way here at Stone Temple Consulting. And, it just doesn't seem to take three and a half years to accomplish our goals. If the goal is to take an established site and double its traffic; gosh it seems like it's doable in a lot less time than that.

Bruce Clay: If your site has 50 to 100 million pages, and you are going to double traffic, you are talking about probably needing an architectural change. Maybe a redesign of the site; maybe moving servers, you are looking at things that won't cause your servers to melt down into a silver puddle. That's going to take a lot more. My three and a half year example resulted in a 900% traffic increase for a major site, something well worth waiting for.

But it's still true that on the average that like you we have done quite well in 6 months (plus or minus) for many clients. There is often low hanging fruit that you might get in of the first few months, but that's generally not the big win. It's somewhere between 5 and 9 months for most sites. Impossible words take longer.

Eric Enge: Then, of course there are plenty of sites where there aren't enough relatively easy wins to get you there that will take quite a bit longer. I agree with that; so I guess there is a spectrum here as well.

Bruce Clay: If I wanted to rank well for "car dealership" I would be facing how many SEO savvy dealers are in any one city? The answer is "not many". On a local level you would think that if I just did my job right it would take a few months, not a year to get ranked. If I want to rank for "cars", that will take a little longer.

So, it really depends upon the business. In a more complex environment, you are also dealing with other things that impact the project: the complexity of the clients CMS, multiple release cycles, distributed servers, things like that really change the project. A small website like we saw on the average 10 years ago; HTML and seldom dynamic content, were easy, those were the days.

Now with all the dynamic content, the restrictions of many CMS systems, web designers who believe they are designing for people instead of search engines, those kinds of things slow down SEO projects and they do take longer. The more complex the environment, the longer it takes.

Eric Enge: Let's shift gears a little bit, because I know we wanted to talk about some of the big changes coming in search in particular, some of the things like behavioral search and intent-based search that they are going to change the landscape of the way things work. Can you talk about that at an overview level for a little bit first?

Bruce Clay: There is so much money to be made by properly targeting ads to queries, and by properly targeting ads to the established behavior known for the person doing the query. Behavioral search and intent-based search are going to change the face of SEO. I will give you an example, and I think it will help you understand how this works.

If you had a room of people and they all searched for the word Java, some people would be looking for coffee, some people might be looking for a programming language, and some people might be interested in travel. So, today you do that search and you get back generic results that might include all three forms of those sites.

Eric Enge: Right. Today search engines deal with the ambiguity by mixing and matching results basically.

Bruce Clay: Right, especially if they don't know which you wanted. With behavioral search, they are able to determine your past behavior. So, if you visit a lot of travel sites, they would bias the search results towards travel information. You might get one programming site, one coffee site, and eight travel sites. They might also determine, based upon your prior behavior that you like snorkeling, or you like golf, or you like deep sea fishing.

So, even the people that did the query for Java that were interested in travel would get different results. Now, at the end of the day you may have fifty people doing the same query and all fifty might get a different sequence of results.

Eric Enge: So, the intriguing thing there is what's involved in collecting enough data to be able to make those judgments.

Bruce Clay: You are right that data collection is hard. I think that what we are going to see is that the design world is going to change a little bit. You are going to have to be more focused on the community of your customer, and not just the keywords.

Keyword research is going to change; the old approach of this is right way to do keywords, don't pick keywords that don't convert, and know what to throw away and which ones to keep may be absolutely wrong in a behavioral world.

What I think we are going to need to do is change the way we view keywords. And, we are going to have to change the view we have of ranking reports. If I do a ranking report I may say you are #2 where a client will walk up to his computer, type it in and he is #8.

Eric Enge: That can happen right now.

Bruce Clay: It can happen right now, but with behavioral it's pretty much going to happen a lot.

Simply put, the measurement of the success of an SEO program is going to be based on how much traffic it can generate and not based upon just ranking. And, the traffic that SEO generates includes all the long-tail keywords, not just the few keywords selected for a project. Searchers type in combinations of words that you and I would never dream of using together, and if done properly it results in traffic.

People type in "seo ppc design analytics" in any order and we show in the results. I wouldn't type it in, but some searchers have. If you are in the results then you are going to get the traffic. It may not be your targeted traffic, but if you did rank then you are getting some long-tail traffic.

What I think we are going to see is that the way we perform keyword research is going to change. We have already changed some here at my company. I think that the approach to how we measure success is changing. We have already changed it; we actually became an Omniture agency just so we can report on analytics and integrate it into our tools.

We are reengineering our entire SEOToolSet around what is in the market as well as what will be the market a year from now. We are preparing for the needs of the future market that I think that we are going to need worldwide.

Eric Enge: Yes. I have written on our blog about the kinds of the things that you should be monitoring if you are an SEO. I am just not at all a fan of ranking reports and haven't been for some time. I certainly go on a website I am working on, and I'll go poke at a term or two occasionally. But, if it doesn't change what you are going to do today, why are you looking at it?

If what you are supposed to be doing is to build some great content and promoting it to get links, and being smart about how the content is targeted and matches up with conversions. Well, none of that involves a lot of the things that people end up spending a lot of time looking at.

The more hilarious thing is people that were obsessed with different data centers. And, the difference between ranking reports and data centers oh my goodness; you could spend hours on this stuff, and you are doing that instead of work.

Bruce Clay: Yes. Traffic rules.

Links are another great traffic source. The best thing you can do is to behave in a natural way to develop websites that everybody will to link to. We refer to these as link magnets.

Eric Enge: My name is Link Nirvana. The links come without effort.

Bruce Clay: You got it, Link. I can spend a hundred hours designing a page and get a thousand links. Or I could spend a hundred hours begging for links and wish I had spend it elsewhere.

As a summary, you have ranking instead traffic as we think is important, and we find too many link begging instead of building content worth linking to. I think that you have SEO's going in the wrong direction. There are a lot of people out there that are teaching classes and I think they are teaching the wrong stuff.

It is not helping people understand that traffic is king. It is not helping them understand how to build things that people will link to, and it's not telling them how to understand the relationship between natural behavior in an index and the content they just wrote.

I will tell you right now. Every journalist on the planet thinks they are writing good content; and they are trained to write content. And most of it doesn't have a chance of ranking.

Eric Enge: Right. It gets a little deeper than just writing good content, because if thirty other people have written that article, I mean not exact same words of course, but, essentially the same content, then it has no draw. So, how do you bring unique new value to the picture becomes a big question I think.

Bruce Clay: Yes. And, I think that changing to these things is going to ripple throughout the entire system. I think we are going to see a lot of people who have tools that are centered on keywords, or ranking, or things like that have diminishing impact on our industry. They need to evolve. SEO's who haven't really paid attention to industry changes are just going along day by day and one day they will be out of business.

The SEO tool is a tool to provide data for humans to consider. There is a big difference between having data and having wisdom. The tool gives you data; you apply the wisdom.

Assuming you know how to do SEO, and assuming you know what your community is, and how to behave in a natural way within that community, and how to create things that people will link to, you will then have one of the many building blocks for SEO.

Eric Enge: You talked a little bit earlier about localized search.

Bruce Clay: I think that that has to do in part with intent-based search. If I am looking for general information like how do you make a pizza, I think that it's natural for the sites that come up to not be specifically local to where I am.

If I say pizza parlor address, I would expect the search engine, if they can, to figure out what community I am in, and to give me pizza places that are near me.

Now, that I think this is going to driven by mobile a little bit more, but that is what I would expect. That's really where local is going to play. Local is going to be tied to the intent of the query; if the intent of the query is to find a restaurant then the search engine will give you local results automatically.

If instead of a commerce oriented search it is a research oriented search then the location of the site that answers that question has nothing to do with local. From a standpoint of what we are trying to do at a local level, I think local is a great concept that needs a catalyst. And I think that intent-based search will be that catalyst. I think that if we can understand the kinds of words that people are using to query for us, then we can optimize for that.

Then, we can come out and say okay, this is a keyword that people will use to find me within my community, and the intent of the keyword will trigger a local search. Then, I think everybody will understand that's the way they have to build their content. And, I think that's where SEO is going to come in.

Eric Enge: Another big influence I think, is that more and more people are beginning to understand the branding impact, right? I've had it happen many times in dealing with major brands that they are getting very emotional about their rankings on certain kinds of things. I had a situation like this the other day where somebody was ranking fine for their brand name themselves, but three places below them was a rip off report.

That was just lambasting the brand, and that's the reputation management aspect. And then, there are perhaps the core keywords for their product space, maybe its used cars, or something like that. And, they want to rank for those too, and they are not even so much concerned about the ROI as with the brand issue.

Bruce Clay: I have been paying attention to brand name sites and we see that 80% of their traffic from organic search contains their brand name in the query. They are not getting traffic at all from generic search. And because they get traffic from only those already brand aware and not new customers, they are really losing brand recognition and future revenues.

I am getting a lot of calls from people who have tried other SEO companies, even very large SEO companies, and that ended the relations with the SEO the day the client actually saw their analytics data. They are realizing that all that happened is they got ranked for terms that they couldn't lose anyhow. It's a shame.

Eric Enge: Yeah, I know. When we talk with people like that, we always start with okay, what's your current search traffic, and get that answer, and say okay, now let's remove the branded search traffic. Okay, so that's the piece that we can grow, right? We are not going to grow their branded search traffic unless their TV campaign drives more interest in their brand, right? The only thing we can do is take the other part of it, and bring that up.

Bruce Clay: You are so right.

Eric Enge: That's the way we always go over it. It's just ends up being common sense, right? I think it takes a little while to develop enough knowledge of the industry to see it that way.

Bruce Clay: Yes.

Eric Enge: So, that should be a pretty significant change as well.

Bruce Clay: I think it will be; I think that we are going to see local search come as result of intent-based search.

I think we are going to see behavioral search play a big role, and organic search results are going to be considered volatile when in fact they are almost predictable by community that you are in.

There are going to be problems even in behavioral. If I spent a lot of time looking for a gift for somebody who just had a baby, I don't want to go through a life having search engines think I just had kid.

But, once these things are cleaned up and are running; we are going to see behavioral be a big play or intent based search be a big play. They are already both being focused on by the search engines and built in to the search engine process today.

Traffic is going to be the way we get measured. If what you are doing is selling them first page rankings, what are you selling?

There are Snake Oil Salesmen who are guaranteeing rankings. The best we can do is help the client build their business by helping them in getting into search, thus getting into the face of the people in the community that is served by their customers is the best we can do.

Eric Enge: Thanks Bruce!

Bruce Clay: Thank you!

Have comments or want to discuss? You can comment on the Bruce Clay interview here.

Other Recent Interviews



About the Author

Eric Enge is the President of Stone Temple Consulting. Eric is also a founder in Moving Traffic Incorporated, the publisher of Custom Search Guide, a directory of Google Custom Search Engines, and City Town Info, a site that provides information on 20,000 US Cities and Towns.

Stone Temple Consulting (STC) offers search engine optimization and search engine marketing services, and its web site can be found at: http://www.stonetemple.com.

For more information on Web Marketing Services, contact us at:

Stone Temple Consulting
(508) 485-7751 (phone)
(603) 676-0378 (fax)
info@stonetemple.com
  Tue, 16 Sep 2008 16:19:52 +0200
Published: September 15, 2008

Nathan Buggia is the Lead Program Manager for the Live Search Webmaster Center, Microsoft's suite of tools designed to help web publishers get better results from Live Search. Buggia has overall responsibility for all the Webmaster Center Tools, Community and evangelism within the search marketing industry.

Previously, Buggia spent six years in Microsoft's Server and Tools division, most recently as business manager for the Solution Accelerator Group. The division builds end-to-end IT solutions for enterprise customers in Security, Management, Systems Architecture and Interoperability.

Nathan has been working in various aspects of web technology since his first web dev job in 1997, working in PERL and CGI. Since then he has worked in Java, C++/CGI, ASP.Net, Systems Administration, Systems Architecture, and Bio-informatics.

Interview Transcript

Eric Enge: Can you start with an overview of what's new in Webmaster Tools?

Nathan Buggia: The short answer to what's new in the Webmaster Tools is pretty much everything. This is a significant update to what we shipped last November that provides a lot of really interesting data and resources for webmasters. The data and feedback we provide them is generally search engine agnostics so it should be applicable to all the major search engines.

Within the Webmaster Center, we have two features. The first one is an online community, and the online community is a set of a blogs and forums where we provide direct technical support to publishers as well as provide best practices and news around the community.

The second are the Webmaster Tools, and the goal of the Webmasters Tools is to provide a self-service toolset to all seventy-two million active publishers to help them understand how Live Search is crawling and indexing their website as well as how their website might be ranking.

Eric Enge: Certainly your goal is to help webmasters do better in Live Search, but people can use the information in Webmaster Tools to help them with all search engines.

Nathan Buggia: Yes, that's correct. Our goal is to help provide support to publishers. We realize that search is becoming mission critical to publishers these days, both from the standpoint of just traffic going to their website as well as brand recognition. So how people are starting to navigate the web is changing based on search.

We want to provide support to publishers to make sure that if there is an issue that their site is having with a search engine, we can help them fix it. That way we get the best index of their site possible, so they get the best results possible.

Eric Enge: Can you talk a little bit about what the Page Score is that you show for the web pages listed inside of Webmaster Tools?

Nathan Buggia: Page score is a measurement for publishers to see generally how authoritative Live Search believes your content is. It's not an exact metric, it's a general metric. The way a webmaster should look at it is, if they have five green boxes that means they are probably doing well, they are probably in the top echelon somewhere.It doesn't necessarily mean you are at the level of Amazon or Ebay's homepage, but you are generally doing pretty well and you should feel good about that. If your score is low, it's may be five empty boxes, then that page or that domain has more work to do to gain authority.

Eric Enge: Right. Now, if a site gets penalized for some reason, does that affect its Page Score or is that independent?

Nathan Buggia: It's possible, there are a lot of different penalties that can happen between a website and a search engine. Penalties are another area where we try to provide some transparency to the publishers as well. If you notice on the summary page of the Webmaster Tools, we have a feature call blocked. What we are surfacing here are quality based penalties, at both the domain or page levels. So, if your website has experienced any of these, you will see a "yes" there or in the table below you, or you may see a yes aligned with specific pages. And if you do have a penalty, you can just click the hyperlink and it will be a form to request reevaluation. However, before you request reevaluation we highly recommend you go and take a look at our Live Search Webmaster Guidelines, which are linked off of webmasters.live.com, and make sure that everything you are doing in your site adheres to those guidelines.

Eric Enge: Right. The blocked indicator is actually quite interesting because the one above the list of URLs, I assume is for the site overall and then over here next to each URL is just identifying on a URL by URL basis.

Nathan Buggia: Yes, that's correct.

Eric Enge: I assume that there is probably some latency when a new web page is created, you don't know what its Page Score is yet until you have crawled it? Or is it something that you can determine as soon as you encounter the page?

Nathan Buggia: We hear a lot of that from webmasters. They'll come and ask us "Hey, why is my page ranking so low?" And then, we go out and take a look at it, and it's a brand new website that doesn't have a whole lot of traffic coming to it yet. It may not have a lot of links yet, or it may not have a full set of content yet. Those are all things that we take a look at for the authority score.

The domain or Page Score is really based on some other factors. If it looks really small initially, then what you want to do is go out and market the site, go out and build the best content, talk to different people in the industry, and figure out what they need, what would get them to link to your website. And just make sure you have good, unique content.

Eric Enge: Let's talk a little bit about the crawl issues.

Nathan Buggia: So the crawl issue is one of my favorite features, and one that we've spent quite a bit of time working on. What this feature does is it gives you access to a set of reports that show a set of issues that Live Search may have encountered while crawling your website. So, the first issue that we provide information on are 404 errors.

Anytime we encounter a page with an http status code of 404 we stop indexing the page, we don't look at the content on this page. This can happen for a variety of reasons. Someone could've have misspelled the URL on their blog when they were linking to your website, you could have a broken link in your website, or you could have a content management system that is not returning the correct status code.

This has actually happened on microsoft.com on our MVP profile section. And we were able to use this tool to find the issue and resolve it. But there are a lot of reasons why this happens, and the downloadable reports allow you to hand it off to your IT department or your technical folks, and help them scope the problems so they know where to start.

Eric Enge: I really like the 404 report, because in the scenario you talked about, you can use that report and then generate a 301 redirect from the broken page, and the page returning the 404 to the page that was intended.

That will pick you up a link, because the link to the 404 page doesn't bring any value to you. And I am looking at one here right now from my site and evidently someone has linked to the analytics study we did in '07, and it's a bad URL. So a simple 301 redirect could pick up the link.

Nathan Buggia: That's exactly right, all those customers that might be coming to that link, wherever it was linked from, will now get the right article and you'll get the credit and everything is great.

The next report is about URLs blocked by the robot's exclusion protocol. Now, the robot's exclusion protocol, as you know, is not a standard that was designed from the ground up, it is something that's really evolved over the past ten years. It is complicated, and not always well understood by the industry. The most comprehensive article I have seen is out on Jane and Robot.

It really gives a good overview of all of the different functionality that the REP provides and then how to do it, and what the support is on the different engines. The robot's exclusion protocol is very complicated. So what we have done here is provided publishers a list of all the URLs on their website that are blocked to us crawling based on the protocol.

What this does is it allows publishers to go and do an audit of their website. They can take a look and make sure that all the content that they want to be indexed is being indexed, and all the content that they don't want to be indexed is appropriately blocked.

Eric Enge: Right, so the scenario here is, they have a section of their site that they don't want crawled, so they use robots.txt to select and indicate to the crawler to not crawl that section. But, perhaps the way they specify the rule, actually end up setting down access to pages that they do want crawled. By seeing the itemized list that will become very apparent to them quickly.

Nathan Buggia: Yes, exactly.

Eric Enge: Alright, excellent; so let's talk about long, dynamic URLs.

Nathan Buggia: This is also one of my favorite ones, and this is a report that no other engine offers. And really what long, dynamic URLs are, are all the URLs we have identified on your website that have too many parameters.

Eric Enge: And how do you define too many parameters?

Nathan Buggia: That definition may change over time. We change our algorithms, we may expand that out, or if we see some examples that we have too many numbers we may dial back.

What this report does is it allows you to always know what we think is too many parameters, without hard coding a rule into your system. So, the problem with long, dynamic URLs is that when you get all of those different parameters on the URL stream, the different combinations of those parameters tends to be a lot.

For example, if you have an ecommerce site, and that ecommerce site uses parameters to determine what the sort order of products is on the page, those could appear in any order on the URL, and still produce the same valid page. That could create a potentially infinite number of pages or infinite number of URLs that all result to valid pages.

That is a dangerous thing for a search engine crawler, because that could have us spending a lot of time crawling the exact same page with different URLs in your website. And that's something you'd want to avoid.

Eric Enge: Right. And then there is the inverse of that, because if you are not going to let that happen to your crawler, it means you might not be crawled that well if you have these problems.

Nathan Buggia: Exactly. That is exactly how we recommend using this report, which is to just identify the URLs in your site that you might want to go take a look at and see if you can find a shorter, more economical version of the URL.

Eric Enge: What about unsupported content types?

Nathan Buggia: So Live Search has a wide list of content types that we will expose to users. And a content type is a defined in the http header of every page that gets downloaded by a search engine. What that content type does is it tells a search engine what is on the page. It will tell if there is just text html, or if it is a Flash page, a binary application that's being downloaded.

So, in the interest of providing our users exactly what they are looking for, we generally only want to provide them things that they would expect to get from clicking on a link in search results. Unlike web pages and images, we don't want to link directly to applications, Flash files, or things like that.

What this will do is give you a list of all of the pages in your website that either don't specify a content type explicitly or specify a content type that we don't support in our search results. Both of those scenarios will mean that we are not indexing the page. So, it is another potential audit for webmasters to just go and take a look and see if there is anything funny going on.

Eric Enge: What are some of the more common things that you run into that you are not supporting that people are implementing?

Nathan Buggia: So, a great example is, I was just doing some research on microsoft.com, and there is this one team that built a little dynamic image generating tool - I think they were building charts and graphs. On that tool they forgot to specify the content type of the image, and it just worked in their browsers. It worked in FireFox, it worked in IE and in Opera.

So they thought that they were done. But the problem is, search engines weren't able to crawl those charts and graphs, and index them in their results. They weren't specifying the content type in the header, so we didn't know what we were downloading, so we couldn't index it. So, that is a great example of how you could use it to identify potential problems.

Eric Enge: Let's talk about back links.

Nathan Buggia: About a year and a half ago we removed the LinkDomain operator, much to the chagrin of the webmaster community. We promised we would bring it back. Well, a couple of weeks ago, we actually did bring it back in most of its full functionality.

So, what you can do is see all of the sites linking into your website, and then filter that based on the incoming domain. And that let's you slice and dice those inbound links to get a better understanding of linking patterns, such as who is linking to you, and that can be useful in defining your future link building campaigns.

Eric Enge: There are a few ways I could see that being useful. One is certainly just allowing you to see who is linking to you more efficiently. But also if you discover that data and you are able to leverage it. You can communicate with others and say, "Hey, do you know the New York Times linked to us?" It is very beneficial to be able to say that.

Nathan Buggia: Yes. I was talking to a publisher at SES and they were mentioning that. What's interesting about this is, they found it wasn't just the New York Times that was linking to them. They looked all the way at the end of the links, exploring it all using the filtering tool, and they discovered the different sections of the New York Times that were linking to them.

It turned out, it was just a couple of the bloggers on the New York Times really seemed to like them and linked to them quite a bit. And that can be some really valuable information, because if you know who your fans are on the Internet, you can use that to garner more future links.

So, if you are going to give somebody an exclusive story, you might want to go through and find the people who have written about you well in the past and provide them an exclusive, knowing that you are more likely to get the good links back.

Eric Enge: Right. Now, if I want to download this, I can do that, but I think the download is limited to a thousand links?

Nathan Buggia: Yes. With the Webmaster Tools, we internally use an API that limits the results sets that we can get to a thousand. What we have done to work around this is, we have implemented the advanced filtering functionality that you see on pretty much all of our reports. What that filtering functionally does, is it let's you zoom in to just the pages that you want within your website, and then download up to the first thousand of those pages.

Then you can analyze them and look at them in Excel or whatever data management system you might have built. That filtering functionality supports two levels of sub domains and two levels of the sub folders. So, between those four different levels, there is quite a bit of depth that you can scan into within your website.

The back link feature, and the out-bound link feature, filter the domains that are linking in, or that you are linking out to. For example, if you are Microsoft.com, you could go and take a look at digg.com, and delicious.com and see which of the links pointing back to you from those different sites.

Or you could even just look at dot com, dot mil, dot gov, co.uk, dot fr, and all the way down to the very top level domains. For the crawl issues' report, the filtering does allows you to zoom in on a specific portion of your website to pull the errors just for that portion of your website.

Eric Enge: Is there a plan to remove the thousand item imitation in the future?

Nathan Buggia: We are always working on providing more, deeper access to the information. We can't say exactly when we will be able to go beyond that limitation, but it is definitely something that we think about quite a bit.

Eric Enge: Can you comment more on outbound links and how publishers should use that as a weapon in their arsenal?

Nathan Buggia: The outbound links are really taking advantage of the link-from-domain operator that we used to support in Live Search, and giving you access to all of the URLs on your website that we found you are linking out to. Web publishers can just take a look at this and do a basic audit and say okay, are there sites here that I wouldn't want to link to, or is this representative of my website or not?

Eric Enge: Well, one thing that strikes me that they would be able to do is look at it and try to find all those things with a blank page score they are linking to, to see if they are linking to something they don't want to be linking to.

Nathan Buggia: Webmasters would definitely take a look at their outbound links and make sure that they are representative of the content on their website. For example, if you were peta.org (People for the Ethical Treatment of Animals), you may want to take a look and make sure that in any of your UGC (user generated content) areas, people aren't linking out to NRA.org or other sites that your users might not be excited about.

Eric Enge: Right. Let's talk a little bit about keywords.

Nathan Buggia: So, the keyword tool is a really simple tool that allows you to, for any given keyword, find out which pages in your website ranks most for that keyword. MSN.com, for example, is an enormous website that has everything underneath it from a full news magazine, to shopping, to Hotmail, to a portal and all that custom information. If you were to type in a term like digital camera, you would probably want the top pages ranking on your website for digital camera to be within the shopping portion of your website, not within the news portion of your website. So, what this allows you to do is to see within your own content, which pages are best performing for certain keywords within Live Search.

Eric Enge: That's another example of a feature that could provide a lot of insight, which is search engine agnostic. Crawlers obviously vary from engine-to-engine, but there has got to be lot in common at how they look at pages.

Nathan Buggia: Right. What this really does is gives you insight into our dynamic ranking. How we translate an expressed customer need such as digital camera into a page that we think will satisfy that need. It gives you access to our information about a whole different type of ranking from Live Search.

In addition, if you take a look at the result, we also give you a good amount of metadata. So, when was the last date and time that we crawled that page, what is the relative score of that page, and is that page blocked, or are there any penalties levied against that page?

Eric Enge: Right. So, in this context, is the Page Score that we see on the keyword page relative to that keyword search or is it just a generic score?

Nathan Buggia: That is a base authority score that we give to every page in our index. And, it does not change based on whatever you type in the query box.

Eric Enge: Right. So you address the relevance type issues at another level?

Nathan Buggia: Yes.

Eric Enge: Can you talk a little bit about any example you may have about really interesting and novel ways that people have used Webmaster Tools.

Nathan Buggia: Okay. The first reason we see people using our tools is because they seem to be interested in the ranking features. Most of the people will go in and try and understand how their site is valued, or where we think authority is within their website.

The second thing that we see people do with our tools is drill into the information we give around crawling to try and understand the different issues with their websites. Now, all major websites have issues, whether they are 404 issues, or a page is blocked by the RAP. So really what people want to do is get a handle on understanding what the issues are across their website.

If you look at microsoft.com for example, they claim to have one billion unique pages. So if there are about twenty billion pages in our index, they have one billion and they want us to index them all and that's a lot of pages. Of course we don't index that much, but we do index a couple of hundred million of their pages.

So just going through and understanding any issues with a couple of hundred million pages are pretty onerous. So they will go through and use a different filtering functionality to go and scan their most valuable sub-domains, Support.microsoft.com, or TechNet or MSDN, for example.

They will use that information to uncover any big issues with their most valuable sections, and then they will download that data into Excel and they will use that as a scorecard, month- to month. So, they will get a list of the 404 errors every month of the different REP stuff, the different long, dynamic URLs. They use that as a way to understand the progress that they are making in addressing those issues.

Eric Enge: Right. They probably uncover one layer of problems, they deal with it, and then next time around they get to the next layer.

Nathan Buggia: Yes.

Eric Enge: Right. So, is there anything you can say about upcoming plans?

Nathan Buggia: Yes, absolutely. We are continuing to ship new features. You'll probably see us ship more frequently than we have in the past as we build momentum. There are a couple of themes that we are going to continue building features against.

The first theme is really about providing transparency to how we rank and crawl our customers' websites. So, you will see us continue to add functionality that helps you understand even more of the issues that we know about on customers' websites. We will expose that in a way that is actionable for publishers so they know exactly what to do to make their website better.

The next theme that we are going to work on is around content management. One of the pillars of the Webmaster Tools is to help empower publishers to really manage their content within Live Search. So, we are going to continue the work on different features that allow them to better understand what content we have indexed.

It will allow them to automatically have content removed that is either copyright infringed or content that they had not intended to be indexed, as well as more ability to provide structured data and manage that data within our system. We have a couple other themes, but they are top secret projects at this point that we are not ready to talk about them. You will hear more about that in future, I am sure.

Eric Enge: That's great. And what is the best way for people to make suggestions for Webmaster Tools?

Nathan Buggia: We created a feedback forum about a month that customers can use to submit feature requests and we review those every week. We respond to every single comment that gets posted, and take the feedback seriously. That is very important to us.

Eric Enge: Right. Then you can see what seems to be most in demand and consider that against what you think is most useful, and how difficult it is to implement.

Nathan Buggia: Exactly.

Eric Enge: It seems to me that the focus on Webmaster Tools has grown quite a bit within the Live Search team of late. Is that a fair assessment?

Nathan Buggia: I think what you have seen is the Webmaster team continuing to build momentum. Back in November, we were just assembling the team and getting a Version 1 out, and what you've seen since then is us building a lot of infrastructure behind the scenes. We are very close to being fully staffed now, and we have just built the momentum of a development cycle. So, you will probably see faster innovation, and you will see more features continue to come out.

Eric Enge: Excellent. What are your thoughts about an authenticated way to report spam for Webmaster Tools?

Nathan Buggia: That is something that we have on our backlist of features that we would like to implement, we are very interested in doing that. So yes, you will probably see that at some point, although I can't comment on exactly when.

We definitely want to make it easier and easier for the community to provide us feedback, and provide us information. We are also at most of the big tradeshows. We are currently at most all of the ones in the US, and over the course of the next year we are really pushing hard to be at all of the tradeshows worldwide.

If you don't want to provide feedback on our forum, just come and grab one of us at one of the events. If you don't talk to me, it will get back to me. It is a very tight community inside Microsoft.

Eric Enge: Thanks Nate!

Nathan Buggia: My pleasure!

Have comments or want to discuss? You can comment on the Nathan Buggia interview here.

Other Recent Interviews

About the Author

Eric Enge is the President of Stone Temple Consulting. Eric is also a founder in Moving Traffic Incorporated, the publisher of Custom Search Guide, a directory of Google Custom Search Engines, and City Town Info, a site that provides information on 20,000 US Cities and Towns.

Stone Temple Consulting (STC) offers search engine optimization and search engine marketing services, and its web site can be found at: http://www.stonetemple.com.

For more information on Web Marketing Services, contact us at:

Stone Temple Consulting
(508) 485-7751 (phone)
(603) 676-0378 (fax)
info@stonetemple.com
  Mon, 08 Sep 2008 18:45:58 +0200
Published: September 8, 2008

Frazier Miller, General Manager for Yahoo! Local

Frazier Miller serves as the general manager for Yahoo! Local, one of the most visited local network of sites online. Bringing a strong background in product management, Frazier is responsible for setting overall strategy and prioritization for Yahoo! Local properties. He currently oversees all of the products that are core to Yahoo!'s ongoing local initiatives, including Yahoo! Local, Upcoming and Yellow Pages.

Prior to his role with Yahoo! Local, Frazier ran product management for Yahoo! Messenger, a leading communications product. Here he was responsible for growing the Yahoo! Messenger business from 60 million users a month to over 100 million users. During his tenure leading the Yahoo! Messenger team, Frazier also oversaw its expansion into the VOIP market, growing both users and revenue from the ground up.

Prior to Yahoo!, Frazier served in a variety of product management roles with high-growth technology firms including director of product management for BEA Systems and Nimble Technology.

Frazier holds a bachelor's degree in history and English from Dartmouth College and a master of business administration from Harvard Business School. When not working, Frazier enjoys the outdoors, skiing, wind-surfing, and spending time with his two children.

Shailesh Bhat, Product Manager for Yahoo! Local

Shailesh Bhat serves as a senior product manager for Yahoo! Local - one of the most visited local search sites online. In this role, Shailesh is responsible for driving key monetization efforts and helping build and maintain features that increase the overall Yahoo! Local consumer experience.

Prior to working with Yahoo! Local, Shailesh worked at Yahoo! Bangalore as a product manager for the mobile products and market innovation group. In this position, he conceptualized and helped drive new products to-market for India's quickly growing mobile user base. Shailesh brings nearly a decade of experience in product management, sales, and software development to Yahoo!, having also held earlier positions with MitoKen Solutions and Quark.

Shailesh holds a Masters degree in Business Administration from the Indian Institute of Management and a Bachelor of Engineering from the BMS college of Engineering, Bangalore.

Interview Transcript

Eric Enge: Can you talk about the types of data sources used in local search by Yahoo! to find the business information? I am specifically focused on business listings.

Frazier Miller: Right. There are a number of sources that we use to do this. It can be a confusing world and there are a lot of different data providers out there. But, in general we rely pretty heavily on several areas. One is the licensed feeds that we get through data providers like InfoUSA, Acxiom, and Localeze. We rely heavily on these to build our backbone.

But, we realize the data changes very quickly and there are better ways to update that data quickly, so we rely heavily on merchants and give them the ability to update the information themselves directly. We are also using our users more and more to help us update and correct information on the site as well. And so, we have all three of those different areas as ways of building out the database and making sure we have our information as accurate and up-to-date as possible.

Eric Enge: Right. Sources like InfoUSA, and Acxiom, and Localeze, provide authenticated feeds, right? In other words, the data provider has gone through some level of trouble to validate that the location information for a business is correct.

Frazier Miller: Yes. That is a lot of the service they provide and the reason why people like us pay them for their feeds. They have people who are calling on the businesses ensuring the freshness and accuracy of their information. This is important because statistics show that over 50 percent close in their first five years of business (U.S. Small Business Administration).

You also have far more businesses than that whose phone numbers or addresses are changing, so we find that it is not sufficient just to rely on these authenticated sources, and that we need to take other means to do that. We are relying on multiple authenticated sources, and some have strengths in one area and others have strengths in others.

So, we need to provide a layer of heuristics to select which of the data providers has strengths in given areas. We will then compare that data provider's information against these other sources like user input and user feeds, which can be good. But there are also some pitfalls because you can't rely solely on that. We will match and compare data sources of these different types to ensure we get the latest and most accurate information.

Eric Enge: Right. With user contributed data, for example, you have the risk that it is intentionally incorrect.

Frazier Miller: Right, there are fraudulent cases. You can imagine a case when someone comes into a competitor site and says, oh, here is their phone number, and they are providing the phone number to their own site, although we don't see this very often.

We do have human and manual moderation that goes on for changes, so consumer submissions all go through a moderation process where we look for patterns and we actually do validation of data to make sure it is accurate. There are a number of steps we use to try to keep fraud and spam in control.

Eric Enge: Right. A business owner is able to submit data directly to Yahoo as well.

Frazier Miller: Absolutely, there are a couple of different ways. They can add their information for free, as we give business owners the ability to do so, and they can correct and update information. But because we don't have credit cards and we don't have authentication against that business owner, it can stay open for other people to add or change as needed.

We don't lock the information down to that particular user ID or that business. When they become a paid business partner, they basically lock up their profile so that they control the majority of what is said in that profile. When they pay, we obviously have more ability to authenticate and make sure that it is the right business.

Eric Enge: Right. How do you handle discrepancies between providers?

Shailesh Bhat: What you are referring to is essentially how we handle the merge of all these entities. From any of those providers you could potentially get the same listing, and the value of getting the listing from multiple sources is essentially the enrichment that we would get. One source would give one data set for the same listing as compared to others.

So you are right, we need to have heuristic models in order to arrive at the best aggregated listing and the most comprehensive information for that particular merchant's listing.

Eric Enge: Right, so if the business pays, then they control the listing, within the constraints of editorial judgment, of course. If they say the address was 41 Temple Street, and InfoUSA says it is 39 Temple Street, you use 41 because you have an authenticated paid relationship, is that right?

Shailesh Bhat: That is right. But for the specific example that you used of address, we would also use other parameters like geo-coding. So for example, if that specific address did not geo-code to an actual location on a street for example, then we would handle that listing. So it may even get rejected even if it is from the paid feed.

Eric Enge: So, you are calling out the example where they say it is 41, but 41 doesn't exist as an address?

Shailesh Bhat: Exactly.

Eric Enge: And so, that brings up the whole larger point here, the huge problem is that businesses forget to update their data, or they might not even remember that it is necessary, and changes are happening all the time as you pointed out. You gave the example of 50 percent of businesses closing their doors within five years.

There are also businesses with two hundred locations, and they just added ten new ones, and four of them changed addresses and three others changed their phone number. It is an arduous administrative task for the business, right?

Frazier Miller: Yes, one of the things that we encourage our businesses to do is to update the backbone providers when they have a change in information. If they change their phone number, it is not just a matter of calling up the telephone company and saying my phone number is changed.

There needs to be awareness about all the other providers who are picking up their information and getting it out there. So it is no longer as simple as just the yellow pages, right? We find that paid businesses make more of an attempt to do this. It is not often that we have people who are just paying year-after-year who are not updating or managing their listings in an active way. So, there is certainly a reminder there. We do have tools for merchants when they have a change, but that falls under a paid relationship with the merchant.

We don't have anything for free where someone can just come and update a mass amount of data. Getting back to the problem of abuse, you could put out a lot of data in a pretty short period of time, so we generally make sure that there is a paid relationship with a merchant and then have the ability to update via feed.

Eric Enge: So bulk submission directly to Yahoo is not available in the free service?

Frazier Miller: That's right.

Eric Enge: Do you do any validation yourself in any of these scenarios?

Shailesh Bhat: Yes. So for example on the self-serve side, which is essentially listings.local.yahoo.com, that is where a merchant would go to submit a listing. All listings that get fed in through that mode go through a manual moderation. A similar edit is possible from the details page of the listing itself.

There are links on the details page of a listing that would allow someone to clean that listing and submit changes. Even that would go through manual moderation.

Eric Enge: Right. So is there any scenario in which you send an automated fax, or email, or give a phone call to validate contact data?

Shailesh Bhat: In an automated fashion to all our listings, no, we do not do that. There may be specific instances where we would reach out, especially if there are multiple listings that have been provided from one particular ID or something like that.

Eric Enge: More of a spot-check type scenario?

Shailesh Bhat: No. When it is manually submitted, each change that has been made is viewed by someone, but it does not mean that we call each of them.

Frazier Miller: However, there are sometimes exceptions where we will reach out and call businesses, but that is not a standard practice. One of the standard practices we do have though, which is very important to our users, is the URL checks, especially where you get SEMs and others who are intermediaries for these businesses and are trying to get direct clicks into their representation of the business. It is fairly easy to do a more mass scale check via a URL address validation to make sure we are showing the business itself instead of an intermediary.

Eric Enge: Right. So, if you are a business and, for example, you have more than a hundred locations, what is the smartest way for them to manage this relatively confusing situation?

Frazier Miller: I think it comes down to making sure that the core data providers, like InfoUSA, Axiom, and Localeze, are up-to-date.

They are super eager to work with you. They have a business model and they can go and charge folks like us to handle these feeds for free for the business. And then, we get the feeds directly. For a lot of these businesses, it is not just about Yahoo, it is about yellowpages.com, and Superpages, and Google, and a whole set of directories.

So again, I think that is a smart way for businesses to go about keeping their data up to date, whether it is an individual business or a chain. And then, with regard to us entering into a paid relationship, we are certainly a leader in the marketplace and continue to command a lot of market share.

We have a couple of paid products. We offer enhanced listings which are basically $10 a month per listing, where they can enhance the data fields and write a description of the business. There are a couple of additional URL links that they can use and they have the ability to control. We also give them other tools for them to manage their listings.

Eric Enge: Now, if somebody has hundreds or thousands of listings, do they get better per listing price?

Frazier Miller: Bulk discounts?

Eric Enge: Yes.

Frazier Miller: We are always open to negotiation. It generally has to be a lot of listings. When you get into the thousands, like a Starbucks they generally may also be making search marketing buys or display buys. So generally, they tend to be an account that will manage more proactively, so we will absolutely look at how to get the deal across a number of different advertising products.

Eric Enge: What is the best way for someone to engage with the paid listing product if that is what they want to do?

Frazier Miller: We have a self-serve portal that they can go and provision themselves up and running. It is a very easy five-step process. For the larger guys it would be to engage our inside sales team, which also is a relatively straightforward process. The best starting point for reaching them is at advertising.yahoo.com.

Eric Enge: Do micro-formats play a role or is there a possibility of that occurring in the future as a way of people sliding authenticated data direct from their website?

Shailesh Bhat: Right now, the biggest use of micro-formats and RDF is all in the context of the SearchMonkey application.

Eric Enge: So you are not using it as a way of extracting location information for Yahoo Local at this time?

Shailesh Bhat: Right now, on Local we are not. But we will keep it open as an approach.

Eric Enge: The next question is about cleaning up the old records, and I think we have probably discussed the importance of that, but the corresponding question is when you have conflicting records, does that essentially dilute the strength or the ranking of your listing?

Shailesh Bhat: Let us go back to the heuristic model that we were talking about earlier. Essentially, the aim is to have as few duplicates as possible. So if there are four or five listings of 39 x street and 41 x street and so on, the intent is to use other parameters to figure out whether they are same listing, and at the end of the day, publish one single listing.

It does make it pretty complex because at 39 x street you may have two businesses running, right? Or it could be different addresses altogether. So we run the algorithm to try and merge listings. And in such cases, depending on the confidence we have on the source, we would potentially not accept some of the listings from a given source.

Eric Enge: Let us say you do accept it, and it passes the threshold of enough confidence to accept the record, you probably have some way of calculating your confidence level in the data. It would seem to me that it would be very logical for something with a lower confidence level to just be presented with a lower ranking in the results.

Shailesh Bhat: Right, so this one factor in itself may or may not be a tilting factor on a specific record. What I am saying is, it is quite possible that just because we have conflicting data from two sources for a record does not mean that it will not show up as the first result for a given query. There could be other parameters, like the details on that record, the type of query that is there, the keywords, or what exactly the query term is and so on. But, this can be one factor.

Eric Enge: Right, fair enough. So let us talk about the other ranking factors if we could. Just so that you know, I have drawn these largely off of a post by David Mihm in which he interviewed a bunch of local SEO gurus and this is more or less what they came up with as important local ranking factors.

Shailesh Bhat: The distance parameter in itself is definitely one factor, but I think it is a slightly overrated factor in many cases. Categorization, I think, is especially important because of the way queries get generated. Your question is: how is this verified?

When people claim a listing, they typically give websites. Also, the description that they provide is used to make sure that the categorization is correct when there is a submitted listing on the self-service side.

In cases where we do not have a self-serve listing, where the merchant has not provided any data, but we have data from other sources, we essentially look out for the degree of agreement between various sources. That is one heuristic element that helps us in figuring out the right category.

Product keywords in the record are very important. Reviews, ratings; essentially you could say depth of content when a merchant submits a listing is a factor that is useful.

Eric Enge: Right. So along those lines, the depth of content emerges in two dimensions. One being the quality and the depth of the data you get. The second way of looking at it is the number of times it is referenced across the web on yellowpages.com and local.com and sites like that, but also on other websites. If a business is extremely frequently referenced on other web sites, is that likely to count for something?

Frazier Miller: Yes, it's a subtle difference, but one that I think is constructive. Web search looks a lot more at the web index and prevalence of things, whereas in the local context, which really is listing specific, we look at that a lot less. There are some things we do, we go out the web to validate, we go to our web index to validate, but it doesn't play a huge role.

Some of that is back to reasons that we talked about before. InfoUSA has undue weight in the web index for a listing because they publish through so many people. And so, it could be that we've gotten a couple of comments from users that all corroborate with each other.

So, some of it is important because you've got these big content giants who unduly influence the web. But it's an interesting area and certainly there is more to explore with how we use the web index, but it doesn't play a huge role today.

Eric Enge: Right, I understand. Let's start switching gears and talk a little bit about mobile integration.

Frazier Miller: Local searching, local queries, maps, and directions are some of the top use cases, obviously, on mobile devices.

We also have a product called oneSearch. It assumes that you are searching across your email for a given user's name, or maybe it's a web search, or maybe it's a local search. And so, they have an algorithm at their level to basically differentiate different types of queries that go into oneSearch, and whether it's more likely to be a local query versus a web search query versus an email inbox query.

And then, the queries that come out to us use the same index and the same database that we provide the web search to present their results. In most cases, there will bias factors like distance, if they know the distance, or if they know location of the user, so they will bias factors like that in their algorithm, but generally it is the same dataset and similar ranking algorithms.

Eric Enge: What about advertising opportunities in mobile?

Frazier Miller: Yeah, advertising is really emerging in the mobile space and we as a company found a lot of success there, but it has largely come from larger national advertisers who are pretty sophisticated about advertising. It also has a bias towards more display and brand advertising to date as opposed to search marking or keyword terms.

I think that's just a product of sophistication of the market and the users, and there is certainly a lot of search inventory happening on mobile devices. But it's extending fairly quickly as more and more people are browsing the web and using the mobile phone for doing searches.

So, there is nothing unique or specific, there are no products that we provide to businesses like we do on the desktop to say, oh here you can appear in this set of pixels for local queries on the web. The enhanced listings have certainly become part of the core-index, so that factors in just as it would factor into the web listings. You know desktop listings, but it operates much in the same way on mobile.

Eric Enge: What about geo-targeting options?

Frazier Miller: We see a lot of geo-targeting happening with advertisers. In fact, at the keynote that I spoke at a few weeks ago at SMX Local, we released some data that we've seen a 200% increase in geo-targeting advertisers in the last twelve months at Yahoo, and so advertisers are definitely getting the importance of geo-targeting their message.

The context for that I think is trying to minimize marketing spend in a tough economic environment, and if they can bid only on keyword search terms or on display ads in a given DMA or a given geography, then they prefer doing that because it saves them on costs. We are very bullish on geo-targeted advertising and continuing to add lots of tools and capabilities to allow ever more granular geo-targeting. As a company we are very excited about this.

Eric Enge: You are planning to increase the granularity in which you offer geo-targeting?

Fraz