My SEO Kung Fu Is More Powerful Than Your SEO Kung Fu

So, the headline is my analogy for a conversation I seem to have had so many times. When an SEO does some competitive analysis and sees that, their efforts seem (to them) to be better than the competitor. And yet, the competitor seems to have more visibility at search engines. The conclusion they so often arrive at is that, there must be something missing in their SEO tactics, or that the competitor must be doing something sneaky.

However, the answer has, more often than not, to do with the way that search engines analyze end user behavior and fold that into the mix. There’s a whole lot more going on under the hood at search engines that can affect what ranks and what doesn’t. And more to the point, what frequently gets re-ranked as a result of end user intelligence.

Without getting too deep in the weeds, I want to take a little look under the hood to highlight some of the techniques that are pretty much standard in information retrieval terms, but rarely get a mention in SEO circles.

Did Google Just Mess Around With That Query?

Let’s start with the query itself. We imagine that the end user inputs a certain number of keywords and that a search engine then looks

for documents that contain those keywords and ranks them accordingly. However, that’s not always the case. Frequently, documents in the corpus are more relevant to a query, even when they don’t contain the specific keywords submitted by the user.

That being the case, by understanding the “intent” behind a keyword or phrase, a search engine can actually expand the initial query. Query expansion techniques are usually based on an analysis of word or term co-occurrence, in either the entire document collection, a large collection of queries, or the top- ranked documents in a result list. This is not at all the same as simply running a general thesaurus check, which has proven to be quite ineffective at search engines.

The key to effective expansion is to choose words that are appropriate for the “context” or topic of the query. A good example of this would be where “aquarium” would be a good expansion for “tank” in the query “tropical fish tanks.” That would mean if you’re specifically targeting the term “fish tanks” but a page (resource) talking about “aquariums” proves to be more popular to the end user, then that’s the one most likely to be served. And subjective as it is, it’s the quality of the content end users are happy with, regardless of whether the actual words they typed appear in the content.

There are a number of different techniques for query expansion. But how does a search engine know that the expanded query provides more relevant results? The answer is “relevance feedback.”

Implicit data provided by the end user gives huge clues as to what are the most closely associated query terms. Early expansion techniques were focused on expansion of single words, but modern systems use full query terms. What this means is that semantically similar queries can be found by grouping them based on relevant documents that have a common theme, rather than (as already mentioned) the words used in the query.

This rich source of relevance data is then bolstered with click-through data. This means that every query is represented by using the set of pages that are actually clicked on by end users for that query, and then the similarity between that cluster of pages is further calculated.

Techniques for relevance feedback are not new; you can trace them back to the early Sixties. However, in this new realm of “big data” what I have described above (in a the most basic way possible to keep it simple) actually provides the “training data” (the identified relevant and non-relevant documents) for “machine learning” at search engines.

What The Heck Is “Machine Learning?”

It’s a subfield of artificial intelligence (AI) concerned with algorithms that allow computers to learn stuff. Keeping it simple, an algorithm is given a set of data and infers information about the properties of the data – and that information allows it to make predictions about other data that it might see in the future.

So, having mentioned click-through data above, let’s dig just a tiny bit deeper into the importance of “implicit end user feedback.”

Whenever an end user enters a query at a search engine and clicks on the link to a result, the search engine takes a record of that click. For a given query, the click-throughs from many users can be combined into a “click-through curve” showing the pattern of clicks for that query. Stereotypical click-through curves show that the number of clicks decreases with rank. Naturally, if you interpret a click-through as a positive preference on behalf of the end user, the shapes of those curves would be as you might expect, higher ranked results are more likely to be relevant and receive more clicks. Of course, with search engines having access to “user trails” via toolbar and browser data (literally being able to follow you around the web) they now have an even richer seam of data to analyze and match for ranking.

Learning From Clicks.

Google and other search engines receive a constant stream of data around end user behavior. And this can immediately provide information about how much certain results in the SERPs are preferred over others (users choosing to click on a certain link or choosing not to click on a certain link). It’s no hard task for a search engine such as Google to design a click-tracking network by building an artificial neural network (more specifically, a multilayer perceptron (MLP) network). And this is a prime example of “machine learning” in action. No, I’m not going to explain the inner workings of an artificial neural network. There’s tons of data online if you’re so inclined to go find it.

But I do want to, in a simple fashion, explain how it can be used to better rank (and often re-rank) results to provide the end user with the most relevant results and best experience.

First the search engine builds a network of results around a given query (remember the query expansion process explained earlier) by analyzing end user behavior each time someone enters a query and then chooses a link to click on. For instance, each time someone searches for “foreign currency converter” and clicks on a specific link, and then someone else searches for “convert currency” and clicks on exactly the same link and so on. This strengthens the associations of specific words and phrases to a specific URL. Basically, it means that a specific URL is a good resource for multiple queries on a given topic.

The beauty of this for a search engine is that, after a while the neural network can begin to make reasonable guesses about results for queries it has never seen before, based on the similarity to other queries. I’m going to leave it there as it goes well beyond the scope (or should I say purpose) of this column to continue describing deeper levels of the process.

There are many more ways that a search engine can determine the most relevant results of a specific query. In fact, they learn a huge amount form what are known as “query chains,” which is the process of an end user starting with one query, then reformulating it (taking out some words or adding some words). By monitoring this cognitive process, a search engine can preempt the end user. So the user types in one thing at the beginning of the chain and the search engine delivers the most relevant document that usually comes at the end of the chain.

In short, search engines know a lot more about which media are consumed by end users and how, and which is deemed the most relevant (often most popular) result to serve given end user preferences. And it has nothing to with which result had whatever amount of SEO work on it.

I’ve written a lot over the years about “signals” to search engines, in particular the importance of end user data in ranking mechanisms. In fact, it’s coming up to ten years now (yes, ten years!) since I first wrote about this at ClickZ.

And, on a regular basis, I still see vendor and agency infographics suggesting what the strongest signals are to Google. Yet rarely do you see end user data highlighted as prominently as is should be. Sure, text and links send signals to Google. But if end users don’t click on those links or stay on a page long enough to suggest the content is interesting, what sort of signal does that send? A very strong (and negative) one I’d say.

So, going back to the headline of this column. Next time you scratch your head, comparing your “SEO Kung Fu” to the other guy, give some extra thought to what search engines know… And unfortunately you don’t.

“Take no thought of who is right or wrong or who is better than. Be not for or against.”

BRUCE LEE, MARTIAL ARTIST & ACTOR 1940 – 1973

Originally Posted At Tech Marketing news

footer-ad-1200x120

Develop Content Around Intent

intent

As marketers continue to guess what consumers want, understanding the ‘intention economy’ could ensure less monetary waste on ads, more relevant content, and greater customer loyalty.

Intention is a state of mind. A driving force behind human behavior powered by belief, desire and goal. It has been studied much in such disciplines as philosophy and psychology. And now, intent computing is a growing area of research, particularly in the field of digital marketing. It is based on predicting the probability of a specific intention held by an end user by applying machine learning and data mining methods.

Search engines have the hugely difficult job of having to process end users’ information need in the form of queries entered into an oblong box on a web page, by the billion. And a major part of the problem they have is that end users have no real idea how to solidly communicate their information needs. This is not a skill, which is generally taught in school (limited subjects, such as library and information science, law, and chemistry, perhaps). So this is why information systems have to be designed to elicit information from users rather than expecting the end user to volunteer it. End users simply have neither the knowledge, nor the desire to devote much time or energy to this. And the success of trying to determine intent behind a query, is crucial to the success of search engines, such as Google and Bing.

Web queries, generally, are very short. Numbers differ from study to study, but consistently we discover that most queries are still around three terms long. And, of course, the topics of these queries cover the length and breadth of all human interests, such as health, entertainment, commerce, and the number one subject of them all… Take a guess anyone? Yep. Sex.

Given that I’ve already touched on the stumbling block of the search engine/end-user-as-a-dumbass situation, when it comes to actually entering a query, we should also take into consideration that search engines, even though complaining about the problem of short queries initially, discouraged longer queries. I’ll write another column on how search engines have transitioned from needing an exact match between terms in the query and also in the link anchor text, so as not to exclude equally relevant results that may not actually contain the terms. Many times I see a result that is totally relevant to the query, and yet the query terms don’t actually appear anywhere on the page.

I’m very proud to have been one of the first people in the online marketing world to have interviewed Andrei Broder, distinguished scientist at Google (although he was chief scientist at Alta Vista at the time of the interview) following the publication of his seminal paper “A Taxonomy of Web Search” based around the “informational need” of the end user at search engines. He broke it down to three types of search behavior: Informational; Navigational; Transactional.

Broder’s original work has been expanded upon significantly, including a hierarchy of user goals. The “transactional” category is also known as “resource” category. A local type query also adds to understanding intent, and the stronger the “commercial” element, the stronger the intent to purchase is indicated. This is a strong signal as to whether or not to show ads, of course.

The last study related to query volume I had sight of, suggested that as much as 80 percent of queries at search engines fall into the “informational” category, while the rest are split fairly equally between “navigational” and “transactional” queries.

Ultimately, click-through curves provide a huge signal to search engines, when determining the intent behind a query. We know already that the top 10 results get the most clicks. But this is much more complex than how many times a particular result gets clicked on. There are many more clues given about intent by studying the percentage of click-through for the top ten ranked results. Each click-through represents the first result clicked by a different user entering the query.

Like many others in the digital marketing world, I firmly believe that, gradually, we are moving away from our current “attention economy” into a newer “intention economy.” It is well documented that we have access to more information and more resources than ever before. At the same time, the capacity for producing information now vastly exceeds the human capacity for processing it. The need to examine large quantities of information in such short spaces of time affects our decision-making processes. We suffer from information overload, difficulty, or impossibility of processing much of it, the irrelevance or non-importance of most of it, and lack of time to understand it. Plus, there are multiple sources containing the same information. Yes, this is a gift of the world wide web that just keeps giving.

It’s said that nearly everyone in the modern world is influenced, to some degree, by advertising and other forms of promotion. However, dramatic transformations in the way that we are served marketing media, and the way we consume media generally, is fundamentally changing the art and science of advertising and marketing. A recent survey quoted by Inc. Magazine stated that 70 percent of consumers want to learn about products through content as opposed to traditional ad methods.

We are moving forward into a whole new world of marketing communications. A world where one way mass media advertising is being replaced by a multitude of two way media channels. The marketer and the audience are both senders and receivers in what is becoming more of a digital marketing dialogue.

Audience intelligence, the real “big data” in the marketing world, is the fuel that will power what I’ve already referred to as the “intention economy.” The more we understand the end user (and the more they understand us, the marketer), the more relevant (and therefore more useful) we become to each other.

As co-author of “The Cluetrain Manifesto” and, more recently, “The Intention Economy” Doc Searls states, this new economy will outperform the attention economy that has shaped marketing and sales since the dawn of advertising. Customer intentions, well expressed and understood, will improve marketing and sales, because both will work with better information, and both will be spared the cost and effort wasted on guesses about what customers might want.

Perhaps my favorite quote from his recent book is: “The Intention Economy grows around buyers, not sellers. It leverages the simple fact that buyers are the first source of money, and that they come ready-made. You don’t need advertising to make them… The Intention Economy is about buyers finding sellers, not sellers finding (or “capturing”) buyers.”

Imagine, once you understand end user intent, the likelihood is, you’ll develop the appropriate content to satisfy that informational need. Here’s something that kind of sums it up. During a chat with my friend Avinash Kaushik, digital marketing evangelist at Google (it’s recorded if you want to view it, here), he sort of dismissed advertisers’ use of conventional demographics to target audience segments.

The short story goes something like this: “Imagine you’re targeting a demographic group that includes 80-year old females, and you’re trying to sell them wheelchairs. How do you know they want one? Now imagine if you knew something about their intent, like for instance, if you knew they had been online looking at the Apple store, you’d be selling them an iPad, not a wheelchair. That’s the difference: understanding INTENT!”

Originally posted at Tech Marketing News