I have been thinking a lot lately about topical keyword research and how this plays a role in SEO, content hierarchy and the data science approaches we use to accomplish these ideas. Let me back up…
Topical Keyword Research and Intent-based Search
Whether you’re an SEO or just someone who’s observed Google search results over time, it’s clear that over the years Google SERPs have become much more “semantic.” But what does that really mean?
In short, computers have used natural language processing (NLP) to better understand how human language works. That might be understanding synonyms, crafting results based on which device type you are using or any number of things. But the bottom line is that Google has moved away from showing results that are heavily keyword-based (returning pages that contain the exact phrase you typed or something very similar) to more of a semantic or intent-based approach where the results might contain the keyword you searched, but they are more concerned with showing results you intend to see and understanding if something related is a better result and does not contain your exact keywords — that’s okay.
How Does this Impact SEO?
In a big way. And this is not new, but we need to approach keyword research and craft content around intent and not specific 1:1 keywords. In other words, we should get a list of keywords, cluster them into intent groups and then build content based on intent groups instead of individual keywords. This is ultimately what the user wants — not a a bunch of slight variations of content that are more or less similar. And Google theoretically will rank this content well if it meets user intent and they can connect it to the query.
How is this Related to Data Science?
For one, clustering is a big topic in data science and can be executed in R. There are no doubt SEO tools out there that exist, but if you want more control you might consider supervised or unsupervised clustering in R.
I can see a bigger picture here as well. As we craft our content based on intent and clustering, we can almost take a testing approach to content and site information architecture in the future. Basically, one could build out their intent groups and with that list merge content that is part of one cluster or fill out any gaps. Over time analytics should show how users move through the funnel and if any steps are needed to provide an easy path for users (a path to whatever your goal happens to be).
But I think there is a paid media tie in here as well. Not often enough do we look at paid media performance from an SEO perspective and document which keywords drive conversions versus which are more informational. We should be using that information to learn how to build out information architecture as well. It should be an additional layer to better understand how to break up similar content throughout the user journey and confirm which keywords belong to which bucket.