If you want to rank your domain & pages higher on Google results, your backlink profile is one the most important factors. It’s also one of the hardest & most expensive items to take on, especially at scale.
A simplified view of approaching your possible backlinks is:
- Quality: How authoritative is the domain & page of the source? Higher authority is obviously better.
- Relevancy: How close is the source link to your domain topic clusters, and specific target page topic clusters? The more relevant the better. For example if your domain & page have content about fish tanks, getting a link from a page about airports would have less value for you, even if its high authority.
- Volume: If your backlink profile is healthy (quality & relevant links), and excluding possible penalty issues, generally the more inbound links the better.
It’s really hard to nail those 3 factors at the same time. Usually looking at link building approaches, you would either:
- Go for high volume, but lower quality and relevancy. Most link building agencies go for this approach. Essentially they build content for you, and they have a network of publishers whom you pay (directly or indirectly) to create links for your content.
- Go for higher quality & relevancy, but probably miss out on volume. This is the concept of Barnacle SEO, where you essentially use existing ranking pages for your ranking terms and try to acquire links.
Barnacle SEO at Scale (BSaS)
Barnacle SEO is great, since you target very high relevancy pages. The main drawback is that it involves a lot of manual work, and a limited amount of available opportunities to run at scale. This post has to do with a set of scripts & processes that attempt to alleviate those issues so you can expand your high relevancy links at a faster rate. We will call that Barnacle SEO at Scale (BSaS).
I will be doing a step-by-step walkthrough of the BSaS process for one of my favorite SaaS platforms as a case study — Optimizely. Optimizely is one of the leading AB testing / experimentation platforms, with worldwide reach.
We will be looking into their status in their backlink universe, and you will get a handy workbook to identify backlinking opportunities that they could work on. That doesn’t mean that this could only work on B2B — I recently ran the process with a B2C platform and we managed to get 70 super high relevancy backlinks in 2 months. Here’s a snapshot of the progress we documented:
Note: This is likely to be a long one, so if you are interested in running BSaS for your project or want to learn more, feel free to reach out at firstname.lastname@example.org
Step 1: Ranking terms
The first step of BSaS is to document the actual terms that we would like to rank for, or improve our existing positions for.
BSaS comes with a handy template like the one below, where you can specify the ranking term, location (in case you want to focus on a specific country & city), the language of the term and tags that might be useful contextually later. There is also an optional column to define your target ranking page, which is helpful for a use case we will see later on.
I went ahead and added some terms that are relevant to Optimizely, but that obviously will scale to 100’s or 1000’s of terms, when running the process in earnest.
Step 2: Search volumes for ranking terms
We use the TrafficEstimatorService of the AdWords API in order to get volume metadata of the ranking terms. This will be used to prioritize the opportunities later on.
Here’s how this looks like:
Step 3: SERP scrape
For each combination of ranking term + location +language, we crawl the top 30 results on Google, through a localized crawler. There are a lot of APIs available on the web if you search for “serp scraping”, or you can of course build your own.
For each of the top 30 results we capture the URL, domain, position, and calculate a “volume score”, which is basically a (definitely not 100% accurate) estimation of the traffic each result would get. The reason this is called “volume score” and not “traffic” is, again, because it’s not accurate, but it’s good enough to give a relative difference between the options.
The volume score is based on CTR by position estimates, like this one.
Here’s a snapshot output of the crawl, so you can understand the data structure.
Step 4: SERP results tagging
Even for only 25 ranking terms which we selected on the Optimizely case, there will be more than 500 URLs available on the crawl output. A lot of them would not be relevant for acquiring links, as they are either competitor results, our own brand, or non-approachable platforms. So these results need to be tagged and filtered. You can either create the rules manually, or in my case use competition identification software and an ever expanding set of rules for the tagging of non-approachable domains.
The filters are applied first to the domain level and then exposed to the URL list as follows:
Step 5: SERP results crawl
After filtering, we also gather interesting meta data from crawling the result URLs themselves and capturing information about the pages. Some of the data captured are:
- number of external links: the number of links on the url that point to external domain. This is an indication for the likelihood that they would be willing to link to other sites.
- number of ad links: the number of links on the url that are either advertising network links or affiliate links. This is also an indication for the likelihood that they would be willing to promote other services.
- has brand link: Binary field in case the URL already has a link pointing to our brand. This is useful for reporting purposes, so we can have scores about our “barnacle share”, and to filter options further.
- has brand mention: We check if the HTML of the page mentions “optimizely” on the page. This is handy for claiming easy links.
- number of competitor links: We capture the number of competitor that this URL has links pointing to. This is useful to reporting on competitor barnacle scores, and prioritizing these URLs (if competitors have links, we could also potentially get them).
- is competitor link higher: An interesting data point, in case both competitors and we have links, but the competitor link is higher in the HTML.
- is brand link correct: For each ranking term, we added the ranking page earlier. In case we do have a brand link but it’s pointing to a different URL, this is flagged here.
- is error: This is a flag if the SERP URL actually could not be crawled, because it through some kind of error. This happens more often than you would think.
- has 404 target: This is a flag that is true if any of the links that the SERP URL points to is an error page.
A part of the metadata of the SERP results crawl looks like this:
By this point we have a pretty good backlog of potential high relevancy link acquisition opportunities, and some interesting prioritization options and reach out cases to build scripts for.
In terms of possibilities, we are only half way there, as there is a further level to look into, and a whole set of enhancements. This is digital marketing today: To have an edge, you need to go the extra mile.
Before we go into the next steps, let’s have a look at how Optimizely is faring in terms of their barnacle backlink universe.
Quick view of the data so far:
From the 25 ranking terms we added as input, there are 390 unique domains ranking with 600 unique URLs on the search results.
The “addressable” URLs (excluding brand, competitor, unapproachable) are 512 from 356 domains.
A more insightful way to analyse the data is not in terms of counting URLs, but to use the “volume score” we calculated earlier, to add weight to the potential opportunity. The idea is that it’s better to get a link from a URL on the 1st position vs 30th position, and the higher the search volume of the ranking term, the better.
- addressable_volume_score: The sum of volume score for addressable URLs
- brand_won_volume_score: The sum of volume score for URLs that have links pointing to optimizely.com
- competitor_won_volume_score: The sum of volume score for URLs that have links to optimizely competitors
As you can see Optimizely is doing pretty well in terms of backlinks on the SERP results, as it’s “capturing” about 15% of the addressable points, and its higher than all the other competitors combined.
The same view broken down by ranking term:
By top domains:
Some other interesting data points are:
- brand_mention_not_won: URLs where “Optimizely” is mentioned, but there is no link
- brand_won_wrong_url: URLs where Optimizely has won a link, but it is different from the one defined as the “ranking URL”. This usually happened when we have a target ranking URL, but the backlink points to the homepage instead
All of these can be obviously used as metrics to track across time. They all come from a single table, which you will be able to explore at the end of the post. We won’t go into hunting for opportunities yet, as we’ll save that for the end of the article.
Step 6: SERP results backlinks
So far we have used Google’s prioritised results as an indication of the relevancy and quality of the pages to approach. However, as we said initially, backlinks are a big part of making a page rank, so it is insightful to also uncover the source pages that pointed backlinks to the high ranking pages, and thus got them to rank higher.
Here’s some of the info the the Mozscape API gathers for each URL:
The second column is a whole new set of potential backlinks that helped the page rank, and potentially could be easier to reach out and get links in. To keep things sane, I tend to limit the backlinks per SERP results to the top 25 sorted by propensity. That means that for 25 keywords, you can get up to 18.750 backlink opportunities. Obviously they tend to be a lot lower due to to duplication and filtering.
Step 7: Link quality scoring
Now that we have a large number of URLs that we could potentially get links from, we’ll need a unified process of prioritizing them in a way that maximizes the delivered outcome, which is of course better rankings.
To do that we run all the urls (excluding brand, competitors, irrelevant) from the Mozscape API and get quantitative information about the link quality, like the page authority, domain authority, spam score, linking domains to the page etc.
Step 8: Programmatic contact information sourcing
The main idea of BSaS is to be able to run Barnacle SEO activities for a higher volume output than a manual approach. In order to do that, we run all the “approachable” domains from contact information sourcing APIs like hunter.io, so that we can gather leads to reach out to on the outreach campaigns. I did not run this for Optimizely because there’s cost associated, but here’s a snapshot of the potential output:
Below we will explore some interesting use cases that can be applied at scale with some lightweight automation. As promised you can explore the full datasets yourself here:
In total, between the 2 datasets, there are about 9.000 “addressable” link opportunities. As you start working with the output, the filters can become more sophisticated and extensive.
However, now that you have the data in a structured format, that’s when the real work starts. The data is just an enabler and a good groundwork for you to bring in the results you’re looking for. The post-setup time is where you get to apply your brand guidelines, persuasiveness and resourcefulness to try to win over valuable links that can have a big effect on your SEO and your bottom line.
Here are some use cases to start off with your reachout attempts:
Unclaimed Brand Mentions
- is_brand, is_competitor,is_unapproachable=0
Angle: “Looks like you gave a shout but did not include a link pointing to our website – wouldn’t your visitors benefit from being able to visit the target page you are referring to?”
Wrong Brand Links
- is_brand, is_competitor,is_unapproachable=0
Angle: “Thanks for the link! We actually have another page available that we believe might be more relevant for users clicking through to visit. Would you consider switching over to X instead?”
Competitor won links
- is_brand, is_competitor,is_unapproachable=0
Angle: “We noticed that on your article about user journeys, you have a link pointing over to Apptimize’s case study about Hotel Tonight. Looks like that case study is back from 2015. Have you had a look at our extensive resource of recent case studies of Optimizely clients improving conversion rates? Would love to have a chat about how a content synergy can help both our customer audience learn more about user acquisition & adoption”
That’s it for this post – happy link building!