First time I got outranked by a spin of my own content, said Tonya Ugnich.

Expired domain, 8 million pages in the index, pure spam.

Found a footprint and other domains from their huge network.

Now reading the HTML code they fed to Googlebot.

Very helpful content.

As this spam network uses cloaking software, it’s difficult to see the pages they serve to Googlebot.

But I got it anyway and want to share what I found.

It’s important to mention that some of my findings might be irrelevant to ranking, and products of survivorship bias.

Correlation does not mean causation and etc.

Regardless, it does work and gets lots of traffic for low-competition keywords right now.

I looked at about a dozen of pages on different domains and all of them follow the same template: simple clean HTML, Bootstrap CSS and Font Awesome icons.

There is a simple search form, not sure if it’s working.

Some pages have no images, others have one image hosted on Bing.

They use just one schema in LD+JSON format: BreadcrumbList.

On each page there are 15 internal links in <ul> with exact match anchors.

And 8 outgoing links that use three kinds of anchors: either domain name, a very common anchor (like “news” or “en”), or short gibberish.

They use names of real popular people as authors of pages.

They use XML, HTML and RSS sitemaps simultaneously.

Most pages have no H2-6, just one H1.

There is a <base> tag on the page.

They use some unusual meta tags: author, designer, owner.

No <strong>, <b> or other inline markup tags.

No tables, no lists (except for the internal links).

Around 1000 words per page.

The keyword density for the primary keyword varies, depending on the keyword.

Good choice of co-occurring words, as far as I can see.

If you want an example just search for any long tail keyword and scroll down until pages 3-4.

If you see a European GTLD, it’s likely it.

Leave a Reply

Your email address will not be published. Required fields are marked *