Screaming Frog’s “Near Duplicates” report allows you to see what % of content is duplicate.

This helps you find duplicate pages at scale.

When analyzing a website, checking for duplicate content can be difficult and time consuming.

Often, if a page has a competing duplicate, it might not be immediately obvious upon a visual reviews.

Things like PDFs, URL variations and other types of duplicate content are oftentimes hidden deep within a site.

Fortunately, you can check at scale by using this process:

1. Open up Screaming Frog.

2. Navigate to Configuration > Content > Duplicates.

duplicates

3. Check “Enable Near Duplicates” and choose the Similarity Threshold.

similarity threshold

4. Start your crawl and let it complete.

5. When the crawl is complete go to Crawl Analysis > Configure. Ensure that “Content” is checked.

crawl analysis

6. Go to Crawl Analysis > Start.

start

7. When the analysis is complete, go to Content > Near Duplicates. The “Closest Similarity Match” will give you a % match.

процент совпадения

8. You can use the “Duplicate Details” tab to see the other pages that have a high % match.

высокий процент совпадения

This process makes it much faster for you to identify duplicate content on your site and the groupings of URLs that search engines would consider similar.

 

Leave a Reply

Your email address will not be published. Required fields are marked *