Screaming Frog’s “Near Duplicates” report allows you to see what % of content is duplicate.
This helps you find duplicate pages at scale.
When analyzing a website, checking for duplicate content can be difficult and time consuming.
Often, if a page has a competing duplicate, it might not be immediately obvious upon a visual reviews.
Things like PDFs, URL variations and other types of duplicate content are oftentimes hidden deep within a site.
Fortunately, you can check at scale by using this process:
1. Open up Screaming Frog.
2. Navigate to Configuration > Content > Duplicates.
3. Check “Enable Near Duplicates” and choose the Similarity Threshold.
4. Start your crawl and let it complete.
5. When the crawl is complete go to Crawl Analysis > Configure. Ensure that “Content” is checked.
6. Go to Crawl Analysis > Start.
7. When the analysis is complete, go to Content > Near Duplicates. The “Closest Similarity Match” will give you a % match.
8. You can use the “Duplicate Details” tab to see the other pages that have a high % match.
This process makes it much faster for you to identify duplicate content on your site and the groupings of URLs that search engines would consider similar.