A common myth about duplicate content is that it results in a Google penalty. As the search giant states in the Google Search Central guidelines: it doesn’t. Google acknowledges that most cases of duplicate content are benign. Even so, doing nothing about duplicates can hurt your site just as much as getting penalized, as you can experience a negative effect on your search rankings and find your optimization efforts sabotaged.
Here is an example of what happens when you have several versions of the same content on your site or across the web:
• Search engines index each of them. Specifically, Google indexes what it considers non-malicious duplicate content. These could be product descriptions found through multiple distinct URLs (unique web addresses), regular and printer-friendly versions of web pages and syndicated content such as press releases and proprietary studies, among others.
• Now, search engines don’t want to list the same content multiple times for a single query. So they filter what they think is the best version to rank in the top results and satisfy user intent. Google does a great job at this.
• It turns out duplicates confuse search engines. If they can’t tell which copy is the original, all versions will struggle to rank. Or if search engines are forced to choose one duplicate copy over another, the visibility of the other versions diminishes. There’s no winning in either context. The content you prefer to rank for a particular keyword may or may not be overlooked, and leaving things to chance isn’t a sound strategy for staying competitive in the search results.
Wouldn’t it be more sensible to eliminate duplicate content or minimize its impact on your website rankings and search engine optimization (SEO)? The good news is that duplicate content issues can be diagnosed, managed and avoided with key concept awareness and technical know-how.
In this blog post, we show you how it’s done. We cover:
• What is duplicate content?
• How does duplicate content affect SEO?
• What causes duplicate content?
• How to successfully avoid duplicate content
• What Google has to say about duplicates
• Practical tips to prevent duplicate content
What Is Duplicate Content?
Duplicate content completely matches or appears significantly identical to another piece of content. Duplicates live on more than one location on the web. When we say location, we’re referring to the URL or unique web address, for example, http://www.example.com/index.html.
You can create dupes, as Google calls them, by accident or through no fault of your own. Here are the scenarios that lead to duplicate content on the same website or across websites:
• You have a webpage that can be accessed through multiple URLs.
• You published the same content on different pages.
• Another website scraped or copied content from your website.
How Does Duplicate Content Affect SEO?
Identical content can influence your search engine performance in several ways. But is it bad in general and is there an exception when you consider the question of how does duplicate content affect SEO?
Duplicate content is bad for SEO. Although Google does not have a penalty for [this practice], it will still not help you outrank your competitors,” said Ronnel Viloria, Thrive’s Demand Generation Senior SEO Strategist. And for those who use black-hat tactics, he added: “Worse, your web pages might be removed from Google’s search results for trying to manipulate its search index.”
Let’s delve into the reasons it is doing your SEO more harm than good.
Loss of Traffic
Drawing more site traffic is why website owners strive to rank high on the search engine results pages (SERPs). The presence of wayward duplicate copy runs counter to this goal.
Consider the two scenarios:
1. When you have multiple URLs leading to the same page, Google may serve your user and searcher the unappealing version, distracting them from clicking on your link.
http://example.com/page-name
http://example.com/category/page-name
http://example.com/page-name?utm_source=google&utm_medium=cpc&utm_campaign=spring_sale
The third URL is unappealing because it contains strings that don’t provide any value to your target audience. On the other hand, the first one delivers succinctly everything the searcher needs to know about the page they’re about to visit.
2. It’s not like Google will hit you with a duplicate content penalty because you have three city pages containing the same information (if you’re a travel site, for instance). But, because excellent user experience is one of the most important ranking factors, the popular search engine favors “pages with distinct information”. It may have difficulty determining the original copy and be forced to filter the best version. Sometimes, the chosen version isn’t the one you prefer to rank for and drive visitors to.
Ruined SEO Rankings
If you have a page that’s accessible through different URLs, each of them may receive external backlinks. This splits the link equity among them and effectively reduces your chances of ranking the most appropriate version on the SERPs.
On rare occasions, syndicated content or content republished on other sites with your permission can outperform you in the rankings. Be wary of this issue if you often send out articles, infographics, videos and press releases. You also have the choice to hire a team that specializes in SEO content writing services to avoid making costly mistakes.
What Causes Duplicate Content?
At this point, you’re probably already familiar with the kind of content to avoid even without the aid of a website duplicate content checker or an expert content writing agency yet. Let’s cement that knowledge by helping you identify specific duplicate content issues.
1. HTTP/HTTPS, WWW/Non-WWW and the Trailing Slash
The first place to look for duplicate content is the URL. If your site can be accessed with or without the WWW prefix, i.e., www.example.com and example.com, you’ve got duplicate content on the same website. The same conclusion applies if both HTTP and HTTPS have live versions. Another variation to check is the appearance of a slash at the end of a URL, but not in the way you might think.
We’ll let John Mueller, Google Webmaster Trends Analyst, clear out the confusion about trailing slashes:
2. Session IDs
Ecommerce sites allow visitors to store items in their cart while they browse other pages. This data is stored in a session ID, which is unique to each user. The session ID is usually added to the URL to create a new URL, which customers can use to access the site. Unfortunately, web crawlers identify these individual URLs as duplicate content.
The code appended to the URL in this example comprises the session ID:
http://yourdomain.com/index.jsp;jsessionid=07D3CCD4
3. Mobile, AMP and Printer-Friendly Versions
While the format changes, the content is the same. Thus, mobile, Accelerated Mobile Pages (AMP) and printer-friendly versions are considered a duplicate copy.
• URL – example.com
• Mobile URL – m.example.com/page
• AMP URL – example.com/amp/page
• Printer-friendly URL – example.com/print/page
Note: With the inclusion of these seemingly harmless versions, you might be questioning how much duplicate content is acceptable. It’s not that easy to tell. To provide an accurate answer, professionals offering website content writing services must first perform a thorough site audit. At the same time, any content writing company worth its salt will not tolerate leaving these URLs lying around just because Google doesn’t serve a duplicate content penalty.
4. Comment Pagination
Depending on your content management system (CMS), pagination can be implemented to spread comments across multiple pages. However, this practice results in more duplicate content on the same website as the article URL adds one comments page after another.
example.com/post/
example.com/post/comment-page-2
example.com/post/comment-page-3
5. Localization
Multiple-location businesses sometimes find it hard to produce unique content and resort to using templates for the locales they serve. This leads to instances of near-duplicates or exact-match copy on their websites.
6. Scrapers
Everything else discussed so far involves internal duplicate content issues. But scraped content is an external problem that some web owners must deal with promptly because of the extent of its impact. It pertains not just to duplicates but also to third-party actors stealing content from your site and repurposing it. In some cases, the text is copied word for word.
Note: It’s also vital that you are consciously crafting authentic content and being careful not to violate Google’s quality standards as well as established rules against plagiarism. You can Google duplicate content checker options for guidance.
Alternatively, you may read this article enumerating the Top Duplicate Content Checkers For Website Content to find expert-recommended tools you can add to your arsenal.
How to Successfully Avoid Duplicate Content
It won’t be easy reversing the damage caused by duplicate content without a solid grasp of technical SEO. So it’s best to avoid duplicate content issues altogether. Knowing how much duplicate content is acceptable doesn’t change the fact that dupes are bad for SEO, so you better just get rid of them.
However, searching for and fixing dupes is a time-consuming process. So the next step is to have more experienced hands take charge. A reputable content writing company that provides SEO content writing services such as Thrive can do a lot of wonders for your SEO efforts. We can cut down the turnaround time, clean up your content without skipping a step and integrate technical SEO, local SEO and other solutions to future-proof your content and optimization strategy.
What Google Has to Say About Duplicates
Again, Google does not impose a duplicate content penalty because most instances do not have a deceptive origin. Duplicates are not grounds for manual actions, site/page bans and ranking demotions. However, Google rewards sites that offer distinct information, filtering out duplicate URLs and prohibiting them from ranking well.
So the optics here is that site owners and webmasters are discouraged from producing identical content that could hurt their SEO.
Practical Tips to Prevent Duplicate Content
Instead of thinking about how much duplicate content is acceptable, channel your time and resources into preventing duplicates from ruining your SEO efforts. Having access to a website duplicate content checker when you write is a good place to start.
To take your content creation to the next level, you must apply the following tactics or have a content writing agency handle them for you:
• Canonical URLs. When Google finds duplicates from your site, it will determine the best version based on ranking signals and crawl that more often. By setting a canonical URL, you can consolidate link signals and specify your preferred version to Google. This can be applied to the sitemap or by sending a rel=canonical header in your page response.
You can also add a rel=canonical <link> tag to all duplicate pages, pointing them to the canonical page.
Check out Google’s canonicalization guidelines here.
• 301 redirects. This permanent redirect forwards a good deal of the ranking signals to the original URL, allowing it to gain traction in search results.
• URL parameter handling. This set of canonicalization options tells Google which parameters to exclude rather than include. Each option will benefit your search performance differently, but in general, you can improve crawl efficiency and ranking value. Consulting a content writing company specializing in website content writing services can start you off on the right track.
• Rel=”alternate”. For localized pages that have versions for different languages or regions, you can let Google know when to use the alternate pages.
• Website duplicate content checker. Leverage this automated tool to keep track of scrapers trying to mooch off your link juice. But do remember that not all Google duplicate content checker services are created equal.
Avoid Duplicates and Future-Proof Your Content Optimization Strategy
We have tackled the question of what is duplicate content and how you can manage it without harming your content and SEO strategy. We have also dished out actionable tips on how to avoid producing duplicates altogether, from using a Google duplicate content checker to enlisting SEO content writing services.
It is essential to effectively execute the technicalities of removing duplicate content to improve the search rankings of the pages you prefer showing your target audience. This includes communicating your preference to Google through canonicalization methods. If you need help in nailing the nitty-gritty of search fundamentals, get it from someone who prioritizes quality over quantity.
Thrive is a content writing agency offering website content writing services integrated with on-page and off-page optimization to ensure brands are compliant with Google’s webmaster guidelines. Start a conversation with us by filling out this form or calling 866-908-4748.