Hiding duplicate content from your site via robots.txt login
Many blogs, including this one, generate duplicate content. For example, the archive pages duplicate the content of individual posts, they just show them in a different way (a couple of posts per page, as opposed to a single post per page).
That unfortunately clogs search engines. Being a perfectionist that I am, I want that a search for e.g. "15minutes" (my simple timer application) leads people to individual blog posts about it and not to aggregate pages with random other content.
User-agent: * Disallow: /page/ Disallow: /tag/ Disallow: /notes/
In my particular case, archive pages all start with
/tag/ is another namespace with duplicate content (shows a list of articles with a given tag).
For this technique to work the names duplicate pages have to follow a pattern, but that’s easy enough to ensure, especially if you write your own blog software, like I do.
|Newer article:||Speeding up Go (and C++) with custom allocators|
|Older article:||How I sped up Go by 20% (or is Go really slower than Java?)|