Duplicate content with sh404sef

Discuss Search Engine Optimization in relation to Joomla!. This forum will also have discussions on SEF/SEO Joomla! extensions.

Moderator: General Support Moderators

Locked
kensay
Joomla! Apprentice
Joomla! Apprentice
Posts: 42
Joined: Tue Jul 28, 2009 4:06 pm

Duplicate content with sh404sef

Post by kensay » Mon Jan 18, 2010 12:38 pm

Hi guys

I have installed the sh404sef component. My problem is that i got shitloads of different urls pointing to the same page and its very confusing to me how to avoid this. I've avoided many of them with the robot.txt file so that i only get the sef friendly urls indexed. But if you take following example:

http://www.azurex.dk/Referencer.html
http://www.azurex.dk/referencer.html

both have the exact same content.
i used http://www.azurex.dk/Referencer.html but then i changed the url to lower case and purged all urls. But the link with upper case R still exists obviously. How do i avoid this using my current Joomla components?

Regards Mikael

User avatar
Leftfield
Joomla! Virtuoso
Joomla! Virtuoso
Posts: 4439
Joined: Fri Dec 08, 2006 3:33 am
Contact:

Re: Duplicate content with sh404sef

Post by Leftfield » Mon Jan 18, 2010 12:48 pm

Dont put it in the sitemap and dont link it from anywhere :). Thats all. If it is linked from somewhere, make 301 redirect in htaccess ot custom redirect in sh404.
SEO & SEM Manager https://vujosevic.com/

kensay
Joomla! Apprentice
Joomla! Apprentice
Posts: 42
Joined: Tue Jul 28, 2009 4:06 pm

Re: Duplicate content with sh404sef

Post by kensay » Mon Jan 18, 2010 1:01 pm

I have discovered that it is any version of the url that gives the same result:

ReFErenCER
referenCER

etc etc etc. Making a canonical or 301 etc for all of these versions seems impossible. Isnt there a feature somehow that converts all urls to lower case automatically? there is such a function in sh404sef which i've tested but with no luck

User avatar
Leftfield
Joomla! Virtuoso
Joomla! Virtuoso
Posts: 4439
Joined: Fri Dec 08, 2006 3:33 am
Contact:

Re: Duplicate content with sh404sef

Post by Leftfield » Mon Jan 18, 2010 1:07 pm

Leftfield wrote:Dont put it in the sitemap and dont link that kind of url from anywhere :). Thats all.
You can relax.
SEO & SEM Manager https://vujosevic.com/

kensay
Joomla! Apprentice
Joomla! Apprentice
Posts: 42
Joined: Tue Jul 28, 2009 4:06 pm

Re: Duplicate content with sh404sef

Post by kensay » Mon Jan 18, 2010 1:12 pm

Ok great but could you explain to me why i can relax?

As i see it i have a potential of massive duplicate content error if say, someone links to me with a uppercase letter. Then i supose google with index that URL and BOOM:(

Regards Mikael

User avatar
Leftfield
Joomla! Virtuoso
Joomla! Virtuoso
Posts: 4439
Joined: Fri Dec 08, 2006 3:33 am
Contact:

Re: Duplicate content with sh404sef

Post by Leftfield » Mon Jan 18, 2010 2:06 pm

IMHO regarding SERP, there will be no boom. Anyway, this is not duplicate content.
SEO & SEM Manager https://vujosevic.com/

wmalcom
Joomla! Apprentice
Joomla! Apprentice
Posts: 12
Joined: Wed Apr 21, 2010 7:48 pm
Location: Dallas, TX
Contact:

Re: Duplicate content with sh404sef

Post by wmalcom » Fri Jul 02, 2010 7:20 pm

You must be on a windows server. Windows servers do not take into fact uppercase and lowercase file/directory names, linux servers do. Don't worry though, this is not considered dup content. :D
Internet Marketing Services | PPC Management, Pay Per Click Management, Google Adwords Qualified Professional | Malcom Media
http://www.malcommedia.com

User avatar
fmmarzoa
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 123
Joined: Sun Nov 11, 2007 1:15 pm
Location: Spain
Contact:

Re: Duplicate content with sh404sef

Post by fmmarzoa » Sun Apr 01, 2012 7:07 pm

Hi,

I am facing the same problem now. My site has been having that problem for a long time, but I just noticed it today since I was working on other projects.

It seems clearly like a sh404sef bug for me, using SEF URLs that in fact are NOT DEFINED nowhere to show content instead of giving an SH404SEF.

It is not a problem of the underlying OS and has nothing to see with it: it's the sh404SEF who receives the URL and should manage it. It is pretty clear, but anyway I have my site on an Ubuntu Linux distribution with Apache, so It is also empirical tested.

You are right on that: IT IS a potential duplicate content issue, since even when you do not use those invalid links, one mistake in one backlink its enough for Google to find them. And I have that problem exactly with one of those URLs, since I have found in my site stats visits that use uppercase and others using lowercase.

Anyway my site is in an old J!1.5 -as this post is also old- and I am planning to migrate it to J!1.7, or may be even a custom CMS that I am using for another pages.

In the meantime I am thinking in an eventual workaround forcing all urls lowercase with mod_rewrite into .htaccess, so Apache mod_rewrite will take care of that before the silly sh404sef gets the URL.

It seems not to be very difficult:

http://www.chrisabernethy.com/force-low ... d_rewrite/

Probably this is not useful for you so long after your post, but I leave it here just for the record. It may be useful for someone later, even for myself.

Best regards,
“The media's the most powerful entity on earth. They have the power to make the innocent guilty and to make the guilty innocent, and that's power. Because they control the minds of the masses.” -- Malcolm X


Locked

Return to “Search Engine Optimization (Joomla! SEO) in Joomla! 1.5”