Advertisement

Google Duplicate Content Problems - 16 URLS for 1 content item

General questions relating to Joomla! There are other boards for more specific help on Joomla! features and extensions.

Moderator: General Support Moderators

Forum rules
Forum Rules
Absolute Beginner's Guide to Joomla! <-- please read before posting, this means YOU.
Forum Post Assistant - If you are serious about wanting help, you will use this tool to help you post.
Locked
User avatar
serrbiz
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 231
Joined: Mon Sep 18, 2006 3:48 pm
Location: Dallas, TX
Contact:

Google Duplicate Content Problems - 16 URLS for 1 content item

Post by serrbiz » Sat Jun 30, 2007 12:31 am

There are a couple posts floating about regarding Google and Joomla duplicate content.  There has been no proper resolve (that I've found), nor, unfortunately, will this post be one.

However, I want to address what I consider to be a mis-informed opinion in this post http://forum.joomla.org/index.php/topic,181605.0.html.

Here, a member claims that Google is smarter than we think and the duplicate content issue is not really a problem.

I disagree.

Considering Google has problems with http://www.domain.com vs domain.com how smart is Google when it comes to Joomla's duplicate content?

If you combine Googles www vs non-www issue with Joomlas duplicate content problems, we have just raised this issue to the second power.

Now, more specifically about Goggles opinion of duplicate content, read this from Googles White Papers:

"Duplicate content generally refers to substantive blocks of content WITHIN or across domains that either completely match other content or are appreciably similar."

To Google credit, they try to forgive non-manipulative duplicate content when it comes to blogs and CMS systems as they said here:

http://googlewebmastercentral.[URL banned]. ... ntent.html

Though, the also make a suggestions on how to deal with it. Using No Index tags, site maps, etc.

Though this helps Google determine what page is the relevant page you want indexed, Joomla's duplicate content issue is throwing road blocks in front of Google.

I consider this a SERIOUS problem with Joomla, especially when combining Joomla with 3rd party mambots and compenants.  One site with 3 extra compenants and a site map mambot, Joomla produced 16 URLS for 1 content item!!! See the note below.

OpenSEF and Jpromoter can help with manual URL redirects and aliases. However, it's a SLOW and LABORIOUS process.  You can add half an hour of manual adjustments for EVERY page you create. 

This defeats the purpose of a CMS.

Don't get me wrong.  I think Joomla is a great CMS for its ease of use and scalability. Manual URL rewrites and SEO adjustments aside, I still use Joomla for my own site and clients sites, but it's a problem I just discovered in depth. This needs to be resolved at a core system level not with clunky components and extensive manual intervention.

While we wait for the Joomla team to fix this problem, I encourage all Joomla devotees concerned with SEO and organic ranking, to continue this discussion and post their findings here.  Maybe we can influence the development team to address this problem sooner rather than later.


-- 16 ID's for one content item using Joomla --

index.php?option=com_content&task=view&id=73
index.php?option=com_content&task=view&id=73&Itemid=1
index.php?option=com_content&task=view&id=73&Itemid=24
index.php?option=com_content&task=view&id=73&Itemid=27
index.php?option=com_content&task=view&id=73&Itemid=3
index.php?option=com_content&task=view&id=73&Itemid=30
index.php?option=com_content&task=view&id=73&Itemid=31
index.php?option=com_content&task=view&id=73&Itemid=33
index.php?option=com_content&task=view&id=73&Itemid=34
index.php?option=com_content&task=view&id=73&Itemid=41
index.php?option=com_content&task=view&id=73&Itemid=49
index.php?option=com_content&task=view&id=73&Itemid=56
index.php?option=com_content&task=view&id=73&Itemid=57
index.php?option=com_content&task=view&id=73&Itemid=59
index.php?option=com_content&task=view&id=73&Itemid=9
index.php?option=com_content&task=view&id=73&Itemid=99999999

These urls were produced using Joomla 1.012. Add ons include:

Social Bookmarker Mambot
RD Site Map
DS Syndicate

Advertisement
User avatar
ranwilli
Joomla! Master
Joomla! Master
Posts: 19203
Joined: Sun Feb 19, 2006 6:47 pm
Location: Toledo, OH
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by ranwilli » Sat Jun 30, 2007 3:38 am

Are you aware of what the Itemids are? I suspect you have a BUNCH of menues made up that you're not using, or somehow have 16 menu entries pointing to that article ID, Are they perhaps in the trash?

Also you may want to look carefully at this extension - at least read the article at the developer's site for some more thorough understanding of the issues:
http://extensions.joomla.org/component/ ... Itemid,35/

Best of Luck!
Don't HACK the Joomla! core, Instead "Extend" and/or "Override."
Stay ON the update path.
https://harpervance.com

User avatar
serrbiz
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 231
Joined: Mon Sep 18, 2006 3:48 pm
Location: Dallas, TX
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by serrbiz » Sat Jun 30, 2007 1:59 pm

Thanks, but the link goes to SEF Patch, which only address meta data issues.  It does not resolve the duplicate ID issue.

Found this link about the new Joomla 1.5, but it doesn't help for Joomla 1.0.12.  Also, I don't want to start uing 1.5 untill its stable and there are compenants for it.

http://www.joomla.org/component/option, ... ,33/p,286/

User avatar
ranwilli
Joomla! Master
Joomla! Master
Posts: 19203
Joined: Sun Feb 19, 2006 6:47 pm
Location: Toledo, OH
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by ranwilli » Sat Jun 30, 2007 2:50 pm

The point is: Those are NOT duplicate articles the Article ID is always 73. You are calling out 16 distinct Menu listings (Itemid=x) di you eliminate unused menu links?
Don't HACK the Joomla! core, Instead "Extend" and/or "Override."
Stay ON the update path.
https://harpervance.com

User avatar
kenmcd
Joomla! Champion
Joomla! Champion
Posts: 5672
Joined: Thu Aug 18, 2005 2:09 am
Location: California
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by kenmcd » Sat Jun 30, 2007 7:16 pm

Large numbers of ItemIDs usually come from a mis-behaving module.
I have seen related-links, popular, and other modules which create a new link ItemID for every page upon which they module appears.
This can quickly multiply the number of ItemIDs pointing to one content item.

Some more advanced modules enable users to set the ItemID to be used thus avoiding the multiples.

Check your pages to see where these links appear to find the source.
██ LibreTraining

User avatar
DeanMarshall
Joomla! Hero
Joomla! Hero
Posts: 2352
Joined: Fri Aug 19, 2005 2:26 am
Location: Lancaster, Lancashire, United Kingdom
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by DeanMarshall » Sat Jun 30, 2007 7:32 pm

Is that misbehaving Ken - or is that the way we are supposed to make modules nowadays.
Wasn't the propogation of the current page's Itemid the biggest change in J 1.0.12 ?


Dean
Dean Marshall Consultancy - six Joomla experts - http://www.deanmarshall.co.uk/

Joomla Experts - Joomla Support http://www.deanmarshall.co.uk/joomla-se ... pport.html

User avatar
kenmcd
Joomla! Champion
Joomla! Champion
Posts: 5672
Joined: Thu Aug 18, 2005 2:09 am
Location: California
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by kenmcd » Sat Jun 30, 2007 9:20 pm

DeanMarshall wrote:
Is that misbehaving Ken - or is that the way we are supposed to make modules nowadays.
Wasn't the propogation of the current page's Itemid the biggest change in J 1.0.12 ?

Dean
Dean,

Not an attack. Please relax.
Users usually blame Joomla or the SEF component for multiple URLs.
More often the source is components or modules they have installed.
That is where it needs to be addressed to "fix" the issue.
For sites without an advanced SEF component the issue often does not arise because it simply is not brought to their attention as it is with an advanced SEF component.
No advanced SEF component can "fix" this issue.
And I mean "fix" from the users perspective.

As far as "misbehaving" - my apologies if you have an issue with that word.
That is again from the users perspective.
Right or wrong they see these mass multiple URLs as a problem.
And yes I know this is an ongoing Joomla issue, etc, etc. - not the problem here.
Users are trying to manage a site now with advanced SEF FURLs.

And yes, perhaps if advanced SEF was incorporated in the core all module developers would take this issue into account when making components or modules. Obviously this particular multiple ItemID issue arises again-and-again-and-again because advance SEF is not taken into account when the component or module is made.
The average user does not understand this . . . which is why the same questions arise over-and-over.
"Misbehaving" is just a matter of perspective.
The solution many times is to simply get rid of the source.

I have seen some modules which are designed to deal with the ItemID issues.
The developer took this issue into account.

Sorry if you do not like the word.
How about this instead?  -  a multiple-ItemID-producing component or module

"Joomla user, find the a multiple-ItemID-producing component or module and delete it and your issue may be solved"

KM
██ LibreTraining

AmyStephen
Joomla! Champion
Joomla! Champion
Posts: 7018
Joined: Wed Nov 22, 2006 3:35 pm
Location: Nebraska
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by AmyStephen » Sat Jun 30, 2007 9:32 pm

kenmcd wrote:
Large numbers of ItemIDs usually come from a mis-behaving module.
I have seen related-links, popular, and other modules which create a new link ItemID for every page upon which they module appears.
This can quickly multiply the number of ItemIDs pointing to one content item.

Some more advanced modules enable users to set the ItemID to be used thus avoiding the multiples.

Check your pages to see where these links appear to find the source.
Ah-ha! That explains a lot - thanks!

I'll have to look at v 1.5 more closely now that I understand this issue better. I know the frontpage page issue started in 1.0.12 - did the modules issue start then, too? Does anyone know what 1.0.13 does related to modules? I heard from Ken there is a switch for Frontpage.

Again, thanks. That really helped.
Amy :)

User avatar
kenmcd
Joomla! Champion
Joomla! Champion
Posts: 5672
Joined: Thu Aug 18, 2005 2:09 am
Location: California
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by kenmcd » Sat Jun 30, 2007 9:56 pm

AmyStephen wrote:I know the frontpage page issue started in 1.0.12 - did the modules issue start then, too? Does anyone know what 1.0.13 does related to modules? I heard from Ken there is a switch for Frontpage.

Again, thanks. That really helped.
Amy :)
No, this issue did not start with J 1.0.12.
I found this when looking at various sites for OpenSEF users.
OpenSEF users asked why OpenSEF was creating so many multiple FURLs.
Answer: OpenSEF is not creating the multiple URLs - Joomla, or a component, or a module is creating the multiple ItemIDs.
OpenSEF and other advanced SEF components simply make it easy to see that this is happening and focus attention on the issue.

The current J 1.0.13 RC2 from the SVN merely allows one to switch the behavior
- J 1.0.11 or earlier ItemID behavior
- J 1.0.12 or later ItemID behavior

Not sure how or if this affects the other modules.
The core Popular Links module would be one to test.
It did exhibit this multiple ItemID behavior in the past.
Do not remember/know if this has been fixed/changed/modified in current Joomla 1.0.12.

Testing - make the Popular module appear on multiple pages and then look at the links.
If it still works like it did, you will see different ItemIDs for the same content item depending on which page the module appears.

As I said before, there were other 3PD modules which did this too.

The workaround in OpenSEF is to un-check the used button for the undesired internal URLs.
The desired internal URL (the one left) will be the one used.

I know this also affects sites not using any advanced SEF component.
██ LibreTraining

AmyStephen
Joomla! Champion
Joomla! Champion
Posts: 7018
Joined: Wed Nov 22, 2006 3:35 pm
Location: Nebraska
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by AmyStephen » Sun Jul 01, 2007 3:03 am

kenmcd wrote: Testing - make the Popular module appear on multiple pages and then look at the links.
OK, I gave it a test - put the Popular module on every page in a J! v 1.5 beta 2 install (SVN updated.)

It looks good. We have *the same* ItemID for articles regardless of the "page" the Popular module is on. (Of course, with SEF URLs *on*, we're good, too.)

Good work to the core devs!

Tried checking out the Frontpage issue - but there appears to be a linking error on the non-frontpage "read more" links in the blog categories. (It also appears that the URL might be different.)

I'm heading on vacation for awhile - but I am glad I stumbled on this conversation. I didn't understand the module issue until reading Ken's post. I am glad to see that resolved in v 1.5.

Thanks, again.
Amy :)

User avatar
kenmcd
Joomla! Champion
Joomla! Champion
Posts: 5672
Joined: Thu Aug 18, 2005 2:09 am
Location: California
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by kenmcd » Sun Jul 01, 2007 4:02 am

Keep in mind J 1.5 is the same as J 1.0.12+.
This will have the same behavior so the issues discussed in "that long thread" will apply.
Perhaps that change also affects the modules.

Have to test other modules.
██ LibreTraining

AmyStephen
Joomla! Champion
Joomla! Champion
Posts: 7018
Joined: Wed Nov 22, 2006 3:35 pm
Location: Nebraska
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by AmyStephen » Sun Jul 01, 2007 10:27 am

One more comment on this before vacation...and Ken - this isn't "for you" - you are an expert at ItemID. Just kind of thinking out loud.

The thing is this - Joomla! is not a simple blogger.

In WP, a blog entry has a permanent URL. ALL blog pages look the same.

In Joomla!, the article is but one piece of the page - it can be "decorated" using any combination of modules. It can have an entirely unique template. J! is so flexible, it *allows you* to put the same article into many sections of your site. So, the announcement made by the company president can be shared in the President's blog *and* the product offering sections of the site.

If the article is purposefully located in different places, it must have a different URL.

For that reason, when an article appears on the frontpage, it is not necessarily obvious where to "drill down" into on the "read more". Given that possible ambiguity, frontpage articles are now logically given another unique URL. I get why this was decided; that is my point.

I also get why this causes people heartache in the day of trackbacks and Digg and delicious bookmarking. (Let alone Google penalties - if existing - that's debated by some - I have no idea.)

Back to the original question, you *can* configure Joomla! to use one URL per article. You can configure Joomla! to behave very simply - just like WP. You don't have to have multiple URLs per article (module issue previously discussed, aside.)

It's important to understand that and to knowingly make your choices. If you must have a permanent, unique URL per article, do not put the article in multiple places on your site. Remove the print, email and pdf capability. And, use the pre-v 1.0.12 ItemID for frontpage logic.

I realize v 1.5 is like v 1.0.12 for the Frontpage. I believe the Frontpage concept still needs work - I understand why the devs are doing what they are doing. I understand the problem is not easy because of J!'s flexibility. But, IMO, it's not there, yet. I keep thinking we need a default "read more" option at the article level. Or, the article page needs to be like WP - where it looks the same for every article. Regardless, we need a Frontpage *and* we need a permanent URL per article. Somehow. 

Flexibility always spells trouble. Simple is simple. The beauty and the curse of Joomla!, I guess.
Amy :)

User avatar
DeanMarshall
Joomla! Hero
Joomla! Hero
Posts: 2352
Joined: Fri Aug 19, 2005 2:26 am
Location: Lancaster, Lancashire, United Kingdom
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by DeanMarshall » Sun Jul 01, 2007 10:30 am

Dean,

Not an attack. Please relax.
I'm on your side mate - sorry if it sounded like truculence, I wasn't 'defending' anything - merely
making the observation that this is apparently 'intended behaviour'.

As I mentioned here: http://www.joomla.org/component/option, ... 105/p,240/
I have my own 'issues' with the behaviour.

Cheers

Dean
Dean Marshall Consultancy - six Joomla experts - http://www.deanmarshall.co.uk/

Joomla Experts - Joomla Support http://www.deanmarshall.co.uk/joomla-se ... pport.html

User avatar
serrbiz
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 231
Joined: Mon Sep 18, 2006 3:48 pm
Location: Dallas, TX
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by serrbiz » Mon Jul 02, 2007 10:47 am

Good stuff all.  Thanks for the feedback.

Yes, I have removed oun-used menus, and played with OpenSEF for using and not using itemID numbers etc.  I've got the system down to 2-3 urls per content item and am having my programmers look at the issue in depth to see what we can do...

If we come up with anyhthing, we'll post it here.

Thanks.

User avatar
alledia
Joomla! Ace
Joomla! Ace
Posts: 1070
Joined: Tue Jul 18, 2006 3:55 pm
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by alledia » Tue Jul 03, 2007 4:03 pm

Use a SEF URL component with automapping and your problem is solved.

http://googlewebmastercentral.[URL banned]. ... ntent.html

This provides a list of ways to deal with duplicate content, and includes the warning: "Understand your CMS"
Joomla extensions and templates: http://Joomlashack.com

User avatar
serrbiz
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 231
Joined: Mon Sep 18, 2006 3:48 pm
Location: Dallas, TX
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by serrbiz » Tue Jul 03, 2007 4:13 pm

The automapping feature is actual a problem with openSEF as it automaps all the duplicate URLS mentioned above.

As I mentioned, the big culprits are the frontpage module, and other modules such as latest news, etc.

One way to reduce the duplicate content, is to make mulltiple "latest news" functions, and make each only target a specific category.

While waiting for 1.5 to finished and stable, Serr.biz is working on it's own SEF solution which we'll make available when done. Stay tuned.

User avatar
alledia
Joomla! Ace
Joomla! Ace
Posts: 1070
Joined: Tue Jul 18, 2006 3:55 pm
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by alledia » Tue Jul 03, 2007 4:25 pm

True - thats one of the (remarkably few) flaws with OpenSEF, but sh404SEF and SEF Advance both do a great job with automapping.
Joomla extensions and templates: http://Joomlashack.com

User avatar
kenmcd
Joomla! Champion
Joomla! Champion
Posts: 5672
Joined: Thu Aug 18, 2005 2:09 am
Location: California
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by kenmcd » Tue Jul 03, 2007 6:53 pm

An advanced SEF component must decide how to deal with the Joomla duplicates.
OpenSEF or any advanced SEF component does not "create duplicates."
Joomla and add-ons create the duplicates.

JoomSEF defaulted to unique FURLs which acknowledges the different ItemID.
The FURLs were numbered 1,2,3.
Some users liked this, some did not.
This is the way Joomla 1.5 SEF works presently.
Good default because everything just works without any intervention.
But you will continue to have duplicate URLs for the same content.
The user who started this thread would have 16 FURLs for one content item in this case.

sh040SEF currently guesses and selects the first internal URL when there are multiple ItemIDs.
This is the first URL in the database.
This is the way Joomla works, subject to the internal link type priority.
sh404SEF does also have an entire "Itemid management" configuration to help deal with the issues.
This includes "Guess Itemid on homepage?" which attempts to automatically "fix" the issues.
This is selectable because for some users this fixes their site, for others this breaks their site
Additionally, the ability to manually select which internal URL to use is in testing now.
This more precise control is being added due to user requests.

SEF Advance does enable you to choose to have unique URLs by adding ItemID etc.
Default is to guess which internal URL to use based on the internal link type priority.
(this was reported to be the source of Joomla's internal link type priority)
All URL rewriting previously was done on-the-fly so the multiple URLs remained hidden.
Have not seen the current version for obvious reasons.

OpenSEF was designed to enable a knowledgeable user more visible and complete control.
The Used button and Link Priority info enables specific control.
More control generally means more complexity.
Bad for average user as it is more confusing. Good for more knowledgeable users.
The various Append to URL features enable unique FURLs if desired, or needed.
All internal URLs are displayed in Manage FURLs because they exist on the users site.
Because these URLs exist they will be used, and an SEF component must deal with it.
The only difference is whether it is visible or not visible.
Auto-mapping is selectable - for most, On is best; for others, Off is best.

ALL advanced SEF components are dealing with the multiple URLs.
Just because you do not see the multiple URLs in the database does not mean they are not there on the site.
Some web sites "just work" with the defaults in some components.
Some do not.
The ability to manage the ItemID issues is the key.

This is why different SEF components may be right for different users.
Install and go may be important.
Add-ons may be important.
Speed may be important.
Managing ItemIDs may be important.
Just depends.

Due to lack of an "official" advanced SEF component, we now have 4 different SEF extension formats.
sh404SEF shines here in that it attempts to use any SEF extension format available.
And Yannick (shumisha) has been writing new extensions at fast pace.

There are two other advanced SEF components in development now (that I have heard of).
The fun continues!
:D
██ LibreTraining

User avatar
alledia
Joomla! Ace
Joomla! Ace
Posts: 1070
Joined: Tue Jul 18, 2006 3:55 pm
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by alledia » Tue Jul 03, 2007 6:58 pm

Great overview Ken.

Don't forget Remosef and JPromoter too .... No wonder users are confused.
Joomla extensions and templates: http://Joomlashack.com

User avatar
kenmcd
Joomla! Champion
Joomla! Champion
Posts: 5672
Joined: Thu Aug 18, 2005 2:09 am
Location: California
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by kenmcd » Tue Jul 03, 2007 7:37 pm

alledia wrote: Great overview Ken.

Don't forget Remosef and JPromoter too .... No wonder users are confused.
RemoSEF - thought this is dormant

JPromoter - last looked at this in v1.0.x.
At that time it only dealt with meta tags, and SEF Patch Extended was better.
Did not know it now also rewrites URLs.
Another commercial advanced SEF component.
:(
██ LibreTraining

User avatar
alledia
Joomla! Ace
Joomla! Ace
Posts: 1070
Joined: Tue Jul 18, 2006 3:55 pm
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by alledia » Tue Jul 03, 2007 7:42 pm

JPromoter takes an interesting approach - avoids the sef_ext.php files entirely and uses .xml files for each component.

They seem quicker to write than the sef_ext.php files, but it also means theres no overlap with work done for other SEF URL components.
Joomla extensions and templates: http://Joomlashack.com

K123
Joomla! Apprentice
Joomla! Apprentice
Posts: 14
Joined: Fri Jun 08, 2007 5:10 am

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by K123 » Sun Jul 08, 2007 10:51 pm

I recently began my search for SEF urls and other SEO and intend to invest my time into an ‘advanced’ solution but for now I plead ignorance…

Wouldn’t it be nice if the Joomla core included the manual or semi-manual URL overrides for “pages”
http://www.mysite.com/arbitory_subgroup ... nt/content
http://www.mysite.com/blog/id=55

Of course this must become a lot more complicated when considering that there is no such thing as a page and articles can be on many “pages”…

Just my thoughts, take em or leave em. Very informative thread by the way.

User avatar
serrbiz
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 231
Joined: Mon Sep 18, 2006 3:48 pm
Location: Dallas, TX
Contact:

Re: Google Duplicate Content Problems - 16 URLS for 1 content item

Post by serrbiz » Tue Sep 04, 2007 11:55 am

To address this issue of duplicate cotent, Serr.biz created a gpl component called SerrBizSEF. Check it out in the extensions direcotry.
http://extensions.joomla.org/component/ ... Itemid,35/

Advertisement

Locked

Return to “General Questions - 1.0.x”