Hmm, it may not be a "bug", but it is something that needs to be "fixed" - since duplicate content can adversely affect your pagerank...(although I don't think it _should_...maybe Google could be petitioned to ignore duplicate content for Joomla! sites?
However, I understand that it may be too much to ask for 1.5.
I'm wondering what the solution might be, though. Looking at my "FAQs" page, as an example, I see that:
http://www.mikemillsconsulting.com/faqs is a menu item URL that points to my "Frequently Asked Questions" category.
http://www.mikemillsconsulting.com/faqs ... -questions is the URL for my category, but it is incorrectly pointing to "Newsflash 4". They both have an ID of 42 -
this looks like a SEF bug, by the way.
http://www.mikemillsconsulting.com/faqs ... ocess is the URL for a particular FAQ entry on my site.
http://www.mikemillsconsulting.com/faqs ... ocess is the direct Article URL for that same FAQ entry (duplicate content).
It seems to me that there are at least two "problems":
1) That menu items have their own URL, instead of just using the URL of the item in question. (That's why I suggested using the "External Link" option above.)
2) That
both menu items
and categories are included in the URLs of articles. Articles are unique, by way of their itemID (the bug noticed above notwithstanding), and it seems to me that URLs should just point to the article, and not even mention the menu and category in the path. When you consider a future where an article could be in multiple categories, then I think that makes even more sense.
Basically, I think "how to find an article" (menu and category) should be taken out of the Article URL. An article URL should, I think, be permanent and probably not categorized (except maybe including the year and month it was published).
Hmm, on the other hand, I suppose one could make an argument that the search engines have it wrong, and should allow "duplicate" content as a way of supporting "taxonomy friendly URLs". For example:
http://foo.com/news/science/sun_explodes.html
and also
http://foo.com/news/topnews/sun_explodes.html
Otherwise, without "taxonomy friendly URLs", you might browse to "
http://foo.com/news/topnews", and then select the "sun explodes" article with a permanent article URL like:
http://foo.com/2007/12/sun_explodes.html
Hmm...thinking out loud, sorry

What if, by convention, every article were to have a permanent URL (like
http://foo.com/2007/12/sun_explodes.html). A meta tag could indicate that this is the permanent URL to search engines, and that it's "basename" is "/2007/12/sun_explodes.html". Then, what if - also by convention - search engines would look at the "basename" of URLs, and if two articles have the same "basename", then do NOT treat them as duplicate content. In other words, this would allow you to include whatever taxonomy you want (
including formatting tags, like blog, etc.) in
front of the basename URL, and it would be OK with search engines. In my example, the "basename" would be "/2007/12/sun_explodes.html". So, "
http://foo.com/science/2007/12/sun_explodes.html" would be okay and not treated as duplicate content - it has the same basename as the article's permanent URL.
Going further, this could enable searching for an article using just the URL:
http://foo.com/news/science/ - a hierarchical search URL
http://foo.com/science,sun/ - an "AND" search url, returning articles tagged with both science and sun.
Anyway, sorry for babbling and thinking out loud...it's just something that caught my interest.
--Mike