Page 1 of 1

Non existing url reachable if first character is the id of an article or category

Posted: Sat Jan 26, 2019 8:17 pm
by zyltar
Hi,

Not sure it's a bug but I noticed that actually as long as an url begin with the id of an existing article or category, it does not trigger a 404 even if what follow does not correspond to any of the site aliases.

Examples on a demo website :

https://jdemo.anw-sandbox.ch/fr/2this-a ... -not-exist
https://jdemo.anw-sandbox.ch/fr/2/this- ... e/and-more

And to check the alias does not exist at all, just try without the number or with a number which is not the id of an existing article, it will then normally raise a 404 :

https://jdemo.anw-sandbox.ch/fr/this-al ... -not-exist
https://jdemo.anw-sandbox.ch/fr/9this-a ... -not-exist


This behavior is very annoying as Google actually finds problems on pages that do not exist and it's thus impossible to fix them.

Is it supposed to be the normal behavior actually or is it a bug ?


Thanks in advance for your help.

Re: Non existing url reachable if first character is the id of an article or category

Posted: Sun Jan 27, 2019 5:43 pm
by Alejo
Try enabling modern routing in the Articles/Options settings so URLs will not have IDs to begin with.

Re: Non existing url reachable if first character is the id of an article or category

Posted: Wed Jan 30, 2019 7:03 pm
by zyltar
Thanks a lot for your feedback and sorry for my late answer, I missed the notification.

Enabling modern routing will solve it only if I also set "Remove IDs from URLs" to yes and this way all article urls with an ID will raise a 404 even if followed by the correct article alias.

Of course I could probably manage to fix that by adding sort of a rewrite rule but I guess it would be quickly getting me in troubles on websites with many third party extensions.

Honestly I do not understand the way routing actually works in Joomla content component. Even if an url begins with the id of an existing article, if what follows does not match the article alias, it should raise a 404, no ? What's the point of having any url working whatever it's made of as long as it begins with an existing ID ?

I noticed this issue because I received warning from Google for pages with incorrect hreflang on the website of an animal rescue park and when I browsed the errors Google found, I found urls such as https://www.parc-challandes.ch/fr/3-month-payday-loans which is reachable only because there is an article with ID 3 : https://www.parc-challandes.ch/fr/le-parc

That does not make any sense to me and it may be quite a problem for SEO. I do not know why Google tries to browse such urls but I am pretty sure that should raise a 404 even with legacy routing.

Anyway, thanks for your help, maybe I 'll finally find a way to get around that. If it's the case, I'll come back here to share it :)

Re: Non existing url reachable if first character is the id of an article or category

Posted: Wed Jan 30, 2019 7:34 pm
by Alejo
Read my other reply to a related topic here viewtopic.php?p=3557099#p3557116

Yes, you have to enable modern routing and remove IDs. Then by using redirect rules you can have the old URLs using IDs automatically go to the new URLs without IDs.

Re: Non existing url reachable if first character is the id of an article or category

Posted: Mon Feb 18, 2019 5:38 am
by zyltar
Once again, sorry for my late reply.

I did some testing but encountered problems on some websites with third party extensions that do not support modern routing, at least for now.

So I had to find another solution and ended up using the "Prevent content link spamming" function of Jredirects : https://extensions.joomla.org/extension/jredirects which do the job without forcing me to migrate all sites to modern routing.

And once I will be ready for this migration, it will also help me on the way as it has got a function for that too , so it's a pretty nice discovery.

Thanks for your help and have a nice day.