Advertisement

Multilingual: "alternate" links are ok (UTF-8) but not the "canonical" link (ASCII)

Discuss Search Engine Optimization in relation to Joomla! 4.x. This forum will also have discussions on SEF/SEO Joomla! 4.x extensions.

Moderator: General Support Moderators

Forum rules
Forum Rules
Absolute Beginner's Guide to Joomla! <-- please read before posting, this means YOU.
Forum Post Assistant - If you are serious about wanting help, you will use this tool to help you post.
Windows Defender SmartScreen Issues <-- please read this if using Windows 10.
Post Reply
StoneFree
Joomla! Fledgling
Joomla! Fledgling
Posts: 3
Joined: Mon Nov 04, 2024 10:39 pm

Multilingual: "alternate" links are ok (UTF-8) but not the "canonical" link (ASCII)

Post by StoneFree » Mon Nov 04, 2024 11:51 pm

Hello,

I have a live multilingual site, with English as the default language, and 5 other ones.

All links in the html <head> section of the articles are fine for the rel="alternate" hreflang="..." attributes:
<link href="https://mysite.com/url-en" rel="alternate" hreflang="en-GB">
<link href="https://mysite.com/de/url-de" rel="alternate" hreflang="de-DE">
<link href="https://mysite.com/fr/url-fr" rel="alternate" hreflang="fr-FR">
<link href="https://mysite.com/es/url-es" rel="alternate" hreflang="es-ES">
<link href="https://mysite.com/pt/urf-pt" rel="alternate" hreflang="pt-BR">
<link href="https://mysite.com/zh/url-zh" rel="alternate" hreflang="zh-CN">

All above URLs (url-de, url-fr, url-es, url-pt & url-zh) are UTF-8 with the right characters (ü, è, é, á, ó, â, 更, 统, 腺, 支, etc).

The problem is with the canonical link.

Below the rel="alternate" links, the rel="canonical" link is not UTF8 - for instance in Chinese (zh):
<link href="https://mysite/zh/%E6%9B%B4%E5%A4etc" rel="canonical">,
hence with ASCII encoding and all the "%" followed by two hexadecimal digits.

I don't understand why the canonical link is not correct (ASCII) when the line right above is ok (UTF-8): rel="alternate" for zh shows the right characters (更, 统, 腺, 支, etc).

I have tried in vain:
- Online search: old posts, Joomla forums, Stack Overflow, github, etc
- Browsers issue (Firefox, Chrome, Edge, cookies etc)
- Joomla setup: SEF & multilingual
- MySQL: all tables seem ok, with UTF-8 encoding
- Quick search on "canonical" in Joomla's code (not a pro but I have developed a component a few years ago) - no clue

Any help would be appreciated.
Thanks!

Advertisement
User avatar
pe7er
Joomla! Master
Joomla! Master
Posts: 25287
Joined: Thu Aug 18, 2005 8:55 pm
Location: Nijmegen, Netherlands
Contact:

Re: Multilingual: "alternate" links are ok (UTF-8) but not the "canonical" link (ASCII)

Post by pe7er » Tue Nov 05, 2024 7:55 am

Welcome to Joomla forum!

I have experience with multilingual websites but not with the Chinese language.

How did you configure your "Unicode Aliases"?
Check Global Configuration > Site > SEO (at the bottom) > "Unicode Aliases"
Is it enabled to allow non-ASCII characters in URLs?

No: Joomla transliterates aliases which contain non–latin-1 characters using the transliteration engine provided by the Joomla language pack for the content's selected language, e.g. über becomes ueber for en-GB (English, Great Britain).

Yes: Joomla does not transliterate aliases; the alias is used as is, e.g. über remains as–is.
Kind Regards,
Peter Martin, Global Moderator
Company website: https://db8.nl/en/ - Joomla specialist, Nijmegen, Netherlands
The best website: https://the-best-website.com

shumisha
Joomla! Guru
Joomla! Guru
Posts: 554
Joined: Sat Aug 20, 2005 3:15 pm
Contact:

Re: Multilingual: "alternate" links are ok (UTF-8) but not the "canonical" link (ASCII)

Post by shumisha » Tue Nov 05, 2024 8:02 am

Hi

1 - What you see is not really important, the URL in the canonical tag is simply URL-encoded and Google will know how to handle it

2 - Do you know what produces this canonical tag?

Joomla by default does not insert a canonical, unless you have entered a domain name in the options of the Joomla SEF system plugin. If you did that, then remove the domain name: the Joomla-generated canonical is not useful, it was not meant for "SEO" canonicalization and is not able to determine which URL is canonical.

And you don't need a canonical tag if you don't have duplicate content.
4SEO, 4AI, 4Command, 4Podcast, 4Video, SEO and content extensions for Joomla 3, 4 & 5 - https://weeblr.com
I don't reply to PM anymore. Thanks for using our extensions.

StoneFree
Joomla! Fledgling
Joomla! Fledgling
Posts: 3
Joined: Mon Nov 04, 2024 10:39 pm

Re: Multilingual: "alternate" links are ok (UTF-8) but not the "canonical" link (ASCII)

Post by StoneFree » Tue Nov 05, 2024 9:28 pm

Hi Pe7er and shumisha,

Thanks for your help.

@Pe7er:

"Unicode Aliases" is set to yes.

From what I've seen, Joomla transliteration in articles works fine with all languages I use, except Chinese.
For instance with de, fr, es & pt, aliases keep their "special characters: ü, è, é, á, ó, â...", and get rid of spaces, colon signs (:) etc, and add hyphens to what would be a clean URL.

It's not the case with Chinese: article aliases are just the copy of the title! sometimes with one hyphen here or there, but with all spaces, colons etc.

The good news is that my Chinese articles get their url from one word menu items, thus no problem accessing them thru menu navigation. But no chance to access them thru aliases that have non-valid url characters.

My conclusion: Joomla not yet ready for development in China ;)

@shumisha:

Considering all this complexity, with your info about how canonical links matters for our preferred search engine, I have removed the domain name in the SEF plugin.

I will see what influence it has on "Why pages aren’t indexed" in Google Search Console:
"Duplicate without user-selected canonical"
"Duplicate, Google chose different canonical than user"

What also bothers me is that the hreflang tag testing tools (https://technicalseo.com/tools/hreflang/) recommended by Google (https://developers.google.com/search/do ... n-mistakes) give me errors like :
- “Self-referencing: Missing” for the url tested
- "Missing return link” for every url quoted in ‘alternate hreflang’ in the url tested (i.e. all other languages).
- plus obviously a mismanaged “x-default”.
All this while scrupulously using the Joomla multilingual standard.

Conclusion: I think Joomla still has some room for improvement in its multilingual & SEO management, but Google knows this and correctly interprets what it needs for its search results.

shumisha
Joomla! Guru
Joomla! Guru
Posts: 554
Joined: Sat Aug 20, 2005 3:15 pm
Contact:

Re: Multilingual: "alternate" links are ok (UTF-8) but not the "canonical" link (ASCII)

Post by shumisha » Wed Nov 06, 2024 8:10 am

Hi
Considering all this complexity, with your info about how canonical links matters for our preferred search engine, I have removed the domain name in the SEF plugin.
A correct and valid canonical would be good, when needed. But the one Joomla generates through the SFE plugin is not good enough and you are better off removing it, regardless of your issue with URL encoding.
I think Joomla still has some room for improvement in its multilingual
In my many years with Joomla and Joomla multilingual sites, I have found that Joomla's implementation is very good and does not show the problems you mention.
What also bothers me is that the hreflang tag testing tools (https://technicalseo.com/tools/hreflang/) recommended by Google (https://developers.google.com/search/do ... n-mistakes) give me errors like :
- “Self-referencing: Missing” for the url tested
- "Missing return link” for every url quoted in ‘alternate hreflang’ in the url tested (i.e. all other languages).
- plus obviously a mismanaged “x-default”.
All this while scrupulously using the Joomla multilingual standard.
That's unlikely and will definitely would not happen with a site done "according to Joomla ML standard". It looks more like you have issues in your menu configuration. I'd suggest reviewing all that, and especially the fact you have proper home menu items in each languages, and proper associations where needed.
4SEO, 4AI, 4Command, 4Podcast, 4Video, SEO and content extensions for Joomla 3, 4 & 5 - https://weeblr.com
I don't reply to PM anymore. Thanks for using our extensions.

Advertisement

Post Reply

Return to “Search Engine Optimization (Joomla! SEO) in Joomla! 4.x”