OK,it's a BugReport now-Re:Prevent unescaping in com_content

This forum is for reporting bugs in Joomla!. Please don't report problems with extensions in here.
Locked
User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

OK,it's a BugReport now-Re:Prevent unescaping in com_content

Post by beanluc » Mon Nov 02, 2009 11:10 pm

Original topic title: wrote:How to DISPLAY (not execute, NOT FORMAT) Code In Content
I found 2 threads which ask this question.

The help which was offered was along the lines of "use one of the many code-in-content plugins for this".

After the posters clarified that they aren't trying to put executable code in content, but instead are trying to display bare code, no responses.

http://forum.joomla.org/viewtopic.php?p=1491043
http://forum.joomla.org/viewtopic.php?p=1465612


Anyway: I'm in the same boat.

Actually, using JCE, I am able to save code in content for display.
But
When I later open the same article for editing, JCE wrecks the displayable code.
It takes characters which have been saved in the database (the first time) as proplerly escaped <'s, and it UNESCAPES them, THEN the code which is supposed to be escaped for display is now actual HTML, XML or PHP tags (because JCE unescaped them).

Obviously the effect is either that JCE then strips these tags, OR, if the tags are present in JCE's "Extended Elements" configuration, then they're not stripped, but they're not escaped either, so, they either become invisible OR they get removed by Joomla's blacklist.

None of these effects are wanted, and none of them should happen at all.

Is anybody successful at using Joomla CMS with a decent WYSIWYG editor to publish and REVISE content which contains sourcecode snippets for display in content, not for execution?

THANKS SO MUCH for any help here!
Last edited by beanluc on Fri Nov 06, 2009 6:13 pm, edited 5 times in total.

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: How to DISPLAY (not execute, NOT FORMAT) Code In Content

Post by beanluc » Tue Nov 03, 2009 12:59 am

Here's another user who has experienced the same problem as me:

http://forum.joomla.org/viewtopic.php?p ... 0#p1708850

He posted in a thread which was about Joomla HTML filtering, though he also clearly describes that no filtering is happening, instead it's clear that his WYSIWYG editor is wrecking his content after OPENING for editing.

I read all the other stuff in that thread too, they are talking about TinyMCE settings which apparently yield reliable results. But, JCE doesn't have the same settings available in the configuration page(s).

While these settings are said to work in TinyMCE:

- Code Cleanup on Startup = No
- Code cleanup on save = Never
- Do not clean HTML entities=Yes
- Prohibited Elements = DELETE EVERYTHING THERE

JCE has these settings instead:
Cleanup HTML = No/Yes
Entity Encoding = raw/named/numeric


So, how can I force JCE to use the settings described for TinyMCE?
What's Cleanup HTML, anyway? Is it onstartup? Is it onsave? Both? Neither?
What's Entity Encoding, anyway? Is it a cleanup setting? Or unrelated to cleanup?

Thanks for any attention, the reason I'm posting so much here instead of at joomlacontenteditor.net site is that their support forum is down.
Last edited by beanluc on Tue Nov 03, 2009 5:45 am, edited 1 time in total.

User avatar
imanickam
Joomla! Master
Joomla! Master
Posts: 28202
Joined: Wed Aug 13, 2008 2:57 am
Location: Chennai, India

Re: How to DISPLAY (not execute) Code In Content

Post by imanickam » Tue Nov 03, 2009 1:08 am

Ilagnayeru (MIG) Manickam | இளஞாயிறு மாணிக்கம்
Joomla! - Global Moderators Team | Joomla! Core - Tamil (தமிழ்) Translation Team Coordinator
Former Joomla! Translations Coordination Team Lead
Eegan - Support the poor and underprivileged

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: How to DISPLAY (not execute, NOT FORMAT) Code In Content

Post by beanluc » Tue Nov 03, 2009 5:44 am

imanickam wrote:Try the extensions available at http://extensions.joomla.org/extensions ... de-display.
Thanks for looking into this, though, it's not the answer.

Fancy extras like "highlighting" don't do any good, regarding the problem described here.
Those extensions can only highlight what gets rendered in the first place.
The problem is, editing the article which has code in it wrecks the code completely.

If there's an extension which fixes the editor problem, I'll be glad to try it. But, I'm thinking this isn't something to fix with another extension, I'm thinking it's a problem, period, with the editor(s).

After I wrote the above, I realized: even using "No Editor" yields the same result. I confirmed, I have content in the database which is properly escaped with < instead of <

But when opening the article for editing, the < get UNESCAPED to <.

So saving the code the first time yields a properly escaped, properly saved body of content. Retrieving this from the DB for rendering in a pageview works great. It's only when this material gets retrieved from the DB and populated into a content editor window that it gets wrecked, with or without WYSIWG plugin.

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Right Forum? How to DISPLAY (not execute | format) CodeInCon

Post by beanluc » Tue Nov 03, 2009 9:39 pm

OK, I found this:

http://forum.joomla.org/viewtopic.php?p ... 1#p1849761

...describing a [template]/html/com_content/article/form.php over-ride which looks like it's supposed to correct the un-escaping of "<".

I'm going to look into this more, following the path suggested above. However: this behavior happens in the back-end too, so, what's described above probably isn't the actual answer. Still, it hopefully puts me on a more fruitful track.
beanluc wrote:even using "No Editor" yields the same result. I confirmed, I have content in the database which is properly escaped with < instead of <

But when opening the article for editing, the < get UNESCAPED to <.

So saving the code the first time yields a properly escaped, properly saved body of content. Retrieving this from the DB for rendering in a pageview works great. It's only when this material gets retrieved from the DB and populated into a content editor window that it gets wrecked, with or without WYSIWG plugin.

Because I've learned this matter is NOT related to WYSIWYG, but is core J! behavior, I ask: Is this the right forum for this? Should it be moved to a more appropriate place and maybe even re-titled to indicate that this might not be a garden-variety support issue? Or, maybe it IS a garden-variety support issue - still, is there a better place?

Thanks

User avatar
ooffick
Joomla! Master
Joomla! Master
Posts: 11615
Joined: Thu Jul 17, 2008 3:10 pm
Location: Ireland
Contact:

Re: Prevent unescaping of HTML entities in com_content edit task

Post by ooffick » Tue Nov 03, 2009 10:43 pm

Did you check the blacklist setting to include HTML into your Articles:
http://docs.joomla.org/Why_does_some_HT ... n_1.5.8%3F

and if you select No Editor, there is no editor?

Olaf
Olaf Offick - Global Moderator
learnskills.org

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by beanluc » Tue Nov 03, 2009 11:25 pm

It's true, if I select "No Editor", there is no editor.

This isn't a blacklist matter. I see in the database that my article has exactly the content I want. Blacklist prevents HTML elements from getting saved in the DB at all.

Blacklist has nothing to do with escaped characters getting unescaped when being rendered into the textarea of the edit form.

This has not to do with blacklist, and not WYSIWYG either. I hope a moderator can move this to the better more suitable area. Having it here in WYSIWYG is confusing. I know, I started it here, but since then we've all learned it's not related to WYSIWYG.

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping of HTML entities in com_content edit task

Post by beanluc » Tue Nov 03, 2009 11:41 pm

ooffick wrote:Did you check the blacklist setting to include HTML into your Articles?
You understand, I'm not trying to include HTML in my article. I'm trying to include the escaped representation of HTML in my article.

It works, but only up until after I create and save such content. It gets destroyed when I open it for revising.

I generally don't jump quickly from "I can't make it work" to "it's got to be a bug". If there's some obscure knowledge about built-in setting which mitigates this effect, I'll be very happy to learn it. Or even obvious knowledge. I'll be the first to say DUH ME - and then I'll gratefully carry on.

But I will say:
The fact that I can save such content the first time suggests that intended behavior is to allow the escaped representation of entities like & and < in content. If this content is getting corrupted as a result of a specific routine CMS activity, like editing/revising, it suggests that the intended behavior is failing.

So: I ask for the following help from a moderator/admin:
Please help figure out where this thread will get the most helpful attention. I believe the WYSIWYG group is not it. I don't know if the Core Coding forum, a Bug Reports forum (is here one?), or some other forum is correct, but I won't go spamming all the topics I can think of with the same question. Instead I'll leave it to the mods/admins to help here.

Thanks again,
BL

User avatar
ooffick
Joomla! Master
Joomla! Master
Posts: 11615
Joined: Thu Jul 17, 2008 3:10 pm
Location: Ireland
Contact:

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by ooffick » Tue Nov 03, 2009 11:51 pm

Hi,

I just tried to add some html code, so the HTML code is showing up in the frontend and it works without any problem.

So what exactly is the problem?

e.g. < is automatically translated to < as required by the HTML standard, as it is otherwise parsed by the browser.

So what are you trying to do?

Olaf
Olaf Offick - Global Moderator
learnskills.org

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by beanluc » Wed Nov 04, 2009 12:28 pm

ooffick wrote:I just tried to add some html code, so the HTML code is showing up in the frontend and it works without any problem.

So what exactly is the problem?

e.g. < is automatically translated to < as required by the HTML standard, as it is otherwise parsed by the browser.
I know, it works for me too when I do what you say above.

Did you do the next step?

After you saved it, and you saw that the < was escaped, did you do the next part about opening the article for editing again? What happened? Did the escaped < stay escaped, or did it get unescaped back to real <?

BL

User avatar
ooffick
Joomla! Master
Joomla! Master
Posts: 11615
Joined: Thu Jul 17, 2008 3:10 pm
Location: Ireland
Contact:

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by ooffick » Wed Nov 04, 2009 1:03 pm

Yes, works fine, that depends if you see it with "NO Editor" or TinyMCE

Olaf
Olaf Offick - Global Moderator
learnskills.org

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by beanluc » Wed Nov 04, 2009 10:32 pm

Based on what I'm observing, and what I stated above, it does not depend.

"No Editor" has the same behavior.

We already moved this thread out of WYSIWYG forum based on this information.

When you say "works fine", are you saying that,
• with "No Editor" selected,
• you entered the escaped representation of a HTML tag in the body of a content-item for display,
• you saved that content-item,
• you opened it again for editing,
• received the content-item-body back from the server inside the textarea in your browser,
confirmed that it is still escaped as you see in the textarea,
• you submitted it again,
• and it's still escaped after the second save
• and still correctly displaying in the article?

The 6th bullet (bold, regards "confirmation") is where this breaks down, so, of course, bullets 7, 8 and 9 fail too.

It doesn't matter which browser. Doesn't matter which editor or if there's no editor.

You're saying it's not happening that way for you?

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by beanluc » Wed Nov 04, 2009 10:48 pm

OK, so, here's what I've learned:
I confirmed with Wireshark that the textarea element is receiving the escaped characters from the server, in the "form" view/edit task.

When I saw that from Wireshark, I looked in the page source (which we do after all expect to be the same thing we see in Wireshark).
Sure enough, the value inside the textarea element has the escaped characters.

Now I understand, this is a browser issue:
the browser both
displays the escaped char's as unescaped ones when it renders the textarea,
and also what's more,
actually submits them as unescaped if you submit the form.

Both of those conditions (primarily the first) are what breaks my content with escaped code in it, when re-editing the article.

I'm guessing here, but the fact that six different browsers I've tried all do the exact same thing, makes me think this is absolutely expected behavior, probably part of the W3C specifications for HTML and/or XHTML.

(Not six different instances of a browser, but six distinct browsers: IE 6 and 7, FF 2 and 3.5, Safari 2, Opera 9. When there's one thing they all do the same way, that can't be accidental.)

User avatar
ooffick
Joomla! Master
Joomla! Master
Posts: 11615
Joined: Thu Jul 17, 2008 3:10 pm
Location: Ireland
Contact:

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by ooffick » Wed Nov 04, 2009 11:02 pm

I open a new article with e.g. TinyMCE:

write the following into the editor:

Code: Select all

<p><b>test</b></p>
and then save it
and then open it
and then save it
and then open it

And it stays the same.

Or if you want to use NO Editor:

Code: Select all

<p><p>&lt;b&gt;test&lt;/b&gt;</p></p>
Olaf
Olaf Offick - Global Moderator
learnskills.org

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by beanluc » Wed Nov 04, 2009 11:04 pm

--Little later:--
So, I made a bare HTML file with nothing in it but a form with textarea.

Exact same behavior.

Even when page source has escaped entities, when you display the page, not only do you see unescaped ones in the textarea, it's actually worse than that:
Any manipulation you can try with JS, DOM, Firebug/IE Dev Toolbar, form submission, etc., all those techniques wind up treating it as unescaped despite its actually being escaped in page source. Seems like the fact that it's in a textarea introduces this behavior.

In my static HTML test page, I doubly escaped the entities.
(For example, < gets doubly escaped to &lt; - the & character which forms part of the original gets escaped itself too with the double escaping.)

And guess what?
With double escaping in HTML file source, the textarea on rendering now returns the singly-escaped entities!
So, instead of < getting unescaped to <, now we have &lt; getting escaped to <
Now THAT'S some content we can deal with, because when that form is submitted, we have exactly the single-escaped entities we wanted all along.
Not only that, it's proper in the field before we submit it too, so, the person editing the field can see the right encoding. He doesn't have to "pretend" to not be looking at an unescaped-but-wrong content string. Of course, we want both: we want proper content in front of the editor's eyes, and, we want proper content to be saved.

OK, so, I think the secret to solving this is going to be about doubly-escaping the content which Joomla is putting into the textarea html element. Well, it's actually already escaped in the DB, so, Joomla only has to escape it again once, not twice.
Last edited by beanluc on Thu Nov 05, 2009 7:46 am, edited 1 time in total.

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by beanluc » Wed Nov 04, 2009 11:10 pm

ooffick wrote:I open a new article with e.g. TinyMCE:write the following into the editor:

Code: Select all

<p><b>test</b></p>
and then save it
and then open it
and then save it
and then open it
And it stays the same.
Or if you want to use NO Editor:

Code: Select all

<p><p>&lt;b&gt;test&lt;/b&gt;</p></p>
Olaf
Olaf, thanks a lot for testing that. TinyMCE seems to be handling this for you. I'm not eager to switch to TinyMCE, so, originally I was asking if some known JCE settings would take care of this.

That's what my post was about, where I compared recommended TinyMCE settings to the options revealed in my JCE config area, and I asked asked what's the same, what's different, what do the JCE labels even mean, how can I get the recommended TinyMCE effect from JCE when the options are obscure or not equal.

Seems clear to me that in your case, Olaf, the editor is actually FIXING the textarea corruption. You didn't say what happened with "No editor", whether you did repeated opens/saves. Thanks again for continuing to help.
Last edited by beanluc on Thu Nov 05, 2009 7:47 am, edited 1 time in total.

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by beanluc » Thu Nov 05, 2009 12:20 am

--Little later:--
Note: This is not browser-specific behavior either. It turns out that this is wel-known, expected behavior, it's the way X/HTML is supposed to work.

Google reveals that there's a whole list of web development languages, developers in which know to expect this. Members of these languages' user communities counsel help-seekers and learning developers to count on it and to handle it. It's not a surprise. Like I said: Six distinct browsers are all doing it the same way. That can't be an accident. It's got to be standard.

(I know, you can joke that it IS an accident when six distinct browsers all do something the same way, CSS and JS users laugh here now, that's a good one, you know what I mean, but this isn't like that. This is more like expecting browsers to talk on HTTP the same way. Forms are something which had darn well better behave predictably, or very little on the Web could be reliable.)

So, for this to work in Joomla, Joomla has to doubly-escape the material which is being put into the textarea element.

The reason I say it's Joomla's job, even though there's apparently at least one way to get at least one WYSIWYG editor to take care of this for me, is as follows:
Sure, you MIGHT be able to configure a popular editor plugin to do this for you (might not always be possible).
Joomla isn't responsible for FCK, Tiny, JCE, etc.
Joomla is responsible for Joomla.
What if I have a really good reason for not using an editor plugin? (I'm not saying I do have one, I'm saying Joomla shouldn't assume nobody's going to edit this way).
If I did have a good reason, this effect still would occur.
Why would I want to put up with entering and saving properly escaped entities, then entering and saving them again just because I want to open that content body later to edit some other part of it?
I wouldn't.
I don't.

So, there are 2 reasons why I'm continuing to try to drum up attention to this.
One is:
I need JCE to work. I hope someone can tell me how to configure JCE with the TinyMCE settings recommended above. I know JCE is based on Tiny, so, I expect the same config settings are available... somehow.

The other is:
As I described, I believe Joomla should be able to mitigate this.
I don't know if it actually is or if it actually isn't.
I would really like some informed person to say
• whether this is a known, expected, unsurprising condition of using Joomla;
• whether there's some unknown (to me) setting in Joomla for this;
• whether this (escaped representation of code intended for display in content, unreliable preservation of such content with subsequent editing) is such a novel use-case that the CoreDev's might not have considered, tested and accounted for it, with a patch, a config setting, or a documentation page;
• and whether we're still not discussing it in the place where the most likely knowledgeable parties will be exposed to it. We were in "WYSIWYG" before, a little-trafficked forum, now we're in "Adminstration", more traffic but only helpful if there is after all an Administrative solution or someone browsing here who knows some of what I ask above.

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by beanluc » Thu Nov 05, 2009 1:12 am

--little later--

OK, I have a com_content/article/form.php template override file.

On line 95, I replaced
echo $this->editor->display('text', $this->article->text, '100%', '400', '70', '15');

with

echo $this->editor->display('text', str_replace("&lt;", "&lt;", $this->article->text), '100%', '400', '70', '15');

Interestingly, the match pattern is the same as the replace pattern.
Why?
Because the escaped representation of < matches <.
That is, < matches <.
That doens't get me anywhere because it will escape things which shouldn't be escaped.
I only want to match < and escape the & character in it.
To match < I can use &<
To escape the & in < I first match the &lt, by using its escaped representation, which is &lt;
Now I've matched the unescaped < by using the escaped & to match it, then what do I replace the match with? The escaped representation of < which is &lt;

It's twisted, it's perverse, my match pattern is identical to my replace pattern, but it's doing what I need. My editor plugins work with my code-in-content now, and so does my "No Editor" form.

I really do hope someone speaks to this and helps me learn (A) whether this is necessary and (B) whether this particular way is reliable, safe, bad practice, whatever. I've already learned a lot from this and I hope some guru would check my work so I complete the lesson, for better or for worse.

Let me put it this way:
At least four people have already brought this issue up in the forums in the last 12 months.
Who knows how many more have tried to search the forum on this subject and didn't ever post.
I predict I'm not going to be the only one who implements this view override, if nobody comes to warn us if it's all wrong, unnecessary, and dangerous, and if there actually is already a deliberate feature or setting in Joomla Core for mitigating this completely predictable effect.

Thanks a lot, indeed, sincerely, to Olaf for staying with me, trying a couple things, talking it over, and moving and re-titling the thread for me after I figured out it wasn't to be blamed on the editor plugins.
Last edited by beanluc on Thu Nov 05, 2009 7:54 am, edited 1 time in total.

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by beanluc » Thu Nov 05, 2009 1:43 am

Let me spell out two things I forgot about:

First, I added this on line 3 of form.php
JFilterOutput::objectHTMLSafe( $this->article );
my str_replace thing described above doesn't work without that step.

Second, I didn't make it completely clear that all my problems were about front-end editing. The back end edit task handler has different code, and wasn't giving a problem. It still wasn't a suitable work-around, due to the way our site works, and that's why I persevered until the front-end form.php override solution presented itself.

User avatar
ooffick
Joomla! Master
Joomla! Master
Posts: 11615
Joined: Thu Jul 17, 2008 3:10 pm
Location: Ireland
Contact:

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by ooffick » Thu Nov 05, 2009 10:12 am

beanluc wrote:Seems clear to me that in your case, Olaf, the editor is actually FIXING the textarea corruption. You didn't say what happened with "No editor", whether you did repeated opens/saves. Thanks again for continuing to help.
Yes, works with JCE, and NO Editor as well.

Olaf
Olaf Offick - Global Moderator
learnskills.org

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by beanluc » Thu Nov 05, 2009 7:39 pm

ooffick wrote:So what are you trying to do?
Aha, Olaf, I wrote an answer to this but it didn't get saved in the thread.

I'll answer it now, even after all of the above, just because it will illustrate what my problem really was.

We have a site which we're positioning as a "wiki". Some of the contents are technical documentation, API drafts, and other types of pages which must include (mostly XML) code and pseudocode for display in content.

We're calling it a "wiki" even though it's Joomla CMS, because articles are both a product and a driver of a collaborative, iterative work process, and the activity requires many drafts on individual articles.

This is all happening from the frontside, not the backend.

(Aside:How to get Joomla to be a wiki? Simple: make all contributors Publishers, include a Comments extension, and include a Versions/Revisions extension.)

So, here is how you can see that for us it's critical that content doesn't get corrupted when an article is opened for revising and editing. We have to have this working, because we know that revisions will happen, and not by people who are webmasters and can fix it manually.

Possibly we're unlike a lot of other authors who are using Joomla, who are posting code-in-content for display. Many of them might never encounter this particular case (because they don't revise) or might fix the corruption themselves when they see it (because they're webmasters).

What I discovered is that the backend com_content form layout handles this case well, and the frontend com_content form layout doesn't work the same way and fails to ensure that certain escaped entities remain escaped inside the textarea.

So, this seems serious to me - if only for a small sub-set of sites.

I'm still interested to get attention on this from com_content developers. Olaf, as a moderator, would you consider moving this thread again to the right place for core functionality assessment and coding?

Thanks, Olaf,
BL

User avatar
ooffick
Joomla! Master
Joomla! Master
Posts: 11615
Joined: Thu Jul 17, 2008 3:10 pm
Location: Ireland
Contact:

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by ooffick » Thu Nov 05, 2009 8:40 pm

beanluc wrote:What I discovered is that the backend com_content form layout handles this case well, and the frontend com_content form layout doesn't work the same way and fails to ensure that certain escaped entities remain escaped inside the textarea.
Yes, there seems to be some problems in the frontend form encoding.

Olaf
Olaf Offick - Global Moderator
learnskills.org

User avatar
ooffick
Joomla! Master
Joomla! Master
Posts: 11615
Joined: Thu Jul 17, 2008 3:10 pm
Location: Ireland
Contact:

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by ooffick » Thu Nov 05, 2009 11:24 pm

Hi,

to fix it try the following:
  1. create a folder in your template folder called "html" (if it doesn't exists)
  2. create a folder in that html folder called "com_content" (if it doesn't exists)
  3. create a folder in that com_content folder called "article" (if it doesn't exists)
  4. copy the file .../components/com_content/views/article/tmpl/form.php in this .../templates/[your-template]/html/com_content/article/ folder
  5. open that file
  6. find the following lines:

    Code: Select all

    echo $this->editor->display('text', $this->article->text, '100%', '400', '70', '15');
  7. and replace it with the following:

    Code: Select all

    echo $this->editor->display('text', htmlentities($this->article->text), '100%', '400', '70', '15');
  8. save the file
  9. set the Filter groups to "Public Frontend" in the Global Parameters.
Olaf
Olaf Offick - Global Moderator
learnskills.org

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: Prevent unescaping HTML entities in com_content "edit" task

Post by beanluc » Fri Nov 06, 2009 6:53 am

That's right, I already had the form.php override file in my template html directory.

Turns out, the only thing that's necessary is simply to add this to line 3:

Code: Select all

JFilterOutput::objectHTMLSafe( $this->article );
that's how the admin version of form.php handles this. Makes sense to (A) do it the same way in the frontend form.php, and (B) to use the Joomla built-in method.

So, not necessary to edit line 95 with a PHP function. Only to add the JFilterOutput line near the top.

Do you think this is an oversight? Deliberate? If so, why? Should this be brought to, let's say, http://forum.joomla.org/viewforum.php?f=304 for bugfix consideration? The thread could be moved, or, a new (concise) one could be started over there.

What do you say?

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Bug Report: Re: Prevent unescaping in com_content "edit" tas

Post by beanluc » Fri Nov 06, 2009 6:09 pm

Problem Description:
HTML entities in body of article content get UNESCAPED when opening article for editing from the front end. So the content gets corrupted if it's revised and saved.

This arises from normal behavior of the textarea HTML element. It's predictable that when a textarea's value contains HTML entities, they're unescaped both for in-page rendering of the textarea, and also for actual data transmission on form submission.

In the admin article-editing form, Joomla provides JFilterOutput::objectHTMLSafe() for re-escaping HTML entities before populating them into the editing textarea's value, so that when the original content is unescaped by the textarea, it winds up back to the original, deliberately single-escaped state, which can then be edited and saved without corruption.

This is missing from the front end article-editing form.

Extended, detailed problem description can be found as follows.
Please read carefully before forming judgements, because this issue has been very frequently mis-understood as being related to Filtering (blacklisting) or to Formatting (supported not by Core but by "code-display extensions).

http://forum.joomla.org/viewtopic.php?p ... 7#p1917117
(what the overview scenario is)

http://forum.joomla.org/viewtopic.php?p ... 4#p1913424
(one description of replicating the problem effect)

http://forum.joomla.org/viewtopic.php?p ... 1#p1915921
(another description of replicating the problem effect)

http://forum.joomla.org/viewtopic.php?p ... 7#p1913167
(description of the actual effect and why it's a serious problem)


Actions Taken To Resolve:
in
/components/com_content/views/article/tmpl/form.php,
add after line 2

Code: Select all

JFilterOutput::objectHTMLSafe( $this->article );
Diagnostic Information
Joomla! Version: Joomla! 1.5.14 Stable [ Wojmamni Ama Naiki ] 30-July-2009 23:00 GMT
configuration.php: Writable (Mode: 666 ) | RG_EMULATION: N/A
Architecture/Platform: Windows NT 6.0 ( i586) | Web Server: Microsoft-IIS/7.0 ( geminiweb.genesyslab.com ) | PHP Version: 5.2.11
PHP Requirements: register_globals: Disabled | magic_quotes_gpc: Disabled | safe_mode: Disabled | MySQL Support: Yes | XML Support: Yes | zlib Support: Yes
mbstring Support (1.5): Yes | iconv Support (1.5): Yes | save.session_path: Writable | Max.Execution Time: 1200 seconds | File Uploads: Enabled
MySQL Version: 5.1.39-community ( localhost via TCP/IP )

Extended Information:
SEF: Disabled (without ReWrite) | FTP Layer: Disabled | htaccess: Not Implemented
PHP/suExec: User and Web Server accounts are not the same. (PHP/suExec probably not installed)
PHP Environment: API: cgi-fcgi | MySQLi: Yes | Max. Memory: 128M | Max. Upload Size: 60M | Max. Post Size: 60M | Max. Input Time: 300 | Zend Version: 2.2.0
Disabled Functions:
MySQL Client: 5.0.51a ( latin1 )

User avatar
beanluc
Joomla! Guru
Joomla! Guru
Posts: 922
Joined: Wed Mar 04, 2009 9:50 am
Location: Silicon Valley, CA, USA

Re: OK,it's a BugReport now-Re:Prevent unescaping in com_content

Post by beanluc » Thu Mar 25, 2010 1:24 am


User avatar
mcsmom
Joomla! Exemplar
Joomla! Exemplar
Posts: 7897
Joined: Thu Aug 18, 2005 8:43 pm
Location: New York
Contact:

Re: OK,it's a BugReport now-Re:Prevent unescaping in com_content

Post by mcsmom » Fri Mar 26, 2010 1:58 pm

Thanks beanluc. Would you mind tracing the same thing in 1.6 and see if it is still happening? Now is the time to get it resolved. Would be great if you would post on the cms-dev list if it is still an issue in 1.6.
So we must fix our vision not merely on the negative expulsion of war, but upon the positive affirmation of peace. MLK 1964.
http://officialjoomlabook.com Get it at http://www.joomla.org/joomla-press-official-books.html Buy a book, support Joomla!.


Locked

Return to “Joomla! 1.5 Bug Reporting”