Open Bug 372471 Opened 17 years ago Updated 12 years ago

[editor] Don't encode special characters in attribute values

Categories

(SeaMonkey :: Composer, enhancement)

SeaMonkey 1.1 Branch
PowerPC
macOS
enhancement
Not set
normal

Tracking

(Not tracked)

People

(Reporter: d.blair, Unassigned)

References

Details

(Whiteboard: DUPEME)

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.2) Gecko/20070221 SeaMonkey/1.1.1
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.2) Gecko/20070221 SeaMonkey/1.1.1

In editing web page counters, the character '|' consistently saves as '%7C' thus fouls the code. I have to correct this problem each and every time with a separate html editing program. Not good.

Reproducible: Always

Steps to Reproduce:
1.Save changes to html file (with character '|' inserted).
2.Check html file and find that code does not display properly due to misbegotten character.
3.Return to code via html editor and correct character where Seamonkey cannot.
Actual Results:  
If not fixed, code fails to produce result on webpage

If fixed through separate html editing program, code succeeds to produce result on webpage

Expected Results:  
the aforementioned character of '|' should appear consistently (and as functionally) after all (and any) changes have been made to html document.

Fix this so that I can use Seamonkey and not some other program.
Can you reproduce with SeaMonkey v1.1.9 ?
If yes, please attach a page example, so someone can try and reproduce.
Whiteboard: DUPEME
Version: unspecified → SeaMonkey 1.1 Branch
Assignee: composer → nobody
QA Contact: composer
I'm using Version 2.2
The character  |  gets converted to  %7C  in my visitor counter code causing it to not work at all.

Here is the code before editing with SeaMonkey:
<b>You are visitor number:&nbsp; <img
 src="/cgi-sys/Count.cgi?df=aakdq06.dat|display=Counter|ft=6|md=5|frgb=100;139;216|dd=A">
Since April 09, 2007<b>
<br>

Here is he code after editing with SeaMonkey:
<b>You are visitor number:&nbsp; <img
src="/cgi-sys/Count.cgi?df=aakdq06.dat%7Cdisplay=Counter%7Cft=6%7Cmd=5%7Cfrgb=100;139;216%7Cdd=A">Since
April 09, 2007<b>
<br>
That should work. Your server should automatically unescape the query string. Please consult the administrator of your webserver to see if there are some settings you need to change on the server.
I checked with my server administrator. Here is what he said:
"The way your software saves the edited page, is the issue, it is not a server issue. I would need to see the HTML code on the page and see the URL. Send the above and give me the URL to the page and I will take a look."
-----------------------
I sent the same information as shown in comment 3 above. Here is his reply: 
"Ok I see what the issue is, the software is converting it to Hex character (%7C ) and not decimal code as it should be (&#124;), the software should never change anything in URL's you enter into the software."

"If it was changing it to decimal code, the query string would be converted and it would work, but I have never seen it do so with hex character's. There may be a setting for it and you may need to turn that off in the settings."
------------------------
I don't understand how it could be a server setting if the code gets changed while I'm using SeaMonkey Composer, and not even connected to the internet. The code gets altered with no internet connection present.
This file shows the code after loading it into SeaMonkey, as well as after saving it. The original code had the "|" character originally, but those got changed to "%7C". In order to make this work, I have to go into Notepad and change each instance of the "%7C" back to "|".
Hmm.

encodeURI("|")
> %7C

http://www.w3schools.com/jsref/jsref_encodeuri.asp
https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/encodeURI

encodeURI replaces all characters except the following with the appropriate UTF-8 escape sequences:
Reserved characters 	; , / ? : @ & = + $
Unescaped characters 	alphabetic, decimal digits, - _ . ! ~ * ' ( )
Score 	#

So from this perspective we are correctly encoding the query string.

CC kaze.

Hey Kaze, what does the latest version of KompoZer do in this instance?
This is all very nice but the end result is that SeaMonkey is, what I consider, the best HTML editor program except that it alters the code so it doesn't work. 

I've used several other HTML editor programs, that do not alter this code. So why does SeaMonkey do it? I just don't understand why this is being done and why it isn't being converted back so that the HTML code will work.

This simply makes no sense to me.
FJA: SeaMonkey, like all Mozilla-based HTML editors, is a DOM-to-text editor. The HTML markup is parsed by Gecko, rendered on your screen like any browser would do, then serialized to text when the document is saved. It *will* reformat your code, there’s not much we can do about it.

However, there are some tweaks we could do to control *how* the document is serialized, e.g. by adding a few prefs to control which entities should be escaped, or if non-ascii characters are allowed in attributes — which could result in non-valid pages, but that can be necessary when dealing with server-side code (e.g. <a href="<?php $myVar ?>"link</a> or with your example above).

(In reply to Philip Chee from comment #6)
> So from this perspective we are correctly encoding the query string.

Right. This is not a bug, this is a request for enhancement.
I sure understand that it is a bug from the user’s perspective.

(In reply to Philip Chee from comment #6)
> Hey Kaze, what does the latest version of KompoZer do in this instance?

In Kompozer 0.7.x and 0.8.x we had a specific serialization pref for that: “Don’t encode special characters in attribute values”. This pref required a dirty hack on the DOM serializer. As the serializer has been seriously revamped in Gecko 2 (1.9.3 IIRC), I’ll have to re-implement these tweaks properly and propose a patch for the Core Editor. 

I’ve just renamed this bug after that pref. I’m afraid I can’t work on this right now but I’ll have a look.
Severity: major → enhancement
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: In editing web page counters, the character '|' consistently saves as '%7C' thus fouls the code → [editor] Don't encode special characters in attribute values
Note: Gecko now supports 21 serialization flags, SeaMonkey probably does not use all of them.
https://mxr.mozilla.org/mozilla-central/source/content/base/public/nsIDocumentEncoder.idl#71

We’ll probably have to add another serialization flag for this bug.
Hmmmm, that should be a rather easy fix, see nsXHTMLContentSerializer::SerializeAttributes.

An even easier fix would be to set a pref to avoid escaping special characters in URI attributes, but that might not be enough.
https://mxr.mozilla.org/mozilla-central/source/content/base/src/nsXHTMLContentSerializer.cpp#406
OKI! This has gotten far beyond my ability to understand. I won't comment anymore unless you request something else from me. 

Thanks for looking into this. I'll keep plodding along and if someday it works, I'll know it.
When I first saw the Bugzilla message, I thought that maybe this problem had been resolved. Instead it only showed that someone else had the same problem. Apparently no one is bothering with it after 5 years. Too bad
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: