view source: raw french accented characters appear as chinese!

VERIFIED DUPLICATE of bug 12502

Status

()

P3
minor
VERIFIED DUPLICATE of bug 12502
19 years ago
19 years ago

People

(Reporter: Herve.Renault, Assigned: jbetak)

Tracking

Trunk
x86
Linux
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [PDT-])

Attachments

(5 attachments)

(Reporter)

Description

19 years ago
though it's not correct not to encode accented characters into html entities,
lots of people here let those raw characters unencoded in their pages...
and then with mozilla, those characters appear as chinese ones in the "view
source" window !)

take a look at the screenshot...

NOTE: this does not happen when you put the correct
meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"
tag in the headers of the page.
(Reporter)

Comment 1

19 years ago
Created attachment 4104 [details]
funny screenshot to illustrate bug #23577

Updated

19 years ago
Status: NEW → ASSIGNED
Summary: raw french accented characters in html source appear as chinese characters ! → view soruce: raw french accented characters appear as chinese!

Updated

19 years ago
Target Milestone: M13

Comment 2

19 years ago
Can you please attach your testing html into this bug report so I can reproduce
this easier ?

Comment 3

19 years ago
He posted his html on irc earlier and it worked for me just fine
but not for others on #mozilla. I noticed that the default charset
on my rh 6.1 box is iso-8859-1. Perhaps the others are using
something else, like us-ascii.

Comment 4

19 years ago
html code from the screenshot :

     <INPUT TYPE="text" NAME="Titre_page_accueil" VALUE="hé ! & ! ok avec le
'nouveau 'FormatteSite en bash..." SIZE="70">
        <br>
        <br>

     <script language="Javascript" type="text/javascript">
     <!--
       // si Javascript est activé, on peut faire mieux qu un
échappement basique des doubles-quotes...

document.zeform.Titre_page_accueil.value = "hé ! & ! ok avec le 'nouveau
\"FormatteSite en bash...";
     //-->
     </script>
(Reporter)

Comment 5

19 years ago
Endico,



you wrote :

I noticed that the default charset on my rh 6.1 box is iso-8859-1. Perhaps the

others are using something else, like us-ascii.



but when i start mozilla i can read :

nsCollationUnix::Initialize mLocale = fr_FR

nsCollationUnix::Initialize mCharset

= ISO-8859-1



does it correspond to what you mean ?

Comment 6

19 years ago
Yes, that's what I meant. This is what I have.

nsCollationUnix::Initialize mLocale = C
nsCollationUnix::Initialize mCharset =
ISO-8859-1

Comment 7

19 years ago
could you kindly put this page as attachment to make it easier to test.
Otherwise, I afraid the copy and paste may currupt the acuracy of testing.
Thanks.

Updated

19 years ago
Summary: view soruce: raw french accented characters appear as chinese! → view source: raw french accented characters appear as chinese!

Comment 8

19 years ago
oops, sorry. i'm doing it right now.
+ changed typo in the "Summary" field ("view soruce ...")
(Reporter)

Comment 9

19 years ago
Created attachment 4217 [details]
test case for bug #23577

Comment 10

19 years ago
I just saw the same problem in the mail module (Messenger).
see the attached screenshot.
the "Subject:" field contains non-MIME-encoded accents...
it should be :

Subject: Internet Actu Spécial Lecteurs, 18 janvier 2000
                         ^
                         |

Comment 11

19 years ago
Created attachment 4309 [details]
screenshot for bug #23577 as seen in the mail module

Updated

19 years ago
Assignee: ftang → erik
Status: ASSIGNED → NEW
Target Milestone: M13 → M14

Comment 12

19 years ago
hum... reassign to erik for invetigation. It could be the font list issue ???
Herve: Can you attach your 'xlsfonts'results into the bugs report as
attachment ? This may help us to track down the problem faster.
(Reporter)

Comment 13

19 years ago
Created attachment 4334 [details]
my xlsfonts...

Updated

19 years ago
Assignee: erik → cata

Comment 14

19 years ago
The first comment in this bug report says that the View Source problem does not
occur when there is a META charset in the document. This is an indication that
the problem is not in the font engine, but in the code that tries to determine
the charset of a document. Tentatively reassigning to Cata.

Comment 15

19 years ago
I think this is probably related to the plain text file problem. Somehow we now 
use nsXMLDocument instead of nsHTMLDocument to view source or plain text file. 
While there are no meta, the view source fallback to use "UTF-8" charset in the 
nsXMLDocument. The default charset for XML is UTF-8. 
Status: NEW → ASSIGNED

Comment 16

19 years ago
In my tree -- I've changed plaintext from using XML to being an HTML document. 
ViewSource however is XML and will remain XML because of the additional 
capabilities we expect to layer upon it.

Comment 17

19 years ago
If the entire problem is that the document didn't have a meta tag,
then why did it work on my machine? Back when he first found the bug
he put the offending document online and a bunch of us on IRC tried
it out. View Source was broken for many people but it worked for me.
I'm sure I don't have any special i18n settings set in mozilla. 

Comment 18

19 years ago
RickG: I am a little bit confused about your comment. Are you going to check in 
the change to make plain text using nsHTMLDocument in near future ? 
Reassign this bug back to myself.
 
Assignee: cata → ftang
Status: ASSIGNED → NEW

Updated

19 years ago
Status: NEW → ASSIGNED

Comment 19

19 years ago
reassign View source problem to jbetak
Assignee: ftang → jbetak
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED

Updated

19 years ago
Keywords: beta1

Comment 20

19 years ago
Putting on PDT- radar for beta1.
Whiteboard: [PDT-]
I changed the way nsXMLDocument treats a view-source webshell charset settings 
- intsead of defaulting to the XML document standard UTF-8, we default now to 
the webshell standard charset, which happens to be Latin-1. Even though the 
webshell (see bug 27646) currently doesn't inherit the charset info from the 
"parent" window, Latin-1 is a good enough starting point for the problem at 
hand and the user can switch to any charset later, since the charset menu now 
works propely in the view-source mode.

The changes should be in on Sunday or Monday - code review permitting.
Status: ASSIGNED → RESOLVED
Last Resolved: 19 years ago
Resolution: --- → FIXED

Comment 22

19 years ago
I verified this in 2000021408 Win32, and Linux, and 2000021413 Mac build.
Status: RESOLVED → VERIFIED

Comment 23

19 years ago
i'm affraid there's a new problem : the source is cut from the first accented
character to the end of the page in 2000-02-14-16-M14 binary for Linux.

take a look at the attached screenshot below...
(Reporter)

Comment 24

19 years ago
Created attachment 5262 [details]
screenshot using 2000-02-14-16-M14 linux binary

Comment 25

19 years ago
reopen bugs which have NOT YET FIXED yet.  teruko- how can you verify this bug? 
jbeta have not check in his fix YET.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---

Comment 26

19 years ago
mark it as dup of 12502. 

*** This bug has been marked as a duplicate of 12502 ***
Status: REOPENED → RESOLVED
Last Resolved: 19 years ago19 years ago
Resolution: --- → DUPLICATE

Comment 27

19 years ago
Verified as dup.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.