Closed
Bug 138215
Opened 22 years ago
Closed 3 years ago
Unicode control characters are printed as symbols
Categories
(Core :: Internationalization, defect)
Core
Internationalization
Tracking
()
RESOLVED
FIXED
People
(Reporter: bronger, Assigned: jshin1987)
References
()
Details
(Keywords: intl)
From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020313 BuildID: 2002031312 Unicode characters like "Emspace" "ThinSpace" or "PrivateUseOne" (in the Unicode code charts enclosed by dashed lines) are printed as their code chart symbols. But correct would be the verbatim output, i.e. a *real* em-space or simply nothing for "PrivateUseOne". These are only examples, this report applies to all special characters. Reproducible: Always Steps to Reproduce: 1. Open the given URL 2. 3. Actual Results: Unicode characters like "Emspace" "ThinSpace" or "PrivateUseOne" (in the Unicode code charts enclosed by dashed lines) are printed as their code chart symbols. Expected Results: Correct would be the verbatim output, i.e. a *real* em-space (broad white space) or simply nothing for "PrivateUseOne". These are only examples, this report applies to all special characters. This doesn't happen if you use name entities in the HTML code. So, ߓ and   produce different output, which mustn't be.
Comment 1•22 years ago
|
||
To intl.
Assignee: attinasi → yokoyama
Status: UNCONFIRMED → NEW
Component: Layout → Internationalization
Ever confirmed: true
QA Contact: petersen → ruixu
Comment 3•22 years ago
|
||
Those are 2 different issues. For the first issue, I could not reproduce it on both linux and windows. The 2nd observed behavior is intentioal. Because of the wide spread of win1252, and MS sometimes misname it as win-latin1, many webpages take for granted and use 0x92 for single quote. Since this code point is not used in latin1 anyway, we interpret using win1252. Some people may disagree of this implementation, but if we don't do that, we will have tons of bugs and users will blame mozilla.
Status: NEW → ASSIGNED
Reporter | ||
Comment 4•22 years ago
|
||
If you export the following HTML excerpt <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Test Page</title> </head> <body> <p> ‘‚</p> </body> </html> to the local file bronger.xhtml (I think the 'xhtml' is significant!) and load it into Mozilla099 (Gecko/20020313, I use the Linux version), then you get this: EM SP PU1 BPH (e.i. nine letters and one digit) which is wrong. &#x...; refers in XML files to unicodes, the file is a UTF-8 XML file. No Latin-1 here. (But BTW, an encoding = "iso-8859-1" wouldn't change anything.) The "EMSP" must in fact be a wide white space, and the other two C1-Control characters should Mozilla at least ignore, but under no circumstance it should produce their "names".
Reporter | ||
Comment 5•21 years ago
|
||
I've prepared a better demonstration document at <http://tbookdtd.sourceforge.net/unitest.xhtml>. I consider the codes in the table (except for the C1 characters above 83) more or less significant skip characters that should be printed properly. (Although Unicode offers even more.)
Comment 6•19 years ago
|
||
shanjian is no longer working on mozilla for 2 years and these bugs are still here. Mark them won't fix. If you want to reopen it, find a good owner first.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → WONTFIX
Reporter | ||
Comment 7•19 years ago
|
||
I find this bug-closing policy a little bit odd, but most of the wrong glyphs mentioned here have been fixed without being noted here anyway. The only remaining one that's worth a new bug entry is the zwnj in my opinion.
Comment 8•19 years ago
|
||
Mass Re-assigning bugs that Frank Tang Closed on March 1st Spam is his fault Mass Re-Open to follow
Assignee: shanjian → nobody
Comment 9•19 years ago
|
||
Mass Bug Re-Open of bugs Frank Tang Closed with no good reason. Spam is his fault not my own
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Comment 10•19 years ago
|
||
Reassigning Franks old bugs to Jungshik Shin for triage - Sorry for spam
Assignee: nobody → jshin1987
Status: REOPENED → NEW
Updated•15 years ago
|
QA Contact: amyy → i18n
Comment 11•3 years ago
|
||
this seems to be working now
Status: NEW → RESOLVED
Closed: 19 years ago → 3 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•