Closed
Bug 17962
Opened 25 years ago
Closed 24 years ago
Display all HTML 4 character entities in browser correctly
Categories
(Core :: Internationalization, defect, P3)
Core
Internationalization
Tracking
()
VERIFIED
FIXED
mozilla0.8.1
People
(Reporter: sidr, Assigned: ftang)
References
()
Details
(Keywords: html4, intl)
Attachments
(7 files)
All of the HTML 4 character entities should display a useful and meaningful
glyph when referenced in an HTML file.
| Reporter | ||
Comment 1•25 years ago
|
||
| Reporter | ||
Comment 2•25 years ago
|
||
At present, there are at least 3 issues outstanding.
1. On Windows NT (and 95) the "Miscellaneous Technical" char entities
testcase crashes the browser, and the browser hangs instead of displaying
the "ISO 8859-1" char entities, preventing inspection of those testcases
on Win32. This is bug 17958.
2. On linux, • displays as "•" instead of as a bullet. This is
bug 16872. Possibly other character entities are affected, testing the
testcases linked from the attachment above will tell the tale.
3. On linux, “ and ” (left and right double quotes) delay loading
of straightforward pages for tens of seconds on at least some machines,
if they end up being found in an ISO 10646-1 (unicode) sharacter set,
due to the sheer size of the unicode set. This is bug 14961. On Windows and
Mac, these characters are part of the first 256 characters in any set. This
could be a general problem for other platforms that use ISO 8859-1, rather
than the Windows or Mac adaptions, as their default character set. For
practical purposes, if character entities take tens of seconds to find,
that can't be considered adequate support. Also, even if this took less
time, it is less than ideal to use a different font for the glyphs for
only a few characters.
Issue 1 is waiting on bug 17958.
Issues 2 & 3 are of unknown severity at present. The only simple way
to find out their severity would be to view the character entity testcases
on several platforms other than Windows and Mac.
BTW, the component is set to "internationalization" not because this is
of consequence only for i18n (this is really a "Browser-General" problem),
but because presumably that team is already working with character entities
and character sets.
| Assignee | ||
Updated•25 years ago
|
| Assignee | ||
Comment 3•25 years ago
|
||
This looks like a tracking bug instead of a real bug for me. All the problem
mention here in this bug report have a seperate bug # associate w/ me. Mark this
M20 since this is a tracking bug. Seperate bug# should have different M number
since they should be fix eariler.
Ressign this bug to erik, even it is a tracking bug. Most of the stuff mention
here are GFX issue.
| Reporter | ||
Comment 4•25 years ago
|
||
Sorry, yes, this is *also* a tracking bug, but it probably shouldn't be.
Resolving the three bugs referred to won't invalidate this report.
The only thing that can invalidate this report is testing.
At its core, this bug is the general case for bug 16872, where
• displays incorrectly on at least one platform. The reasoning:
where there's smoke, there may be fire. The real question is, do all
of the HTML 4 character entities that should display something display
something useful on all Platform/OS combos?
Updated•25 years ago
|
Status: NEW → ASSIGNED
Comment 5•25 years ago
|
||
Comment 6•25 years ago
|
||
Comment 7•25 years ago
|
||
Many entities being displayed correctly in M16 have "broken" in the nightly I'm
using now (ID 2000071620). They appear as inverted solid triangles.
There are more complete entity reference pages, but mine is at
http://www.r5i.com/~tim/symbols.shtml, or see the letterlike symbols, math
symbols, and arrow in the test case attachment.
Sorry if this is the wrong place for this. It's the closest I could find in my
Bugzilla search.
These looked OK for me with last week's 2000071108 build on US Win95, but I
see the inverted triangles with today's build, 2000071709.
Reassigned to ftang because erik just left for sabbatical.
Assignee: erik → ftang
Status: ASSIGNED → NEW
| Reporter | ||
Comment 10•25 years ago
|
||
Testing with the 2000-07-17-09-M17 nightly binary on WinNT, 53 of the HTML 4
character entities display the same, incorrect, glyph, one that looks like a
bold, bold left single quote mark.
None of the ISO-8859-1 characters are affected. The affected symbols are mostly
mathematical or quasi-mathematical. They are found in:
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=2604
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=2606
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=2607
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=2608
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=2610
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=2616
Nominating for nsbeta3 - surely reviewers will subject it to test suites,
and this is a very basic test to be failing. Looking at comments in bug
45543, the full fix for this will probably be waiting until Erik gets back:
see the Additional Comments From rbs@maths.uq.edu.au 2000-07-17 11:18.
This bug would almost certainly depend on bug 45543 except that there are no
Korean characters in the HTML 4 set; the root problem looks to be the same.
rbs@maths.uq.edu.au, is the current problem with the math glyphs a blocker for
you?
Updating Platform/OS to All/All, as this will need to be verified everywhere
for HTML 4.0 compliance.
Comment 11•25 years ago
|
||
> is the current problem with the math glyphs a blocker for you?
No, it isn't. The hack I indicated on 45543 is temporarily doing the trick.
When I visit the links you gave above, they look okay (with missing glyphs
represented by '?' as expected). Also, MathML-enabled builds include the ucvmath
module which gives access to more mathematical/scientific symbols for those who
have the corresponding fonts.
(Notice that the default Mozilla can display many of the HTML4 symbols if
the user has the "Lucida Unicode Sans" font.)
| Assignee | ||
Comment 12•25 years ago
|
||
reassign back to erik. It seems the fix for 45543 is good for short term and we
should wait erik back to fix the rest.
Assignee: ftang → erik
Comment 13•25 years ago
|
||
This now WORKSFORME completely on Windows 2000 commerical build 6.0.17.2000080104.
Should this be closed, or are there remaining issues?
Whiteboard: WORKSFORME?
| Reporter | ||
Comment 14•25 years ago
|
||
> Should this be closed, or are there remaining issues?
Not quite yet, and who knows? To date the greatest number of font diplay
problems have occurred on Linux, but this also needs testing on Mac to
be sure that all of the character entities display properly on the Tier 1
builds. For fonts issues, testing on Windows only is not enough.
Whiteboard: WORKSFORME? → Win: WFM; Linux: ???; Mac: ???;
Comment 15•25 years ago
|
||
Agreed.
Eli, can you verifiy that this is WORKSFORME on all three tier 1 platforms?
The attachement is a set of links to the comprehensive testcases you were
after the other day...
QA Contact: teruko → elig
Comment 16•25 years ago
|
||
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=2608
does not seem to be the right testcase
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=2610
I get '?'s for the first 4 entities
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=2616
I get '?' for zwnj through rlm.
Linux build 2000.08.02.08 on RH 6.2
| Reporter | ||
Comment 17•25 years ago
|
||
| Reporter | ||
Comment 18•25 years ago
|
||
The second attachment shows all of the HTML 4 character entities in named
and numeric form in one testcase; now that random entities aren't crashing
Mozilla, that's feasible and convenient.
New testcase for spacing and zero-width characters in text:
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=12289
Testing with 2000-08-02-08-M18 shows the "Windows" results in the next paragraph
Remaining problem character entities:
On Linux: ⌈ &rciel; ⌊ ⌋ ‌ ‍ ‎ ‏
On Windows: ‌ ‍ ‎ ‏
On Mac: as yet unknown
Richard: yeah, 2608 is not a character entity testcase; never was;
mea culpa typoa: should have been 2609.
Whiteboard: Win: WFM; Linux: ???; Mac: ???; → Win: problems; Linux: problems; Mac: ???;
| Assignee | ||
Comment 19•25 years ago
|
||
bug 47714 is about Mac and Symbol entity set.
| Reporter | ||
Comment 20•25 years ago
|
||
Updated remaining problem character entities:
On Linux: ⌈ &rciel; ⌊ ⌋ ‌ ‍ ‎ ‏
On Windows: ‌ ‍ ‎ ‏
On Mac: ⌈ &rciel; ⌊ ⌋ (Miscellaneous Technical)
It is entirely possible that the problems with the Miscellaneous Technical
glyphs on Linux has a similar cause to to the codepoint translation
problem on Macs (bug 47714).
Note that ‌, ‍, ‎, and ‏ are displaying properly as "nothing"
in the "General Punctuation" table in the second attachment, but in the
"Spacing and Zero-width Characters", either a "?" or a thin vertical bar is
appearing when they are placed in text - and these characters' normal habitat
is in the midst of printable text.
| Reporter | ||
Comment 21•25 years ago
|
||
It always helps, when evaluating testcase results, to know what to expect.
For ‎ and ‏, it appears that visible glyphs looking almost like
thin vertical bars, with tiny right- and left- pointing arrows at the top,
should be expected.
To see this clearly, view http://www.hclrss.demon.co.uk/demos/ent4_frame.html ,
scroll down in the left frame to _left-to-right mark_, and click on that link.
Looking carefully at the in-text testcase (end of second attachment), the same
glyphs are shown on WinNT testing with 2000-08-09-08-M18 -- they are smaller,
and butted against the adjacent text characters, but the characters are clearly
*not* just thin vertical bars. On the other hand these charaters mysteriously
do not appear when they are the only content of a table cell.
| Assignee | ||
Comment 22•25 years ago
|
||
I don't think it is reasonable to fix
‌ ‍ ‎ ‏
for any platform. These characters are control characters and should not be test
the rendering along, instead the apperance should change depend on the
surranding characters. There are no visual requirment how to display them ALONG.
Some application / OS display them one or or the other. The importance is how
they change the rendering of surranding characters. For example, they should
change the display behavior Arabic, and indict scripts.
The ⌈ &rciel; ⌊ ⌋ issue should be possible to fix if we
remap according to the html instead the adobe mapping. We need to remap Mac code
also. But it should be easy.
| Assignee | ||
Comment 23•25 years ago
|
||
⌈ is U+2308 in unicode. By look at the Symbol font, it look like code
point 0xE9. Which mean in Macintosh, the font encode as U+F8EE
⌉ is U+2309 in unicode. By look at the Symbol font, it look like code
point 0xF9. Which mean in Macintosh, the font encode as U+F8F9
⌊ is U+230A in Unicode. By look at the Symbol font, it look like code
point 0xEB. Which mean in Macintosh, the font encode as U+F8F0
&rflorr; is U+230B in Unicode. By look at the Symbol font, it look like code
point 0xFB. Which mean in Macintosh, the font encode as U+F8FB
Therefore, the way we fix this bug is to change the mapping table for Unicode to
symbol font mapping and change the entries for 0xE9, 0xEB, 0xF9, 0xFB to
2308, 230a, 2309, 230b
To fix Mac, we should put down 4 if
if( (0x2308 <= (u)) && ((u) <= 0x230b)) {
if(u == 0x2308)
u = 0xf8ee;
else if(u=0x2309)
u = 0xf8f9;
else if(u=0x230A)
u = 0xf8f0;
else if(u=0x230B)
u = 0xf8fb;
}
| Assignee | ||
Comment 24•25 years ago
|
||
nsbeta3- per bug meeting (ekrock)
Whiteboard: Win: problems; Linux: problems; Mac: problems; → [nsbeta3-]Win: problems; Linux: problems; Mac: problems;
Comment 25•25 years ago
|
||
Accepting bug, but marking Future, since it's nsbeta3-.
Status: NEW → ASSIGNED
Target Milestone: M20 → Future
Updated•24 years ago
|
QA Contact: elig → teruko
Comment 26•24 years ago
|
||
Nominating for Mozilla1.0 as a polish/compliance issue.
Keywords: mozilla1.0
Comment 27•24 years ago
|
||
Frank, I'm reassigning this to you since you seem to know what to do, and I
don't know my way around your Unicode conversion tables.
Should this be marked nsbeta1?
Assignee: erik → ftang
Status: ASSIGNED → NEW
| Assignee | ||
Comment 28•24 years ago
|
||
We probably should fix the Mac . That should be easy to do. Mark this bug as P3
moz9 for only the Mac enhancment part.
Comment 29•24 years ago
|
||
Changed QA contact to andreasb@netscape.com for now.
QA Contact: teruko → andreasb
| Assignee | ||
Comment 30•24 years ago
|
||
The ⌈ ⌉ ⌊ ⌋ display problem on Mac have been checked in
8/15/2000. It seems the only remaining issue are these in Gtk.
| Assignee | ||
Comment 31•24 years ago
|
||
ok. I also fix Gtk. here are the patch
| Assignee | ||
Comment 32•24 years ago
|
||
| Assignee | ||
Comment 33•24 years ago
|
||
| Assignee | ||
Comment 34•24 years ago
|
||
| Assignee | ||
Comment 35•24 years ago
|
||
| Assignee | ||
Comment 36•24 years ago
|
||
| Assignee | ||
Comment 37•24 years ago
|
||
bstell- can you review this ?
Target Milestone: mozilla0.9 → mozilla0.8.1
Comment 38•24 years ago
|
||
sr=erik
Looks good.
Comment 39•24 years ago
|
||
since this is only used for converting from Unicode to Adobe code for display
this is okay.
r=bstell@netscape.com
| Assignee | ||
Comment 40•24 years ago
|
||
fix linux lcell/rcell/rflorr/lfloor
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 41•24 years ago
|
||
Verifying this bug, however see new bug report (bug 75059) which narrows down
problematic characters.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•