Closed
Bug 6585
Opened 25 years ago
Closed 24 years ago
missing glyphs should use substitute
Categories
(Core :: Layout, defect, P1)
Tracking
()
VERIFIED
FIXED
M16
People
(Reporter: erik, Assigned: erik)
References
()
Details
(Keywords: css1, Whiteboard: [PDT-])
Attachments
(5 files)
If we cannot find any font that contains a glyph for the particular character we are trying to measure/draw, then we should pop up a window that explains where you can download such a font, similar to the window that comes up when you visit a page containing a plug-in that you don't have yet. Currently, the measuring code returns zero, and the drawing code doesn't attempt to draw anything for such characters. (We might want to at least draw some boxes in this case, like Windows does.) For the pop-up window, there are various issues to consider: What do we say to the user? "You don't have a font for the character U+1234"? "Would you like to download a Japanese font?" Etc. Do we automatically re-render the page after the user has downloaded the font? How often do we trigger the pop-up? Once per page? Once per session? Once per session for each character? For each "script" (Unicode range)? Etc.
Assignee | ||
Updated•25 years ago
|
Status: NEW → ASSIGNED
Target Milestone: M15
Updated•25 years ago
|
OS: other → All
Summary: pop up window when no font found for particular character → {css1} missing glyphs should use substitute (U+FFFD)
Comment 2•25 years ago
|
||
The following page: http://www.bath.ac.uk/%7Epy8ieh/internet/eviltests/glyphs.html ...should help with this bug (it has some obscure maths codes in it). You should definitely be drawing something for unknown glyphs. That is a {css1} compliance issue. I am increasing the severity for the moment, since the missing glpyh is an error, not an enhancement. Technically, if you cannot display a glyph then you should be displaying the glyph with unicode code point U+FFFD, REPLACEMENT CHARACTER. If that doesn't map to anything, then, well... a space (U+0020) would be better than nothing! A popup window would be a possibility, but see also the suggestion in bug 6211: a window detailing all errors in the document (HTML, CSS, image downloading problems, and so on). I will add "glyphs not found" to the list of errors that should be reported.
Assignee | ||
Comment 3•25 years ago
|
||
See also: http://www.w3.org/TR/REC-html40/charset.html#h-5.4
Adding lyecies@netscape.com to cc list.
Comment 6•25 years ago
|
||
Here is the test case from bug 17567: http://bugzilla.mozilla.org/showattachment.cgi?attach_id=2486 BTW, on my system IE5 has a really amusing attempt at this: it displays three different "missing glyph" glyphs!!! :-) I'm hoping we will be slightly more consistent than that...
Bug 17567 had a note saying that the code to fix this bug (or at least the part covered by 17567) has already landed, but is #ifdef MOZ_MATHML.
Assignee | ||
Comment 8•25 years ago
|
||
Actually, the font code that Roger checked in for MathML was mostly written by me, and was designed to be extensible, so that we could extend it to cover this bug too. So, it isn't just a matter of making the #ifdef MATHML code be the default. We need to write some more code.
Comment 9•25 years ago
|
||
Migrating from {css1} to css1 keyword. The {css1}, {css2}, {css3} and {css-moz} radars should now be considered deprecated in favour of keywords. I am *really* sorry about the spam...
Assignee | ||
Comment 10•25 years ago
|
||
Moving all of my M15s to M16. Please add comments if you disagree.
Target Milestone: M15 → M16
Updated•25 years ago
|
Summary: {css1} missing glyphs should use substitute (U+FFFD) → missing glyphs should use substitute (U+FFFD)
I would hope to see this fixed before M16. How hard is it? I would nominate it for beta1 if I were more sure that it isn't that complicated... It's not a minor issue, especially when reading texts in certain languages (that are not the user's default language).
Assignee | ||
Comment 12•25 years ago
|
||
Beta1 candidacy is not so much dependent on how hard it is to fix it, but how much we need it for "beta quality", which is of course difficult to define, but the PDT team has a pretty good sense of what is important and what isn't. So, how about just nominating for Beta1, by adding the beta1 keyword above?
OK, nominating for beta1, because I think it's important that the user knows when bits of text are missing.
Keywords: beta1
Comment 14•25 years ago
|
||
Agreed ... it's frustrating that there's content on the page that mozilla just doesn't show at all. 4.x shows them wrong, but at least it shows that there's something there.
Comment 15•25 years ago
|
||
PDT indicated it would not hold beta for this... so PDT-
Whiteboard: [PDT-]
Comment 16•25 years ago
|
||
Perhaps a tooltip can be displayed when hovering over a missing glyph (U+FFFD), e.g. _Latin small letter e with inverted breve_ for U+0207.
Comment 17•25 years ago
|
||
How can you hover over something that's not there at all? And how would the user know where to hover?
Comment 18•25 years ago
|
||
> How can you hover over something that's not
> there at all? And how would the
> user know where to hover?
When a glyph can't be found, the REPLACEMENT CHARACTER U+FFFD should be displayed. When hovering over this, the name of the missing glyph can be shown as a tooltip. Hope this makes things clearer.
Comment 19•25 years ago
|
||
Erik, now that I am wading into fonts again for MathML add-ons, things are pretty fresh in my memory, and I may lay a helping hand w.r.t. this bug in the Win32 corner. The fix has to be in FindGlobalFont(), right? Rather than return nsnull, return the first font that has the Unicode's REPLACEMENT CHAR?
Comment 20•25 years ago
|
||
Thinking again, just returning a font there is not enough! Since this doesn't tell to other rendering/measuring functions that that particular char has to be re-mapped to the missing char.
Assignee | ||
Comment 21•25 years ago
|
||
Right, you can't just return a font that contains a glyph for U+FFFD. I would rather make it more general, and return an object that implements the GetWidth and DrawString APIs, but isn't actually a font on the inside. It is just a couple of methods that deal with rectangular boxes on the screen. I.e. the GetWidth method returns the width of the boxes that would be drawn by the DrawString method. Also, we want to create another subclass that maps Unicodes to other character codes for which we *do* have glyphs on the system. E.g. Unix, with limited iso8859-1 fonts, where windows-1252 docs can't be displayed normally. But a box-drawing subclass is a good first step, and I think we need it anyway.
Comment 22•25 years ago
|
||
Got a fix in my tree. As you suggested, I added another subclass (nsFontWinSubstitute) which acts as a fall-back for chars without glyphs. Great job with this foundation, erik! I didn't implement a drawing box font. I am simply using an existing font (out of the global list) that has the replacement char. This font is added in mLoadedFonts[], but its map only contains chars without glyphs, and it is not added in the hashtable (this way, it cannot be looked up, and therefore it will not interfere with the other font switching tricks, and its name doesn't matter). The way to get hold of the font is with something like mLoadedFonts[mIndexOfSubstituteFont], which is then used to instantiate a nsFontWinSubstitute that has its own GetWidth/DrawString. [e.g., GetWidth('xyz') actually does GetWidth('???'), DrawString('xyz') actually does DrawString('???')]; Since the font is kept in mLoadedFonts[], subsequent searches for the same chars without glyphs are speedy, and return the substitute font straight away.
Comment 23•25 years ago
|
||
Should this go in M14?
Assignee | ||
Comment 24•25 years ago
|
||
The PDT team has already looked at this and marked it PDT- (i.e. not a candidate for Beta 1). Your work *is* appreciated though, especially by me!
While the tree is still open (i.e., today), why should not being marked PDT+ prevent someone who doesn't work for Netscape from checking in the fix (assuming it's reviewed, etc.)? Some of us do think this should definitely be in beta1.
Comment 26•25 years ago
|
||
I wondered that as well. The tree isn't closed to non-PDT+ fixes, it's just that Netscape engineers aren't supposed to be spending time tracking down non-PDT+ fixes. Or at least, that's been my understanding.
Assignee | ||
Comment 27•25 years ago
|
||
I don't have time to review the code since I'm working on PDT+ bugs. If somebody else does have time to review it, and both Roger and the reviewer can promise that this code will not cause other bugs in M14 or cause performance problems, then I may give my reluctant "OK". I'm not trying to be mean here. I've had years of experience shipping betas, and this is not the right time to introduce significant numbers of lines of code in sensitive areas for features that are not even deemed PDT+ by the PDT team. When you are close to the deadline, a fix must be for something that is *both* actually needed *and* very safe and understood.
Assignee | ||
Comment 28•25 years ago
|
||
By the way, Windows GFX changes now require Troy's review, and Kevin McCluskey is the GFX owner, so his review is also required.
Assignee | ||
Comment 29•25 years ago
|
||
By the way, as Jim Roskind said in mozilla.seamonkey, M14 may not become Netscape's Beta 1. If M14 has too many beta-stoppers, M15 will be the target. So there is still some chance that this fix will make Netscape's Beta 1, and even if it doesn't we can still get it into Beta 2.
Comment 30•25 years ago
|
||
If you can post the diffs and get Troy and Kevin McCluskey to review the changes, go for it... cc'd troy and kmcclusk
Comment 31•25 years ago
|
||
Comment 32•25 years ago
|
||
Comment 33•25 years ago
|
||
Comment 34•25 years ago
|
||
Comment 35•25 years ago
|
||
Comment 36•25 years ago
|
||
Ok guys, now you have everything. Upon review, the decision is yours. I am pretty confident that the code is now solid. It makes the browsing experience quite nice. Erik has been very helpful in providing insights and a great deal of feedback during development. I understand his prudence, and if the code has to wait, no problemo. The part in intl consists in adding extra mapping tables to enable the conversion of further math fonts. These fonts can be enabled by default in gfx by removing the #ifdef MOZ_MATHML that is in gFontsHaveConverters, if you deem appropriate. Screenshots of the testcases when these were enabled and disabled were attached. The fact that some math chars "overflow" in the testcase is due to the FindFont('a') hack --it returns the "ascii" metrics that are used as the default metrics. This is a recorded bugzilla bug 20394 which erik plans to address with a GetDimensions() method that will supersede the current GetHeight/GetWidth.
Comment 37•25 years ago
|
||
Of note, the patch for gfx can be tried in *isolation*. So, feel free to apply it on gfx, and report what you see (on win32)...
Assignee | ||
Updated•25 years ago
|
Severity: normal → major
Priority: P3 → P1
Target Milestone: M16 → M15
Comment 38•25 years ago
|
||
The bug is scheduled for M15, meaning there is time for improvements on the present solution, if worthwhile. Currently, since "viewsource" goes through the same rendering process, it means that chars without glyphs are still renderered with the REPLACEMENT CHAR, even within "viewsource". Is it, instead, desirable to display the unicode points there? Implementing this is not really demanding, because the font engine knows everything, and can either choose to replace the chars, or render their unicode points. However, "viewsource" is currently known to be notoriously slow and the question is whether it is worthwhile doing these things that will make it even slower. Also, the font engine will then have to know that the consumer is "viewsource" BTW, is there a common unicode font that actually has a glyph for the Unicode's replacement char U+FFFD?
Assignee | ||
Comment 39•25 years ago
|
||
nsFont.decorations is a PRUint8 bit-mask that could be used for your View Source idea. If we had 2 bits, we could have 4 modes: (1) show glyphs as usual (like browser) (2) unavailable glyphs are shown as Unicodes (3) all non-ASCII chars (> 0x7F) are shown as Unicodes (4) all chars are shown as Unicodes Number (4) is not that useful though...
Comment 40•25 years ago
|
||
Sounds good. Some comments/clarifications: (1) show glyphs as usual (like browser). All missing chars show up as the replacement char (this is the current behavior --not really helpful, but can be left as an option, as it is the fatest mode, and is a sort of WYSIWYG too!). (2) unavailable glyphs are shown as Unicodes -- this mode allow copy-pasting of the screen content of viewsource. (3) all non-ASCII chars (> 0x7F) are shown as Unicodes -- Not very sure how helpful is this. So the reformulation: (3) all non-ASCII chars (> 0x7F) are shown as both Unicodes and Glyphs, in the format "0xUnicode{Glyph}". Hence a char withouht glyph will show up as "0xUnicode{REPLACEMENT_CHAR}".
Comment 41•25 years ago
|
||
I now have all this coded up in my tree, and there are in sync with the changes Troy made recently --bug 27056, to optimize the use of SelectObject(). For (1), that was the original code. I didn't have to do anything --save bringing the code in sync. For (3), this is coded in nsRenderingContextWin. I added a convenience function SubstituteNonAsciiChars() that checks to see if some non-ascii chars have to be substituted. And... it is the output of this transformation that is used in the body of GetWidth and DrawString. (This way, the font-swithing machinery will pick the glyphs... For example, "Å" will appear as "ÅU+000C5". "Å" could be any strange glyph that could be picked in any font.) The SubstituteNonAsciiChars() function is carefully optimized and does nothing if nothing has to be done (e.g., when the current text run consists only of ascii chars). Concretely, the code of GetWidth(aString) and DrawString(aString) now loos like: PRUnichar* pstr = aString; if (shouldSubstituteNonAsciiChars) { pstr = SubstituteNonAsciiChars(aString); } ... use the *same* code as before to do the measurement/rendering using pstr For (2), this is coded by nsFontWinSubstitute in nsFontMetricsWin. (Doing so fulfills the intended purpose because nsFontWinSubstitute only applies to chars without glyphs.) The code of GetWidth(aString) and DrawString(aString) there now loos like: PRUnichar* pstr = aString; if (shouldDisplayUnicode) { ... sprintf Unicode points } else { ... default to the REPLACEMENT CHAR } ... use the *same* code as before to do the measurement/rendering using pstr ==== It is interesting to see the HEX (huh, Unicode) modes in action, and as you can see from the description above, the code is solidly standing upon the existing foundation and should therefore be as robust as the foundation :-) That's why I am pretty confident that the code is OK and should go in M15 as early as possible to enable further testing and enjoyement by others. What is missing in the code is how to activate these 'shouldDisplayUnicode' or 'shouldSubstituteNonAsciiChars'.
Comment 42•25 years ago
|
||
I just ran across this font problem while porting to the Photon UI, which may be the same as this problem, on articles at www.abcnews.com I see HTML that looks like: <font face= geneva,arial,helvetica size=5><b> <a href="/sections/us/DailyNews/bettyloubeets000218.html" > Weighing ‘Black Widow’s’ Fate </a> </b></font><br> The "‘" character looks like an arrow pointing upward under Photon. It is correct under Windows, Mac and Linux Netscape 4.7 but it is not correct under Photon or GTK Mozilla. I found out that numeric entities between 130-159 are special and need to be translated. Check out this link for more info: http://www.hut.fi/u/jkorpela/www/windows-chars.html I guess I can put some sort of translation in my own GFX code to do the translation but I was not sure if the purpose and fix for this bug would also fix this problem. Does this bug encompass this problem??
Those numeric entities should not be supported in strict mode, since they are undefined. (However, as characters in the document, they could be supported if it's in a windows encoding, rather than iso-8859-1.)
Comment 44•25 years ago
|
||
I am not sure if these codes are treated as special cases somewhere, and re-mapped to their proper unicode values --At the level of GFX they are not treated as special cases.
Assignee | ||
Comment 45•25 years ago
|
||
NCRs (numeric character references) like ‘ are indeed illegal, but very common on the Net. Our parser converts them to the correct Unicodes, and GFX then tries to find fonts with glyphs for those Unicodes. On Unix we have for a long time had the problem that Windows extensions in the 0x80-0x9F range could not be displayed. This bug report does indeed refer to that kind of problem. There are several bug reports assigned to me in this area, some Unix-specific.
Comment 46•25 years ago
|
||
BTW, we still need additional API to really nail this bug 6585 down. The substitution mode is not yet defined and I saw in bug 5100 that peterl didn't like seeing nsFont used for extra things. So we will need some other ways to pass rendering options in GFX. There could be a SetFlags() (or whatever) to specify rendering policies. RFE: going back to the initial questions raised in this bug report, an interesting option could be to pool/sort all the missing chars, in view of possibly sending a one-off feedback to the user (bug 6211, with a listing of names?), or in view of determining the fonts that can be auto-downloaded (or suggested for download). Note that it would be hard to implement a tooltip --compared to improving the viewsource trick to show their unicode in an ERROR-type color, but even achieving this is not straightforward.
Assignee | ||
Comment 47•25 years ago
|
||
This bug is now for Windows only, and assigned to Roger for M16 (i.e. after Beta 1 branch). I created bug 31252 for Unix. Mac already has substitutes.
Assignee: erik → rbs
Status: ASSIGNED → NEW
OS: All → Windows NT
Hardware: All → PC
Summary: missing glyphs should use substitute (U+FFFD) → missing glyphs should use substitute
Target Milestone: M15 → M16
Comment 48•24 years ago
|
||
Code checked-in. Chars with missing glyphs are now substituted. Currently only one replacement mode is activated. Additional API is needed to switch to different modes on the fly.
Assignee | ||
Comment 49•24 years ago
|
||
Roger, thanks for taking on this work, and for checking it in. This bug report is out of control, and we ought to create separate bug reports for the remaining issues. Re-assign the bug back to me if you'd like me to do that.
Comment 50•24 years ago
|
||
Re-assigning to erik to better organise where to head to: * Use substitute as per bug title (done). * API to flag different rendering policies (e.g., could be with a general struct, or for a start, a bitarray with some bits for the replacement mode) * Visual feedback of missing chars (Error-color for ViewSource? popup?) * Auto-download of fonts ? * Error log ? (collect all missing chars -- bug 6211) ?
Assignee: rbs → erik
Assignee | ||
Comment 51•24 years ago
|
||
At some point, I will look into this bug, and create one or more new bugs to cover the remaining issues. For now, setting milestone to M17.
Status: NEW → ASSIGNED
Target Milestone: M16 → M17
Comment 52•24 years ago
|
||
I'd prefer we think about the design sooner, so we can define and prioritize tasks and possibly delegate them as well...
Comment 53•24 years ago
|
||
I did similar thing on Mac already. However, we should do special process for "space" characters in the range of U+2000-U+206F. (some of them, not all of them). There are no reason we cannot "render"/"measure" a space characters, even non of the font have these glyph (a glyph for a space character is simply a matrix without any countor....)
Assignee | ||
Comment 54•24 years ago
|
||
This bug is *way* out of control. Please do not add any more comments to it. I have created a new bug for the Unicode space character issue: http://bugzilla.mozilla.org/show_bug.cgi?id=33032
Assignee | ||
Comment 55•24 years ago
|
||
Please do not add any more comments to this bug. Marking FIXED (since we now at least get question marks for unavailable glyphs). Other issues have been separated out. Thanks, Shrirang! From: shrir@netscape.com (Shrirang Khanzode) I have seperated the other issues from bug 6585. Bugs logged: Bug 41251 : API to flag rendering policies Bug 41249: Visual feedback for missing characters Bug 41250 : Auto download of fonts (Filed already) Bug 6211 - Error log to collect missing characters
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Target Milestone: M17 → M16
You need to log in
before you can comment on or make changes to this bug.
Description
•