We need to draw the "missing glyph" boxes on all platforms.
Created attachment 257300 [details] [diff] [review] fix I think this is the right idea. But for some reason it doesn't work on Windows. For example with the testcase "ᝣ" we call GetCharacterPlacement and it returns advance width zero, glyph index 3. So I can't tell that this is a missing glyph, and even if I could, how do I get the missing-glyph glyph to render instead?
Attachment #257300 - Attachment mime type: application/octet-stream → text/plain
That question gave me a lot of grief in bug 353756. See comments there and my "fonts suck" vent at http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/gfx/thebes/src/gfxWindowsFonts.cpp&rev=1.86&mark=1275-1284#1273
I'm using that logic in the Uniscribe path.
Attachment #257300 - Attachment is patch: true
I'll fix this up after my patch in bug 372631 lands. I'd actually like to copy Pango's approach of drawing the Unicode character number in the missing-glyph box (hexboxes). We can't use Pango's because we don't draw using pangocairo and a cross-platform solution would be great. So I'm thinking of doing it cross-platform, mostly in gfxFont::Draw. I'll modify attachment 257300 [details] [diff] [review] to store the Unicode codepoint for a character with missing glyphs in the mGlyphID for the DetailedGlyph record for the character. We'll have a 3x5 custom bitmap "font" that we can render via a temporary image surface and some algorithm for calculating the desired size of the hexbox based on the font metrics and actually drawing 4 or 6 hex digits in it plus a border. This will give us an advance width for each hexbox. In gfxFont::Draw we will draw any necessary hexboxes as we traverse the textrun building the glyph array that we pass to cairo_show_glyphs.
Created attachment 259164 [details] [diff] [review] draw missing-glyph boxes including "hexboxes" showing UTF16 codepoint Okay, here's a cross-platform implementation of hexboxes. I'm drawing my own missing-glyph box (with 50% opacity) and my own hex numerals, but it's not much code and it's nice to have on all platforms. I'll attach a screenshot in a moment.
Will this override the Mac OS X missing-character glyphs? Zoomed out they appear as a boxed letter chosen to represent the language, and if you zoom way in, you can see the name of the language (in English) and the unicode range for that language. They're more useful than hexboxes in some ways, as well as prettier and more Mac-like ;)
Here's the doc for Mac OS Last Resort Font : http://developer.apple.com/textfonts/LastResortFont/ The glyphes : http://developer.apple.com/textfonts/LastResortFont/LastResortGlyphs236pp.pdf It seems to me that whenever there's a usual way of handling this on any OS, we should reuse it : - so use LastResort font on Mac OS (it was there already in Mac OS 9) - use pango's hex boxes on Linux ( Would there be a way to use them instead of reimplementing them ? http://bugzilla.gnome.org/show_bug.cgi?id=313551 for implementation info ) - use whatever is convenient under Windows, maybe pango's hex boxes.
On a platform where no meaningful (better than '?') missing glyph/last resort font is natively available (Windows), I'd use a rectangle with 6-digit hex number rather than two 4-digit hex numbers to represent the corresponding surrogate pair. That's what Pango does (I guess) and what the old Gfx:Xft does.
This feedback would have been more useful *before* I'd done the work of implementing it. I don't know how to select the Mac 'last resort' font in our font fallback code. Vlad left a comment saying he didn't know how to select it either. We can't reuse the Pango code without going through pango-cairo, and it's easier and faster to call cairo_show_glyphs than to go through pango-cairo. For Windows, we need this code anyway.
I thought about the non-BMP case, but it's a bit more code to handle a very rare situation, and it's really only a slight improvement in that situation (anyone who cares can figure out the character from the surrogate pair), so I don't think it's worth the effort. I'd accept a patch though :-). It would probably require per-platform code to put the correct character into mGlyphID and also require some small changes to the missing-glyph drawing code. Note that Apple's "last resort" font probably can't handle non-BMP characters (Apple only supports 65535 glyphs per font).
Sorry roc, I wasn't following this bug until your last blog entry. It's good to have your current version and we can open another bug about making things more platform conformant on Mac in the future. BTW Last Resort has glyphs for many non BMP languages, it even has glyphs for script that are only proposition for inclusion in unicode.
For what it's worth, I'd really like to see *something* land here, because this is causing a bunch of reftests that should be marked "fails" to have to be marked "random" because we don't know whether the characters are present or not. (bugs/351641-2b.html, counters/t1202-counter-10-b-test.html, counters/t1202-counters-10-b-test.html). Things can always be refined later...
Comment on attachment 259164 [details] [diff] [review] draw missing-glyph boxes including "hexboxes" showing UTF16 codepoint Can you rename gfxMissingGlyphDrawing to something like gfxFontMissingGlyph or gfxFontSomething just make it easier to find things related to fonts? any idea what the perf hit is?
Attachment #259164 - Flags: review?(pavlov) → review+
(In reply to comment #13) > BTW Last Resort has glyphs for many non BMP languages, it even has glyphs for > script that are only proposition for inclusion in unicode. OK. I guess that can work because they don't have one glyph per character, just one glyph per language. Once someone figures out how to do it, adding it to our code would be relatively easy. (In reply to comment #15) > Can you rename gfxMissingGlyphDrawing to something like gfxFontMissingGlyph or > gfxFontSomething just make it easier to find things related to fonts? Sure > any idea what the perf hit is? No, but I'll be very surprised if it impacts perf. I haven't added anything significant to any important paths. Pages full of missing glyphs will slow down a lot of course.
Status: NEW → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
I now see "missing glyph" in some cases where there should be a valid glyph, like in this page "in‑testsuite" the '-' is rendered as a box with 20|11 in it.
Were you seeing the glyph before this landed, or a blank space? I've been seeing that character missing for a while, since Linux font fallback to anything other than specified fonts hasn't been working for ages. But there's a separate bug on that.
Ah, true, in a3 the character is missing.
Created attachment 259845 [details] [diff] [review] fix Windows reftest regression 1) Handle Unicode zero-width space specially on Windows --- even if the font doesn't have a glyph for it, don't show a missing-glyph box 2) Robustify the gfxFont code so that we can handle missing glyphs without the box information in mDetailedGlyphs
Comment on attachment 259845 [details] [diff] [review] fix Windows reftest regression Vlad could review this too, I think, if Stuart's offline
Created attachment 259850 [details] [diff] [review] updated patch Update; force ZWSP to be invisible and zero-width even if the font has a non-zero-width glyph for it.
Comment on attachment 259850 [details] [diff] [review] updated patch if a font has a non-zero width glyph then thats up to it. We should honor it. I'd prefer we don't special case it for when it is missing either since knowing what the character is is useful. what exactly is this trying to fix?
Created attachment 259868 [details] [diff] [review] updated patch Alright, this is better. It just refines the IsGlyphMissing check to accept fZeroWidth as a valid glyph if the Unicode character is a zero-width space.
Comment on attachment 259868 [details] [diff] [review] updated patch Get rid of UNICODE_ZWSP -- it is only needed in 1 spot. Also make IsZeroWidthUnicodeChar const.. static as well? also can you fix the PR_LOG(..."crappy font...") to actually pass glyph in?
Attachment #259868 - Flags: review?(pavlov) → review+
Checked that in.
To link the bugs, it looks like this caused Bug #375506.
(In reply to comment #12) > I thought about the non-BMP case, but it's a bit more code to handle a very > rare situation, and it's really only a slight improvement in that situation > (anyone who cares can figure out the character from the surrogate pair), so I > don't think it's worth the effort. Personally I think it is worth the effort. If I saw the current behaviour in somebody else's app my reaction would be "those guys don't really understand Unicode".
I think it's worth doing as well.
You need to log in before you can comment on or make changes to this bug.