Created attachment 313957 [details] backtrace User agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b5) Gecko/2008032620 Firefox/3.0b5 This bug appears whenever I copy any valid (or invalid) font over a different, existing font of the same type (*.fon) within C:\windows\fonts. I used the command prompt with "copy font.fon fonttooverwrite.fon" and then opened a page with <font face="someotherfonttooverwrite">some text</font> (refresh if needed). It seems to be a write access violation. 014C9566 movzx ecx,word ptr [eax] <-- ecx=0x0000CDCD 014C9569 mov edx,dword ptr [aSFP] 014C956C movzx eax,word ptr [edx+6] 014C9570 cmp ecx,eax 014C9572 jne UniscribeItem::IsGlyphMissing+2Bh (14C957Bh) The fuzzer that I'm using for font files can't do any real testing because this bug gets hit on the first iteration every time. The way that the tester works is by writing directly to the windows/fonts directory and then opening/closing a window with a font tag referencing the font. I have a couple questions that I hope someone can help me with. Do fonts that Firefox use need to be registered somehow (besides just existing within the fonts directory), and, if so, can't I simply fuzz the contents of an already-registered font file within the directory itself without having to re-register the font after being fuzzed? Also, Firefox's behavior seems to be that it opens a font file when needed and closes it when the page referencing the font closes. Is the font cached within Firefox or are the file contents reloaded every time the font is referenced? One more thing, this and bug 425592 seem to be the same thing, but that one is not marked as security-sensitive and doesn't have any details so I filed a new bug (hope that was the right thing to do).
In the first paragraph, fonttooverwrite.fon and someotherfonttooverwrite are the same thing; I forgot to change both references to be consistent.
Fixing this may well fix the top remaining font-related crash (one that showed up high right after the release shipped and then went down, I think).
Component: Layout: Fonts and Text → GFX: Thebes
QA Contact: layout.fonts-and-text → thebes
We keep a lot of information about the fonts installed on your system and refresh our info any time we receive a WM_FONTCHANGE message from Windows. I'm not sure if changing files directly in your font directory triggers that or not. If not, it could be causing some weird problems.
Paul can you send the WM_FONTCHANGE message yourself targeting all windows http://msdn2.microsoft.com/en-us/library/ms533930(VS.85).aspx to work around this?
Pav: should this block? It doesn't sound like a common activity, but dbaron's comment 2 gives us pause ...
beltzner: if the only way to trigger this is manual changes in the fonts directory, we shouldn't block. I don't think it is though based on the crash reports. This could possibly be related to some of the nsTArray<nsAutoString> stuff, but I'm not really sure what would be causing it.
I can reproduce this without manual changing something in the fonts directory. It involves using a lot of different fonts and character encodings. The crash then happens when Firefox reaches the amount of 10000 GDI objects.
I've now filed that as bug 429148. I'm marking this one as blocking bug 425592 (the topcrash bug).
If this indeed is a #12 top crash blocker, we should really try to fix this. +'ing this. Assigning to Stuart for now, but maybe jdagget could take a look as well?
Assignee: nobody → pavlov
Flags: blocking1.9? → blocking1.9+
Paul, could you write up a more detailed set of steps for this? I played around with this but wasn't able to duplicate it as described. I used a FON editor to create a bitmap font 'bongo.fon', using symbol.fon. Then I used Explorer to copy it over into the Fonts folder. Then I used these steps: 1. Start Minefield 2. Open a command shell and copy roman.fon over bongo.fon 3. Open a page that specifies "Bongo" in a style (see attached) Result: no crash, but wacky rendering, line height appears to be something huge, and only a single line of the testcase text is displayed. Right now, this testcase is a bit crazy but if it helps us find a bug that actual users are running into another way, then it's all good. Paul, any more specifics you can add would be a huge help.
Try using a font that already exists and is known to be recognized to Firefox. For example, try these steps: In the prompt, type copy roman.fon roman2.fon copy symbol.fon roman.fon (y to overwrite) Then update your references and open your page. That's all that I needed to do. If you're seeing weird stuff, perhaps the bug already activated but didn't crash your browser; compare your user agent to the one listed at the top of the bug report.
Btw, reference roman in the past example
Stuart, is it OK if I steal this from you?
Assignee: pavlov → roc
What characters are in your test page, Paul? I don't have symbol.fon in my system. I have symbole.fon though. Does copying symbole.fon over roman.fon cause the bug for you? Can you reproduce this by copying over the font and then starting Firefox? or do you have to start Firefox first, then copy over the font, then load the page? what if you start Firefox, load the page, copy over the font, then reload the page? The more details you can give, the better.
KarlT says that deleting a font from your system and then restarting Firefox will actually make Firefox crash on startup. I don't see that. I do see strange things happening when I delete or replace a font, though; the document appears to be very tall, but scroll buttons don't work (line-height busted?). I'll try to address that weirdness. It's possible that opt builds crash and debug builds just do this weird stuff.
Whiteboard: [needs patch stuart
Okay, so the following happens when you remove a TTF font from the file system and then Gecko tries to use it: -- Windows still gives us the font when enumerating fonts. We have a reasonable looking LOGFONT structure. -- We can still create a font object and get a valid HFONT. -- ReadCMAP fails. We interpret this as meaning it's a non-Unicode font. -- HasCharacter takes the mUnknownCMAP path, which calls ScriptGetCMap, which as far as I have seen so far, always fails. (This means we don't use the font in any text runs, so I'm not sure how this could cause a crash in UniscribeItem::IsGlyphMissing, since no UniscribeItems should be using this font.) -- We still get a gfxWindowsFont for the font in our fontgroup, and we still try to use its metrics for computing line heights etc... -- gfxWindowsFont::ComputeMetrics calls GetOutlineTextMetrics, which fails -- so it calls GetTextMetrics, which fails too. But we ignore the failure and carry on using 'metrics', which is uninitialized. So we return basically random values to layout. That's bad. So I'm still not sure how we're hitting the crash in UniscribeItem::IsGlyphMissing, but we should definitely try to detect that the font is missing earlier and avoid or detect failure in GetTextMetrics. I think when ReadCMAP fails we should call GetTextMetrics, and if *that* fails we should conclude the font is completely unusable and drop it from our font list.
I don't actually use symbol.fon or symbole.fon, but I assumed John Daggett did so my directions were targeted to him. I used a font that is on my system called modern.fon. My testing indicated that every font should work, including true-type. I only used fonts that already existed within the fonts directory, so perhaps that is the difference.
(In reply to comment #4) > Paul can you send the WM_FONTCHANGE message yourself targeting all windows > http://msdn2.microsoft.com/en-us/library/ms533930(VS.85).aspx to work around > this? > Sorry not to get back to you sooner. I wrote some code to do this and Firefox still crashed, same as before.
Created attachment 317197 [details] [diff] [review] fix Okay! The idea here is that whenever we convert a FontEntry to a gfxWindowsFont, we check whether the gfxWindowFont got valid textmetrics --- i.e. one of GetOutlineTextMetrics or GetTextMetrics succeeded. At least on XP, when a font file is removed unexpectedly those both fail even if the font is still enumerable. For gfxWindowsFontGroup, we convert the first fontentry to a gfxWindowsFont when we construct the fontgroup because various callers to GetFontAt(0) don't want to deal with getting a bad gfxWindowsFont back. So the gfxWindowsFontGroup constructor will detect the bad font early and remove it so no-one ever sees it. In font fallback we just reject any font entry that fails to give us a good gfxWindowsFont. This approach means we need to convert to gfxWindowsFont earlier than we did before, so that we can continue iterating if the gfxWindowsFont we got was bad, and various fallback subroutines need to return gfxWindowsFonts instead of FontEntries. I don't think this will have any perf impact in the normal case because I've tried to make things not do any more work than we would have done before, although we may do it in different places. I think it's still possible for a font to go away even while we have a gfxWindowsFont with an open HFONT for it. Sometimes I've seen font files being locked, but it seems GDI can drop the locks even while we have the HFONT. Depends on GDI's internal caching, I guess. There's not much we can do about that, but this should still make us a lot more robust.
I'm not sure if this will solve Paul's problem and the other crashes but I am hopeful. It definitely fixes the weirdness I see when font files are removed and then Gecko tries to use those fonts.
(In reply to comment #22) > I'm not sure if this will solve Paul's problem and the other crashes but I am > hopeful. It definitely fixes the weirdness I see when font files are removed > and then Gecko tries to use those fonts. Just to confirm, you weren't able to reproduce a crash right? But you did experience the weird "giant line height" effect, which this patch fixes?
(In reply to comment #23) > Just to confirm, you weren't able to reproduce a crash right? Right. > But you did experience the weird "giant line height" effect, which this patch > fixes? Right. The giant line height was caused by us using uninitialized text metrics due to GetTextMetrics failing, and now if that happens we simply won't use the font --- we'll fall back to some other font.
A couple more comments about the patch: Code that calls GetFontAt(i) for i > 0 can get a bad font. But AFAICT everyone passes 0 there except for gfxWindowsFontGroup::gfxWindowsFontGroup when it gets underline offset for the first "bad underline" font, and that's harmless. Perhaps we should just replace GetFontAt(i) with GetFirstFont()... This won't actually do what Paul wants. We won't crash, but we won't use the newly-replaced roman.fon or whatever, that font will just be ignored. It might be possible to trigger Windows to rescan the font directory or otherwise update its internal font catalog when we detect font breakage, but I don't know how to do that. Broadcasting WM_FONTCHANGE won't work, that's broadcast by Windows *after* it has updated the catalog so it can't be using that as a trigger as well; when we receive a WM_FONTCHANGE we will rerun font enumeration and Windows will tell us again about fonts that aren't really available. I have noticed that removing a font file, say "cambria.ttc" directly and then removing a different font the "right" way by dragging it out of the Fonts window does *not* cause the Fonts window and GDI font enumeration to notice that Cambria is gone. I wish I knew whether this would address the topcrash issue, but it's hard to know without more data on those crashes. Karl did say that he saw crashes when uninstalling fonts as one user while running Firefox as another user, and this patch should fix that. I'd be surprised if that sort of thing was a common scenario, but I guess you never know what apps might be doing out in the wild.
Whiteboard: [needs review]
See bug 429148 of a case where I crashed with a similar testcase, but perhaps this patch fixes also that crash.
Whiteboard: [needs review] → [needs review pavlov]
Comment on attachment 317197 [details] [diff] [review] fix in gfxWindowsFontGroup::gfxWindowsFontGroup() we should really fix up the arrays by shifting things down. Getting a gfxWindowsFont in WhichFontSupportsChar is going to hurt perf. :( I'm also concerned about perf for the same reason in FindFontForCharProc
(In reply to comment #27) > (From update of attachment 317197 [details] [diff] [review]) > in gfxWindowsFontGroup::gfxWindowsFontGroup() we should really fix up the > arrays by shifting things down. That's what RemoveElementAt does. > Getting a gfxWindowsFont in WhichFontSupportsChar is going to hurt perf. :( Why? We only do it for a font which supports the character. In that case the old code would have eventually created the gfxWindowsFont anyway, right? gfxWindowsPlatform::FindFontForCharProc is actually a slight perf regression, now that I look at it closer, because we'll create a gfxWindowsFont every time the font looks like the best match *so far*. I would hope it's only a small regression since the number of gfxWindowsFonts we create should be small compared to the number of fonts we're scanning, and these fonts are cached so that after we've scanned the system fonts for the first character, subsequent characters with the same font style won't create any new gfxWindowsFont objects. However, we could avoid this by leaving FindFontFromCharProc alone, and then after we've run it, we create a gfxWindowsFont for the final best match. If that's broken, we record that that font is no good and rerun FindFontFromCharProc asking it to ignore that font this time. That would avoid slowing down FindFontFromCharProc when there are no bad fonts ... although if you had a lot of bad fonts, it could make FindFontFromCharProc a lot slower. Want me to do this?
if the best fontentry match from FindFontFromCharProc is invalid we could probably just draw boxes.
So gfxWindowsPlatform::FindFontForChar still returns already_AddRefed<gfxWindowsFont>, but returns nsnull if the best-match font is bad?
Created attachment 317839 [details] [diff] [review] fix v2 This what you had in mind?
mostly, yeah. we probably don't want to set mCodepointsWithNoFonts.set(aCh); when bestMatch is invalid (maybe I'm being hopeful that it will become un-invalid eventually? We might also want to fire a timer when we encounter an invalid font and clear the font cache and our platform caches. (we should probably only do it once for a given invalid font). Clearing the font cache might help us if we're just out of GDI objects. thought? aside from that, things look OK.
(In reply to comment #32) > mostly, yeah. we probably don't want to set mCodepointsWithNoFonts.set(aCh); > when bestMatch is invalid (maybe I'm being hopeful that it will become > un-invalid eventually? OK. > We might also want to fire a timer when we encounter an invalid font and clear > the font cache and our platform caches. (we should probably only do it once > for a given invalid font). Clearing the font cache might help us if we're just > out of GDI objects. thought? That sounds like more work than I want to do at this stage of the release. Also I think a better approach would be to periodically check our GDI object usage and flush if it's too high for whatever reason. > aside from that, things look OK. I'll post a new patch addressing the first point.
Created attachment 317857 [details] [diff] [review] fix v3
(In reply to comment #33) > That sounds like more work than I want to do at this stage of the release. Also > I think a better approach would be to periodically check our GDI object usage > and flush if it's too high for whatever reason. This is actually harder than it sounds -- there's some code in nsThebesImage.cpp where I check the number of GDI objects in use to decide how to optimize images, but the problem is that there's both a per-process limit (10,000) as well as a system global limit (which is fairly low). I don't know of a way to query the second number, so given a lot of GDI-using apps, we could still get back errors even when we're not close to 10k. (Solution: Stop using GDI :p)
Comment on attachment 317857 [details] [diff] [review] fix v3 safe fix for an issue that is apparently on our topcrash radar.
Attachment #317857 - Flags: approval1.9?
Whiteboard: [needs review pavlov] → [needs approval]
Weird, evil looking font install hack for Windows: http://www.microsoft.com/technet/scriptcenter/resources/qanda/apr08/hey0425.mspx Ick, patooey.
Comment on attachment 317857 [details] [diff] [review] fix v3 a1.9+=damons
Attachment #317857 - Flags: approval1.9? → approval1.9+
Status: NEW → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED
Whiteboard: [needs approval]
Caused crashes on tinderbox, backed out.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1209426647.1209428972.25823.gz Looks like we crashed in /tests/extensions/universalchardet/tests/test_bug171813.html or thereabouts. Not sure about 'browser' tests.
Those tests pass for me. Rerunning all mochitests.
Mochitests and browser tests all pass. I need help!
Tests pass on Chris' Vista machine and our Mac Mini XP machine. Stuart, would you mind running mochitests on your machine with this patch? And/or, do you know what fonts the unit test boxes have installed?
Keywords: crash, topcrash → qawanted
Robcee: do you have a list of what fonts the tinderboxen have installed by default?
(In reply to comment #48) > Robcee: do you have a list of what fonts the tinderboxen have installed by > default? The unittest buildbot slaves just have whatever's installed on the ref-platform installs by default. Nothing added to them.
Created attachment 318538 [details] [diff] [review] fix v4 Gavin tracked down the problem. In gfxWindowsFontGroup::gfxWindowsFontGroup, when mFontEntries.Length() is zero, we were appending to mFontEntries but leaving mFonts empty. Later we'd try to set mFonts, bad idea. We need to append an empty element to mFonts to keep its length in sync with the length of mFontEntries. Apparently that fixes the crash in tests. That made me think about the fact that along that path, mFonts could end up being an invalid font (although since it's supposed to be a stock font, that would be very bad news). We should be OK even if we end up with a bad font there --- all metrics will be zero, mUnknownCMAP will be true and no characters should match --- but just to be a little more careful I've added an IsValid test to HasCharacter to ensure that an invalid font doesn't match any characters. I'll reland with these changes.
Attachment #317857 - Attachment is obsolete: true
Whiteboard: [needs landing]
Status: REOPENED → RESOLVED
Last Resolved: 10 years ago → 10 years ago
Resolution: --- → FIXED
Whiteboard: [needs landing]
this has caused 432062
No longer depends on: 432062
Crash Signature: [@UniscribeItem::IsGlyphMissing]
You need to log in before you can comment on or make changes to this bug.