Closed Bug 739858 Opened 12 years ago Closed 12 years ago

Crash [@ nsAString_internal::Assign ] with downloaded font

Categories

(Core :: Graphics, defect)

13 Branch
x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla15
Tracking Status
firefox13 + fixed
firefox14 + fixed
firefox15 --- fixed

People

(Reporter: bc, Assigned: jfkthame)

References

()

Details

(Keywords: crash, regression, Whiteboard: [qa-])

Crash Data

Attachments

(2 files)

1. http://blogs.dunyanews.tv/?p=3406
2. Crash Nightly/14 Linux debug only. Didn't crash a nightly Nightly.

Operating system: Linux
                  0.0.0 Linux 3.2.10-3.fc16.i686.PAE #1 SMP Thu Mar 15 20:37:01 UTC 2012 i686
CPU: x86
     GenuineIntel family 6 model 37 stepping 1
     2 CPUs

Crash reason:  SIGSEGV
Crash address: 0x10

Thread 0 (crashed)
 0  libxul.so!nsAString_internal::Assign [nsTSubstring.cpp : 365 + 0x3]
    eip = 0xb5feadf2   esp = 0xbff7ff20   ebp = 0xbff7ff58   ebx = 0xb70238a8
    esi = 0x0b395840   edi = 0x00001478   eax = 0x0000000c   ecx = 0xb70238a8
    edx = 0x0000000c   efl = 0x00210287
    Found by: given as instruction pointer in context
 1  libxul.so!nsString::nsString [nsTString.h : 92 + 0xa]
    eip = 0xb49a7aea   esp = 0xbff7ff60   ebp = 0xbff7ff78   ebx = 0xb70238a8
    esi = 0x0b395840   edi = 0x00001478
    Found by: call frame info
 2  libxul.so!gfxFontEntry::gfxFontEntry [gfxFont.h : 221 + 0x3b]
    eip = 0xb608e6b2   esp = 0xbff7ff80   ebp = 0xbff7ffa8   ebx = 0xb70238a8
    esi = 0x0b395840   edi = 0x00001478
    Found by: call frame info
 3  libxul.so!gfxFcFontEntry::gfxFcFontEntry [gfxPangoFonts.cpp : 234 + 0x21]
    eip = 0xb6099bf4   esp = 0xbff7ffb0   ebp = 0xbff7ffc8   ebx = 0xb70238a8
    esi = 0x0b395840   edi = 0x00001478
    Found by: call frame info
 4  libxul.so!gfxUserFcFontEntry::gfxUserFcFontEntry [gfxPangoFonts.cpp : 442 + 0x1e]
    eip = 0xb609a3f7   esp = 0xbff7ffd0   ebp = 0xbff7ffe8   ebx = 0xb70238a8
    esi = 0x0b395840   edi = 0x00001478
    Found by: call frame info
 5  libxul.so!gfxDownloadedFcFontEntry::gfxDownloadedFcFontEntry [gfxPangoFonts.cpp : 560 + 0x11]
    eip = 0xb609a914   esp = 0xbff7fff0   ebp = 0xbff80018   ebx = 0xb70238a8
    esi = 0x0b395840   edi = 0x00001478
    Found by: call frame info
 6  libxul.so!gfxPangoFontGroup::NewFontEntry [gfxPangoFonts.cpp : 2372 + 0x20]
    eip = 0xb609ec05   esp = 0xbff80020   ebp = 0xbff80058   ebx = 0xb70238a8
    esi = 0x0b395840   edi = 0x00001478
    Found by: call frame info
 7  libxul.so!gfxPlatformGtk::MakePlatformFont [gfxPlatformGtk.cpp : 275 + 0x18]
    eip = 0xb60a6c85   esp = 0xbff80060   ebp = 0xbff80078   ebx = 0xb70238a8
    esi = 0xb4963fbe   edi = 0x00001478
    Found by: call frame info
 8  libxul.so!gfxUserFontSet::LoadFont [gfxUserFontSet.cpp : 674 + 0x23]
    eip = 0xb6090426   esp = 0xbff80080   ebp = 0xbff80268   ebx = 0xb70238a8
    esi = 0xb4963fbe   edi = 0x00001478
    Found by: call frame info
Crash Signature: [@ nsAString_internal::Assign]
Depends on: 557234
OS: Mac OS X → Linux
The only way I can see that we'd hit a crash here would be if the proxy font entry's family pointer has been reset to null, in which case we lose because the gfxUserFcFontEntry wants to use the family's name string.

If the proxy doesn't belong to a family, I think this means it's going to be unusable anyay (no way to reach it!), so we may as well check this and bail out of the font-loading process.
Attachment #610157 - Flags: review?(jdaggett)
Attachment #610157 - Attachment is patch: true
(In reply to Jonathan Kew (:jfkthame) from comment #1)

> The only way I can see that we'd hit a crash here would be if the proxy font
> entry's family pointer has been reset to null, in which case we lose because
> the gfxUserFcFontEntry wants to use the family's name string.
> 
> If the proxy doesn't belong to a family, I think this means it's going to be
> unusable anyay (no way to reach it!), so we may as well check this and bail
> out of the font-loading process.

I don't like the idea of this patch, the logic in DetachFontEntries clears out family pointers and it's not terribly clear how the logic is guaranteed to always restore that.  So couldn't there be a scenario where @font-face rules are added while fonts are still loading that would create this sort of crash path?  I think this needs more testing to see if we can come up with a reproducible testcase before sticking in a workaround patch like this.  My concern is that we're just getting lucky here on other platforms that don't use mFamily at the "right" time to cause a crash.
No longer depends on: 557234
(In reply to John Daggett (:jtd) from comment #2)
> (In reply to Jonathan Kew (:jfkthame) from comment #1)
> 
> > The only way I can see that we'd hit a crash here would be if the proxy font
> > entry's family pointer has been reset to null, in which case we lose because
> > the gfxUserFcFontEntry wants to use the family's name string.
> > 
> > If the proxy doesn't belong to a family, I think this means it's going to be
> > unusable anyay (no way to reach it!), so we may as well check this and bail
> > out of the font-loading process.
> 
> I don't like the idea of this patch, the logic in DetachFontEntries clears
> out family pointers and it's not terribly clear how the logic is guaranteed
> to always restore that.

DetachFontEntries clears the family pointers for all the existing entries (but we keep hold of those entries in the old rule set while creating the new one). As the new rule set is created, we check each rule to see if it existed in the old set; if so, we just move the old rule to the new set, and put its entry into the new family we're building:

http://mxr.mozilla.org/mozilla-central/source/layout/style/nsFontFaceLoader.cpp#505

This will set up the entry's family pointer (via gfxUserFontSet::AddFontFace calling gfxFontFamily::AddFontEntry calling gfxFontEntry::SetFamily) for each such entry that belonged to a pre-existing rule that's being preserved in the new font set.

So the only font entries that will be left with a null family pointer are those belonging to rules that existed in the old set but are not present in the new one. That's easy to achieve, of course, by dynamically modifying the CSS (enabling/disabling rules, etc.)

Such entries might have already begun to download a font, in which case the loader is going to send OnLoadComplete to the font set, but as the entry is no longer attached to a family it will never be used.

>  So couldn't there be a scenario where @font-face
> rules are added while fonts are still loading that would create this sort of
> crash path?

Not *added*, AFAICS, but *modified or removed*, as described above.

>  I think this needs more testing to see if we can come up with a
> reproducible testcase before sticking in a workaround patch like this. 

I don't have a reproducible testcase (it'd be timing-sensitive, for one thing), but what I think we can do to more proactively prevent the problem is to have nsUserFontSet::UpdateRules iterate over any old rules that don't get migrated to the new set, and call CancelLoader on them. This is a good thing anyway because it means we don't keep on downloading a font that we're never going to use.

Nevertheless, I think we should also take the original patch here as well, as a precautionary measure if nothing else. Also, it's safe enough that I think we could nominate it for Aurora, whereas a reworking of UpdateRules() to cancel in-progress-but-obsolete loaders might be considered riskier.

> My
> concern is that we're just getting lucky here on other platforms that don't
> use mFamily at the "right" time to cause a crash.

The font activation code on the other platforms doesn't need to refer to mFamily, which is why they don't crash. They'll successfully activate the downloaded font (assuming it's valid), but nsUserFontSet::ReplaceFontEntry checks the family pointer for null before calling family->ReplaceFontEntry, so the proxy/real replacement never happens (and the downloaded font is useless).
This should prevent us reaching the state where this crash can happen; it also makes us more efficient by abandoning downloads for resources that we're no longer going to use anyway.

(I'd like to take the original crash-prevention patch in attachment 610157 [details] [diff] [review] as well, to reduce the risk of running into something like this again if we modify loading behavior in the future.)
Attachment #614837 - Flags: review?(dbaron)
How do these two patches relate to each other?  Are they alternatives?  (And is there a reason you're asking me for review rather than jdaggett, who I think would probably be a better reviewer?)
(In reply to David Baron [:dbaron] from comment #5)
> How do these two patches relate to each other?  Are they alternatives? 

I believe either one of them should resolve the crash here; the first is a small bandaid to prevents the crash, while the second addresses the presumed crashing scenario at a deeper level (and should be beneficial across all platforms, even though only Linux actually crashes). IMO, we should take both; the first is a sensible precaution even if the second makes it not strictly needed at this point.

> (And
> is there a reason you're asking me for review rather than jdaggett, who I
> think would probably be a better reviewer?)

My recollection was that you'd previously reviewed some changes to user font set updating (in bug 633299), and this seemed to follow on naturally from that. But if you'd rather pass it to John, that's ok.
Comment on attachment 614837 [details] [diff] [review]
patch, cancel any in-progress loaders for rules that have been dropped from the font-set

Redirecting r? to jdaggett, though whichever of you reviews it would be fine with me. :)
Attachment #614837 - Flags: review?(dbaron) → review?(jdaggett)
Attachment #610157 - Flags: review?(jdaggett) → review+
Attachment #614837 - Flags: review?(jdaggett) → review+
Backed out for compile errors:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=476998cbd69f
eg https://tbpl.mozilla.org/php/getParsedLog.php?id=11236148&tree=Mozilla-Inbound

{
../../../gfx/thebes/gfxUserFontSet.cpp: In member function 'gfxFontEntry* gfxUserFontSet::LoadFont(gfxProxyFontEntry*, const PRUint8*, PRUint32&)':
../../../gfx/thebes/gfxUserFontSet.cpp:634:28: error: invalid conversion from 'const void*' to 'void*'
../../../gfx/thebes/gfxUserFontSet.cpp:634:28: error:   initializing argument 1 of 'void NS_Free_P(void*)'
make[6]: *** [gfxUserFontSet.o] Error 1
}

https://hg.mozilla.org/integration/mozilla-inbound/rev/ccb7ddfe749b
Target Milestone: mozilla15 → ---
Summary: Debug Crash [@ nsAString_internal::Assign ] with downloaded font → Crash [@ nsAString_internal::Assign ] with downloaded font
This is occurring in the wild, not debug-only (as originally filed); see bug 741504 and bug 757108.
Comment on attachment 610157 [details] [diff] [review]
patch, bail out of font loading if the proxy has been detached

[Approval Request Comment]
Bug caused by (feature/regressing bug #): not sure this is a recent regression, may be longstanding but low-volume crash; possibly bug 633299?

User impact if declined: occasional crashes on pages with downloadable fonts

Testing completed (on m-c, etc.): bug 741504#c9 confirms the crash can no longer be reproduced on trunk

Risk to taking this patch (and alternatives if risky): minimal risk, just detects problem scenario and bails out of font loading

String or UUID changes made by this patch: none
Attachment #610157 - Flags: approval-mozilla-aurora?
Only nominating the first patch for Aurora uplift at this point, as this is sufficient to prevent the crash and is much less invasive (risky?) than the more extensive code improvements in the second patch.
Doesn't this bug affect all (most?) sites with downloadable fonts? I'd love to see it in Firefox 13 before release, it's a crash happening consistently on certain pages after all.
I suspect it only occurs under certain circumstances when the set of @font-face rules is being updated, not on all sites with downloadable fonts. However, AFAICT from crash-stats it does seem to be the #3 topcrash on Linux in 13.0 and 14.0 builds (around 5-6% of Linux crashes), so it would be good to fix. Nominating for beta as well, given how safe the patch looks.
Comment on attachment 610157 [details] [diff] [review]
patch, bail out of font loading if the proxy has been detached

[Approval Request Comment]
Bug caused by (feature/regressing bug #): see above - related to @font-face usage, but precise regression point not identified

User impact if declined: occasional crashes on pages with downloadable fonts

Testing completed (on m-c, etc.): bug 741504#c9 confirms the crash can no longer be reproduced on trunk

Risk to taking this patch (and alternatives if risky): minimal risk, just detects problem scenario and bails out of font loading

String or UUID changes made by this patch: none

This is currently the #3 topcrash in FF13 and 14 on Linux, if I'm reading crash-stats correctly.
Attachment #610157 - Flags: approval-mozilla-beta?
After digging through crashstats and pushlog some more, I believe this appeared as a regression in mozilla-13 due to bug 721315 on 2012-02-03.

There was a potential bug lurking in the code prior to that, but bug 721315 made it much more likely to show up as an actual crash; previously, it would have been a potential use-after-free which might only very rarely lead to a crash (probably without a consistent signature), depending on heap churn.
Version: Trunk → 13 Branch
Comment on attachment 610157 [details] [diff] [review]
patch, bail out of font loading if the proxy has been detached

[Triage Comment]
Low risk and verified fix for a top Linux crasher. Approved for Aurora 14 and Beta 13. Please land asap.
Attachment #610157 - Flags: approval-mozilla-beta?
Attachment #610157 - Flags: approval-mozilla-beta+
Attachment #610157 - Flags: approval-mozilla-aurora?
Attachment #610157 - Flags: approval-mozilla-aurora+
Landed on Aurora:
https://hg.mozilla.org/releases/mozilla-aurora/rev/0433b21faf60
And followup bustage fix:
https://hg.mozilla.org/releases/mozilla-aurora/rev/9620de534159

(The bustage fix from the original landing had ended up in the second patch, so it got missed when transplanting just the first one from m-c to aurora - sorry about the messy result.)

Beta landing to follow, after verifying local build.
Whiteboard: [qa+]
Tried to verify on:
Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120529 Firefox/13.0 (20120529072024)
with the steps from this bug and from bug 757108.

The crash didn't reproduce but it also didn't reproduce on older builds without this fix. If anybody finds reliable STR for this crash, please add them in a comment.
Whiteboard: [qa+] → [qa-]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: