Closed Bug 847272 Opened 7 years ago Closed 7 years ago

"ASSERTION: caching a font associated with no family yet"

Categories

(Core :: Graphics: Text, defect)

x86_64
macOS
defect
Not set

Tracking

()

VERIFIED FIXED
mozilla22
Tracking Status
firefox20 --- unaffected
firefox21 --- verified

People

(Reporter: jruderman, Assigned: jtd)

References

(Blocks 2 open bugs)

Details

(4 keywords)

Attachments

(4 files, 1 obsolete file)

1. Create a new profile (mkdir -p ~/px/a; firefox -profile ~/px/a)
2. Install https://www.squarefree.com/extensions/domFuzzLite3.xpi
3. Set 
     user_pref("security.fileuri.strict_origin_policy", false);
4. Fix a path in the testcase.
5. Load the testcase.
6. Wait a bit.

Result: after 10 seconds (on average),

###!!! ASSERTION: caching a font associated with no family yet: 'aFontEntry->mFamilyName.Length() != 0', file gfx/thebes/gfxUserFontSet.cpp, line 773

This assertion was added in bug 833169.
Attached file stack
Assignee: nobody → jdaggett
So I traced through this today, there appears to be a circular logic bug within the lazy user font set accessor in nsPresContext::GetUserFontSet and nsFontFaceLoader::OnStreamComplete.  It looks like what is happening is that while a given font is being downloaded, the enable/disable flag on the stylesheet is changed which invalidates the user font set (i.e. mUserFontSetDirty gets set).  When the download completes, within OnStreamComplete the GetUserFontSet method is called and this flushes the font set and cleans out all the families via UpdateRules.  So when OnLoadComplete is called, the family is empty and when calling ReplaceFontEntry no family name is set which later on causes the assertion to fire.

Next step is to figure out how to untangle this mess... ;)
Two options, I think. One is to just bail out of caching the font entry if its family name has been cleared, as this means the entry has been "detached" from its family and is no longer reachable - the only thing keeping it alive is the in-progress fontloader, which is about to go away. So in this case, we'd just discard the font, as the stylesheet has changed and we no longer need it.

A (better?) alternative would be to -not- clear the family name from the font entry if there's no new name to be set, so that we'd end up caching the font entry under its "original" family name even though it no longer belongs to it. This way, if the stylesheet is dynamically toggled on/off, we'd have the font already in cache rather than having to re-download it (assuming the family name being used doesn't change, of course) when the rule is re-activated.
Since nsPresContext::GetUserFontSet can potentially alter the contents of the user font set, avoid calling that method within font loader code.  There's already mFontSet so simply use that.  Even if the font set is going to be reconstructed, the downloaded font will end up in the cache with the right name associated with it, such that it'll be pulled out of the cache after the font set is rebuilt.  Also added some more asserts to assure that we're not manipulating font entries in zombie font sets.
Attachment #724241 - Flags: review?(jfkthame)
Comment on attachment 724241 [details] [diff] [review]
patch, avoid font set updates within loader code

Review of attachment 724241 [details] [diff] [review]:
-----------------------------------------------------------------

In principle this seems reasonable, modulo a couple of issues noted below:

::: gfx/thebes/gfxUserFontSet.h
@@ +99,4 @@
>                  break;
>              }
>          }
> +        NS_ASSERTION(found, "font entry not found in family!");

Rather than creating and updating a separate 'found' variable, just for the sake of an assertion in debug builds, how about declaring the loop variable outside the loop (so it is still in scope here), and then simply asserting that i < numFonts at the end. (That makes the initial assertion redundant, too.)

::: layout/style/nsFontFaceLoader.cpp
@@ +222,5 @@
>      LOG(("fontdownloader (%p) reflow\n", this));
>    }
>  
> +  // done with font set
> +  mFontSet = nullptr;

Hmm, if we're setting this to null here, maybe we'd also better explicitly cancel the loader's mLoadTimer. I'm not entirely sure if there's currently any risk it could fire after OnStreamComplete has been called, but before the loader is actually deleted, but it sounds like a potential footgun - the timer callback would attempt to access mFontSet. Cancelling the timer in OnStreamComplete just makes sense anyway, as it no longer serves any purpose.
Revised based on review comments.
Attachment #724395 - Flags: review?(jfkthame)
Attachment #724241 - Attachment is obsolete: true
Attachment #724241 - Flags: review?(jfkthame)
Attachment #724395 - Flags: review?(jfkthame) → review+
https://hg.mozilla.org/mozilla-central/rev/06f85e10a40c
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Comment on attachment 724395 [details] [diff] [review]
patch v2, avoid font set updates within loader code

[Approval Request Comment]
Bug caused by (feature/regressing bug #): prior changes to font loading code

User impact if declined: 
This fix seems to have decreased the rate at which topcrasher bug 838105 occurs.

Testing completed (on m-c, etc.): already in aurora

Risk to taking this patch (and alternatives if risky):
The patch merely adds better error checking to the font loading code, it doesn't represent a large change in normal code flow.  Since it seems to be correlated with a significant drop in the rate at which a the topcrasher is occurring, I strongly recommend that we uplift this to beta.

String or IDL/UUID changes made by this patch: none
Attachment #724395 - Flags: approval-mozilla-beta?
(In reply to John Daggett (:jtd) from comment #10)
> Comment on attachment 724395 [details] [diff] [review]
> patch v2, avoid font set updates within loader code
> 
> [Approval Request Comment]
> Bug caused by (feature/regressing bug #): prior changes to font loading code
> 
> User impact if declined: 
> This fix seems to have decreased the rate at which topcrasher bug 838105
> occurs.
> 
> Testing completed (on m-c, etc.): already in aurora
> 
> Risk to taking this patch (and alternatives if risky):
> The patch merely adds better error checking to the font loading code, it
> doesn't represent a large change in normal code flow.  Since it seems to be
> correlated with a significant drop in the rate at which a the topcrasher is
> occurring, I strongly recommend that we uplift this to beta.
John, is there any way for QA to verify this patch or bug 838105 once the patch lands on beta ?
> 
> String or IDL/UUID changes made by this patch: none
(In reply to bhavana bajaj [:bajaj] from comment #11)
> (In reply to John Daggett (:jtd) from comment #10)
> > Comment on attachment 724395 [details] [diff] [review]
> > patch v2, avoid font set updates within loader code
> > 
> > [Approval Request Comment]
> > Bug caused by (feature/regressing bug #): prior changes to font loading code
> > 
> > User impact if declined: 
> > This fix seems to have decreased the rate at which topcrasher bug 838105
> > occurs.
> > 
> > Testing completed (on m-c, etc.): already in aurora
> > 
> > Risk to taking this patch (and alternatives if risky):
> > The patch merely adds better error checking to the font loading code, it
> > doesn't represent a large change in normal code flow.  Since it seems to be
> > correlated with a significant drop in the rate at which a the topcrasher is
> > occurring, I strongly recommend that we uplift this to beta.
> John, is there any way for QA to verify this patch or bug 838105 once the
> patch lands on beta ?
> > 
other than the test-case here ? I see some steps in https://bugzilla.mozilla.org/show_bug.cgi?id=838105#c6 , could that be a good verification check ? Any additional testing that you would recommend ?
> > String or IDL/UUID changes made by this patch: none
(In reply to bhavana bajaj [:bajaj] from comment #11)

> John, is there any way for QA to verify this patch or bug 838105 once the
> patch lands on beta ?

Yes, I was able to reproduce the bug 838105 crash running with beta:

http://people.mozilla.org/~jdaggett/tests/bug838105-loadtests.html

It took roughly 65 minutes before the crash occurred, but this is consistent with the crash reports, many of them show very long uptimes before the crash occurs.

Crash report:
https://crash-stats.mozilla.com/report/index/bp-edfef3b0-caab-4828-b824-57a682130429

Landing the patch should make it so the crash takes much longer to occur (I've struggled to reproduce it on trunk).
Comment on attachment 724395 [details] [diff] [review]
patch v2, avoid font set updates within loader code

Approving this to mitigate the top-crasher bug 838105 . This has baked very well on m-c/aurora and we have it with very low volume on those channels.

I am also going to request some QA testing here to see if they can reproduce John's case and verify the patch.

Also we should be keeping a close eye for this signature once we ship a beta with this patch to make sure it helped !

Please land by tomorrow(Tuesday) morning PT on beta channel for this to get into our next beta.
Attachment #724395 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Keywords: verifyme
No crash occurred in more than three hours testing using FF 21 beta 6 (Build ID: 20130430204233).
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0

Marking as verified.
If someone else is able to reproduce it, please reopen this issue.
Status: RESOLVED → VERIFIED
Blocks: 838105
You need to log in before you can comment on or make changes to this bug.