Last Comment Bug 763702 - crash in nsFontInflationData::FindFontInflationDataFor at crash address 0x28 (((nsIFrame*)0)->GetStateBits())
: crash in nsFontInflationData::FindFontInflationDataFor at crash address 0x28 ...
Status: VERIFIED FIXED
[native-crash]
: crash, testcase, topcrash
Product: Core
Classification: Components
Component: Layout (show other bugs)
: 14 Branch
: ARM Android
: P1 critical (vote)
: mozilla16
Assigned To: Scott Johnson (:jwir3)
:
Mentors:
: 767298 767804 769740 (view as bug list)
Depends on: 749186
Blocks: 766599 747720 756999
  Show dependency treegraph
 
Reported: 2012-06-11 14:24 PDT by Jet Villegas (:jet)
Modified: 2012-06-29 23:25 PDT (History)
20 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
fixed
+
verified
verified
.N+
15+


Attachments
testcase (421 bytes, application/xhtml+xml)
2012-06-16 10:00 PDT, Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( )
no flags Details
b763702 - WIP (8.68 KB, patch)
2012-06-18 12:17 PDT, Scott Johnson (:jwir3)
no flags Details | Diff | Review
b763702 - WIP (v2) (24.16 KB, patch)
2012-06-18 19:22 PDT, Scott Johnson (:jwir3)
no flags Details | Diff | Review
b763702 (1.06 KB, patch)
2012-06-19 13:20 PDT, Scott Johnson (:jwir3)
dbaron: review+
akeybl: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta+
Details | Diff | Review
b763702-crashtest: Crashtest for this bug (1.53 KB, patch)
2012-06-19 13:46 PDT, Scott Johnson (:jwir3)
dbaron: review+
Details | Diff | Review

Description Jet Villegas (:jet) 2012-06-11 14:24:42 PDT
+++ This bug was initially created as a clone of Bug #749186 +++

This will record new instances of crashes with this signature. Bug 749186 will be closed/fixed as the STR in that bug no longer reproduce. However, there may still be instances of similar crashes in the field. We'll need new STR's with any new crash reports.
Comment 1 Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( ) 2012-06-11 14:26:24 PDT
This doesn't have a testcase, does it?
Comment 2 Scoobidiver (away) 2012-06-12 01:01:05 PDT
It's #5 top crasher in 15.0a2. Note that the patch of bug 749186 landed in 15.0a2/20120606.
Comment 3 Scoobidiver (away) 2012-06-15 00:36:53 PDT
It accounts for 19.6% of crashes in 14.0b7 after the patch of bug 749186 landed.
Before the patch, it was accounting for 18.5% of crashes in 14.0b6.
Comment 4 Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( ) 2012-06-16 10:00:18 PDT
Created attachment 633823 [details]
testcase

Ok, I'm seeing this crash every time with this testcase
Comment 5 Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( ) 2012-06-16 10:12:46 PDT
Reasking for blocking Fennec1.0, since this is the biggest topcrash by far and there is a testcase now.
Comment 6 Scott Johnson (:jwir3) 2012-06-16 10:24:53 PDT
(In reply to Martijn Wargers [:mw22] (QA - IRC nick: mw22) from comment #4)
> Created attachment 633823 [details]
> testcase
> 
> Ok, I'm seeing this crash every time with this testcase

Hmm... that's really interesting, because it's a mobile-optimized testcase. Font inflation should be disabled for this testcase, which in turn should not cause this code to be executed.

I'm looking into it.
Comment 7 Scott Johnson (:jwir3) 2012-06-18 12:17:33 PDT
Created attachment 634138 [details] [diff] [review]
b763702 - WIP

This is a patch from bug 749186 that we were originally going to land, but it throws assertions in two of our reftests for font inflation. I'm trying right now to debug these assertions, but the good news is that it appears to fix the crash in this bug.
Comment 8 Scott Johnson (:jwir3) 2012-06-18 13:15:17 PDT
(In reply to Scott Johnson (:jwir3) from comment #7)
> Created attachment 634138 [details] [diff] [review]
> b763702 - WIP
> 
> This is a patch from bug 749186 that we were originally going to land, but
> it throws assertions in two of our reftests for font inflation. I'm trying
> right now to debug these assertions, but the good news is that it appears to
> fix the crash in this bug.

So, it appears that the assertions are not actually triggered by video-1.html and intrinsic-min-1.html, the tests that are reported to have assertions. Instead, if I comment out the test above those, container-with-clamping.html, the assertions go away, and the tests all pass. 

This test, container-with-clamping.html, has an iframe element that loads another html page. I'm wondering if perhaps there's an extra presShell that is getting thrown away that shouldn't be (or perhaps is not getting thrown away when it should be...).
Comment 9 Scott Johnson (:jwir3) 2012-06-18 19:22:46 PDT
Created attachment 634272 [details] [diff] [review]
b763702 - WIP (v2)

I think I've found what's causing the issue, with dbaron's help. It seems that in the case of <iframe> elements, the parent document is getting reframed first, which seems to cause problems. If I only reframe the child, as in the example I've posted, then everything seems to work fine.

The problem I'm faced with now, though, is that I want to do this only in certain cases. That is, the crash might still happen in cases where there isn't a second document in the document tree. I'm looking at finding a method of doing this efficiently right now.

In the case where we have nested documents like the example container-with-clamping.html, do we need to reframe each document individually? (I guess what I'm asking is whether we should perform the reframe from the leaves of the document tree upward, or whether we should ONLY reframe the leaves). I want to avoid reframing sub documents multiple times, but I'm not sure what the best approach to doing this might be.

Also, it's probably worth mentioning that while I don't think this patch is (as it stands right now) incredibly risky, I do think that it's not ready to be landed until we put some more testing into it. This crash has been quite tricky to track down, and we really should have some additional testing before we release this patch into the wild, just to verify that it actually fixes the crash...
Comment 10 Scott Johnson (:jwir3) 2012-06-18 19:25:23 PDT
>		if (!doc->GetParentDocument()) {
>		// We don't want to reconstruct frames for the child document
>		// twice.
>		return;
>		}

Note that this code is actually not doing anything if we don't have a parent document. This is the first approach to this, just to verify that I could get it to work in the multi-document case, but it's not going to fix the crash for the single document case.
Comment 11 Scott Johnson (:jwir3) 2012-06-19 10:52:19 PDT
(In reply to Scott Johnson (:jwir3) from comment #10)
> Note that this code is actually not doing anything if we don't have a parent
> document. This is the first approach to this, just to verify that I could
> get it to work in the multi-document case, but it's not going to fix the
> crash for the single document case.

Actually, bz said that my assumption might not be correct. Apparently, all non-chrome documents should have a non-null parent. So, this might be filtering out chrome documents, which could be the cause of our problems.

Another thing we might be able to do to fix this bug is to simply re-traverse the frame tree when these preferences have changed, and reset state bits for font inflation. This would allow us to keep the existing frames, but set the appropriate state bits. I'm not sure whether there is something else we need to do that's being performed during frame construction, though, that needs to be done to prevent other problems.
Comment 12 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2012-06-19 11:38:01 PDT
OK, so the reason the previous patch wasn't sufficient here is that the result of nsContentUtils::GetViewportInfo can change, at the very least when the doctype changes as a result of displaying the parser error.  This means that keeping the prefs constant isn't sufficient to prevent nsLayoutUtils::FontSizeInflationEnabled from changing.
Comment 13 Scott Johnson (:jwir3) 2012-06-19 13:20:27 PDT
Created attachment 634561 [details] [diff] [review]
b763702

This seems to fix the crash exhibited by the test case. I'm still working on making the testcase a crashtest.
Comment 14 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2012-06-19 13:22:59 PDT
Comment on attachment 634561 [details] [diff] [review]
b763702

r=dbaron, though at some point we should try to get the rest of the patch in (and revert the other fix)
Comment 15 Scott Johnson (:jwir3) 2012-06-19 13:30:53 PDT
(In reply to David Baron [:dbaron] from comment #14)
> Comment on attachment 634561 [details] [diff] [review]
> b763702
> 
> r=dbaron, though at some point we should try to get the rest of the patch in
> (and revert the other fix)

Agreed.
Comment 16 Scott Johnson (:jwir3) 2012-06-19 13:46:05 PDT
Created attachment 634573 [details] [diff] [review]
b763702-crashtest: Crashtest for this bug
Comment 17 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2012-06-19 13:47:57 PDT
Comment on attachment 634573 [details] [diff] [review]
b763702-crashtest: Crashtest for this bug

r=dbaron if you've checked that it crashes in the test harness without the patch
Comment 18 Scott Johnson (:jwir3) 2012-06-19 14:19:20 PDT
(In reply to David Baron [:dbaron] from comment #17)
> Comment on attachment 634573 [details] [diff] [review]
> b763702-crashtest: Crashtest for this bug
> 
> r=dbaron if you've checked that it crashes in the test harness without the
> patch

I have confirmed that it does, assuming that we're in optimized mode and not in debug mode (since debug mode automatically assigns the frame state bits for assertion checking).
Comment 20 Jet Villegas (:jet) 2012-06-20 10:49:52 PDT
The patch looks safe enough to take as a late fix. Re-nomming...
Comment 22 Scott Johnson (:jwir3) 2012-06-21 08:45:49 PDT
Comment on attachment 634561 [details] [diff] [review]
b763702

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 706193
User impact if declined: Some crashes in native fennec, #1 topcrash
Testing completed (on m-c, etc.): Some testing completed on m-c, but most testing performed locally.
Risk to taking this patch (and alternatives if risky): Very low risk, as it's only a small change that will, in additional cases, add font inflation state bits to frames. This will, at worst, result in a small performance loss if there is a problem with the code for some reason.
String or UUID changes made by this patch: none
Comment 23 Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( ) 2012-06-22 08:45:46 PDT
Verified fixed for trunk Firefox (on Desktop), using the testcase.
Comment 24 Aaron Train [:aaronmt] 2012-06-22 09:16:39 PDT
*** Bug 767298 has been marked as a duplicate of this bug. ***
Comment 25 Alex Keybl [:akeybl] 2012-06-22 12:23:25 PDT
Comment on attachment 634561 [details] [diff] [review]
b763702

[Triage Comment]
Approving for Aurora 15, in preparation for taking prior to our next Beta 14 build.
Comment 26 Scott Johnson (:jwir3) 2012-06-22 14:18:58 PDT
Pushed to aurora:
https://hg.mozilla.org/releases/mozilla-aurora/rev/eed1fb59b32f
Comment 27 Aaron Train [:aaronmt] 2012-06-24 08:05:10 PDT
*** Bug 767804 has been marked as a duplicate of this bug. ***
Comment 28 Alex Keybl [:akeybl] 2012-06-26 09:51:01 PDT
Comment on attachment 634561 [details] [diff] [review]
b763702

Haven't heard about regressions yet - let's land on beta.
Comment 29 Scoobidiver (away) 2012-06-26 10:19:35 PDT
The latest crash on Aurora happened in 15.0a2/20120622042007.
Comment 30 Scott Johnson (:jwir3) 2012-06-26 11:53:35 PDT
Pushed to beta:
https://hg.mozilla.org/releases/mozilla-beta/rev/150edb8ff472
Comment 31 Scoobidiver (away) 2012-06-29 23:25:15 PDT
*** Bug 769740 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.