Closed Bug 763702 Opened 12 years ago Closed 12 years ago

crash in nsFontInflationData::FindFontInflationDataFor at crash address 0x28 (((nsIFrame*)0)->GetStateBits())

Categories

(Core :: Layout, defect, P1)

14 Branch
ARM
Android
defect

Tracking

()

VERIFIED FIXED
mozilla16
Tracking Status
firefox14 - fixed
firefox15 + verified
firefox16 --- verified
blocking-fennec1.0 --- .N+
fennec 15+ ---

People

(Reporter: bugs, Assigned: jwir3)

References

(Blocks 1 open bug)

Details

(Keywords: crash, testcase, topcrash, Whiteboard: [native-crash])

Crash Data

Attachments

(3 files, 2 obsolete files)

+++ This bug was initially created as a clone of Bug #749186 +++

This will record new instances of crashes with this signature. Bug 749186 will be closed/fixed as the STR in that bug no longer reproduce. However, there may still be instances of similar crashes in the field. We'll need new STR's with any new crash reports.
This doesn't have a testcase, does it?
It's #5 top crasher in 15.0a2. Note that the patch of bug 749186 landed in 15.0a2/20120606.
Whiteboard: [native-crash]
tracking-fennec: ? → 15+
blocking-fennec1.0: ? → -
It accounts for 19.6% of crashes in 14.0b7 after the patch of bug 749186 landed.
Before the patch, it was accounting for 18.5% of crashes in 14.0b6.
Attached file testcase
Ok, I'm seeing this crash every time with this testcase
Reasking for blocking Fennec1.0, since this is the biggest topcrash by far and there is a testcase now.
blocking-fennec1.0: - → ?
Keywords: qawantedtestcase
(In reply to Martijn Wargers [:mw22] (QA - IRC nick: mw22) from comment #4)
> Created attachment 633823 [details]
> testcase
> 
> Ok, I'm seeing this crash every time with this testcase

Hmm... that's really interesting, because it's a mobile-optimized testcase. Font inflation should be disabled for this testcase, which in turn should not cause this code to be executed.

I'm looking into it.
Assignee: nobody → sjohnson
blocking-fennec1.0: ? → .N+
Attached patch b763702 - WIP (obsolete) — Splinter Review
This is a patch from bug 749186 that we were originally going to land, but it throws assertions in two of our reftests for font inflation. I'm trying right now to debug these assertions, but the good news is that it appears to fix the crash in this bug.
(In reply to Scott Johnson (:jwir3) from comment #7)
> Created attachment 634138 [details] [diff] [review]
> b763702 - WIP
> 
> This is a patch from bug 749186 that we were originally going to land, but
> it throws assertions in two of our reftests for font inflation. I'm trying
> right now to debug these assertions, but the good news is that it appears to
> fix the crash in this bug.

So, it appears that the assertions are not actually triggered by video-1.html and intrinsic-min-1.html, the tests that are reported to have assertions. Instead, if I comment out the test above those, container-with-clamping.html, the assertions go away, and the tests all pass. 

This test, container-with-clamping.html, has an iframe element that loads another html page. I'm wondering if perhaps there's an extra presShell that is getting thrown away that shouldn't be (or perhaps is not getting thrown away when it should be...).
Attached patch b763702 - WIP (v2) (obsolete) — Splinter Review
I think I've found what's causing the issue, with dbaron's help. It seems that in the case of <iframe> elements, the parent document is getting reframed first, which seems to cause problems. If I only reframe the child, as in the example I've posted, then everything seems to work fine.

The problem I'm faced with now, though, is that I want to do this only in certain cases. That is, the crash might still happen in cases where there isn't a second document in the document tree. I'm looking at finding a method of doing this efficiently right now.

In the case where we have nested documents like the example container-with-clamping.html, do we need to reframe each document individually? (I guess what I'm asking is whether we should perform the reframe from the leaves of the document tree upward, or whether we should ONLY reframe the leaves). I want to avoid reframing sub documents multiple times, but I'm not sure what the best approach to doing this might be.

Also, it's probably worth mentioning that while I don't think this patch is (as it stands right now) incredibly risky, I do think that it's not ready to be landed until we put some more testing into it. This crash has been quite tricky to track down, and we really should have some additional testing before we release this patch into the wild, just to verify that it actually fixes the crash...
Attachment #634138 - Attachment is obsolete: true
>		if (!doc->GetParentDocument()) {
>		// We don't want to reconstruct frames for the child document
>		// twice.
>		return;
>		}

Note that this code is actually not doing anything if we don't have a parent document. This is the first approach to this, just to verify that I could get it to work in the multi-document case, but it's not going to fix the crash for the single document case.
(In reply to Scott Johnson (:jwir3) from comment #10)
> Note that this code is actually not doing anything if we don't have a parent
> document. This is the first approach to this, just to verify that I could
> get it to work in the multi-document case, but it's not going to fix the
> crash for the single document case.

Actually, bz said that my assumption might not be correct. Apparently, all non-chrome documents should have a non-null parent. So, this might be filtering out chrome documents, which could be the cause of our problems.

Another thing we might be able to do to fix this bug is to simply re-traverse the frame tree when these preferences have changed, and reset state bits for font inflation. This would allow us to keep the existing frames, but set the appropriate state bits. I'm not sure whether there is something else we need to do that's being performed during frame construction, though, that needs to be done to prevent other problems.
OK, so the reason the previous patch wasn't sufficient here is that the result of nsContentUtils::GetViewportInfo can change, at the very least when the doctype changes as a result of displaying the parser error.  This means that keeping the prefs constant isn't sufficient to prevent nsLayoutUtils::FontSizeInflationEnabled from changing.
Attached patch b763702Splinter Review
This seems to fix the crash exhibited by the test case. I'm still working on making the testcase a crashtest.
Attachment #634272 - Attachment is obsolete: true
Attachment #634561 - Flags: review?(dbaron)
Comment on attachment 634561 [details] [diff] [review]
b763702

r=dbaron, though at some point we should try to get the rest of the patch in (and revert the other fix)
Attachment #634561 - Flags: review?(dbaron) → review+
(In reply to David Baron [:dbaron] from comment #14)
> Comment on attachment 634561 [details] [diff] [review]
> b763702
> 
> r=dbaron, though at some point we should try to get the rest of the patch in
> (and revert the other fix)

Agreed.
Attachment #634573 - Flags: review?(dbaron)
Comment on attachment 634573 [details] [diff] [review]
b763702-crashtest: Crashtest for this bug

r=dbaron if you've checked that it crashes in the test harness without the patch
Attachment #634573 - Flags: review?(dbaron) → review+
(In reply to David Baron [:dbaron] from comment #17)
> Comment on attachment 634573 [details] [diff] [review]
> b763702-crashtest: Crashtest for this bug
> 
> r=dbaron if you've checked that it crashes in the test harness without the
> patch

I have confirmed that it does, assuming that we're in optimized mode and not in debug mode (since debug mode automatically assigns the frame state bits for assertion checking).
Blocks: 766599
The patch looks safe enough to take as a late fix. Re-nomming...
tracking-fennec: 15+ → ?
blocking-fennec1.0: .N+ → ?
tracking-fennec: ? → 15+
blocking-fennec1.0: ? → .N+
https://hg.mozilla.org/mozilla-central/rev/46d151cf4291
https://hg.mozilla.org/mozilla-central/rev/e85f1e527b3e
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla16
Comment on attachment 634561 [details] [diff] [review]
b763702

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 706193
User impact if declined: Some crashes in native fennec, #1 topcrash
Testing completed (on m-c, etc.): Some testing completed on m-c, but most testing performed locally.
Risk to taking this patch (and alternatives if risky): Very low risk, as it's only a small change that will, in additional cases, add font inflation state bits to frames. This will, at worst, result in a small performance loss if there is a problem with the code for some reason.
String or UUID changes made by this patch: none
Attachment #634561 - Flags: approval-mozilla-beta?
Attachment #634561 - Flags: approval-mozilla-aurora?
Verified fixed for trunk Firefox (on Desktop), using the testcase.
Status: RESOLVED → VERIFIED
Comment on attachment 634561 [details] [diff] [review]
b763702

[Triage Comment]
Approving for Aurora 15, in preparation for taking prior to our next Beta 14 build.
Attachment #634561 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 634561 [details] [diff] [review]
b763702

Haven't heard about regressions yet - let's land on beta.
Attachment #634561 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
The latest crash on Aurora happened in 15.0a2/20120622042007.
You need to log in before you can comment on or make changes to this bug.