Closed Bug 770041 Opened 9 years ago Closed 9 years ago

Firefox 16.0a1 crash in nsIFrame::GetOffsetToCrossDoc

Categories

(Core :: Layout, defect)

15 Branch
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
mozilla16

People

(Reporter: scoobidiver, Assigned: mattwoodrow)

References

Details

(Keywords: crash, regression, reproducible, Whiteboard: [native-crash])

Crash Data

Attachments

(2 files)

It's #1 top crasher in today's build with about 70 crashes per hour!
It first appeared in 16.0a1/20120701. The regression range is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f08d285b63b0&tochange=d9d61d199b114
It's likely a regression from bug 539356.

Here are some comments:
"Mouse back button crash. "
"This is reproducible but depends on the google search result. The last operation before crash was selecting one of the search result then scrolled down on that page then pressing ALT+LEFT to come back to the serach result. The moment ATL+LEFT is pressed the browser crassehs."

Signature 	nsIFrame::GetOffsetToCrossDoc(nsIFrame const*, int) More Reports Search
UUID	c7def41e-831c-4fe6-ad80-191802120701
Date Processed	2012-07-01 18:14:18
Uptime	522
Last Crash	1.6 weeks before submission
Install Age	8.7 minutes since version was first installed.
Install Time	2012-07-01 18:04:40
Product	Firefox
Version	16.0a1
Build ID	20120701030537
Release Channel	nightly
OS	Windows NT
OS Version	6.1.7600
Build Architecture	x86
Build Architecture Info	GenuineIntel family 6 model 37 stepping 5
Crash Reason	EXCEPTION_STACK_OVERFLOW
Crash Address	0x5b61b8bc
App Notes 	
AdapterVendorID: 0x8086, AdapterDeviceID: 0x0046, AdapterSubsysID: 3674103c, AdapterDriverVersion: 8.15.10.2430
D2D? D2D+ DWrite? DWrite+ D3D10 Layers? D3D10 Layers+ 
Processor Notes 	This dump is too long and has triggered the automatic truncation routine
EMCheckCompatibility	True
Adapter Vendor ID	0x8086
Adapter Device ID	0x0046
Total Virtual Memory	2147352576
Available Virtual Memory	1552732160
System Memory Use Percentage	75
Available Page File	2039160832
Available Physical Memory	482992128

Frame 	Module 	Signature 	Source
0 	xul.dll 	nsIFrame::GetOffsetToCrossDoc 	layout/generic/nsFrame.cpp:4300
1 	xul.dll 	nsIFrame::GetOffsetToCrossDoc 	layout/generic/nsFrame.cpp:4340
2 	xul.dll 	nsIFrame::GetOffsetToCrossDoc 	layout/generic/nsFrame.cpp:4340
...

More reports at:
https://crash-stats.mozilla.com/report/list?signature=nsIFrame%3A%3AGetOffsetToCrossDoc%28nsIFrame+const*%2C+int%29
Matt, maybe bug 539356, comment 57 can help with this.
I can reproduced this crash with IE Tab V2

STR
1. Create Clean Profile and Install IE Tab V2 https://addons.mozilla.org/ja/firefox/addon/ie-tab-2-ff-36/ and Restart
2. Show Addon Bar(Ctrl+/)
3. Goto  any page in current tab, and navigate any link in current tab
4. Switch to IE engine by click IE tab Icon in Addon Bar
5. Navigation back
6. Navigation forward

Actual Results:
Crashes bp-5712bef0-dec0-450d-93ee-0670a2120701

And I confirmed  bug 539356 causes this crash.
Keywords: reproducible
It impacts Mac OS X and Fennec (bp-34adbf88-92bd-49a9-8d52-4f8992120701):
https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3AFramePropertyTable%3A%3AGet
Crash Signature: [@ nsIFrame::GetOffsetToCrossDoc(nsIFrame const*, int)] → [@ nsIFrame::GetOffsetToCrossDoc(nsIFrame const*, int)] [@ mozilla::FramePropertyTable::Get]
OS: Windows NT → All
Hardware: x86 → All
Whiteboard: [native-crash]
Assignee: nobody → matt.woodrow
This is a regression from part 3 of DBLI.

In this, in changed nsPresContext::GetParentPresContext() to succeed, even when the current pres context has no frames.

It looks like this call here used to return nsnull and now is returning a valid pres context: http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsObjectFrame.cpp#70

We then end up calling GetOffsetToCrossDoc with frames that are in different document hierarchies and it recurses infinitely. 

I assume we need extra checks in GetParentPresContext() to prevent this from happening, but I'm not sure what to check, ideas?
(In reply to Matt Woodrow (:mattwoodrow) from comment #4)
> It looks like this call here used to return nsnull and now is returning a
> valid pres context:
> http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsObjectFrame.
> cpp#70

Incorrect link?
What if you do what tnikkel suggested and follow the view hierarchy instead? So: get the view manager, get its root view, see if the root view has a parent, if so, get that parent's view manager's pres context (perhaps by getting its root view's frame's prescontext).
This fixes the crash for me locally, I'll push it it to try to see if it breaks anything else.
Attachment #638250 - Flags: review?(roc)
I added a 64-bit signature that has similar comments, i.e. hitting Back button: https://crash-stats.mozilla.com/report/list?signature=nsPoint%3A%3AConvertAppUnits%28int%2C+int%29
Crash Signature: [@ nsIFrame::GetOffsetToCrossDoc(nsIFrame const*, int)] [@ mozilla::FramePropertyTable::Get] → [@ nsIFrame::GetOffsetToCrossDoc(nsIFrame const*, int)] [@ mozilla::FramePropertyTable::Get] [@ nsPoint::ConvertAppUnits(int, int)]
Even with a few crash problems fixed, this still isn't working correctly.

For DLBI, we need to always be able to get the painting refresh driver, which we do via the root pres context.

Though this patch fixes the crash, it makes GetRootPresContext fail in some situations where it wouldn't have previous, and we can't schedule a paint with the refresh driver.

Example failure:

https://tbpl.mozilla.org/php/getParsedLog.php?id=13159976&tree=Try&full=1
It appears that there are multiple definitions of the root pres context being used here, one by existing callers and one by DLBI.

This just splits this into two functions so that existing code is unaffected and DLBI gets the right object.
Attachment #638325 - Flags: review?(roc)
Comment on attachment 638250 [details] [diff] [review]
Use view system instead

>+  nsIViewManager *viewManager = mShell->GetViewManager();
>+  nsIView *rootView = viewManager->GetRootView();
>+  if (rootView) {
>+    nsIView *parentView = rootView->GetParent();
>+    if (parentView) {
>+      nsIFrame *parentFrame = parentView->GetFrame();
>+      if (parentFrame) {
>+        return parentFrame->PresContext();

I don't think this is right. parentView is the anonymous inner view of a subdocument frame. anonymous meaning it has no frame. You want to get the frame from parentView->GetParent().
Why do we need SchedulePaint to schedule a paint when GetRootPresContext() returns null?
I think we *might* not, but I'd like to keep it the same as we have it currently since it works.

I think we just need IsDOMPaintEventPending to always be able to find the painting refresh driver, since it can be called from PresShells with no frames. Failing to find the refresh driver, and returning false means that we don't wait for paints and fail reftests
Comment on attachment 638325 [details] [diff] [review]
Alternate fix: Add second GetRootPresContext

Review of attachment 638325 [details] [diff] [review]:
-----------------------------------------------------------------

OK, as a temporary fix.
Attachment #638325 - Flags: review?(roc) → review+
Crash Signature: [@ nsIFrame::GetOffsetToCrossDoc(nsIFrame const*, int)] [@ mozilla::FramePropertyTable::Get] [@ nsPoint::ConvertAppUnits(int, int)] → [@ nsIFrame::GetOffsetToCrossDoc(nsIFrame const*, int)] [@ nsIFrame::GetOffsetToCrossDoc] [@ mozilla::FramePropertyTable::Get] [@ nsPoint::ConvertAppUnits(int, int)]
https://hg.mozilla.org/mozilla-central/rev/25061ce7382b
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla16
Backed out as part of DLBI: https://hg.mozilla.org/mozilla-central/rev/25061ce7382b
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Keywords: topcrash
I can't reproduce this.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → WORKSFORME
See Also: → 1378100
You need to log in before you can comment on or make changes to this bug.