Closed Bug 1198674 Opened 4 years ago Closed 4 years ago

crash in mozilla::layout::GetFrom(nsFrameLoader*)

Categories

(Core :: Graphics: Layers, defect, critical)

42 Branch
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
firefox41 --- unaffected
firefox42 + fixed
firefox43 + fixed

People

(Reporter: mats, Assigned: nical)

Details

(Keywords: crash, regression, Whiteboard: gfx-noted)

Crash Data

Attachments

(3 files)

This bug was filed from the Socorro interface and is 
report bp-e81d857c-dfcf-4746-9090-f9edc2150825.
=============================================================

This crash appears to be a regression in v42:

Product 	Version 	Percentage 	Number Of Crashes
Firefox 	42.0a1 	36.79 %	78
Firefox 	42.0a2 	34.91 %	74
Firefox 	43.0a1 	22.64 %	48
Firefox 	41.0a2 	5.66 %	12
(no reported incidents in older versions; numbers are for 28 days)

All crash addresses are 0x48 or 0x40, so I'm guessing 'mFrameLoader'
is null here:
http://hg.mozilla.org/mozilla-central/annotate/04b8c412d9f5/layout/ipc/RenderFrameParent.cpp#l420


Stack:
mozilla::layout::GetFrom(nsFrameLoader*)
mozilla::layout::RenderFrameParent::OwnerContentChanged(nsIContent*)
nsFrameLoader::SetOwnerContent(mozilla::dom::Element*)
nsFrameLoader::StartDestroy()
nsXULElement::UnbindFromTree(bool, bool)
mozilla::dom::Element::UnbindFromTree(bool, bool)
mozilla::dom::Element::UnbindFromTree(bool, bool)
mozilla::dom::Element::UnbindFromTree(bool, bool)
mozilla::dom::Element::UnbindFromTree(bool, bool)
mozilla::dom::Element::UnbindFromTree(bool, bool)
mozilla::dom::Element::UnbindFromTree(bool, bool)
nsXBLBinding::UninstallAnonymousContent(nsIDocument*, nsIContent*)
nsXBLBinding::ChangeDocument(nsIDocument*, nsIDocument*)
nsBindingManager::RemovedFromDocumentInternal(nsIContent*, nsIDocument*)
mozilla::dom::FragmentOrElement::DestroyContent()
...
The "GetFrom(mFrameLoader)" call was added here:
http://hg.mozilla.org/mozilla-central/diff/6dd3690b879c/layout/ipc/RenderFrameParent.cpp
for bug 918634, so it might be a regression in v36 although it's only showing up
in statistics now when e10s is more widely enabled?
Flags: needinfo?(davidp99)
Whiteboard: gfx-noted
It appears the assignee in bug 918634 (David Parks) isn't active anymore.

Nicolas, do you know anyone who can investigate this?  I'm not really sure who
knows this code.
Flags: needinfo?(davidp99) → needinfo?(nical.bugzilla)
I don't know who owns this code (that is still active), but maybe Sotaro knows it enough that he can do the review.

Looking at the code it seems fair to just null-check and early return. For one we null-check the result of GetFrom, and the only ways I can see mFrameLoader being null (by reading the code) is if ActorDestroy has already run or if the creation of the object failed. Either way we should not be running the code that is executed in OwnerContentChanged. I don't know this code enough to know if this is the symptom of another bug, but null-checking here is simple enough that it should not cause any issue (other than potentially hiding the symptom of a bug) and will get rid of this crash.
Assignee: nobody → nical.bugzilla
Flags: needinfo?(nical.bugzilla)
Attachment #8656480 - Flags: review?(sotaro.ikeda.g)
Comment on attachment 8656480 [details] [diff] [review]
null-check mFrameLoader

Review of attachment 8656480 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.
Attachment #8656480 - Flags: review?(sotaro.ikeda.g) → review+
https://hg.mozilla.org/mozilla-central/rev/4fe8c268e837
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
We still crash from the call in GetTextureFactoryIdentifier:
bp-aaa2c957-4da7-4e51-96ce-c44de2150907
Status: RESOLVED → REOPENED
Flags: needinfo?(nical.bugzilla)
Resolution: FIXED → ---
Crash Signature: [@ mozilla::layout::GetFrom(nsFrameLoader*)] → [@ mozilla::layout::GetFrom(nsFrameLoader*)] [@ nsFrameLoader::GetOwnerDoc() ]
I could have seen that one coming.
Basically, in TabChild::ProvideWindow we try to create a PRenderFrame actor pair. It fails on the host side because TabParent::GetFrameLoader returns nullptr in TabParent::AllocPRenderFrameParent. so we early-return in the constructor but the PRenderFrameParen is created still with aSuccess set to fals which means AddTabParentToTable is not called but the actor is still created. Back in TabChild::ProvideWindowCommon we immediately send a synchronous ipc to query some of RenderFrameParent's info, which leads to the crashing function GetTextureFactoryIdentifier.

If we add an extra null-check here we should be good because the following call to GetLayersId() will return 0 which will be interpreted as an error in TabChild::ProvideWindowCommon and the latter will then deallocate the PRenderFrame actor pair, and mFrameLoader won't be read again.

As to why is TabParent::GetFrameLoader, I have no idea and there are many paths in there that could lead to this so I can't get much further without reproducing locally. Hg history vaguely points to bug 1126089 (introduced some of the nullptr paths).
Flags: needinfo?(nical.bugzilla)
Attachment #8658603 - Flags: review?(sotaro.ikeda.g)
Attachment #8658603 - Flags: review?(sotaro.ikeda.g) → review+
[Tracking Requested - why for this release]:
https://hg.mozilla.org/mozilla-central/rev/3d25434042f2
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Crash, tracking.

Nical, could you fill the uplift request to aurora? Thanks
Flags: needinfo?(nical.bugzilla)
Approval Request Comment
[Feature/regressing bug #]:
[User impact if declined]: crashes
[Describe test coverage new/current, TreeHerder]: none
[Risks and why]: low risk, just a few null checks
[String/UUID change made/needed]: none
Flags: needinfo?(nical.bugzilla)
Attachment #8663586 - Flags: approval-mozilla-aurora?
Attachment #8663586 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.