Closed Bug 1264530 Opened 8 years ago Closed 8 years ago

crash in nsPluginInstanceOwner::UseAsyncRendering

Tracking

(firefox48 fixed, firefox49- fixed, relnote-firefox 48+, firefox50 fixed, firefox51 fixed)

Status:

RESOLVED FIXED

Milestone:

mozilla51

Tracking Flags:

Tracking

Status

firefox48

---

fixed

firefox49

fixed

relnote-firefox

---

48+

firefox50

---

fixed

firefox51

---

fixed

People

(Reporter: n.nethercote, Assigned: BenWa)

References

Details

(4 keywords)

Crash Data

Attachments

(2 files)

patch 8 years ago Benoit Girard (:BenWa) 1.23 KB, patch	n.nethercote : review+	Details \| Diff \| Splinter Review
Bug 1264530 - Hold on to Plugin Instance to survive frame poisoning. 8 years ago Benoit Girard (:BenWa) 58 bytes, text/x-review-board-request	jimm : review+ Sylvestre : approval-mozilla-aurora+ Sylvestre : approval-mozilla-beta+ Sylvestre : approval-mozilla-release+	Details

Nicholas Nethercote [inactive]

Reporter

Description

•

8 years ago

This bug was filed from the Socorro interface and is 
report bp-091b5f59-428a-4a40-9f25-9fb662160413.
=============================================================

45 crashes with this signature in the past seven days.

Nicholas Nethercote [inactive]

Reporter

Comment 1

•

8 years ago

bgirard, any ideas?

Nicholas Nethercote [inactive]

Reporter

Updated

•

8 years ago

Flags: needinfo?(bgirard)

Benoit Girard (:BenWa)

Assignee

Comment 2

•

8 years ago

umm, best guess is that mPluginWindow isn't always initialized.

Benoit Girard (:BenWa)

Assignee

Comment 3

•

8 years ago

Attached patch patch — Details — Splinter Review

I meant to attach this patch but forgot.

I'm not sure that this will help, but it's probably a good idea anyways.

Attachment #8747105 - Flags: review?(n.nethercote)

Nicholas Nethercote [inactive]

Reporter

Comment 4

•

8 years ago

Comment on attachment 8747105 [details] [diff] [review]
patch

Review of attachment 8747105 [details] [diff] [review]:
-----------------------------------------------------------------

I also don't see how this will help, but it can't hurt. I suggest leaving the bug open after you land this patch.

Attachment #8747105 - Flags: review?(n.nethercote) → review+

Benoit Girard (:BenWa)

Assignee

Comment 5

•

8 years ago

The crash addresses are all over the place:
https://crash-stats.mozilla.com/report/list?product=Firefox&signature=nsPluginInstanceOwner%3A%3AUseAsyncRendering#tab-reports

And we're de-referencing mPluginwindow. So I'd say this has a non trivial chance of addressing the issue.

Flags: needinfo?(bgirard)

Whiteboard: leave-open

Benoit Girard (:BenWa)

Assignee

Updated

•

8 years ago

Keywords: checkin-needed

Nicholas Nethercote [inactive]

Reporter

Comment 6

•

8 years ago

I see: if the do_GetService(MOZ_PLUGIN_HOST_CONTRACTID) call fails then without this patch mPluginWindow will be uninitialized. Good to fix, then.

Pulsebot

Comment 7

•

8 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/33c2d68c985b

Keywords: checkin-needed

Carsten Book [:Tomcat]

Comment 8

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/33c2d68c985b

[:philipp]

Comment 9

•

8 years ago

hi, crashes with this signature are still occurring on nightly & dev-edition after the patch has landed.

status-firefox49: --- → affected

status-firefox50: --- → affected

Jim Mathies [:jimm]

Comment 11

•

8 years ago

Currently #6 overall top crash in 48.

Jim Mathies [:jimm]

Updated

•

8 years ago

Keywords: topcrash

David Bolter [:davidb] (NeedInfo me for attention)

Comment 12

•

8 years ago

Adding a NI to make sure BenWa notices this is still alive. We need an owner.

Flags: needinfo?(bgirard)

Sylvestre Ledru [:Sylvestre]

Comment 13

•

8 years ago

Milan, do you know who else could help with this? Thanks.
We could take this as a ride along of a 48.0.1

Flags: needinfo?(milan)

Marcia Knous [:marcia]

Comment 14

•

8 years ago

This has now moved up to the #2 top overall rash in 48.

Sylvestre Ledru [:Sylvestre]

Comment 15

•

8 years ago

Johnny, any clues? Thanks

Flags: needinfo?(jst)

Benoit Girard (:BenWa)

Assignee

Comment 16

•

8 years ago

I can look on Wednesday if that's not too late.

Sylvestre Ledru [:Sylvestre]

Comment 17

•

8 years ago

It is a bit too late, this is blocking our capability to push to more than 25% of our release users and the creation of a dot release for 48.

Benoit Girard (:BenWa)

Assignee

Comment 18

•

8 years ago

From the looks of it this crash started to explode recently and most of the crashing addresses are at 0xfffffffff0de8023. I haven't seen something like this before. It could be that the upper word is getting corrupted with something like -1 but then why would the lower word always be same?

Is there's an easy way to get changelogs for the period where the crash rate started exploding?

Benoit Girard (:BenWa)

Assignee

Comment 19

•

8 years ago

Perhaps the crash explosion is linked with something such as the e10s role out?

Benoit Girard (:BenWa)

Assignee

Comment 20

•

8 years ago

Looks like we slice the object here:
https://dxr.mozilla.org/mozilla-central/source/dom/plugins/base/nsPluginInstanceOwner.cpp#3575

Probably not the problem since it's very old code however.

Jim Mathies [:jimm]

Comment 21

•

8 years ago

(In reply to Benoit Girard (:BenWa) from comment #19)
> Perhaps the crash explosion is linked with something such as the e10s role
> out?

Does not appear to be related, it's #33 in the e10s test group, #2 overall.

Marcia Knous [:marcia]

Comment 22

•

8 years ago

Adding regressionwindow-wanted since it was requested in Comment 18.

Keywords: regressionwindow-wanted

Milan Sreckovic [:milan] (needinfo for best results)

Comment 23

•

8 years ago

Not sure how I can help in this area, but I wonder if bug 132759 contains any useful hints.

Flags: needinfo?(milan)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 24

•

8 years ago

[Tracking Requested - why for this release]: Should we block on this?

tracking-firefox49: --- → ?

Keywords: regression

Jet Villegas (inactive)

Updated

•

8 years ago

Flags: needinfo?(bugs)

Marco Castelluccio [:marco]

Comment 25

•

8 years ago

This is happening often on Windows 10 (10.0.10586) [~60% of crashes with this signature compared to ~30% overall] and often with an URL starting with wyciwyg [~30% with this signature compared to less than 5% overall].

Marco Castelluccio [:marco]

Comment 26

•

8 years ago

Another interesting data point is that most users that are experiencing this crash don't have AdBlock Plus installed [more than 95% of crashes with this signature compared to less than 80% overall].

Benoit Girard (:BenWa)

Assignee

Comment 27

•

8 years ago

Thanks for the data. I'll keep it in mind while looking however nothing jumps to mind. I'll spend some of today searching.

Benoit Girard (:BenWa)

Assignee

Comment 28

•

8 years ago

Could the crash explosion correlate with the landing/uplift of https://hg.mozilla.org/releases/mozilla-release/rev/28ba9cceda61 ?

Benoit Girard (:BenWa)

Assignee

Comment 29

•

8 years ago

Some progress:
From the minidump: We're crashing looking at memory point by using nsPluginFrame's mInstanceOwner->mInstance because mInstanceOwner is bad. 28ba9cceda61 added a plugin call which can destroy the nsPluginFrame I believe so it's very likely responsible for the increase.

Marco Castelluccio [:marco]

Comment 30

•

8 years ago

Build ID 20160726073904 was normal for a few days (from 2016-07-27 to 2016-08-02) and started spiking from 2016-08-03 (see https://crash-stats.mozilla.com/signature/?date=%3E%3D2016-05-05&signature=nsPluginInstanceOwner%3A%3AUseAsyncRendering#graphs, if you select "build ID" in "Crashes per day, by ..."). So the spike seems due to a chenage in the external conditions (the finding from comment 26 would support this hypothesis).

If you look at the graph for 48.0a2, the number of crashes is pretty stable and basically the same before/after the uplift of bug 1269444 (although there is an interesting spike on June 1st): https://crash-stats.mozilla.com/signature/?date=%3E%3D2016-05-05&version=48.0a2&signature=nsPluginInstanceOwner%3A%3AUseAsyncRendering#graphs

N.B.: This doesn't mean that 28ba9cceda61 can't be the cause of the crash.

Benoit Girard (:BenWa)

Assignee

Comment 31

•

8 years ago

Ok cool, I think we got everything adding up.
WinDbg points the address to be a block of MEM_RESERVE memory which is something that frame poisoning. This would explain why nearly all the crashes are the same address. We end up deleting nsPluginFrame, we poison the values and we end up with a lot of crashes.

What changed is we added a plug-in call which means that now plugins can do whatever they please including destroying the plugin frame while we have an active call against it. Too bad' :marco data does not support this (nor does it reject it). But I'm sure enough to proceed with a patch.

Patch incoming.

Comment hidden (mozreview-request)

Review commit: https://reviewboard.mozilla.org/r/70758/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/70758/

Jim Mathies [:jimm]

Updated

•

8 years ago

Attachment #8779850 - Flags: review?(jmathies) → review+

Benoit Girard (:BenWa)

Assignee

Comment 33

•

8 years ago

Just a note on the patch. Without reproducing this bug I can't say for sure that it wont just move the crash somewhere else. But likely that will have a similar problem and similar crash signature and address clustering. But it's unlikely to make things worse... maybe?

Benoit Girard (:BenWa)

Assignee

Updated

•

8 years ago

Assignee: nobody → bgirard

Johnny Stenback (:jst)

Comment 34

•

8 years ago

Benoit, while I agree that it's hard to tell for sure, I think you're very likely looking at the real issue here. We have two options here, generally speaking... We either do what your patch does and continue executing this method after the plugin is dead, as your patch does. Or we bail, as we do a bit earlier in this function (on OSX).

Flags: needinfo?(jst)

Benoit Girard (:BenWa)

Assignee

Comment 35

•

8 years ago

Yes, I'm being very pessimistic in that statement. My first patch was a shot in the dark but this one is not and we will likely fix the issue. We should decide how far we want to uplift this. Sounds like we missed the point release already.

Let me know if you want me to change the patch to use bail out instead but I don't know which calls are destroying the frame. I believe it's not just ResolutionMayHaveChanged.

Benoit Girard (:BenWa)

Assignee

Comment 36

•

8 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/f1f22c6fad8184a3b233846c22c979e84c903cea
Bug 1264530 - Hold on to Plugin Instance to survive frame poisoning. r=jimm

Johnny Stenback (:jst)

Comment 37

•

8 years ago

I'm not convinced there's a good way to tell which is better. What's the crash volume for this one on nightly/aurora/beta? Can we know if this was fixed in a few days on nightly I guess is the real question? If we can, and it's not, we should change the patch to the other alternative approach.

Benoit Girard (:BenWa)

Assignee

Comment 38

•

8 years ago

Since it's a plugin change I want to get at least one day in nightly. However ill request uplift immediately after to aurora and beta. The fix looks good and if it isn't with that population we will know in time to have another look.

Flags: needinfo?(bgirard)

[:philipp]

Comment 39

•

8 years ago

(In reply to Johnny Stenback  (:jst, jst@mozilla.com) from comment #37)
> What's the crash volume for this one on nightly/aurora/beta?
on nightly there are generally 1-2 crashes a day, on aurora 3-10 per day & on beta around 100

Sylvestre Ledru [:Sylvestre]

Comment 40

•

8 years ago

Benoit, could you fill the uplift requests to aurora, beta & release? Thanks

Flags: needinfo?(bgirard)

Benoit Girard (:BenWa)

Assignee

Comment 41

•

8 years ago

Comment on attachment 8779850 [details]
Bug 1264530 - Hold on to Plugin Instance to survive frame poisoning.

I'm putting the request in but I'd like to have one day coverage in nightly before we uplift.

Approval Request Comment
[Feature/regressing bug #]: Bug 1269444 *might* of made it worse, we're not sure exactly
[User impact if declined]: Topcrasher due to deleted and sometimes poison framed.
[Describe test coverage new/current, TreeHerder]: There's some plugin and flash coverage but this bug path is not exercised.
[Risks and why]: Fairly low. This has a very high chance of fixing the bug. Worse case is likely we will see another signature pop-up with frame poisoning addresses. If so we can react fast but it's very unlikely.
[String/UUID change made/needed]: none

Attachment #8779850 - Flags: approval-mozilla-beta?

Attachment #8779850 - Flags: approval-mozilla-aurora?

Benoit Girard (:BenWa)

Assignee

Updated

•

8 years ago

Attachment #8779850 - Flags: approval-mozilla-release?

Sylvestre Ledru [:Sylvestre]

Comment 42

•

8 years ago

Comment on attachment 8779850 [details]
Bug 1264530 - Hold on to Plugin Instance to survive frame poisoning.

Taking it now to have it during this week end on aurora
Will decide on release next week

Attachment #8779850 - Flags: approval-mozilla-beta?

Attachment #8779850 - Flags: approval-mozilla-beta+

Attachment #8779850 - Flags: approval-mozilla-aurora?

Attachment #8779850 - Flags: approval-mozilla-aurora+

Benoit Girard (:BenWa)

Assignee

Updated

•

8 years ago

Flags: needinfo?(bgirard)

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 43

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/f1f22c6fad81

[:philipp]

Updated

•

8 years ago

status-firefox51: --- → fixed

Keywords: checkin-needed

Sylvestre Ledru [:Sylvestre]

Comment 44

•

8 years ago

Comment on attachment 8779850 [details]
Bug 1264530 - Hold on to Plugin Instance to survive frame poisoning.

Top crash #2, taking it in release.

Attachment #8779850 - Flags: approval-mozilla-release? → approval-mozilla-release+

Benoit Girard (:BenWa)

Assignee

Comment 45

•

8 years ago

It's been on Nightly for a bit now and this hasn't broken plugins AFAIK. Uplifting now should be fine.

Jet Villegas (inactive)

Updated

•

8 years ago

Flags: needinfo?(bugs)

Ryan VanderMeulen [:RyanVM]

Updated

•

8 years ago

Status: NEW → RESOLVED

Closed: 8 years ago

Keywords: checkin-needed

Resolution: --- → FIXED

Whiteboard: leave-open

Target Milestone: --- → mozilla51

Mark Banner (:standard8)

Comment 46

•

8 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-aurora/rev/fa8506ec4649

status-firefox50: affected → fixed

Mark Banner (:standard8)

Comment 47

•

8 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-beta/rev/3af56860ad1b

status-firefox49: affected → fixed

Mark Banner (:standard8)

Comment 48

•

8 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-release/rev/e360efec6839

status-firefox48: affected → fixed

Liz Henry (:lizzard) (relman/hg->git project)

Updated

•

8 years ago

tracking-firefox49: ? → -

Gerry Chang [:gchang]

Comment 49

•

8 years ago

Added in the 48.0.1 release notes: "Fix a top crash caused by plugin issues".

relnote-firefox: --- → 48+

Marco Castelluccio [:marco]

Comment 50

•

8 years ago

Looks like the volume went back to normal for 48.0 (which does not include the patch). Just one crash with 48.0.1, so I guess we can call it fixed.

I suppose the problem was due to some Flash advertisement, which bug 1269444 made worse.

BMO Automation

Updated

•

2 years ago

Product: Core → Core Graveyard

You need to log in before you can comment on or make changes to this bug.