Closed Bug 722394 Opened 12 years ago Closed 7 years ago

Increase in plugin hangs in Firefox 11

Categories

(Core Graveyard :: Plug-ins, defect)

defect
Not set
normal

Tracking

(firefox11-, firefox12-, firefox13-)

RESOLVED INCOMPLETE
Tracking Status
firefox11 - ---
firefox12 - ---
firefox13 - ---

People

(Reporter: kairo, Unassigned)

References

Details

Since we backed out bug 716945 on 11, we should be getting the same volume of plugin hangs than in 9 and earlier, but this doesn't turn out to be the case. In fact, from the few days of data we already have, it looks like we nearly doubled the hang volume instead.

Here's some data ("Ratio" is hangs per 100 ADU):

11.0a2:
Date         Hangs   ADU       Ratio
2012-01-29   1,498   118,563   1.26
2012-01-28   1,277   115,531   1.11
2012-01-27   1,271   126,332   1.01
2012-01-26*  1,107   125,715   0.88
2012-01-25     235   128,157   0.18

* seems to be first build with bug 716945 being backed out, so ramping up on that day

9.0a2 (at similar ADU level):
Date         Hangs   ADU       Ratio
2011-11-11     819   114,758   0.71
2011-11-10     772   125,244   0.62
2011-11-09     726   127,095   0.57
2011-11-08     689   123,375   0.56
2011-11-07     729   119,878   0.61

See more detailed data at https://crash-stats.mozilla.com/daily?form_selection=by_version&p=Firefox&v[]=9.0a2&throttle[]=100.00&v[]=11.0a2&throttle[]=100.00&hang_type=hang&os[]=Windows&os[]=Mac&os[]=Linux&date_start=2011-10-01&date_end=2012-01-29&submit=Generate

Unfortunately, we current don't get good signatures for those hangs, we need bug 721382 to fix that. Still, there is a large incline apparently and we need to get a hold of it, esp. given that we only have 6 weeks (or less) on beta to react to any code change that could potentially have caused this.
Unfortunately, we don't have good data on when the increase might have happened because bug 716945 masked it until last week.
Depends on: 716945, 721382
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #0)
> Unfortunately, we current don't get good signatures for those hangs, we need
> bug 721382 to fix that.

Bah, that was a dupe of my own bug 709209. :-/
Depends on: 709209
No longer depends on: 721382
I took a bit of a look into how hang volume fared on trunk to try to see if we can find a regression window there.

https://crash-stats.mozilla.com/daily?form_selection=by_version&p=Firefox&v[]=10.0a1&throttle[]=100.00&v[]=9.0a1&throttle[]=100.00&v[]=&throttle[]=100.00&v[]=&throttle[]=100.00&hang_type=hang&os[]=Windows&os[]=Mac&os[]=Linux&date_start=2011-09-15&date_end=2011-11-15&submit=Generate
This mainly shows how the rate decreased around 2011-10-29 when bug 716945 landed. Not much else there, esp. as 2011-11-08 was uplift day, so from the 9th on people were moving away from 10 Nightly and the rates can't be counted on after that due to that decrease in ADU.

https://crash-stats.mozilla.com/daily?form_selection=by_version&p=Firefox&v[]=11.0a1&throttle[]=100.00&v[]=10.0a1&throttle[]=100.00&v[]=&throttle[]=100.00&v[]=&throttle[]=100.00&hang_type=hang&os[]=Windows&os[]=Mac&os[]=Linux&date_start=2011-10-28&date_end=2011-12-24&submit=Generate
This probably the "hot" period. Somewhere in that timeframe (when 11 was on Nightly) we should have had the regression. There's nothing really clear there, though. There *could* be something between 2011-11-26 and 2011-11-29, and we probably should look at plugin-related checkins in that timeframe very closely, but in the end, what the graph shows there isn't really outside the signal/noise ratio, so this looks inconclusive at the end. 2011-12-20 was the next uplift day, FTR.

https://crash-stats.mozilla.com/daily?form_selection=by_version&p=Firefox&v[]=13.0a1&throttle[]=100.00&v[]=12.0a1&throttle[]=100.00&v[]=11.0a1&throttle[]=100.00&v[]=&throttle[]=10.00&hang_type=hang&os[]=Windows&os[]=Mac&os[]=Linux&date_start=2011-12-25&date_end=2012-02-06&submit=Generate
Just for being complete, that's a "since then" graph on trunk, and doesn't give us any new info.

So, my advice here is to look at *all* plugin-related checkins from 2012-01-25 to the end of 11 Nightly, i.e. 2011-12-20 and take an extra close look at the timeframe between 2011-11-25 and 2011-11-29.
This is a grave problem and we need to investigate, unfortunately, the numbers don't help us much. :(
Bug 705365 which shortened the plugin hang timeout from 45 seconds to 25 seconds landed on 25-Nov.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #3)
> Bug 705365 which shortened the plugin hang timeout from 45 seconds to 25
> seconds landed on 25-Nov.

Smells like a smoking gun. As we turned off chromehang (and I guess might remove it at some point given we implement this stuff on the telemetry side), it might make sense to revert this to 45s again.
Tracking in case bug 725869 doesn't resolve the issue. If it does, we should consider doing the same on Aurora 12 and m-c 13.
Depends on: 705365
Bug 725869 has the fix but this one here has all the proper tracking etc. - also that other bug is marked fixed but did this for 11 only, and 12/13 don't have this fixed.

This is a bit of a Bugzilla mess now. Not sure how to proceed on this to not lose proper tracking for 12 and 13 and have things marked correctly.

Alex, Sheila, any proposal on how to do that?
Blocks: 705365
Depends on: 725869
No longer depends on: 705365
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #6)
> Alex, Sheila, any proposal on how to do that?

We'll track the speculative fix in bug 725869 separately, and will likely do the same for m-c/Aurora soon. Both bugs are now tracked and will not fall off of my radar.
Based upon the feedback in https://bugzilla.mozilla.org/show_bug.cgi?id=725869#c3, we'll untrack for FF11/12.
Backout of bug 725869 occurred for both FF13 and trunk. No need to track any longer.
Resolving old bugs which are likely not relevant any more, since NPAPI plugins are deprecated.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.