Closed Bug 1111791 Opened 10 years ago Closed 9 years ago

Telemetry report: effect of the Flash protected-mode experiment

Categories

(Toolkit :: Telemetry, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: benjamin, Assigned: rvitillo)

References

Details

Attachments

(1 file)

This bug tracks reporting on the Flash protected-mode experiment, bug 1110215.

We'd like to compare the two branches of the experiment with at least the following metrics:

* the plugin abort rate reported by SUBPROCESS_ABNORMAL_ABORT
* the plugin crash rate reported by SUBPROCESS_CRASHES_WITH_DUMP
* the rate and severity of of browser hangs, if we have any useful metrics from beta (I understand that BHR is nightly/aurora only). Vladan says EVENTLOOP_UI_LAG_EXP_MS

Vladan, assuming we start getting data from beta tomorrow, is this something rvitillo or you could do and have results Friday morning?
Flags: needinfo?(vdjeric)
Yes, we could do an analysis by Friday morning, although I have to caution that data from the first few days (early adopters) might be skewed toward better machines
Flags: needinfo?(vdjeric)
Assignee: nobody → rvitillo
Yes, that's a risk, but once we have the scripts we can repeat the analysis next week with more data. I suspect that because we have a control we'll see relatively useful numbers in any case.
I don't see how we can get anything useful out from the new probes for Friday. The probes landed the 15th. 

- For all the data collected on the 16 on Beta (~ 7M pings), only 38 submissions have the SUBPROCESS_CRASHES_WITH_DUMP or SUBPROCESS_ABNORMAL_ABORT histograms. But, for all those 38 submissions, the keyed histograms are empty (i.e. {})
- For all the data collected so far today on Beta (~ 700K pings), only 13 submissions have the SUBPROCESS_CRASHES_WITH_DUMP or SUBPROCESS_ABNORMAL_ABORT keyed histograms. But, only one of those 13 submissions has non empty SUBPROC* histograms.

Any idea why those keyed histograms are empty? The only usable histogram so far is EVENTLOOP_UI_LAG_EXP_MS.
Flags: needinfo?(benjamin)
These probes are in 35.0b4 which I don't think has shipped yet.

When it does ship, we should see non-empty histograms when the a plugin crashes, which is common enough that I expect it to show up.
Flags: needinfo?(benjamin)
Update: Because of bug 1112709, the hook we're using is not effective on Windows 8 and 8.1. This means that we should measure only users running Windows 7 or Windows Vista. (Windows XP doesn't have protected mode, so there's really no experiment there).

beta4 shipped today and the experiment is live and uptake will continue over the next few days.

In case it wasn't clear from the other bugs, there are two experiment branches for this experiment: "control" keeps protected-mode on, and "experiment" turns it off.
Here is the report: http://nbviewer.ipython.org/gist/vitillo/3ecd70160a7fc3b24eff

Tldr; no difference so far, if there truly is one it's very small and we would need to collect more data to verify it.
(In reply to Roberto Agostino Vitillo (:rvitillo) on PTO from 20/12-2/1 from comment #6)
> Here is the report:
> http://nbviewer.ipython.org/gist/vitillo/3ecd70160a7fc3b24eff
> 
Not found 404

I wonder if results would also be biased because Nightly users
1, may not repeatedly try something known to cause hangs
2, may well resolve the issue themselves, possibly by disabling protected mode
Disabling protected mode reduces by about 50% the number of sessions with at least one hang and by about 50% the number of sessions with at least one crash. Please see [1] for the precise definition of hang and crash.

Furthermore, the average number of crashes per session is reduced by about 75% and the average number of hangs is reduced by about 50%.

UI lag was reduced by disabling protected mode. All effects are statically significant.

[1] The updated report is visible at http://nbviewer.ipython.org/gist/vitillo/a7d4a3689c8d1295eb7f.
The reason why we didn't see any difference in the first report is that we didn't have enough data and the hook wasn't effective on Windows 8 and 8.1 but the analysis didn't account for that.
(In reply to Roberto Agostino Vitillo (:rvitillo) on PTO from 20/12-2/1 from comment #8)
> UI lag was reduced by disabling protected mode. All effects are statically
> significant.

Are you sure? I thought we concluded that the control group is higher on buckets 50ms upwards (which is everything we can see at the graph because we don't collect sub-50ms hangs), and from this we derived that it's highly likely that the invisible bucket of 0-50ms has more entries for the experiment group, which should indicate that the experiment group is performing better.

And the experiment group is with protected mode enabled. Is it not?
Also, when Roberto and myself discussed it earlier, both of us were somewhat surprised that most of the difference appears at the lower buckets, while we expected flash-related differences to manifest at higher buckets.

It's as if with/without protected mode affects unexpected buckets (the lower buckets) - buckets which we think should be dominated but non-flash related stuff.
(In reply to Avi Halachmi (:avih) from comment #10)
> (In reply to Roberto Agostino Vitillo (:rvitillo) on PTO from 20/12-2/1 from
> comment #8)
> > UI lag was reduced by disabling protected mode. All effects are statically
> > significant.
> 
> Are you sure? I thought we concluded that the control group is higher on
> buckets 50ms upwards (which is everything we can see at the graph because we
> don't collect sub-50ms hangs), and from this we derived that it's highly
> likely that the invisible bucket of 0-50ms has more entries for the
> experiment group, which should indicate that the experiment group is
> performing better.
> 
> And the experiment group is with protected mode enabled. Is it not?

The experiment group has the protected mode disabled.
Right. Sorry for the mixup.

So yeah, everything we've been tracking improved with protected mode disabled, and especially the crashes and big hangs. The smaller hangs also improved, but to a lesser degree.
If I'm not mistaken bug 1110215 essentially only sets "ProtectedMode=0" in:
  "C:\Windows\[System32/SysWOW64]\Macromed\Flash\mms.cfg", correct?

If the above is true, then bug 949121 should probably be taken into account before a decision is made.

I believe the lack of reported similar problems is due to protected mode being enabled by default, and the stringentish requirements for reproducing it. Basically you need a mouse with a backwards/forwards button, which is clicked while hovering a video on Youtube or CNN.

Disabling protected mode could potentially unleash... bad stuff. Especially if bug 949121 isn't some freak of nature.
(In reply to Johan C from comment #14)
> If I'm not mistaken bug 1110215 essentially only sets "ProtectedMode=0" in:
>   "C:\Windows\[System32/SysWOW64]\Macromed\Flash\mms.cfg", correct?
> 
> If the above is true, then bug 949121 should probably be taken into account
> before a decision is made.
> 
It looks as if bug949121 & bug1066600 may have been the same issue and were thought to be Adobe Bugs
See https://bugzilla.mozilla.org/show_bug.cgi?id=1066600#c8
> > Since FF seems to be doing its job and preventing the whole browser from
> > crashing (as long as dom.ipc.plugins.enabled is set to true), maybe I should
> > look at this as a Flash bug and not a Firefox bug??
> 
> Yes, this is Flash crashing and probably best reported with Adob
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
(In reply to John Hesling [:John99] from comment #15)
> It looks as if bug949121 & bug1066600 may have been the same issue and were
> thought to be Adobe Bugs
Indeed, that's why I resolved bug 1066600 as a duplicate of bug 949121. :)


> > Yes, this is Flash crashing and probably best reported with Adob
Filed https://bugbase.adobe.com/index.cfm?event=bug&id=3917345 with Adobe.

From my "manual regression testing" (bug 949121, comment 14), the further I backtracked with Firefox, it looked like Flash wasn't actually crashing, but locking up instead. In bug 949121, Firefox also locks up when Flash does, something I thought OOP-plugins was supposed to solve.

-----

Do you have data on stability for each Flash version? If so, how many users are stuck on older versions, and are later versions perhaps more stable? <- (With protected mode on.)

I disabled protected mode around when it first appeared in Flash (version 11.3?) because of stability issues, but since Flash version 15.(?), protected mode has been stable.

If later versions are in fact more stable, maybe protected mode should stay on? :)

-----

Something more relevant to this bug:
When I debugged protected mode further today I discovered that toggling the pref introduced in bug 1108035 didn't actually update the preference 'mms.cfg' ("mms.cfg" was manually created, prior to bug 1108035).


Sorry for spamming this bug, I figured I had relevant info and that it couldn't hurt to mention these things. :)
One downside to protected mode is that the separate process presumably causes a lot more IPC traffic (between plugin-container.exe and the Flash process). A bit of an anecdote, but I found that Youtube videos were much less choppy on my laptop with protected mode turned off than with it on (at least while they were being loaded). My laptop only has a dual core CPU, and with protected mode on it couldn't keep up. It has been many months since then, however.
This bug is not the place to discuss the relative merits of protected mode. Please only comment in this bug if you have a question about the methodology or statistics for the report here.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: