Closed Bug 841967 Opened 7 years ago Closed 7 years ago

What should we do about security.notification_enable_delay?

Categories

(Toolkit :: Telemetry, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla23
Tracking Status
firefox20 --- unaffected
firefox21 + disabled
firefox22 + fixed
firefox23 --- fixed

People

(Reporter: MattN, Assigned: MattN)

References

()

Details

(Whiteboard: [qa-])

Attachments

(2 files, 1 obsolete file)

I just landed the probe for POPUP_NOTIFICATION_MAINACTION_TRIGGERED_MS this week to measure "The time (in milliseconds) after showing a PopupNotification that the mainAction was first triggered".  In other words, the user sees a popup (aka. doorhanger) notification and they click the button for the default action. In my testing it's hard to do this in less than 200ms and that's when I trying to be quick and have my cursor at the correct coordinates (see testcase at [1]).

We want to know the response time of our users because we added a security delay (currently 500ms) to protect users from being tricked into giving permissions to a website (bug 583175).

IIUC, the data is showing that over 50% of telemetry reports have users clicking the main action in less than 200ms which seems very unlikely for humans.  I suspect that there is some automation that is clicking the button as part of some test suite or there is something wrong with the probe.  Since the security delay is currently set to 500ms, the majority of main action clicks are getting ignored for security.  If there is some automation clicking the button I would expect the tests are now failing so will hopefully hear about it.

I'm wondering if people see a bug in the probe or have ideas as to what is causing the surprising data?

[1] http://jsfiddle.net/wrCs8/15/embedded/result/
Dave Hunt confirmed that Mozmill was submitting telemetry and I filed bug 841978 to have it disabled.  I suspect that this became an issue when it was enabled by default for Nightly and Aurora in 819732 since Dave told me that Mozmill is run on Nightlies.  If that is the case, then telemetry data on any channel where Mozmill runs may be skedwed since that landed.  For most channels that is probably fine but it is noticeable on Nightly where the population is smaller.

I'll keep this bug open until Telemetry is disabled in Mozmill and I see that the telemetry numbers look more reasonable.
Daniel, could we get data on Telemetry reports with the probe POPUP_NOTIFICATION_MAINACTION_TRIGGERED_MS having a value <= 50ms? We'd like to see if there is a pattern in the subnet or IP address owner to identify who or what is running automated tests on Nightly and sending telemetry data.

Thanks
Assignee: nobody → deinspanjer
Flagging for tracking Firefox 21 since that's when the security delay was added in bug 583175.  We need accurate data to know if users will hit the security delay in everyday use and whether we need to tweak the pref before release.
Will file a metrics bug for comment 2.
Assignee: deinspanjer → mnoorenberghe+bmo
Summary: POPUP_NOTIFICATION_MAINACTION_TRIGGERED_MS data seems incorrect → What should we do about security.notification_enable_delay?
Depends on: 847664
(In reply to Matthew N. [:MattN] from comment #3)
> Flagging for tracking Firefox 21 since that's when the security delay was
> added in bug 583175.  We need accurate data to know if users will hit the
> security delay in everyday use and whether we need to tweak the pref before
> release.

Hey :MattN, looks like bug 847664 has been recently updated with the data needed here.Can you please help with the needed analysis and help with next steps on this bug accordingly. Thanks !
I don't see a trend in the raw data and there are hits from around the world so it seems like it's not our own automation that's causing this. I asked in the SUMO contributor forum if people have seen any reports of this problem[1]. About 1.5%[2] of main action clicks on beta are reported as being in less than 50ms after it appears.

It's possible that the issue is with the resolution of Date.now() and perhaps switching to another API such as window.performance.now() would fix the issue. I would suspect that the click delay is being calculated as 0 as the two consecutive Date.now() calls are getting the same value. (This is a known issue).

Does anyone have any other ideas? I'll quickly attempt using performance.now().

[1] https://support.mozilla.org/en-US/forums/support-forum-contributors/709151
[2] http://bit.ly/XHlXnl
I think this may help.

Another possible improvement would be to set timeShown before calling openPopup in PopupNotifications_showPanel but I didn't dig into openPopup to see if it's possible the panel is shown for a while before the NOTIFICATION_EVENT_SHOWN callback is called. I wasn't sure of the cost of another notificationsToShow.forEach(…) before this.panel.openPopup. What do you think dolske?

At this point I don't see a strong case for reducing the timeout as it's only the less than 50ms bucket that seems like an issue and I'm guessing those are all 0 because Date.now is returning the same time twice.
Attachment #738856 - Flags: review?(dolske)
Thanks for looping us in. I haven't seen anything personally but is also truth that we don't necessarily see a lot of feedback on edge cases in pre-release builds. I will pass the message on to see if there's anything that I have missed.
This implements the other improvement that I mentioned in comment 7.  Since there are more clues that lead me to believe this is the problem:
If n.timeShown is undefined in _onButtonCommand then timeSinceShown is NaN and telemetry would put that in the 0-50ms bucket. The good news is that the security protection wouldn't take effect in that case since timeSinceShown < this.buttonDelay would be false. This aligns with a lack of bug reports (although the low amount of potentially affected users [1.5%] is also a factor).

An guess at why this could happen:
An exception is thrown in a notification callback before the affected notification.  This would prevent notificationsToShow.forEach from iterating over the remaining notifications and setting n.timeShown, therefore leading to timeSinceShown being NaN.
Attachment #738856 - Attachment is obsolete: true
Attachment #738856 - Flags: review?(dolske)
Attachment #738890 - Flags: review?(dolske)
(In reply to Matthew N. [:MattN] from comment #9)
> An guess at why this could happen:
> An exception is thrown in a notification callback before the affected
> notification.  This would prevent notificationsToShow.forEach from iterating
> over the remaining notifications and setting n.timeShown, therefore leading
> to timeSinceShown being NaN.

Hrm, seems to me like _fireCallback should be protecting against exceptions thrown in the consumer-specified callbacks (catching and reporting exceptions). Same for the other callbacks (mainAction, secondaryActions, etc.).
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #10)
> Hrm, seems to me like _fireCallback should be protecting against exceptions
> thrown in the consumer-specified callbacks (catching and reporting
> exceptions). Same for the other callbacks (mainAction, secondaryActions,
> etc.).

I agree and I started to include that in the v.2 patch but then thought it should go in a separate bug.  If we think this is the primary cause of the problem then I could go back to v.1 and add try/catch but at this point I'm still guessing at the causes. What do you prefer?
New bug is fine, I just want to make sure we do it :)
[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 583175
User impact if declined: Possibility that ~1.5% of first clicks on the main action of a popup notification (doorhanger) do not work.
Testing completed (on m-c, etc.): None yet. Functionality will return to prior behaviour (no security delay).
Risk to taking this patch (and alternatives if risky): Low risk pref change. We could also just take the long-term fix to use more accurate timers and set the time earlier.
String or IDL/UUID changes made by this patch: None
Attachment #740521 - Flags: review?(dolske)
Attachment #740521 - Flags: approval-mozilla-beta?
Attachment #738890 - Flags: review?(dolske) → review+
Attachment #740521 - Flags: review?(dolske) → review+
(In reply to Matthew N. [:MattN] from comment #13)
> Created attachment 740521 [details] [diff] [review]
> Beta v.1 - Disable the security delay
> 
> [Approval Request Comment]
> Bug caused by (feature/regressing bug #): bug 583175
> User impact if declined: Possibility that ~1.5% of first clicks on the main
> action of a popup notification (doorhanger) do not work.
> Testing completed (on m-c, etc.): None yet. Functionality will return to
> prior behaviour (no security delay).
> Risk to taking this patch (and alternatives if risky): Low risk pref change.
> We could also just take the long-term fix to use more accurate timers and
> set the time earlier.

lets take the low risk fwd fix for beta.We can still try and land the patch for the long-term fix on aurora to mitigate risk in this cycle here .
> String or IDL/UUID changes made by this patch: None
Comment on attachment 740521 [details] [diff] [review]
Beta v.1 - Disable the security delay

Please make sure to land by tomorrow noon for this to make it into Fx21 beta4.Thanks !
Attachment #740521 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
https://hg.mozilla.org/mozilla-central/rev/aa620f3fc2f7
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Matt, should the Fx21 patch land on Fx22 too?
Comment on attachment 738890 [details] [diff] [review]
v.2 performance.now + set timeShown before openPopup

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 583175
User impact if declined: Possibility that ~4% of first clicks on the main action of a popup notification (doorhanger) do not work.
Testing completed (on m-c, etc.): m-c since 2013-04-23 (3 weeks). Rode train to m-a.
Risk to taking this patch (and alternatives if risky): Low risk code movement and drop-in replace of Date.now with performance.now
String or IDL/UUID changes made by this patch: None
Attachment #738890 - Flags: approval-mozilla-beta?
Attachment #738890 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Is there anything QA can do here? If so, can this be verified using the Talos Tool?
(In reply to Paul Silaghi [QA] from comment #22)

I don't think QA would be able to help here.
Thanks Matthew.
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.