Closed Bug 1625601 Opened 5 years ago Closed 5 years ago

Make improvements to new-password heuristics for accuracy and performance

Categories

(Toolkit :: Password Manager, enhancement, P1)

Desktop
Unspecified
enhancement

Tracking

()

RESOLVED FIXED
mozilla76
Tracking Status
firefox-esr68 --- unaffected
firefox74 --- unaffected
firefox75 --- unaffected
firefox76 --- fixed

People

(Reporter: bdanforth, Assigned: bdanforth)

References

Details

Attachments

(1 file)

After landing Bug 1595244 in Nightly, initial verification and telemetry data (courtesy of Bug 1548878, Bug 1616356 and Bug 1619498) identified some opportunities to improve the new-password heuristics -- particularly with respect to accuracy in identifying new password fields on password change forms and performance.

These improvements from the upstream GitHub repo should be incorporated into NewPasswordModel.jsm.

Assignee: nobody → bdanforth
Status: NEW → ASSIGNED
  • Update NewPasswordModel.jsm to use an improved model at 4a2349963d from the upstream GitHub repo (see this comment).
  • Add a check in the test_isProbablyANewPasswordField.js unit test to ensure the method early returns (i.e. returns false) when the feature is disabled by the signon.generation.confidenceThreshold pref.
  • Fix a couple of existing plain mochitests that were failing their checkAutoCompleteResults assertions due to the updated model triggering the password generation result in the autocomplete popup unexpectedly in the tests' forms. There is now enough signal in those forms for _isProbablyANewPasswordField to return false.

(In reply to Erik Rose [:erik][:erikrose] from bug 1595244 comment #19)

https://github.com/mozilla-services/fathom-login-forms/blob/4a2349963df5d3c428847565f44d3d4ec65ae38b/new-password/rulesets.js#L6-L271

Precision is up 4% and recall 10%:

   Testing accuracy per tag:  0.98137    95% CI: (0.96048, 1.00000)
                        FPR:  0.06818    95% CI: (0.00000, 0.14266)
                        FNR:  0.00000    95% CI: (0.00000, 0.00000)
                  Precision:  0.97500    Recall: 1.00000
                   F1 Score:  0.98734

Hi Erik, thanks for the updated model! We were aiming for a 2-3% FPR to ship (vs. the 6.8% above @ a 0.5 threshold) so could you provide a confidence threshold to hit that target at the expense of more FNR? (FWIW we did manual testing on Chrome and found a 1% FPR for the page set we used which mostly included popular login and registration forms).

Thanks

Flags: needinfo?(erik)

Sure, a threshold of .75 will do what you want:

Testing accuracy per tag:  0.93168    95% CI: (0.89270, 0.97065)
                     FPR:  0.02273    95% CI: (0.00000, 0.06676)
                     FNR:  0.08547    95% CI: (0.03481, 0.13613)
               Precision:  0.99074    Recall: 0.91453
                F1 Score:  0.95111

Given that the middle of the confidence histogram is sparse, this shouldn't change your real-world performance much and almost certainly not for the worse†. Just beware that we're now fitting to the testing set, so we have no blind evaluation of how we're doing. Perhaps telemetry can fill part of that hole.

†There are only 3 FPs in the testing set, so we're effectively hand-picking a confidence threshold that's high enough to exclude 2 of them. As you might imagine, there's a lot of chance involved in what confidences are needed to succeed on that tiny set of 3.

Flags: needinfo?(erik)
Pushed by mozilla@noorenberghe.ca: https://hg.mozilla.org/integration/autoland/rev/fdf88d6d3985 Make improvements to new-password heuristics for accuracy and performance r=MattN
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla76
Blocks: 1629132
Blocks: 1638187
Blocks: 1595244
No longer blocks: 1638187
No longer depends on: 1595244
Regressions: 1683077
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: