Closed Bug 1685798 Opened 3 years ago Closed 2 years ago

Focusing the summary text field on Bugzilla janks for one second

Categories

(Toolkit :: Form Autofill, defect, P3)

defect

Tracking

()

VERIFIED FIXED
96 Branch
Performance Impact high
Fission Milestone Future
Tracking Status
firefox-esr78 --- wontfix
firefox-esr91 --- wontfix
firefox84 --- wontfix
firefox85 --- wontfix
firefox86 --- wontfix
firefox87 --- wontfix
firefox92 --- wontfix
firefox93 --- wontfix
firefox94 --- wontfix
firefox95 --- wontfix
firefox96 --- verified

People

(Reporter: mstange, Assigned: dlee)

References

(Regression)

Details

(Keywords: perf, perf:responsiveness, regression, Whiteboard: fission-soft-blocker)

Attachments

(2 files)

Profile: https://share.firefox.dev/2JYg2vt

Steps to reproduce:

  1. Make sure your Firefox profile has Form Autofill enabled, by setting extensions.formautofill.available to on and restarting.
  2. Open the attached testcase. Alternatively, on this very bugzilla bug, click the Edit Bug button in the top right corner. (This only works if you're logged in and have editbugs rights.)
  3. Focus the "Summary" text field at the top and start typing immediately.

Expected results:
No jank, the form field should be responsive immediately.

Actual results:
The page hangs for a full second before input is accepted.

Here's a patch to add profiler markers to the relevant code:

diff --git a/browser/extensions/formautofill/FormAutofillHeuristics.jsm b/browser/extensions/formautofill/FormAutofillHeuristics.jsm
--- a/browser/extensions/formautofill/FormAutofillHeuristics.jsm
+++ b/browser/extensions/formautofill/FormAutofillHeuristics.jsm
@@ -1101,17 +1101,20 @@ this.FormAutofillHeuristics = {
    * @param {Array<string>} regexps
    *        The regex key names that correspond to pattern in the rule list.
    * @returns {?string} The first matched field name
    */
   _findMatchedFieldName(element, regexps) {
     const getElementStrings = this._getElementStrings(element);
     for (let regexp of regexps) {
       for (let string of getElementStrings) {
-        if (this.RULES[regexp].test(string)) {
+        let startTime = Cu.now();
+        let matched = this.RULES[regexp].test(string);
+        ChromeUtils.addProfilerMarker(`FormAutofill ${regexp} regexp`, startTime);
+        if (matched) {
           return regexp;
         }
       }
     }

     return null;
   },

In the call tree, it looks like some regular expressions get compiled repeatedly.

QA Whiteboard: [qa-regression-triage]

I can't reproduce your issue using Firefox Release v84.0.1 or v84.0.2 or Nightly v86.0a1 (2021-01-09) or Nightly v86.0a1 (2021-01-10) on Windows 10 or Mac OS 11.1 (both ARM and non-ARM processor devices) or Mac OS 10.15. I tested it by editing a test bug and saved values (form history, not autofill) on some of the form fields, then edited the bug and clicked on the fields with saved form history values.

We don't have Mac 10.16 (as the performance profile suggests), but I assume that it is actually Mac OS 11:
"As of macOS Big Sur Beta 3, the system version is reported as "11.0" on all Macs running the beta.[citation needed] However, to maintain backwards compatibility, macOS Big Sur also identifies itself as 10.16 to software and in the browser user agent."

This being said, it appears we need more information about your system.

  1. Is it an ARM architecture device? What exact Apple device reproduces this issue?
  2. I would like you to test your issue in safe mode and tell us the result.
    (https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode)
  3. A screen recording might be of some use if you think I've misunderstood your steps.

I don't know if the provided perf profile confirms the issue or exposes the cause of your hang, but I certainly cannot reproduce it.

Flags: needinfo?(mstange.moz)

It's an Intel Macbook Pro. I'd be very surprised if this problem was OS-specific, though.
Maybe it requires some saved form history, or saved credit card information that it considers for autofill.

I also cannot reproduce this in a fresh profile.

Flags: needinfo?(mstange.moz)

I can reproduce in a fresh profile if I set the pref extensions.formautofill.available to on, for example like this: mozregression --launch 2021-01-11 --pref extensions.formautofill.available:on.
Without that pref, Form Autofill only gets enabled if browser.search.region is "US" or "CA", and also it seems like it's not getting enabled on the first start of a fresh profile.

Attached file testcase
Summary: Focusing the "depends on" field on Bugzilla janks for one second because of form autofill regexp matching → Focusing the summary text field on Bugzilla janks for one second because of form autofill regexp matching

I'm looking for a regression window now.

8:31.36 INFO: Last good revision: 256602905ce82b1866a1d2f37b409fa768fb7ae6
8:31.36 INFO: First bad revision: d1114574b777bb00b85f73733a8bd6a78d2ad87e
8:31.36 INFO: Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=256602905ce82b1866a1d2f37b409fa768fb7ae6&tochange=d1114574b777bb00b85f73733a8bd6a78d2ad87e

So probably bug 1634135.

Iain, can you take a look?

Component: Form Autofill → JavaScript Engine
Flags: needinfo?(iireland)
Keywords: regression
Product: Toolkit → Core
Regressed by: 1634135
Version: unspecified → Trunk

Ugh. Looking at the profile, we're spending 80%+ of our time in UnicodeSet::closeOver, which is used by AddUnicodeCaseEquivalents to do case-insensitive matching. It's an open performance issue in V8, and also just generally kind of buggy.

The new Intl team Zibi is starting up will be working on replacing ICU throughout Gecko with a new Rust implementation. Part of that is going to have to include replacing ICU in irregexp (which will be exciting because we'll have to coordinate with V8 upstream, although my understanding is that the V8 team will also be migrating to the same Rust implementation). I don't know what the timeframe on the ICU replacement will be, though.

Flags: needinfo?(iireland)

Does that mean Chrome would be just as slow on this code?

Is there some way we can change the regular expressions to not run into this slowness? We control them, they're here: https://searchfox.org/mozilla-central/rev/07342ce09126c513540c1c343476e026cfa907bf/browser/extensions/formautofill/content/heuristicsRegexp.js#559,562,566

What are your thoughts on the fact that we seem to be compiling the regular expressions repeatedly? (At least that's what the profile from comment 0 looks like to me.)

Flags: needinfo?(mstange.moz)
Flags: needinfo?(mstange.moz) → needinfo?(iireland)

I would expect irregexp to have similar problems on this code.

Some thoughts, which might be helpful:

AddUnicodeCaseEquivalents is only called for case-insensitive (/i) regexps. If we can rewrite our regexps to not need the i flag, I think that would help a lot.

Every RegExpObject has a reference to a RegExpShared, which does the heavy lifting. RegExpObjects will have the same RegExpShared if they have the same pattern and flags. There is a cache of RegExpShared in the Zone. Unused entries are freed after every GC.

The first time we execute a RegExpShared, we generate bytecode for an interpreter and execute that. After ten executions, we recompile and generate native code. Orthogonally, we handle latin1 and two-byte inputs separately. This means each RegExpShared can be compiled up to 4 times: latin1-interpreter, latin1-compiler, two-byte-interpreter, and two-byte-compiler. (Note that latin1 vs two-byte is based on the string being matched, not the regexp pattern itself.)

Aside: V8 tiers up after a single execution. I tested a local patch to go straight to the compiled version (by setting this value to 0), but it didn't make a noticeable difference (although I was just eyeballing it).

Given that we don't GC during your profile, if the same regexp is compiled more than twice, I would assume it's being called with both latin1 and two-byte inputs. If it is compiled more than four times, my first assumption would be that we've changed the pattern or the flags. Otherwise, it should reuse the existing RegExpShared.

Oh, wait. Tracking down the equivalent Chrome code, it looks like they're using some sort of regex matcher built into ICU for form-fills, instead of irregexp? The ICU documentation is here.

Flags: needinfo?(iireland)

Iain, is this right?

  • Changing the regexps to avoid the /i flag might fix it.
  • Alternatively, reimplementing this in C++, using icu::RegexMatcher like Chrome, should fix it.
  • It's hard to fix in the JS engine proper, as the performance issue is in V8's regexp engine.

If so, please redirect this bug to the Form Autofill component.

(It'd be nice to fix upstream in irregexp, if possible. But I know that's hard.)

Flags: needinfo?(iireland)

That is correct.

We will have to coordinate with V8 to fix this in irregexp as part of the Intl team's work to reimplement ICU in Rust, but I can't guarantee a timeframe on that.

Component: JavaScript Engine → Form Autofill
Flags: needinfo?(iireland)
Product: Core → Toolkit

Could you perhaps ensure toolkit folks notice this bug. This is a rather bad regression.

Flags: needinfo?(iireland)
Whiteboard: [qf] → [qf:p1:responsiveness]

I talked to Zibi, who says that the current plan is to replace this code with Fathom.

Erik: If your Fathom implementation is also using /i JS regexps, you may encounter a similar problem. If you do, I believe you can work around it by converting /abc/iu to /[aA][bB][cC]/u, or something along those lines.

Flags: needinfo?(iireland)

Fathom will be using a superset of the previous autofill heuristics, so yes, it applies. Thanks! I'll have a look after the use-Fathom patch (https://phabricator.services.mozilla.com/D100141) is landed.

Hi, I will update "has regression range" field accordingly, as per comment 7.
Best,
Clara

Has Regression Range: --- → yes

I re-tested this, now that Fathom has landed. The 1-second jank is still there: https://share.firefox.dev/3cj5uCm

However, the problem is no longer slow regexes; now the profile seems to be dominated by XrayWrappers and Proxy stuff.

Summary: Focusing the summary text field on Bugzilla janks for one second because of form autofill regexp matching → Focusing the summary text field on Bugzilla janks for one second

Bug 1700838 has an extreme version of this where there jank is >5 minutes. I added a minimal test case there that also gets bogged down on phone-number-field matching when the form has a lot of inputs.

It looks like the algorithm needs to be refactored a bit. For example, the isFieldEligibleForAutofill takes 30% of runtime and is called from the inside of loop. It is called for the same element many times and uses DOM accessors plus the linear-search Array.includes method.

See Also: → 1700838
Flags: needinfo?(erik)

Hi, Kim. I'm chasing this over on bug 1701806. I'm either going to fix it in the next day or 2 or back it out.

Flags: needinfo?(erik)
See Also: → 1701806

Hi Erik, can this bug be closed now?

Flags: needinfo?(erik)

The 1 second jank is still there for me.

I can tell you the Fathom CC stuff is no longer causing the jank, because I backed it out for now. If there's jank, it's coming from elsewhere.

Flags: needinfo?(erik)

Iain, we are back to regexp version of the code here for the foreseeable future. Does your proposal of tweaking the regexps still makes sense?

Flags: needinfo?(iireland)

Yes, converting /abc/iu to /[aA][bB][cC]/u should still help.

I'm also hopeful that either the Outreachy project will help here, or I can make some changes nearby while the Outreachy project is happening, but if that happens it will be closer to the end of the project.

Flags: needinfo?(iireland)
Blocks: 1657998

I've attached a patch which reduces the jank from 1.5 seconds to 90ms on my machine. It's very rough and might not work correctly, but it proves that this bug should be fixable without sacrificing a limb.

So, some of these regexes were already manually dealing with case-insensitivity. For example, this one recognizes both "email" and "Email":

"|(?:이메일|전자.?우편|[Ee]-?mail)(.?주소)?", // ko-KR

It would be great to apply this principle to all of them: Go back to the original corpus of field names, check which letters are actually used in both uppercase and lower case form, and only make those specific letters case-insensitive.

One more thing: This bug is much more severe with Fission, as shown in bug 1657998 comment 15: Every new content process encounters the jank.

(In reply to Markus Stange [:mstange] from comment #29)

One more thing: This bug is much more severe with Fission, as shown in bug 1657998 comment 15: Every new content process encounters the jank.

Tentatively tracking for Fission MVP since Fission this bug is "much more severe with Fission" and causes an intermittent test failure.

Fission Milestone: --- → MVP
Whiteboard: [qf:p1:responsiveness] → [qf:p1:responsiveness] fission-soft-blocker

Dimi, do you know where these regular expressions came from, originally? Or if there's a data set that I can test the tweaked regular expressions on? Ideally I think we should do what I suggested in comment 28: Make all regexps case-sensitive, see what breaks, and add case-insensitive matches for only the characters that need it.

Flags: needinfo?(dlee)

Make all regexps case-sensitive, see what breaks, and add case-insensitive matches for only the characters that need it.

Instead of making the regexps case-insensitive, can you lowercase the input text that you're comparing to the regexp rules? Then the lowercase regexps would match any input case even in the faster case-sensitive mode.

... Oh. That's a great idea and I'm a bit embarrassed that I didn't think of it.

I'll try to give that a spin.

Assignee: nobody → mstange.moz
Status: NEW → ASSIGNED

Many of the form autofill rules have non-English characters. Do we have form autofill tests for these non-English attribute values? We should verify that lowercasing non-English input text doesn't produce unexpected results that no longer match the non-English rules.

(In reply to Markus Stange [:mstange] from comment #31)

Dimi, do you know where these regular expressions came from, originally? Or if there's a data set that I can test the tweaked regular expressions on? Ideally I think we should do what I suggested in comment 28: Make all regexps case-sensitive, see what breaks, and add case-insensitive matches for only the characters that need it.

I think these regular expressions came from chromium src. It looks like we do have tests cover this part, but i'm not sure about the coverage.

Flags: needinfo?(dlee)

(In reply to Dimi Lee [:dimi] from comment #36)

I think these regular expressions came from chromium src.

I see. And the Chromium code indeed uses case-insensitive matching for all of these regexes.

It looks like we do have tests cover this part, but i'm not sure about the coverage.

Thank you, I'll take a look at those.

Markus, do you have any updates on this regex jank bug? Do you think you will be able to land a fix before the end of the Nightly 94 cycle (September 30)? That's the code freeze date for Firefox 94, our Fission release.

Flags: needinfo?(mstange.moz)

Changing severity to S4 because it fits S4's definition.

Severity: -- → S4
Priority: -- → P2

I tried Chris's suggestion and it passes the tests, and fixes the performance bug. I don't know to what degree it is equivalent to the current implementation.

(In reply to Chris Peterson [:cpeterson] from comment #35)

Many of the form autofill rules have non-English characters. Do we have form autofill tests for these non-English attribute values?

I couldn't find any on a quick scan. The fixtures under https://searchfox.org/mozilla-central/source/browser/extensions/formautofill/test/fixtures/third_party are all from English language websites.

We should verify that lowercasing non-English input text doesn't produce unexpected results that no longer match the non-English rules.

Yes... I'm not an expert on this. But my hope is that, if we call toLowerCase() on both the regex string and on the matched string, we should always match at least as many cases as if we had done a case-sensitive match. But we might match fewer cases than we did in the past with the "iu" flag.

Flags: needinfo?(mstange.moz)
Attachment #9227783 - Attachment description: WIP: Bug 1685798 - WIP: Manually build a case-insensitive regex so that we can use //u instead of //iu, and avoid a performance cliff. → Bug 1685798 - Don't use //iu regular expressions in form autofill regex matching, in order to avoid a performance cliff. r=dimi
Pushed by mstange@themasta.com:
https://hg.mozilla.org/integration/autoland/rev/1fc975f87767
Don't use //iu regular expressions in form autofill regex matching, in order to avoid a performance cliff. r=dimi

Backed out for causing high frequency failures on test_autocomplete_basic_form_insecure.html

[task 2021-09-21T04:03:29.344Z] 04:03:29     INFO - TEST-PASS | toolkit/components/passwordmgr/test/mochitest/test_autocomplete_basic_form_insecure.html | Check popup is initially closed 
[task 2021-09-21T04:03:29.344Z] 04:03:29     INFO - add_task | Leaving test test_form1_initial_empty
[task 2021-09-21T04:03:29.345Z] 04:03:29     INFO - add_task | Entering test test_form1_warning_entry
[task 2021-09-21T04:03:29.345Z] 04:03:29     INFO - Buffered messages finished
[task 2021-09-21T04:03:29.346Z] 04:03:29     INFO - TEST-UNEXPECTED-FAIL | toolkit/components/passwordmgr/test/mochitest/test_autocomplete_basic_form_insecure.html | Test timed out. - 
[task 2021-09-21T04:03:30.285Z] 04:03:30     INFO - GECKO(6436) | MEMORY STAT | vsize 2103708MB | vsizeMaxContiguous 71932684MB | residentFast 60MB | heapAllocated 7MB
[task 2021-09-21T04:03:30.303Z] 04:03:30     INFO - TEST-OK | toolkit/components/passwordmgr/test/mochitest/test_autocomplete_basic_form_insecure.html | took 301118ms
Flags: needinfo?(mstange.moz)

Hmm, not sure why that test would fail. Maybe there is an existing dependency on timing, and the one-second freeze was covering it up. I'm also not sure which one of the promises in that test task time out. I've triggered a try build with more logging:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=65e09c11a4c7f0da83cfb92c210ecad48f0c8bad

Flags: needinfo?(mstange.moz)

The timeout occurs when waiting for the autocomplete to be shown after pressing the down arrow key. Lots of tests use this same pattern, so I'm not sure why it wouldn't work in this test.
Tim, do you have any ideas for what could be happening here?

Flags: needinfo?(tgiles)

I may have run into this issue before, if I'm understanding correctly. What is probably happening is, we set the listener for the autocomplete menu to be shown but that function doesn't run quick enough before we end up showing the autocomplete menu. Then, since the autocomplete menu was shown before the listener was ready, we do nothing and timeout. I'm not sure why this happens sometimes and not other times, must be some timing issue somewhere.

IIRC, moving line 148 up one line to before "restoreForm" should resolve this...if it's the same issue I've experienced. For example, here's another test that sets up the listener before we do any kind of focusing on fields that could trigger the autocomplete popup

Seems like we (password manager team) may need to revisit these tests and move some code around to avoid these timing issues.

Markus, feel free to keep me in the loop, happy to help out where I can.

Flags: needinfo?(tgiles)

Thanks!

Try push with promiseACShown() moved before restoreFrom(): https://treeherder.mozilla.org/jobs?repo=try&revision=9979ecac4cca10f112f8df46f98aa7a9bb05d8e0

Unfortunately that didn't work.

Hmm, I'll keep looking into this then and try to find some alternative way to get that test to pass. Thanks for sending off that previous Try run, I'll report back when I find out more.

I've looked at this for a while now and unfortunately, I'm not finding any good reason why the test should be failing. I've tried to reproduce on my win10 machine (with the same configuration that the Try run has for mozinfo, win10_2004 = true and webrender = true among other parts in that list) but no success. I can't think of other steps to try and debug. It would be nice if we could take a screenshot right before we get stuck waiting as the UI might give some hint as to what is going on...but I don't know if Try runs in headless mode, so might be moot.

We'll see if moving the shownPromise code as the first thing that happens in the test makes any difference, just don't know because of timing. Try run

Went ahead and added some debug logging in satchel in the promiseACShown function to try and see what is happening with that function. Try run

This is a tricky one to resolve.

Thanks a ton for your help. We do in fact take screenshots when tests time out; you can find the link in the Artifacts panel of the selected try run. And the screenshot shows that no autocomplete popup is open - just a focused form field.

Unassigning myself; the remaining work in this bug is to figure out the test failure. This is probably going to involve a lot of trial-and-error with repeated try pushes and logging.

Tim, do you want to take the bug from here?

Assignee: mstange.moz → nobody
Status: ASSIGNED → NEW

(In reply to Markus Stange [:mstange] from comment #53)

Unassigning myself; the remaining work in this bug is to figure out the test failure. This is probably going to involve a lot of trial-and-error with repeated try pushes and logging.

Tim, do you want to take the bug from here?

Dimi, is there someone who works on Form Autofill that can help debug this test failure?

mstange's heuristicsRegexp.js regex optimization got backed out because test_autocomplete_basic_form_insecure.html started failing intermittently. A timeout occurs when waiting for the autocomplete menu to be shown after the test presses the down arrow key.

Fission Milestone: MVP → Future
Flags: needinfo?(dlee)
Priority: P2 → P3

(In reply to Chris Peterson [:cpeterson] from comment #54)

Dimi, is there someone who works on Form Autofill that can help debug this test failure?

I'll look into this issue.

Assignee: nobody → dlee
Flags: needinfo?(dlee)
See Also: → 1739170

okay, I can confirm that :Markus patch is not the reason why we have the intermittent errors in Comment 44, but it does change the autofill/autocomplete timing and causes a higher frequency failures.
I'll fix Bug 1554453 and land this patch afterwards.

Pushed by dlee@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ca6647e60fcc
Don't use //iu regular expressions in form autofill regex matching, in order to avoid a performance cliff. r=dimi
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 96 Branch

Setting status-firefox95=wontfix because we don't need to uplift this autofill optimization to Beta 95.

Flags: qe-verify+

I am not able to reproduce the issue using settings from description and build 86.0a1(20210108215736) on Win10x64, same as mentioned on comment #1.
Markus, is the issue still reproducing on your side on latest Beta? Thank you.

Flags: needinfo?(mstange.moz)

See comment 4 - this bug affects functionality that is only enabled by default in the US and Canada. You can override it with the extensions.formautofill.available pref, by setting it to on.

Flags: needinfo?(mstange.moz)

(In reply to Markus Stange [:mstange] from comment #61)

See comment 4 - this bug affects functionality that is only enabled by default in the US and Canada. You can override it with the extensions.formautofill.available pref, by setting it to on.

Hello Markus,
I used the settings mentioned by you on Mac 10.13, using build 86.0(20210215141125) and I was still not able to reproduce the issue. Can you please confirm issue is fixed on latest Beta?
Thank you very much.

Flags: needinfo?(mstange.moz)

Sure!

Reproduced in 95.0b13.
Could not reproduce in 96.0b11.

mozregression --pref extensions.formautofill.available:on --repo beta --launch 95 -a "https://bug1685798.bmoattachments.org/attachment.cgi?id=9196477"

Flags: needinfo?(mstange.moz)
No longer blocks: 1657998
Depends on: 1657998
No longer depends on: 1681985
Performance Impact: --- → P1
Whiteboard: [qf:p1:responsiveness] fission-soft-blocker → fission-soft-blocker

Mark issue as verified based on comment#63.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: