Closed Bug 1921771 Opened 1 month ago Closed 26 days ago

30.1 - 2.06% speedometer3 TodoMVC-JavaScript-ES6-Webpack-Complex-DOM/Adding100Items/Async / speedometer Angular2-TypeScript-TodoMVC/Adding100Items + 129 more (Android, Linux, OSX, Windows) regression on Wed September 25 2024

Categories

(Toolkit :: Form Autofill, defect)

defect

Tracking

()

RESOLVED FIXED
133 Branch
Tracking Status
firefox-esr128 --- unaffected
firefox131 --- unaffected
firefox132 --- disabled
firefox133 + fixed

People

(Reporter: intermittent-bug-filer, Assigned: dimi)

References

(Blocks 1 open bug, Regressed 1 open bug, Regression)

Details

(Keywords: perf, perf-alert, regression, Whiteboard: [fxcm-iframe][sp3])

Attachments

(2 files)

Perfherder has detected a browsertime performance regression from push c1610a2d9ce55366697288544e277b4865b552c7. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
30% speedometer3 TodoMVC-JavaScript-ES6-Webpack-Complex-DOM/Adding100Items/Async linux1804-64-shippable-qr fission webrender 6.27 -> 8.16 Before/After
25% speedometer3 TodoMVC-JavaScript-ES6-Webpack-Complex-DOM/Adding100Items/Async linux1804-64-nightlyasrelease-qr fission webrender 6.51 -> 8.14 Before/After
22% wikipedia largestContentfulPaint linux1804-64-shippable-qr bytecode-cached cold fission webrender 1,242.73 -> 1,512.79
21% speedometer Angular2-TypeScript-TodoMVC/Adding100Items/Async windows11-64-nightlyasrelease-qr fission webrender 1.25 -> 1.52
21% speedometer Inferno-TodoMVC/Adding100Items/Async windows11-64-nightlyasrelease-qr fission webrender 1.35 -> 1.64
21% speedometer Angular2-TypeScript-TodoMVC/Adding100Items/Async windows11-64-shippable-qr fission webrender 1.29 -> 1.55
19% speedometer3 TodoMVC-JavaScript-ES6-Webpack-Complex-DOM/Adding100Items/Async windows11-64-shippable-qr fission webrender 3.95 -> 4.72 Before/After
19% speedometer Inferno-TodoMVC/Adding100Items/Async windows11-64-shippable-qr fission webrender 1.39 -> 1.65
19% speedometer Angular2-TypeScript-TodoMVC/Adding100Items/Async macosx1400-64-shippable-qr fission webrender 0.80 -> 0.95 Before/After
17% speedometer EmberJS-TodoMVC/Adding100Items/Async windows11-64-nightlyasrelease-qr fission webrender 2.19 -> 2.56
... ... ... ... ... ...
2% speedometer Angular2-TypeScript-TodoMVC windows11-64-shippable-qr fission webrender 21.05 -> 21.54
2% speedometer Vanilla-ES2015-TodoMVC/Adding100Items windows11-64-shippable-qr fission webrender 22.59 -> 23.10
2% speedometer VueJS-TodoMVC windows11-64-shippable-qr fission webrender 13.91 -> 14.22
2% speedometer VanillaJS-TodoMVC windows11-64-shippable-qr fission webrender 27.97 -> 28.56
2% speedometer Angular2-TypeScript-TodoMVC/Adding100Items macosx1400-64-shippable-qr fission webrender 8.30 -> 8.47 Before/After

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
5% twitter loadtime macosx1015-64-shippable-qr cold fission webrender 610.38 -> 579.74

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run all of these tests on try with ./mach try perf --alert 2243

The following documentation link provides more information about this command.

For more information on performance sheriffing please see our FAQ.

If you have any questions, please do not hesitate to reach out to fbilt@mozilla.com.

Flags: needinfo?(dlee)

Set release status flags based on info from the regressing bug 1916927

Flags: needinfo?(dlee)
Whiteboard: [fxcm-iframe]
Assignee: nobody → dlee
Status: NEW → ASSIGNED
Pushed by dlee@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3fd1200f4db2 Do not notify the parent when none of the detected fields are address or credit card fields r=credential-management-reviewers,issammani
Status: ASSIGNED → RESOLVED
Closed: 1 month ago
Resolution: --- → FIXED
Target Milestone: --- → 133 Branch

Here's a profile showing the form autofill work that's done during the Adding100Items-Async step of the TodoMVC-JavaScript-ES6-Webpack-Complex-DOM subtest: https://share.firefox.dev/4gS2qMK (warning: this profile is big and slow to load!)

I think we're just doing the content process work at an unfortunate time. You can ask Olli for help determining when to run the work (and how to schedule it).

I think we're just doing the content process work at an unfortunate time. You can ask Olli for help determining when to run the work (and how to schedule it).

Hi Marks, thank you for sharing the profile; it's really helpful! And yes, you're correct about "doing the content process work at an unfortunate time".

Taking TodoMVC-JavaScript-ES6-Webpack-Complex-DOM as an example: Before the patch in Bug 1916927, when users focused on an input field, the FormAutofill code ran entirely in the content process during suite-TodoMVC-JavaScript-ES6-Webpack-Complex-DOM-prepare-start. However, since the patch, some logic has moved to the parent process, altering the workflow as follows:

  1. The child process does some work upon field focus and notifies the parent process if necessary.
  2. The parent process receives the message, processes it, and then sends the updated result back to the child.
  3. The child process receives this update and applies the result.

Because this operation has become asynchronous, step 3 now runs during TodoMVC-JavaScript-ES6-Webpack-Complex-DOM.Adding100Items instead of suite-TodoMVC-JavaScript-ES6-Webpack-Complex-DOM-prepare-start. If my understanding is correct, things done in the preparation step doesn't count when we measure the performance, which leads to a performance drop if we only look at the Adding100Items phase.

Overall, the patch doesn't introduce additional overhead; in fact, it reduces the workload on the child process. From the result in the description, we even see a slight performance improvement in the Twitter load time.

Markus, does above make sense to you? if yes, do you think this is an issue that needs to be addressed? If we were to find a solution, it would likely be a workaround to make the performance looks nicer for the sp3 benchmark. Please let me know your thoughts, thanks!

Flags: needinfo?(dlee) → needinfo?(mstange.moz)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 133 Branch → ---

Olli, do you have opinions on what to do here?

Flags: needinfo?(mstange.moz) → needinfo?(smaug)

"the patch doesn't introduce additional overhead"? If there are more steps and more IPC, doesn't it mean there is extra overhead?
Looking at the profile looks like lots of time is spent loading some module for the JS actor.
(1) why are we still getting the message from the parent process? (maybe sp3 has creditcard or address types of fields?)
(2) could we load the relevant module sooner? Looks like FormAutofillChild.sys.mjs is using some lazy loading which pushes module loading to happen at bad time.

I don't quite understand the reasoning for bug 1916927. Do we really care about cross-origin frames case too in heuristics? Or could the heuristics stay within a content process and only check same origin iframes?
(I could very well be missing some valid cases)

Flags: needinfo?(smaug)

(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #9)

"the patch doesn't introduce additional overhead"? If there are more steps and more IPC, doesn't it mean there is extra overhead?

True, the patch does add more IPC. I should be more precise what i meant when i said the patch doesn't introduce additional overhead, sorry!
When i said that, i meant the patch itself doesn't add additional steps or change anything from "heuristics" perspective. The patch only moves part of the steps from the content process to the parent process.

Looking at the profile looks like lots of time is spent loading some module for the JS actor.
(1) why are we still getting the message from the parent process? (maybe sp3 has creditcard or address types of fields?)

The patch landed earlier in this bug fixes the issues that the content process always sends a messageto the parent process even when there is no credit card or address fields. However, for some sp3 test pages, they do contain address fields, that's why the issue remains.

(2) could we load the relevant module sooner? Looks like FormAutofillChild.sys.mjs is using some lazy loading which pushes module loading to happen at bad time.

I can take a look what can be moved so we do those steps sooner so they don't happen at a bad time. But just to confirm, when we say "bad time", does this also apply for everyday use scenario? or does "bad time" only apply to sp3 benchmark?

I don't quite understand the reasoning for bug 1916927. Do we really care about cross-origin frames case too in heuristics? Or could the heuristics stay within a content process and only check same origin iframes?
(I could very well be missing some valid cases)

Yes, we do consider cross-origin frames. The main reason is for credit card forms where very often not all cc fields are located within the sameframe. Regardless of whether we decide to autofill across origins, referencing cross-origin frames help improve the accuracy of our field detection heuristic. For example if the main-Frame contains name field and there is a cross-Origin iframe containing cc-number

In the parent process, knowing that "name" is followed by "cc-number" allows us to correctly adjust "name" to "cc-name".

Whiteboard: [fxcm-iframe] → [fxcm-iframe][sp3]

Bug 1916927 backs out cleanly from Beta. Should we do that for now while this investigation continues?

Flags: needinfo?(dlee)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #11)

Bug 1916927 backs out cleanly from Beta. Should we do that for now while this investigation continues?

In my opinion, I don’t think we need to back out the patch because, based on my investigation (Comment 7), the performance decrease seems more related to how SP3 measures performance.

Olli, what do you think? Also, sorry I forgot to needinfo you after replying to your questions.

Flags: needinfo?(dlee) → needinfo?(smaug)

sp3 is an important benchmark, to most important we have atm. We shouldn't regress it lightly. Do we need Bug 1916927 in beta? (Do we utilize the cross-origin data already? Hmm, is the use of cross-origin data safe from privacy point of view?)

Flags: needinfo?(smaug)

(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #13)

sp3 is an important benchmark, to most important we have atm. We shouldn't regress it lightly. Do we need Bug 1916927 in beta?

okay, then let's back it out from beta. Ryan, could you help on that? thanks!

(Do we utilize the cross-origin data already? Hmm, is the use of cross-origin data safe from privacy point of view?)

Yes, but just to be clear, the data we’re discussing here is not the user’s data; it is the internally defined data structure used to represent the type of field, such as a credit card number field. This data will not be exposed cross-origin; we only use it to correct the prediction results for other frames if necessary.

Flags: needinfo?(ryanvm)

(In reply to Dimi Lee [:dimi] from comment #14)

okay, then let's back it out from beta. Ryan, could you help on that? thanks!

Backed out for 132.0b9.
https://hg.mozilla.org/releases/mozilla-beta/rev/38da3560f8539e8ec2fbff90d0ae5620bc39201a

Flags: needinfo?(ryanvm)

(In reply to Pulsebot from comment #3)

Pushed by dlee@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3fd1200f4db2
Do not notify the parent when none of the detected fields are address or
credit card fields r=credential-management-reviewers,issammani

Perfherder has detected a browsertime performance change from push 3fd1200f4db2e2bdce0383e795c2874eef90ee67.

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
17% speedometer Angular2-TypeScript-TodoMVC/Adding100Items/Async windows11-64-shippable-qr fission webrender 1.56 -> 1.29 Before/After
17% speedometer Inferno-TodoMVC/Adding100Items/Async windows11-64-shippable-qr fission webrender 1.66 -> 1.38 Before/After
17% speedometer Angular2-TypeScript-TodoMVC/Adding100Items/Async windows11-64-nightlyasrelease-qr fission webrender 1.52 -> 1.26 Before/After
16% speedometer Inferno-TodoMVC/Adding100Items/Async windows11-64-nightlyasrelease-qr fission webrender 1.62 -> 1.36 Before/After
15% speedometer Angular2-TypeScript-TodoMVC/Adding100Items/Async macosx1400-64-shippable-qr fission webrender 0.95 -> 0.80 Before/After
... ... ... ... ... ...
2% speedometer VueJS-TodoMVC windows11-64-shippable-qr fission webrender 14.34 -> 14.03 Before/After

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 2430

For more information on performance sheriffing please see our FAQ.

This patch implements the following optimizations:

  1. Adds element id to the 'FormHandler'' cache. This avoids the need to call 'findRootForField'.
  2. Removes the 'isAutofillEnabled' check in 'receiveMessage'. This check is unnecessary because
    those messages are only triggered if autofill is enabled.
  3. Skips registering the form submission handler if the address fields do not contain 'street-address'.
    This is because, for every country, we don’t save the address if 'street-address' is not present.
  4. Passes 'fieldName' to 'showPopupIfEmpty', so there is no need to call 'getFormHandler' in the function.
Pushed by dlee@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b89a2978b2ab Optimization for field detection flow in autofill r=NeilDeakin
Regressions: 1925641
Status: REOPENED → RESOLVED
Closed: 1 month ago26 days ago
Resolution: --- → FIXED
Target Milestone: --- → 133 Branch

It looks like this is indeed fixed now, thanks!

But it looks like it was actually fixed by bug 1923615, not by this patch. See the graph.

Depends on: 1923615
Regressions: 1927440
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: