Evaluate the performance of Firefox's menubar in L20n

RESOLVED FIXED

Status

RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: stas, Assigned: gandalf)

Tracking

(Blocks: 1 bug)

Details

(Reporter)

Description

2 years ago
We need perf numbers to test l20n-chrome-xul.
(Reporter)

Updated

2 years ago
Blocks: 1289535
(Reporter)

Comment 1

2 years ago
Zibi, when bug 128840 lands what should be our next step to start getting perf numbers?
Flags: needinfo?(gandalf)
(Assignee)

Comment 2

2 years ago
I'd like to plug https://github.com/zbraniecki/gecko-dev/blob/l20n-zibi/toolkit/content/l20n-perf-monitor.js into the menubar document.

Should I do this on l20n branch or against larch?
Flags: needinfo?(gandalf) → needinfo?(stas)
(Reporter)

Comment 3

2 years ago
My votes goes to doing it against larch when bug 1288406 lands.
Flags: needinfo?(stas)
(Assignee)

Updated

2 years ago
Assignee: nobody → gandalf
(Reporter)

Comment 4

2 years ago
I pushed bug 1288406 to try and enabled Talos tests:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=fdcee57b4e4f&newProject=try&newRevision=89fccfd367377df5296d16f41b9933217e8c63f2&framework=1

Some builds failed so these results might be a bit off but as it stands today we're looking at a few regressions:

  sessionrestore                   +9%
  sessionrestore_no_auto_restore   +7%
  tpaint                          +29%
  ts_paint                         +4%

Updated

2 years ago
Blocks: 1291693
(Assignee)

Updated

2 years ago
Depends on: 1295292
(Assignee)

Comment 5

2 years ago
Here's the first batch of results

domContentLoadedEventStart	267.20
domContentLoadedEventEnd	267.20
domComplete	313.40
loadEventStart	330.60
loadEventEnd	391.40
-------------------------------------------------
l20n-start	60.60
io-start	89.60
io-loaded	122.80
fetchFirstBundle	123.80
io-parsed	129.80
translateRoot	130.00
translateElements	130.00
getElementsTranslationStart	130.00
getKeysForElementsStart	130.00
getKeysForElementsEnd	155.00
getElementsTranslationKeys	155.00
applyTranslationsStart	176.20
applyTranslationsEnd	178.60
mozBeforeLayout	196.60
firstPaint	447.00

It's based on 5 runs of larch with l20n-perf-monitor plugged in and performance.mark's sprinkled around l20n-chrome-xul.js.

The takeaways:
 - IO could be started ~30ms earlier
 - IO takes ~33ms
 - we may want to do sync-IO[0]
 - parsing takes ~6ms
 - getting the list of data-l10n-id entities ~25ms
 - formatting entites takes ~20ms
 - we're still done before layout

[0] There's no way to prevent layout/FOUC in XUL other than sync IO. With browser.xul scenario we seem to be well within time budget, but it's not deterministic.

Areas worth investigating:
 - earlier IO
 - sync IO
 - identifying why iterating over entites takes so much time and can we optimize it
 - identifying if we can optimize the formatter - it has a fastpath for string value, but may not for a single text attribute ("label")
(Reporter)

Comment 6

2 years ago
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #5)

> The takeaways:
>  - IO could be started ~30ms earlier

Let's move this into bug 1289535. I'd like us to first experiment with not waiting for documentReady().  Would love to get your help setting up l20n-perf-monitor.

>  - IO takes ~33ms
>  - we may want to do sync-IO[0]
>  - parsing takes ~6ms
>  - getting the list of data-l10n-id entities ~25ms
>  - formatting entites takes ~20ms

That's pretty slow.  I'll investigate.

>  - identifying why iterating over entites takes so much time and can we
> optimize it

Do we know if it's iterating or maybe the cost of querySelectorAll()?

>  - identifying if we can optimize the formatter - it has a fastpath for
> string value, but may not for a single text attribute ("label")

The fastpath is in ctx.format which is used both for values and traits:

https://github.com/l20n/l20n.js/blob/a51193c62dd24c406101ad699b3e3cb34d05126a/src/lib/format.js#L54
(Reporter)

Comment 7

2 years ago
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #5)

> getElementsTranslationStart   130.00
> getKeysForElementsStart       130.00
> getKeysForElementsEnd	        155.00
> getElementsTranslationEnd     155.00
> applyTranslationsStart        176.20

Looking at https://github.com/l20n/l20n.js/blob/a51193c62dd24c406101ad699b3e3cb34d05126a/src/lib/observer/base.js#L236-L245 getKeysForElements is pretty straight-forward:

  getKeysForElements(elems) {
    return elems.map(elem => {
      const id = elem.getAttribute('data-l10n-id');
      const args = elem.getAttribute('data-l10n-args');

      return args ?
        [id, JSON.parse(args.replace(reHtml, match => htmlEntities[match]))] :
        id;
    });
  }

Are we paying for accessing the attributes via getAttribute()?

I also have a couple of ideas for optimizing getElementsTranslation; I filed bug 1296618 for this.
(Reporter)

Comment 8

2 years ago
(In reply to Staś Małolepszy :stas from comment #6)

> Let's move this into bug 1289535. I'd like us to first experiment with not
> waiting for documentReady().  Would love to get your help setting up
> l20n-perf-monitor.

OK, I figured out how to set it up.  I'll be investigating the performance of formatWithFallback on Monday.  I thought the slowness was related to this.interactive in formatEntities, but it looks like all of 20ms are spent formatting.
(Reporter)

Comment 10

2 years ago
I ran a few more experiments and it looks like iterating over the [data-l10n-id] NodeList is the most expensive thing we do -- taking up to 60 ms.  In browser.xul it's only 114 items long.  I would expect it to be faster.

Then there's formatting which takes 20-25 ms.

I also experimented with removing formatEntities in favor of a simpler formatEntity.  The code gets simpler, true, but there's a perf hit of ca. 10-15 ms.
(Assignee)

Comment 12

2 years ago
Ugh, I'm trying to dig into how tpaint matches our perf-monitor results. First problem is that I can't increase the sample because of bug 1295292 (which overall reduces my trust in this test).

Then, I'm getting vastly different numbers than treeherder results.

For mozilla-central rev 63988dbb2532 I see tpaint ~280ms, while for larch rev 7214320961ed I see tpaint ~400ms.

That's a huge difference, much greater than ~20% hit reported by treeherder.

One of us must be wrong.

:stas - can you try to replicate my test and check if you see numbers similar to treeherder or mine?

1) hg up 63988dbb2532
2) ./mach build toolkit;./mach build browser
3) ./mach talos-test --activeTests tpaint
4) hg up larch
5) ./mach build toolkit;./mach build browser
6) ./mach talos-test --activeTests tpaint
(Reporter)

Comment 13

2 years ago
Revision 63988dbb2532:

{
  "framework": {
    "name": "talos"
  }, 
  "suites": [
    {
      "extraOptions": [
        "e10s"
      ], 
      "name": "tpaint", 
      "subtests": [
        {
          "name": "tpaint", 
          "replicates": [
            242.11000000000058, 
            252.62000000000262, 
            254.91499999999996, 
            260.9950000000001, 
            264.6949999999997, 
            268.78000000000065, 
            270.2900000000009, 
            273.9300000000003, 
            275.58500000000276, 
            277.8399999999997, 
            280.4800000000014, 
            290.0, 
            299.6850000000013, 
            352.494999999999, 
            355.5, 
            359.8500000000022, 
            361.2800000000025, 
            361.89500000000044, 
            372.46500000000015, 
            373.03000000000065
          ], 
          "value": 299.6850000000013
        }
      ]
    }
  ]
}


Revision 7214320961ed:

{
  "framework": {
    "name": "talos"
  }, 
  "suites": [
    {
      "extraOptions": [
        "e10s"
      ], 
      "name": "tpaint", 
      "subtests": [
        {
          "name": "tpaint", 
          "replicates": [
            273.34500000000025, 
            304.9300000000003, 
            310.6100000000006, 
            311.1949999999997, 
            314.6949999999997, 
            324.1350000000002, 
            325.6400000000003, 
            327.345, 
            330.10000000000036, 
            331.7400000000016, 
            335.0749999999971, 
            337.1999999999998, 
            392.2600000000002, 
            396.41999999999825, 
            396.6750000000029, 
            401.9300000000003, 
            407.76000000000204, 
            422.6199999999999, 
            425.20500000000175, 
            449.4249999999993
          ], 
          "value": 392.2600000000002
        }
      ]
    }
  ]
}
(Assignee)

Comment 14

2 years ago
Because talos is so unstable, and we're getting really weird results out of it, I'm trying to verify if we are at all able to use talos tests to evaluate our performance.

Each build has been issued with this spell:
`./mach try -b o -p linux64 -u none -t other,other-e10s --rebuild-talos 20`

Here are 3 try builds of mozilla-central (rev=63988dbb2532) with 20 talos test reps:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=fcaa363fdb1d
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cfdc7920edf1
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ba7ca126973f

Here are 3 try builds of larch (rev=468244986b20) with 20 talos test reps:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=857981d6bf5d
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ac7d9aee4052
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bdaab89e9dc5

Here are 3 try builds of larch-nuclear-option (rev=6f82cd715cb1) with 20 talos test reps:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=42ac7e609436
https://treeherder.mozilla.org/#/jobs?repo=try&revision=416dd2cf8fc9
https://treeherder.mozilla.org/#/jobs?repo=try&revision=06ca56ed5e09


Hypothesis:

* Talos results between builds of the same branch should have the same numbers and similar STDEV

If this will not be true, I believe we should put into question using talos tests for informing our performance related decisions in the whole Mozilla project.

* Each larch build should be reliably slower by the same margin from mozilla-central

If this will not be true that means that we should put into question using talos tests for evaluation of L20n performance.

* Each nuclear-option build should be faster than mozilla-central build

If this will not be true we should put into question if we can achieve comparable performance with JS-based implementation of L20n to replace C++ DTD implementation.
(Assignee)

Comment 15

2 years ago
`larch-nuclear-options` is a branch on top of larch which removes all l20n.js content. It basically means that it shaves off all the DTD's that larch removed, but doesn't add any code that larch added. It does nothing in l20n land and does less in DTD land. We should see it reflected in perf numbers if we are to be able to use those perf numbers to inform our decisions.
(Assignee)

Comment 16

2 years ago
The results indicate that in the current setup, larch does not remove enough DTD footprint to create any budget for l20n cost.

the m-c vs nuclear-option is 0, which means that any footprint l20n adds will be an overhead.

I created another attempt which removes *all* DTD from browser.xul:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=530586398dd2
(Reporter)

Comment 17

2 years ago
We ran an experiment on try with the following changes:

 - don't wait on documentReady() to start the IO,
 - adjust documentReady() to recognize XUL's uninitialized state,
 - remove one unnecessary then() from translateRoot(),
 - move <script src="l20n.js"> to the top-most position in <window>.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=01d978ee21c0
(Assignee)

Comment 18

2 years ago
In result of bug 1289318 we decided to merge mozilla-central again, because the amount of Promise related code in l20n.js may impact the performance for us.

The new mozilla-central base revision is acfb2c3ac6ae.

The new larch with 20 talos jobs:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c360ec477a86
(Assignee)

Comment 19

2 years ago
The new larch revision is e980aee95cb2
(Reporter)

Comment 20

2 years ago
(In reply to Staś Małolepszy :stas from comment #17)
> We ran an experiment on try with the following changes:
> 
>  - don't wait on documentReady() to start the IO,
>  - adjust documentReady() to recognize XUL's uninitialized state,
>  - remove one unnecessary then() from translateRoot(),
>  - move <script src="l20n.js"> to the top-most position in <window>.
> 
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=01d978ee21c0


There weren't significant changes in talos numbers, but I still filed bug 1298974 to land these changes on larch as they are conceptually sound.
(Assignee)

Comment 21

2 years ago
And here's a new mozilla-central try build for revision acfb2c3ac6ae with 20 talos runs: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0e80496ddeea
(Assignee)

Comment 24

2 years ago
Performance is pretty well evaluated at this point: https://wiki.mozilla.org/index.php?title=L20n/Firefox/Performance
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.