Closed Bug 1501607 Opened 6 years ago Closed 6 years ago

1.52 - 2.06% Base Content JS (linux64-qr, osx-10-10, windows10-64-qr, windows7-32) regression on push 59160a8260a02fda2bc625b02c3132d9330e2dd7 (Tue Oct 23 2018)

Categories

(Core :: DOM: Bindings (WebIDL), defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla65
Tracking Status
firefox-esr60 --- unaffected
firefox63 --- unaffected
firefox64 --- unaffected
firefox65 + fixed

People

(Reporter: igoldan, Assigned: nika)

References

Details

(Keywords: perf, regression)

We have detected an awsy regression from push:

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=59160a8260a02fda2bc625b02c3132d9330e2dd7

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

  2%  Base Content JS osx-10-10 opt stylo           5,213,170.00 -> 5,320,400.00
  2%  Base Content JS linux64-qr opt stylo          5,211,669.33 -> 5,317,050.67
  2%  Base Content JS windows10-64-qr opt stylo     5,232,114.67 -> 5,332,000.00
  2%  Base Content JS windows7-32 opt stylo         4,203,915.00 -> 4,267,992.00


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=17065

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/AWSY/Tests
Component: General → DOM: Bindings (WebIDL)
Product: Testing → Core
Flags: needinfo?(nika)
My initial guess is that something in the test harness is enumerating system globals and hence now resolving more stuff.  But it's hard to tell for sure because I can't find any instructions for running this test locally...

The other option is that SystemBindingInitIds is now creating a lot more ids.   If so, fixing bug 1501124  would likely fix this.
So the total length of the strings that now pass through SystemBindingInitIds is 11378 on opt Linux, with around 700 strings.  We're seeing ~106-114KB regressions on 64-bit and ~64KB regression on 32-bit.  I'm not sure how big each atom is.
Is there a way to diff the actual about:memory reports between the two builds involved?
Flags: needinfo?(igoldan)
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #3)
> Is there a way to diff the actual about:memory reports between the two
> builds involved?

I am not aware of a method for that. Maybe :erahm can help us with that?
Flags: needinfo?(igoldan) → needinfo?(erahm)
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #3)
> Is there a way to diff the actual about:memory reports between the two
> builds involved?

Download the memory reports from the artifact list of the AWSY jobs, open about:memory, click "Load and diff...", select the two reports.
(In reply to Kris Maglione [:kmag] from comment #5)
> (In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #3)
> > Is there a way to diff the actual about:memory reports between the two
> > builds involved?
> 
> Download the memory reports from the artifact list of the AWSY jobs, open
> about:memory, click "Load and diff...", select the two reports.

More specifically on treeherder: select the 'ab' job, navigate to the 'Job Details' panel, download the 'memory-report-TabsOpenForceGC-0.json.gz' about:memory report.

> Web Content (pid NNN)
> Explicit Allocations 
> 1.12 MB (100.0%) -- explicit
> ├──0.80 MB (71.08%) ++ heap-overhead
> ├──0.36 MB (31.87%) -- js-non-window
> │  ├──0.23 MB (20.82%) -- runtime
> │  │  ├──0.21 MB (18.73%) ── atoms-table [5]
> │  │  └──0.02 MB (02.09%) ── atoms-mark-bitmaps [5]
> │  ├──0.12 MB (11.06%) -- zones/zone(0xNNN)
> │  │  ├──0.12 MB (10.41%) -- strings/string(<non-notable strings>)
> │  │  │  ├──0.11 MB (09.63%) ── gc-heap/latin1 [10]
> │  │  │  └──0.01 MB (00.78%) ── malloc-heap/latin1 [10]
> │  │  └──0.01 MB (00.65%) -- (6 tiny)
> │  │     ├──0.01 MB (00.72%) ── gc-heap-arena-admin [5]
> │  │     ├──-0.01 MB (-00.57%) ── sundries/gc-heap [15]
> │  │     ├──0.00 MB (00.37%) ── unused-gc-things [15]
> │  │     ├──0.00 MB (00.11%) ++ realm([System Principal], shared JSM global)
> │  │     ├──0.00 MB (00.02%) ++ shapes
> │  │     └──0.00 MB (00.00%) ── object-groups/gc-heap [5]

> Other Measurements 
> 0.11 MB (100.0%) -- js-main-runtime-gc-heap-committed
> ├──0.11 MB (96.36%) -- used
> │  ├──0.11 MB (94.89%) -- gc-things
> │  │  ├──0.11 MB (95.29%) ── strings [5]
Flags: needinfo?(erahm)
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #1)
> My initial guess is that something in the test harness is enumerating system
> globals and hence now resolving more stuff.  But it's hard to tell for sure
> because I can't find any instructions for running this test locally...

The 'Content Memshrink Measurements' doc [1] has details. To run locally use:
./mach awsy-test testing/awsy/awsy/test_base_memory_usage.py

[1] https://docs.google.com/document/d/1i3BMYUMC2eDAtJbcTLWibB5PGQlFs8UdUZ3QVlplZsQ/edit?usp=sharing
Thank you.  So for the treeherder linux64-opt build, I see the following increases:

* 250KB page-cache
* 110KB latin1 strings
* 210KB atoms table

Given that,  I suspect that the id init stuff is it, so we should fix bug 1501124.
Sadly, can't add the dependency because of dep loops.
See Also: → 1501124
I've thrown together a patch on bug 1501124
Flags: needinfo?(nika)
The fix for bug 1501124 reduces the atoms table by about 200KB and latin1 strings by ~110KB.  So I expect it  should fix this regression.

Looking at the graph at https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-inbound,1684808,1,4&selected=mozilla-inbound,1684808,395151,618631241,4 it looks like things went down there too.  So I suspect this is fixed.
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #11)
> The fix for bug 1501124 reduces the atoms table by about 200KB and latin1
> strings by ~110KB.  So I expect it  should fix this regression.
> 
> Looking at the graph at
> https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-inbound,
> 1684808,1,4&selected=mozilla-inbound,1684808,395151,618631241,4 it looks
> like things went down there too.  So I suspect this is fixed.

Yup.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Assignee: nobody → nika
Target Milestone: --- → mozilla65

I confirm these regressions got fixed:

== Change summary for alert #17166 (as of Thu, 25 Oct 2018 09:56:06 GMT) ==

Improvements:

2% Base Content JS linux64 opt stylo 5,306,918.00 -> 5,186,726.67
2% Base Content JS osx-10-10 opt stylo 5,293,401.33 -> 5,177,417.33
2% Base Content JS linux64-qr opt stylo 5,281,841.33 -> 5,174,032.00
2% Base Content JS windows10-64-qr opt stylo 5,332,058.67 -> 5,231,868.00
2% Base Content JS windows10-64 opt stylo 5,332,021.33 -> 5,231,836.00
2% Base Content JS windows7-32 opt stylo 4,267,976.00 -> 4,198,521.33

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=17166

Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.