Closed Bug 1273962 Opened 4 years ago Closed 3 years ago

Regressions on massive-poppler-warm-preparation

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: h4writer, Unassigned)

References

Details

Requestion need-info. The first regression is possibly:
Luke Wagner <luke@mozilla.com> Bug 1219098 - Reenable compression on large sources, but revert to uncompressed if decompression happens (r=jandem)

The other one (I think) is:
Nick Fitzgerald <fitzgen@gmail.com> Bug 1211723 and 1260570 - Share JS source text between JSRuntimes; r=jimb

Is this regression important to look into? Both patches seem important? And I don't know how important the benchmark is. Can you enlighten us about how bad regressing that benchmark is?

(Full massive score: https://arewefastyet.com/#machine=31&view=single&suite=massive&start=1461581127&end=1462989360)
Flags: needinfo?(luke)
Bug 1219098 is expected and devs can fix it by compiling with --separate-asm.  Alon, do you suppose we could get some new builds for awfy that have --separate-asm?

Nick: is this just the slowdown from hashing the huge script source?  Can you perhaps confirm that there's nothing pathologically wrong here?
Flags: needinfo?(luke) → needinfo?(nfitzgerald)
Are there docs for running this test locally? Under just the js shell?
Luke: Sure, I'll get separate-asm builds for awfy.

Nick: Massive is a browser-only test, so no shell option. But you can run it locally, just clone https://github.com/kripken/Massive and load the html. The main page has docs for how to run just part of the suite.
Blocks: 1211723
Flags: needinfo?(nfitzgerald)
Flags: needinfo?(nfitzgerald)
Actually it turns out to not be feasible to make --separate-asm builds for Massive. Massive doesn't use the normal code loading mechanisms, it does everything itself so that it can add hooks in various places and manipulate things (in order to ensure a cold load, in order to measure responsiveness during specific parts of load, etc.). So a normal --separate-asm build can't work here, we'd need special new hooks. Writing new hooks requires to carefully determine that we are measuring what we want (2 xhrs instead of 1 might alter things, etc.), in the new loading model, and we might need to remove some benchmarks that are no longer possible or change what they mean. Another concern is that the intrusive hooks means we can't just use normal builds, which would make it hard to update.

Overall I think we just need a rethinking of the entire benchmark, which perhaps we want to do anyhow for wasm.
Is higher or lower better?

On master, I get 1457. With my patch queue in bug 1269451, I get 1729.

I guess lower, based on the graphs in comment 0.
Flags: needinfo?(nfitzgerald)
All the normalized scores reported are higher is better (see table header). If you are looking at internal scores somehow (in the console, they are viewable as part of the JSON messages I think), then it varies.
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #6)
> Is higher or lower better?
> 
> On master, I get 1457. With my patch queue in bug 1269451, I get 1729.


I estimate that 1729 would improve the situation, but not fully fix the regression. Given a base of 1457 as master, I would expect a 2000 to fully fix the "Share JS source text between JSRuntimes" regression. Though this is an estimate. I'm not sure.

> I guess lower, based on the graphs in comment 0.
;
The graphs are "lower on the graph is better", but that means on that particular graph: higher scores is better
There are a few more things I can do to help speed up the shared sources. But first: I'm getting frustratingly generic mochitest failures with that patch queue, so I'm trying to pick it apart and land the pieces I can individually. Once I pick it apart and start landing things, I will circle back and try the next set of optimizations. I just really don't want to stay in my current state where I have a bunch of patches r+'d and most pretty much ready to land but not quite all of them and therefore end up not landing any of them.
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #9)
> There are a few more things I can do to help speed up the shared sources.
> But first: I'm getting frustratingly generic mochitest failures with that
> patch queue, so I'm trying to pick it apart and land the pieces I can
> individually. Once I pick it apart and start landing things, I will circle
> back and try the next set of optimizations. I just really don't want to stay
> in my current state where I have a bunch of patches r+'d and most pretty
> much ready to land but not quite all of them and therefore end up not
> landing any of them.

Sure understandable. Seems there are improvements coming and the patches are important enough to temporarily have a regression in order to move forward. Luke can you give an indication if it is a viable road with this regression? Seems we are still the fastest browser on that benchmark by far, even with this regression.
Flags: needinfo?(luke)
FWIW, the mochitests failures I was seeing are from us running windows in a VM and the copy/paste clipboard not working correctly. Apparently, this is "known" but treeherder doesn't have any way to star it, suggest it, or anything else... *sigh*

So I landed that initial queue.

Remaining low hanging fruit:

* Land the patch for de-duping of source text off main thread in the SourceCompressionTask, from bug 1269451.

* Don't just construct lookups without holding the lock, eagerly hash strings before we take the lock. (Needs bug).

* Only hash the first and last N characters of source text. (Needs bug).
Yes, that sounds reasonable.  Thanks a lot Nick for your diligence on this whole project.  Could you perhaps post again when the intended patches have landed so we can see what the final delta is?  I agree that we're in a position where we can trade one thing for another as long as we understand that it's necessary.
Flags: needinfo?(luke)
Ok all low hanging fruit have landed on inbound.
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #13)
> Ok all low hanging fruit have landed on inbound.

And now they are all on m-c too.
Nice work, looking at awfy, it looks like those patches won back the perf from bug 1211723 and bug 1260570.
\o/ can we close this bug then?
Well there's still the regression from bug 1219098 and I had an idea that maybe there's a better way to solve that bug that doesn't have this big synchronous decompression jank.  So I'd like to keep it open for now and see if I can do that.
Is your idea for solving this more feasible now that bug 1304390 brought us chunked source compression, by any chance?
Flags: needinfo?(luke)
It is wholly subsumed and obviated by bug 1304390 :)
Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(luke)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.