Open Bug 1391710 Opened 2 years ago Updated 6 months ago

Nightly still has 2.2 times poorer JS perfomance than Chrome on Mozilla Log Analyzer

Categories

(Core :: JavaScript Engine, defect, P3)

defect

Tracking

()

Tracking Status
firefox57 --- affected

People

(Reporter: mayhemer, Unassigned, NeedInfo)

References

(Depends on 1 open bug, )

Details

(Keywords: perf, Whiteboard: [qf:p3])

STR:
- download and unpack https://janbambas.cz/dwnlds/js-perf-test-logs.tar.xz
- go to https://janbambas.cz/moz/logan/
- load all log files from the xzip

Then observe the load performance (there is a blue load progress bar appearing at the top.)  

Nightly is 2.5 slower than Chrome (Version 60.0.3112.101 (Official Build) (64-bit)).

Reporting as this can be a good test case for performance improvement optimizations.
Would you be able to follow the steps in this MDN article? https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Reporting_a_Performance_Problem

If you link the "Share" URL those steps produces here, we can see exactly what Firefox is doing on your machine.
c1
Flags: needinfo?(honzab.moz)
(In reply to Sean Stangl [:sstangl] from comment #1)
> Would you be able to follow the steps in this MDN article?
> https://developer.mozilla.org/en-US/docs/Mozilla/Performance/
> Reporting_a_Performance_Problem
> 
> If you link the "Share" URL those steps produces here, we can see exactly
> what Firefox is doing on your machine.

From 57.0a1 (2017-08-22) (64-bit)

https://perfht.ml/2vYncWH

Loaded logan, started the profiler, dropped the files, let it be processed by logan and then captured the profile.

Nightly: 33s, Chrome: 10s.
Flags: needinfo?(honzab.moz) → needinfo?(sstangl)
Keywords: perf
And a clean profile (default preferences)
Priority: -- → P3
Whiteboard: [qf:p3]
Any news? :)
Honza, I often get an exception (just "error") reported to the console. This happens at random but it's very frequent - it makes it hard to profile this. Any idea what's going on there?
(In reply to Jan de Mooij [:jandem] from comment #6)
> Honza, I often get an exception (just "error") reported to the console. 

When you do what?  Is there a location side by it?

> This
> happens at random but it's very frequent - it makes it hard to profile this.
> Any idea what's going on there?

No.
Ah!  I unpacked the files and tested in Firefox (using only Chrome for logan these day - because of this bug).  Yes, I can see the error message with no data added.  Will look into it shortly.

You could let me know about this sooner, so we could move on.
(In reply to Honza Bambas (:mayhemer) from comment #8)
> You could let me know about this sooner, so we could move on.

I didn't know about this before I commented - din't try it until today.
(In reply to Jan de Mooij [:jandem] from comment #9)
> (In reply to Honza Bambas (:mayhemer) from comment #8)
> > You could let me know about this sooner, so we could move on.
> 
> I didn't know about this before I commented - din't try it until today.

Aha!  Yeah, this used to work recently.

Hmm.. so, I dug this down to NS_ERROR_DOM_FILE_NOT_READABLE_ERR, which is quite surprising...  Looks like we have a bug in DOM exposed file/blob reader that emerged only recently.

Filed bug 1408397.  It's sad it's blocking this bug investigation now :(
Depends on: 1408397
Flags: needinfo?(sstangl)
Jan, the blocker has been fixed.  You can now test again.  Thanks.
Flags: needinfo?(jdemooij)
So, any news? :)
I looked into this a bit. We spent a lot of time (> 30%) in nursery GCs. Then there's some slowness related to |delete| (please don't use that operator if you can avoid it, it's slow). I'll look into the GC stuff today.
We have a ton of large arrays (length > 600,000) when I load just the mozilla-FX2o.log file. Tenuring them is very slow.

Honza, can you show me where in the code we create these arrays?
Actually it seems to be a single array so it's probably from the WholeCell StoreBuffer.

I'll do some more digging.
Depends on: 1415119
The patch in bug 1415119 improves this noticeably. We're still ~2x slower than Chrome on just the mozilla-FX2o.log file but it used to be about 3x slower for me.
Depends on: 1415124
Thanks Jan!  To answer the (associative) array question, I never investigate exactly which is the largest one, but I assume there is one I add to and remove from quite often - an array keeping every "live" object (from ctor to dtor) being currently tracked in the log (as it's being initially parsed), mapped by the pointer value:

window.logan._proc.objs

+ https://github.com/mayhemer/logan/blame/master/logan.js#L660

- https://github.com/mayhemer/logan/blame/master/logan.js#L345
  https://github.com/mayhemer/logan/blame/master/logan.js#L351

removal happens with delete.
Depends on: 1415161
Flags: needinfo?(jdemooij)
(In reply to Jan de Mooij [:jandem] from comment #16)
> The patch in bug 1415119 improves this noticeably. We're still ~2x slower
> than Chrome on just the mozilla-FX2o.log file but it used to be about 3x
> slower for me.

It was down to 1.6x slower a few days ago. I think the remaining difference is due to lexical environment stuff (for let/const) being slower than necessary (we're working on that) and the page allocates so much memory that we switch to non-incremental GCs and that involves waiting for background threads to finish. That's also something I expect to improve soonish.
(In reply to Jan de Mooij [:jandem] from comment #18)
> (In reply to Jan de Mooij [:jandem] from comment #16)
> > The patch in bug 1415119 improves this noticeably. We're still ~2x slower
> > than Chrome on just the mozilla-FX2o.log file but it used to be about 3x
> > slower for me.
> 
> It was down to 1.6x slower a few days ago. 

Yes, same measurements on my side, great improvement!  Thanks.

> I think the remaining difference
> is due to lexical environment stuff (for let/const) being slower than
> necessary (we're working on that) 

bug#?

> and the page allocates so much memory that

truth.

> we switch to non-incremental GCs and that involves waiting for background
> threads to finish. That's also something I expect to improve soonish.

again, bug#?  note that chrome has similar issues (I can see the blue progress bar getting stuck from time to time, apparently GC is happening).

I started a rewrite how logan saves the captured lines (currently we keep all the stuff in memory).  maybe that will lower the memory allocation amount.  

do you want me to expose the current state of logan somewhere in case you want to test with it?
Depends on: 1416180
(In reply to Honza Bambas (:mayhemer) from comment #19)
> > I think the remaining difference
> > is due to lexical environment stuff (for let/const) being slower than
> > necessary (we're working on that) 
> 
> bug#?

Bug 1341937.

> again, bug#?  note that chrome has similar issues (I can see the blue
> progress bar getting stuck from time to time, apparently GC is happening).

Filed bug 1416180.

> do you want me to expose the current state of logan somewhere in case you
> want to test with it?

Maybe attach as zip file to this bug? Also a copy of the test data maybe. It's always annoying when you get back to a bug later and URLs are dead.
(In reply to Jan de Mooij [:jandem] from comment #20)

Thanks.

> Maybe attach as zip file to this bug? Also a copy of the test data maybe.
> It's always annoying when you get back to a bug later and URLs are dead.

I've set URL to https://github.com/mayhemer/logan/archive/9db48e3be176d6dd41276db3a809104d6f614ffe.zip which is a snapshot of the state as it was when I filed this bug (and before the latest large changes influencing perf have landed).  Just unzip and load from file:///, no need to host it.

The example log set at https://janbambas.cz/dwnlds/js-perf-test-logs.tar.xz will remain there.  I'm under control of that site.

Rechecked that current nightly is ~1.6 slower than chrome using these data.
Summary: 2.5 poorer JS perfomance than Chrome on Mozilla Log Analyzer → 2.5 (update: 1.6) poorer JS perfomance than Chrome on Mozilla Log Analyzer
Depends on: 1419785
(In reply to Honza Bambas (:mayhemer) from comment #19)
> 
> do you want me to expose the current state of logan somewhere in case you
> want to test with it?

Yes please.  I'm about to start some testing, I'll use the current version, if I hear back from you I can use your preserved version.
(In reply to Paul Bone [:pbone] from comment #22)
> (In reply to Honza Bambas (:mayhemer) from comment #19)
> > 
> > do you want me to expose the current state of logan somewhere in case you
> > want to test with it?
> 
> Yes please.  I'm about to start some testing, I'll use the current version,
> if I hear back from you I can use your preserved version.

So, for convenience I unpacked the zip referred in the URL field, which is the older version of logan I used to originally report this bug, to:

https://janbambas.cz/moz/bug1391710/
With the up to date version of logan at https://janbambas.cz/moz/logan and some ~200MB log I can see the following:

Fx Nightly 61: ~26-29 seconds to parse
Fx Release 59.0.02: ~25 seconds to parse
Chr 65.0.3325.181: ~13-14 seconds to parse

all x64.
Summary: 2.5 (update: 1.6) poorer JS perfomance than Chrome on Mozilla Log Analyzer → again, Nightly has 2 times poorer JS perfomance than Chrome on Mozilla Log Analyzer
Tested again after some time with the current (up to date) Nightly x64 and Chrome (Version 70.0.3538.77 (Official Build) (64-bit)).

Arbitrary set of logs, Firefox: 35 seconds, Chrome: 16 seconds to load.



To defend Firefox, when I load a really huge (1GB) log, Chrome usually crashes the tab, we still stand!  When Chrome happen to survive, and I start expanding number of lines of long-living objects, Chrome becomes soon useless in scrolling and modifying the DOM, while Firefox remains mostly fluent!

Hence, despite the slow load times I start preferring Firefox for large logs over Chrome.  Simply because it works :)
Summary: again, Nightly has 2 times poorer JS perfomance than Chrome on Mozilla Log Analyzer → Nightly still has 2.2 times poorer JS perfomance than Chrome on Mozilla Log Analyzer
For performance testing, particularly cases like this with a lot of GC Minor activity, try setting JSGC_DISABLE_POISONING=1 in your environment.  This will disable some extra work that Nightly versions of Firefox do that helps us find bugs.
(In reply to Paul Bone [:pbone] from comment #26)
> For performance testing, particularly cases like this with a lot of GC Minor
> activity, try setting JSGC_DISABLE_POISONING=1 in your environment.  This
> will disable some extra work that Nightly versions of Firefox do that helps
> us find bugs.

Will try (ni>me)

I also tried the release build (63.0.1): 31 seconds instead of 35.  It's still around ~2 slower.
Flags: needinfo?(honzab.moz)
Please capture a Gecko/perf.html profile where you're at it.

HI for myself to attempt to reproduce this.

Flags: needinfo?(pbone)

stalled on my side, sorry.

Flags: needinfo?(honzab.moz)
You need to log in before you can comment on or make changes to this bug.