1530212 - Loading https://doc.servo.org/smallvec/trait.Array.html spends multiple seconds in parsing and GC during page load

This page is loading a multiple MB script which contains a single array with tons of literals and a few variables, which prevent us from using the fast-path for JSOP_OBJECT creation, but go through the slow-path of generating JSOP_INITELEM.

var a = 1;
var b = 2;
var arr = [1,2,3,4,5,6,7,8,9,0, a ,1,2,3,4,5,6,7,8,9,0, b ,1,2,3,4,5,6,7,8,9,0];

2 potential approaches:

It sounds that this is the kind of values where we could use the constant propagation of the parser to figure out the a & b variables.
Another option is to initialize the JSOP_OBJECT case, and only emit JSOP_INITELEM for the few variables cases which are detected while parsing.

Jason, does any sounds like something easy to do?

status-firefox68: --- → fix-optional

Flags: needinfo?(jorendorff)

Priority: -- → P2

Jason Orendorff [:jorendorff]

Comment 4

•

5 years ago

I don't see a quick hit here. Naive constant-propagation would be unsound. If there's already a global readonly property a, var a = 1; does not change its value.

Flags: needinfo?(jorendorff)

Jason Orendorff [:jorendorff]

Comment 5

•

5 years ago

•

Edited

Well, the first step is to avoid GC. So:

Is the frontend triggering GC, by making a lot of garbage? (I sort of doubt it, but if so, something like comment 3 suggested approach 2 is the only way.)
If not, is there any realistic way to avoid GC?

This style of code rapidly consumes a huge amount of memory but generates no garbage at run time (that is, all of the arrays are ultimately stored in a global variable, and thus rooted). But it would be unsound to assume that the code doesn't generate garbage.

jonco, any ideas?

Flags: needinfo?(jcoppeard)

Denis Palmeiro [:denispal]

Updated

•

5 years ago

Blocks: 1551299

Jason Orendorff [:jorendorff]

Comment 6

•

5 years ago

The profile in comment 1 contains an 18-second event processing delay, of which:

4.0sec or so in script parsing
4.5sec or so in script emit
5.5sec in GC
0.2sec other execution

So if we can eliminate the long GC pause, the next idea is to move script parsing and emitting to a thread. This probably wouldn't reduce total time from start to DOMContentLoaded, but I think it would keep the page responsive while we're waiting.

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

5 years ago

Depends on: 1543806

Jon Coppeard (:jonco)

Comment 7

•

5 years ago

This is happening while executing JS, so I don't think it's related to bug 1543806.

We're allocating all this data into the nursery and then tenuring it when we collect the nursery. It would be great if we could detect this situation and allocated this data in the tenured heap to start with. Pre-tenuring is based off the ObjectGroup, and I don't know how this works for literal objects like we have here. Do they all end up with separate groups? If so I don't see how to avoid this.

Looking at the profiles, we're seeing huge minor GC times. But digging in I see things like taking 40ms to allocate or clear a page of memory, which makes me think that maybe the system is swapping at this point? I don't understand why this is taking so long.

No longer depends on: 1543806

Jason Orendorff [:jorendorff]

Comment 8

•

5 years ago

#jsapi discussion indicates the spec for <script defer> allows off-main-thread parsing.


<jorendorff> bz: ping - I have a profile in bug 1530212 that raises
    questions about <script defer>

    bz: in particular, do we have to run the script synchronously as soon as
    parsing is complete?

    that is, what if we instead started js-parsing the script as soon as
    possible, then once html-parsing is complete and all deferred script
    js-parsing is complete, run them all and then fire DOMContentLoaded

    that seems indistinguishable from the JS scripts just taking a long time to
    load, from content's perspective

<bz> jorendorff: You are correct

    jorendorff: Bas and jesup had already run into us not yielding while
    executing defer scripts...

    jorendorff: And yes, we should be doing exactly what you suggest: kick off
    the parse once we have the data and there's nothing more urgent going on,
    and just run the scripts once both its parsing and the HTML parser is done

    ...

    https://html.spec.whatwg.org/multipage/parsing.html#the-end:list-of-scripts-that-will-execute-when-the-document-has-finished-parsing

    Spec explicitly says you don't wait until all defer scripts are ready to
    run the first one

    As long as you run them in order

    And you can do whatever in between; see step 3.1 in the steps I linked.

    jorendorff: To be clear, each individual script would run to completion

    jorendorff: We can just preempt between them

<smaug> we preempt if the next script hasn't been loaded

<jesup> Preempting if the script "ran too long" before running the next defer
    script seems totally reasonable.  SetTimeout has a similar mechanism to
    avoid monopolizing the event queue and blocking other things from happening

<smaug> yup

    and parsing/compiling more on background threads sounds good too, once we
    can do that while GC runs

<tcampbell> matt is making progress on that. It is looking reasonable so far

<smaug> great

Jason Orendorff [:jorendorff]

Comment 9

•

5 years ago

(In reply to Jon Coppeard (:jonco) from comment #7)

Looking at the profiles, we're seeing huge minor GC times. But digging in I see things like taking 40ms to allocate or clear a page of memory, which makes me think that maybe the system is swapping at this point? I don't understand why this is taking so long.

There's a lot I don't understand about these profiles.

In two of the three profiles, the samples don't come at regular intervals. (Unfortunately, the profiler UI still treats each sample as taking up just 1ms, which makes it hard to estimate how long things take.)

Jon Coppeard (:jonco)

Comment 10

•

5 years ago

(In reply to Jason Orendorff [:jorendorff] from comment #9)

In two of the three profiles, the samples don't come at regular intervals. (Unfortunately, the profiler UI still treats each sample as taking up just 1ms, which makes it hard to estimate how long things take.)

Oh thanks, that explains what I'm seeing. This also happens when I profile locally. It looks like I get one sample every ~6ms, with reported total running/self time correspondingly reduced.

I still don't know why we're getting these long minor GCs. When I run this code in the shell I can see long minor GCs of ~15ms to evict a full 16MB nursery. When run in the browser I can see e.g. a 672ms minor GC (!!). As far as I can see in the profile, nothing else is running though.

Jon Coppeard (:jonco)

Comment 11

•

5 years ago

Sounds like it might be the same as bug 1245213.

Comment 12

•

5 years ago

The claimed "where is the time spent" is quite different for this bug and bug 1245213, no?

Jon Coppeard (:jonco)

Comment 13

•

5 years ago

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #12)
That's true. But also the profiles didn't seem to make a lot of sense last I looked. It's possibly related anyway.

André Bargull [:anba]

Comment 14

•

5 years ago

There's also bug 1557990 (which was duped to bug 1245213), even though the profile in bug 1557990 doesn't show PCToLineNumber has the most time consuming function, but actually hash lookups in js::ObjectGroup::findAllocationSite.

And the profile in comment #1 also shows that 10% of the first six seconds are spent in js::StringToLowerCase, which may improve through bug 1564347.

demiobenour

Comment 15

•

4 years ago

Can we at least get the Servo website fixed?

Andrew Creskey [:acreskey]

Comment 16

•

3 years ago

We haven't been able to reproduce any performance issues on "https://doc.servo.org/smallvec/trait.Array.html" using Nightly 90.

We thought that perhaps the servo website had been fixed (and it may have been).
But from a look through archive.org, the 2017 version of the site is also free of multiple seconds of parsing /GC in today's Nightly.
https://web.archive.org/web/20170727093443/https://doc.servo.org/smallvec/trait.Array.html

Closing as WORKSFORME, but if anyone can reproduce this, please re-open.

Status: NEW → RESOLVED

Closed: 3 years ago

Resolution: --- → WORKSFORME

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

2 years ago

Performance Impact: --- → P1

Keywords: perf:pageload

Whiteboard: [qf:p1:pageload]

Bugzilla

Quick Search

Loading https://doc.servo.org/smallvec/trait.Array.html spends multiple seconds in parsing and GC during page load

Categories

(Core :: JavaScript Engine, defect, P2)

Tracking

()

People

(Reporter: emilio, Unassigned, NeedInfo)

References

(Blocks 1 open bug,
URL
)

Details

(Keywords: perf, perf:pageload)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Updated

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Updated