Closed Bug 1348134 Opened 7 years ago Closed 7 years ago

ScriptSource compression is often wasteful

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Performance Impact high
Tracking Status
firefox55 --- fixed

People

(Reporter: shu, Assigned: shu)

References

Details

Attachments

(3 files, 4 obsolete files)

ScriptSource compression interacts poorly with LazyScripts. Currently, we eagerly fire off an off-thread CompressionTask and block on it finishing after completing parsing. If there are any LazyScripts tied to the ScriptSource (the common case), those LazyScripts will have to decompress.

Compressing eagerly hurts responsiveness, doubly so when the frontend also bets wrong on the LazyScript, where the LazyScript is immediately or almost immediately executed.

ISTM compression should be done on GC, not eagerly, and optimize for responsiveness up front. Let GC hysteresis optimize for memory.

This came up when investigating a 6% speedup in octane code-load when the shell is run with --no-threads.
I also discovered while looking at this that off-thread parsed scripts are never compressed.
Not sure who to best review this; giving to Jon since of GC interaction.
Attachment #8848675 - Flags: review?(jcoppeard)
Oops, uploaded patches in wrong order. Reuploading in right order.
Attachment #8848676 - Flags: review?(jcoppeard)
Attachment #8848675 - Attachment is obsolete: true
Attachment #8848675 - Flags: review?(jcoppeard)
This is awesome.

Would it make sense to have maxCompressionThreads return a number smaller than threadCount? To make sure a shrinking GC is not going to kick off a large number of compression tasks that would keep the helper threads from doing other GC/compilation work.
(In reply to Shu-yu Guo [:shu] from comment #0)
> ScriptSource compression interacts poorly with LazyScripts. Currently, we
> eagerly fire off an off-thread CompressionTask and block on it finishing
> after completing parsing.

I think the problem is blocking, not the fact that we attempt to compress.

(In reply to Shu-yu Guo [:shu] from comment #1)
> I also discovered while looking at this that off-thread parsed scripts are
> never compressed.

I actually recall the opposite, except when running under rr where none were compressed.

(In reply to Shu-yu Guo [:shu] from comment #4)
> Created attachment 8848677 [details] [diff] [review]
> Handle compression tasks only with shrinking GC; attach finished tasks with
> non-urgent interrupt.

Would it make sense to trigger this sync when we are encoding bytecode as well, as we also want to limit the number of bytes which have to be transfered out of the bytecode cache instead of saving the double-size source.
Flags: needinfo?(shu)
Attachment #8848676 - Flags: review?(jcoppeard) → review+
Comment on attachment 8848677 [details] [diff] [review]
Handle compression tasks only with shrinking GC; attach finished tasks with non-urgent interrupt.

Review of attachment 8848677 [details] [diff] [review]:
-----------------------------------------------------------------

I like the idea of doing compression on GC rather than eagarly, but I've got a couple of questions about this.

Do I understand correctly that these compression tasks are created on script creation and persit until the next shrinking GC?  Shrinking GCs happen relatively rarely, after some amount of user inactivity or on memory pressure.  Because of this it seems that the uncompressed script source could outlive the script itself perhaps by a considerable margin.

Would it be better to kick off compression tasks after a normal GC instead?

Also, maybe we could start the compression at the end of GC and ignore compression tasks for scripts that have died (I'm thinking by checking the reference count on the ScriptSource).  And looking at that, I think we need to add a reference to the ScriptSource while it's held by the compression task.
(In reply to Jon Coppeard (:jonco) from comment #7)
> Comment on attachment 8848677 [details] [diff] [review]
> Handle compression tasks only with shrinking GC; attach finished tasks with
> non-urgent interrupt.
> 
> Review of attachment 8848677 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I like the idea of doing compression on GC rather than eagarly, but I've got
> a couple of questions about this.
> 
> Do I understand correctly that these compression tasks are created on script
> creation and persit until the next shrinking GC?  Shrinking GCs happen
> relatively rarely, after some amount of user inactivity or on memory
> pressure.  Because of this it seems that the uncompressed script source
> could outlive the script itself perhaps by a considerable margin.

You understand correctly, and that's a great catch. I should be removing any enqueued ScriptSources when their refcounts go to 0.

> 
> Would it be better to kick off compression tasks after a normal GC instead?
> 
> Also, maybe we could start the compression at the end of GC and ignore
> compression tasks for scripts that have died (I'm thinking by checking the
> reference count on the ScriptSource).  And looking at that, I think we need
> to add a reference to the ScriptSource while it's held by the compression
> task.

Second, many JSScripts share the same ScriptSource. I did intend to have the compression trigger be "either idle or there's memory pressure". I want to err on the side of more responsiveness, which is why I chose shrinking GC.

I don't know how to best compare the tradeoffs between every shrinking GC and every normal, major GC. Any ideas?
Flags: needinfo?(shu)
(In reply to Shu-yu Guo [:shu] from comment #8)
> I did intend to have the
> compression trigger be "either idle or there's memory pressure". I want to
> err on the side of more responsiveness, which is why I chose shrinking GC.

In that case shrinking GC sounds OK, as long as you know it's probably common for this not to happen at all in normal use (the default inactivity timeout is 5 minutes).

> I don't know how to best compare the tradeoffs between every shrinking GC
> and every normal, major GC. Any ideas?

Since the compression happens off thread I guess the tradeoff is mainly how much memory we save vs. how often we need to decompress and how long that takes.  I suspect that compressing on normal GC will improve the first and not make much different to the latter, but I don't know without testing.

Memory-wise you could compare builds on areweslimyet to see if it makes a noticeable difference there.
(In reply to Nicolas B. Pierron [:nbp] from comment #6)
> (In reply to Shu-yu Guo [:shu] from comment #0)
> > ScriptSource compression interacts poorly with LazyScripts. Currently, we
> > eagerly fire off an off-thread CompressionTask and block on it finishing
> > after completing parsing.
> 
> I think the problem is blocking, not the fact that we attempt to compress.

For responsiveness the problem is that all delazification now need to block on decompression. The blocking during bytecode compilation is almost never a responsiveness problem since compression finishes (in parallel) much faster than parsing itself.
(In reply to Jan de Mooij [:jandem] from comment #5)
> This is awesome.
> 
> Would it make sense to have maxCompressionThreads return a number smaller
> than threadCount? To make sure a shrinking GC is not going to kick off a
> large number of compression tasks that would keep the helper threads from
> doing other GC/compilation work.

Yeah, I should also change maxCompressionThreads to use less threads. The patch currently defers to higher-priority Ion stuff after every compression task to not hold up more important work.

After talking with Jon I'll try to see how doing it per major GC instead of shrinking GC works out, to make the work more incremental.
Comment on attachment 8848677 [details] [diff] [review]
Handle compression tasks only with shrinking GC; attach finished tasks with non-urgent interrupt.

Review of attachment 8848677 [details] [diff] [review]:
-----------------------------------------------------------------

Cancelling r? while working on new version.
Attachment #8848677 - Flags: review?(jcoppeard)
As discussed, issues addressed:

- Refcount the ScriptSources
- Start compressing on every major GC instead of shrinking GC
- Limit to 1 compression thread
Attachment #8853140 - Flags: review?(jcoppeard)
Attachment #8848677 - Attachment is obsolete: true
Comment on attachment 8853140 [details] [diff] [review]
Handle compression tasks only with major GC; attach finished tasks with non-urgent interrupt.

Cancelling review again because it's not dealing with off-thread compiles correctly.
Attachment #8853140 - Flags: review?(jcoppeard)
Now with fixes for off-thread compiles.

SourceCompressionTasks will hold on to the JSRuntime (ScriptSources are
malloced and use immutable strings from the JSRuntime-wide immutable string
table, so they're tied to JSRuntimes in that sense). When an off-thread
compression finishes and is ready to be attached, a non-urgent interrupt is
requested on the runtime's active context (if any). If there is no active
context, nothing more is done. I *think* this is okay, because it'll just end
up waiting until the next interrupt to attach.
Attachment #8853194 - Flags: review?(jcoppeard)
Attachment #8853140 - Attachment is obsolete: true
Comment on attachment 8853194 [details] [diff] [review]
Handle compression tasks only with major GC; attach finished tasks with non-urgent interrupt.

Review of attachment 8853194 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the updates.

::: js/src/vm/HelperThreads.h
@@ +92,5 @@
> +    // SourceCompressionTasks are enqueued by parsing but are not processed
> +    // until a major GC occurs. When all the enqueued tasks are finished,
> +    // non-urgent interrupts are requested on the triggering contexts and this
> +    // is reset to false.
> +    bool handleCompressionTasks_;

It sounds like it would be possible for processing of all compression tasks to take longer than the first gc slice.  This means that any new scripts created could have their source compressed immediately if handleCompressionTasks_ was still true when we returned to the mutator.

@@ +681,5 @@
>  extern bool
>  OffThreadParsingMustWaitForGC(JSRuntime* rt);
>  
> +// Compression tasks are heap allocated and enqueued in setSourceCopy. The
> +// tasks themselves are not processed until a shrinking GC occurs. Upon

s/shrinking// here.
Attachment #8853194 - Flags: review?(jcoppeard) → review+
Sorry for the churn. Jon's review comment made me realize I'm better off
handling all aspects of source compression as part of the GC instead of
triggering non-urgent interrupts.

The schema used in this version is in the block comment above
SourceCompressionTask.

Steve, since Jon's away on PTO next week, could you review?
Attachment #8853592 - Flags: review?(sphink)
Attachment #8853194 - Attachment is obsolete: true
Comment on attachment 8853592 [details] [diff] [review]
Handle compression tasks only with major GC; attach finished tasks with non-urgent interrupt.

Review of attachment 8853592 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/HelperThreads.cpp
@@ +1776,5 @@
> +        SourceCompressionTask* task = list[i];
> +        if (task->runtimeMatches(runtime))
> +            js_delete(task);
> +    }
> +    list.clear();

Oops, this is wrong. I'll fix this to use HelperThreadState().remove.
Assignee: nobody → shu
Here's 10x CodeLoad after/before numbers:

Deferred compression:
CodeLoad: 17904
CodeLoad: 17857
CodeLoad: 17418
CodeLoad: 17842
CodeLoad: 17383
CodeLoad: 17847
CodeLoad: 17931
CodeLoad: 17833
CodeLoad: 17867
CodeLoad: 17663
CodeLoad: 17866

Eager compression:
CodeLoad: 16412
CodeLoad: 16395
CodeLoad: 16543
CodeLoad: 16438
CodeLoad: 16455
CodeLoad: 16309
CodeLoad: 16498
CodeLoad: 16529
CodeLoad: 16579
CodeLoad: 16181
CodeLoad: 16381
Whiteboard: [qf]
Comment on attachment 8853592 [details] [diff] [review]
Handle compression tasks only with major GC; attach finished tasks with non-urgent interrupt.

Review of attachment 8853592 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsscript.h
@@ +658,5 @@
>          if (ss)
>              ss->decref();
>          ss = newss;
> +        if (ss)
> +            ss->incref();

Wow... How was this null-check missing before? Was that my fault? Probably :-/

::: js/src/vm/HelperThreads.cpp
@@ +929,5 @@
>          return 1;
> +
> +    // Compression is triggered on major GCs to compress ScriptSources. It is
> +    // considered low priority work.
> +    return 1;

This function is always returning 1 now, perhaps it should do that without the branching.
(In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #20)
> Comment on attachment 8853592 [details] [diff] [review]
> Handle compression tasks only with major GC; attach finished tasks with
> non-urgent interrupt.
> 
> Review of attachment 8853592 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/vm/HelperThreads.cpp
> @@ +929,5 @@
> >          return 1;
> > +
> > +    // Compression is triggered on major GCs to compress ScriptSources. It is
> > +    // considered low priority work.
> > +    return 1;
> 
> This function is always returning 1 now, perhaps it should do that without
> the branching.

Ah yeah, good catch.
Whiteboard: [qf] → [qf:p1]
Comment on attachment 8853592 [details] [diff] [review]
Handle compression tasks only with major GC; attach finished tasks with non-urgent interrupt.

Review of attachment 8853592 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the fix to remove only entries matching the runtime, as you pointed out.

::: js/src/jsscript.h
@@ +485,5 @@
>          MOZ_ASSERT(refs != 0);
> +        --refs;
> +        if (refs == 1 && compressionTask_) {
> +            // If the source is enqueued to be compressed, the compression
> +            // tasks holds on to the ScriptSource. But this isn't a real use:

*task

@@ +658,5 @@
>          if (ss)
>              ss->decref();
>          ss = newss;
> +        if (ss)
> +            ss->incref();

Whoa. If this function is going to be decref'ing before incref'ing, then there should be a MOZ_ASSERT(ss != newss) at the top. Otherwise, you could either get a UAF crash or you could cancel the compression of a live ScriptSource.

But it seems like it'd be better to just incref first.

::: js/src/vm/HelperThreads.cpp
@@ +1785,5 @@
> +{
> +    AutoLockHelperThreadState lock;
> +
> +    if (!HelperThreadState().threads)
> +        return;

I found this confusing until I realized it meant the helper thread state is uninitialized. (It looked like it was checking that there were zero threads, and I worried that a thread dying might put you into this state or something.)

Now that I know what's going on, I'm not sure if it rates a comment or not.
Attachment #8853592 - Flags: review?(sphink) → review+
(In reply to Steve Fink [:sfink] [:s:] from comment #22)
> Comment on attachment 8853592 [details] [diff] [review]
> Handle compression tasks only with major GC; attach finished tasks with
> non-urgent interrupt.
> 
> Review of attachment 8853592 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me with the fix to remove only entries matching the runtime, as you
> pointed out.
> 
> 
> @@ +658,5 @@
> >          if (ss)
> >              ss->decref();
> >          ss = newss;
> > +        if (ss)
> > +            ss->incref();
> 
> Whoa. If this function is going to be decref'ing before incref'ing, then
> there should be a MOZ_ASSERT(ss != newss) at the top. Otherwise, you could
> either get a UAF crash or you could cancel the compression of a live
> ScriptSource.
> 
> But it seems like it'd be better to just incref first.
> 
Dang, good call.
One thing this patch doesn't do is attempt to compress XDR-decoded sources. Should do that as a followup.
Found another bug during testing -- the bug is in the comments in the patch.
Basically we need a Rooted for passing out char pointers to the uncompressed
buffer, because it could get freed out from under us during GC during an
interrupt.
Attachment #8856347 - Flags: review?(jcoppeard)
Attachment #8856347 - Flags: review?(jcoppeard) → review+
Pushed by shu@rfrn.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5f6017c9156d
Stop eagerly compressing ScriptSources. (r=jonco)
https://hg.mozilla.org/integration/mozilla-inbound/rev/536a322e2985
Handle compression tasks with major GCs instead of eagerly. (r=sfink,jonco)
https://hg.mozilla.org/integration/mozilla-inbound/rev/1dc3aef24017
Pin chars returned from ScriptSource as an analog to Rooted. (r=jonco)
Pushed by shu@rfrn.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/53236678bcde
Stop eagerly compressing ScriptSources. (r=jonco)
https://hg.mozilla.org/integration/mozilla-inbound/rev/c7955d6d0759
Handle compression tasks with major GCs instead of eagerly. (r=sfink,jonco)
https://hg.mozilla.org/integration/mozilla-inbound/rev/2ff285174c16
Pin chars returned from ScriptSource as an analog to Rooted. (r=jonco)
These patches improved Octane-CodeLoad by 10% on the Windows machine on AWFY, 30% on Mac and 67% on Android.
Slower machines are more affected by the change, maybe?
Depends on: 1357012
Performance Impact: --- → P1
Whiteboard: [qf:p1]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: