Closed Bug 1304390 Opened 4 years ago Closed 4 years ago

Compress/decompress script sources in chunks


(Core :: JavaScript Engine, defect)

Not set



Tracking Status
firefox52 --- fixed


(Reporter: jandem, Assigned: jandem)




(1 file, 1 obsolete file)

Attached patch Patch (obsolete) — Splinter Review
JS files can be large, not only for asm.js stuff but Gmail, Google Docs, Outlook, etc all use some JS files that are at least 1 MB. Because we inflate to two-byte chars, that's at least 2-4 MB internally.

For these huge files, I've seen decompression times like 5-15 ms, and that's on a very fast machine. What's worse, we often decompress the same files repeatedly (when delazifying functions for instance). We have the UncompressedSourceCache, but it's purged on GC.

The attached patch changes the compressor to do a full flush (Z_FULL_FLUSH) after compressing 64 KB chunks of data. At this point we record the offset (in the compressed buffer), and these offsets then let us decompress individual 64 KB chunks.

Doing a full flush after each 64 KB chunk makes the compressed data slightly larger, but the difference is small. I tried a number of JS files (jQuery, some huge Gmail/Gdocs JS files) and with 64 KB chunks the size difference is 1-3%. With 32 KB chunks, it's 4-5% so that's more significant. These numbers include the chunk offsets we have to save.

The UncompressedSourceCache now caches individual chunks. When the region we're interested in spans multiple chunks, we do a lookup for each chunk and copy the parts we need into a new temporary buffer. This case is pretty rare.

It seems to work well in practice. On the Emscripten ammo.js benchmark [1] for instance, we currently decompress 3.8 MB of data each time I click "go!". With the patch I'm attaching, we decompress just 1 or 2 64 KB chunks \o/

What's nice about this is that even if we do have to decompress all chunks, it's still less janky because the main thread now gets a chance to run JS, fire interrupt callbacks, etc instead of being stuck in a single inflate() call for a long time.

Finally, this patch reverts the 3rd patch in bug 1219098: really huge files (> 5 MB or so) are currently decompressed only once (and then we replace the compressed source). Now that we can decompress in chunks, this should no longer be necessary and we can keep the compressed source.

Attachment #8793320 - Flags: review?(luke)
Attached patch PatchSplinter Review
This fixes an issue caught by Valgrind: we left some padding bytes uninitialized and that's not really an issue, except that the ImmutableStringsCache hashes these bytes. This patch just initializes these bytes to zero.
Attachment #8793320 - Attachment is obsolete: true
Attachment #8793320 - Flags: review?(luke)
Attachment #8793364 - Flags: review?(luke)
(In reply to Jan de Mooij [:jandem] from comment #0)
> I tried a number of JS files (jQuery,
> some huge Gmail/Gdocs JS files) and with 64 KB chunks the size difference is
> 1-3%. With 32 KB chunks, it's 4-5% so that's more significant. These numbers
> include the chunk offsets we have to save.

Here's a graph of these numbers, FWIW:

It shows that compression helps a lot, even on minified code, and the difference between No_chunks and 64_KB_Chunks is very small.
Great work, jandem :)
Comment on attachment 8793364 [details] [diff] [review]

Review of attachment 8793364 [details] [diff] [review]:

Really nice patch, very easy to read; great work!  So good to finally get this!

::: js/src/jsscript.cpp
@@ +1589,5 @@
> +    // Determine which chunk(s) we are interested in. Note that this also
> +    // converts firstByte and lastByte to offsets within the chunk.
> +    size_t firstChunk, lastChunk;
> +    Compressor::toChunkOffset(&firstByte, &firstChunk);
> +    Compressor::toChunkOffset(&lastByte, &lastChunk);

I'd change the toChunkOffset interface to have signature (size_t uncompressedOffset, size_t* chunk, size_t* chunkOffset) so then it's pretty easy to see the dataflow and no need to comment.

@@ -1565,5 @@
> -            // relazification, this can happen repeatedly, so conservatively go
> -            // back to storing the data uncompressed to avoid wasting too much
> -            // time yo-yoing back and forth between compressed and uncompressed.
> -            const size_t HUGE_SCRIPT = 5 * 1024 * 1024;
> -            if (lengthWithNull > HUGE_SCRIPT) {

On a side note, and definitely for another bug: I was looking at the other remaining limitations in ScriptSource::setSourceCopy and I bet we can remove the cpuCount>1 limitation: most devices have >1 cores, so we're probably not being benchmarked on them, and achieving the huge memory savings on what are probably old/weak devices seems more important.

@@ -1778,5 @@
>            case Compressor::OOM:
>              return OOM;
>          }
>      }
> -    size_t compressedBytes = comp.outWritten();

Could you remove outWritten() so there's no ambiguity in the interface as to which to use?  I see there is one other use in the MOREOUTPUT case; it looks suspicious given that it is using equality, not >= and I don't expect that to ever be a problem in practice given the nature of text, so I'd guess it could be removed.

::: js/src/jsscript.h
@@ +502,5 @@
> +    // |*begin + len|. If the chars are compressed, we may decompress only
> +    // certain chunks, so in that case |*begin| is adjusted to be relative to
> +    // the start of the returned string.
> +    const char16_t* chars(JSContext* cx, UncompressedSourceCache::AutoHoldEntry& asp,
> +                          size_t* begin, size_t len);

IIUC, all callers end up adding the returned char16_t* to *begin to achieve the desired pointer to the substring, so could we simplify the interface here and make begin/len input/value parameters and have the return value be the pointer to the substring?

::: js/src/vm/Compression.cpp
@@ +118,3 @@
>      outbytes += zs.next_out - oldout;
> +    currentChunkSize += zs.next_in - oldin;
> +    MOZ_ASSERT(currentChunkSize <= CHUNK_SIZE);

Could you also add a general assert that matches chunkSize() up with the current state after deflate()?

@@ +145,5 @@
> +Compressor::totalBytesNeeded(uint32_t* bytes) const
> +{
> +    mozilla::CheckedInt<uint32_t> checkedBytes = AlignBytes(outbytes, sizeof(uint32_t));
> +    checkedBytes += sizeOfChunkOffsets();
> +    if (!checkedBytes.isValid())

Could you return a size_t?  Then this method could be infallible, iiuc.

::: js/src/vm/Compression.h
@@ +92,5 @@
>  bool DecompressString(const unsigned char* inp, size_t inplen,
>                        unsigned char* out, size_t outlen);
> +bool DecompressStringChunk(const unsigned char* inp, size_t chunk,
> +                           unsigned char* out, size_t outlen);

Could you add a comment for this function describing parameters and usage?
Attachment #8793364 - Flags: review?(luke) → review+
Thanks for the review, great suggestions. I'll update the patch.
Pushed by
Compress/decompress script sources in chunks. r=luke
This was a 1.5% (x64) and 2.1% (x86) win on Kraken on AWFY. 6-8% on imaging-darkroom and some smaller wins on other tests. These tests use JS files with huge arrays, and we no longer have to decompress all that :)
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Depends on: 1306963
You need to log in before you can comment on or make changes to this bug.