Closed Bug 686805 Opened 13 years ago Closed 12 years ago

Do incremental zip decompression in our linker

Categories

(Core :: mozglue, defect)

ARM
Android
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla13

People

(Reporter: taras.mozilla, Assigned: glandium)

References

Details

(Keywords: feature, perf, Whiteboard: [Snappy:p1])

Attachments

(7 files, 18 obsolete files)

3.15 KB, patch
taras.mozilla
: review+
Details | Diff | Splinter Review
11.60 KB, patch
glandium
: review+
Details | Diff | Splinter Review
9.38 KB, patch
taras.mozilla
: review+
Details | Diff | Splinter Review
22.20 KB, patch
taras.mozilla
: review+
Details | Diff | Splinter Review
9.14 KB, patch
glandium
: review+
Details | Diff | Splinter Review
3.92 KB, patch
glandium
: review+
Details | Diff | Splinter Review
4.10 KB, patch
glandium
: review+
Details | Diff | Splinter Review
In order to improve startup responsiveness we should decompress libxul incrementally(via segfault handler)
Depends on: 686801
Keywords: perf
Blocks: 674579
Assignee: nobody → mh+mozilla
Target Milestone: --- → mozilla9
Version: unspecified → Trunk
Target Milestone: mozilla9 → ---
Whiteboard: [Snappy]
Whiteboard: [Snappy] → [Snappy:p1]
Depends on: 683127
Taras, yell if this is the wrong place for this patch.

Adds concurrent segfault based library decompression to Mike Hommey's
new runtime linker patch in bug 683127.  Is very much a WIP and is 
based on an older version of Mike's linker (27 Dec).
A program to generate seekable compressed streams for use by the
linker mod in previous patch.  Not integrated into the build system.
Uses a block size of 16KB at the moment.
(In reply to Julian Seward from comment #1)
> Created attachment 586035 [details] [diff] [review]
> Adds concurrent decompression of seekable-compressed libraries (WIP)

Some additional comments:

For each library in seekable-compressed-stream (SZS) format, it creates a
child thread to decompress the library at the time it is dlopened.  The
child thread decompresses the library in ascending order of address space
(ie, sequentially), but cooperates with a segfault handler to detect cases
where firefox (or the linker) accesses pages of the library that have not
yet been decompressed.  In such cases it gives priority to these out-of-order
blocks and decompresses them immediately, so as to service the fault as
quickly as possible.  Then goes back to sequential decompression.

This succeeds in loading and starting Firefox on x86_64-linux.

It is though pretty hairy:

* if anybody else installs a segfault handler, we are hosed: segfaults
  resulting from accesses to not-yet-decompressed pages are not handled
  by the decompress mechanism, and generally cause Firefox to exit early.

* if Firefox forks, we are hosed: only the thread that calls fork exists
  in the child process.  In particular, all the decompress threads are missing,
  which means any segfaults that happen in the child can never lead to page
  decompression, and the child hangs, sometimes leading to the parent hanging too.

* there may be other unknown instability; I am unclear how reliable this is.

In addition, it seriously messes with our toolery:

* because the libraries are not loaded in a "standard" way, GDB and
  Valgrind do not generate sane stack traces as they cannot find the
  relevant debuginfo objects.  This makes debugging difficult.

* for reasons unknown to me, even though this appears to work OK on the
  command line, when run under GDB it segfaults.

* it breaks Valgrind (it is unusable) since V's JIT takes segfaults
  when jitting application code on as-yet decompressed pages, and V has as yet
  no way to route such segfaults back to the appropriate handler.  This means
  it is impossible to check for memory corruption (V/Memcheck) or exit-time
  races (V/Helgrind) right now.
This is a kludge to make it work on desktop Linux.  A better solution w.r.t.
the conflict between segfault handlers is needed.
OS: Mac OS X → Android
Hardware: x86 → ARM
Keywords: feature
This integrates the program in the build system, and C++ize it a bit.
Attachment #586037 - Attachment is obsolete: true
Depends on: 720704
Attachment #591396 - Attachment is obsolete: true
Attachment #591396 - Flags: review?
Attachment #591443 - Flags: review? → review?(taras.mozilla)
Attachment #591397 - Attachment is obsolete: true
Attachment #591397 - Flags: review?
Attachment #591443 - Flags: review?(taras.mozilla) → review+
Comment on attachment 591450 [details] [diff] [review]
part 2 - Use a SIGSEGV signal handler to handle segmentation faults happening in loaded libraries address space

rubberstamp
Attachment #591450 - Flags: review?(taras.mozilla) → review+
Comment on attachment 591450 [details] [diff] [review]
part 2 - Use a SIGSEGV signal handler to handle segmentation faults happening in loaded libraries address space

Sadly, this patch breaks the crash reporter on Android. Apparently, the dump process gets a SIGILL, which brings it back in the signal handler, and dead locks on a mutex in breakpad.
(In reply to Mike Hommey [:glandium] from comment #12)
> Sadly, this patch breaks the crash reporter on Android. Apparently, the dump
> process gets a SIGILL, which brings it back in the signal handler, and dead
> locks on a mutex in breakpad.

It was in fact a bug in breakpad.
Depends on: 721321
Attachment #591450 - Flags: review?(jseward)
Attachment #591450 - Flags: review?(jseward) → review+
This essentially changes one thing:

-      elf->mappable->prefetch(info->si_addr);
-      return;
+      if (elf->mappable->prefetch(info->si_addr))
+        return;

This allows to rethrow the segmentation fault if the linker couldn't
handle it, instead of deadlocking.
Attachment #591450 - Attachment is obsolete: true
Attached patch WIP part 3 & 4 (obsolete) — Splinter Review
This combines what will be part 3 and 4 once everything is cleaned up (especially comments, some don't even match what's happening anymore). This started off "Rebase against bug 683127 patches of 6 Jan (apply second)", replaced some types, used some things from Utils.h, made it more C++ish, and removed the background decompression, which in the end, was not a win. Decompression is now within the handler as discussed on IRC.
At this point, I think the remaining changes are mainly cosmetic, so I'm seeking for pre-review, here.
I'll push to try to get a build for people to test on different devices.
Attachment #586959 - Attachment is obsolete: true
Attachment #586961 - Attachment is obsolete: true
Attachment #589491 - Attachment is obsolete: true
Attachment #591871 - Flags: feedback?(jseward)
Ah, one thing I want to change is the mprotect in MappableSZ::handle_fault, to make it apply on everything that is decompressed instead of just the page that faulted. And I want to add an equivalent to MappableExtractFile (which would extract the stream in a file, completely, and then map the file, for the debug intent)
Comment on attachment 591871 [details] [diff] [review]
WIP part 3 & 4

Seems fine.  Some minor comments:

+      } else if (s.GetType() == Zip::Stream::STORE) {
+        mappable = MappableSZ::Create(name, zip, &s);

This logic assumes that DEFLATE vs STORE is a reliable indication
of non-seekable vs seekable zip.  I guess that's OK since it's
very unlikely that a compress attempt for a seekable zip would
find it to be so uncompressable that it would up being STOREd.


+SZStream::Init (void* ptr)

General comment: I think there may be unaligned accesses when
fetching bits of the seekable zstream.  Is that OK?  I think they
are ok on all important archs (x86, x86_64, armv6 and later)


+   strm.avail_in  = 999999; // FIXME

This assumes that the (un)compressed block size <= 999999;
I suppose that's OK.


+// This is the core of the thing.  It contains the segfault handler
+// and the decompression function.  These run in separate threads.

(eg) some of the comments are a bit out of date now, considering there
is no separate decompress thread.
Attachment #591871 - Flags: feedback?(jseward) → feedback+
(In reply to Julian Seward from comment #18)
> Comment on attachment 591871 [details] [diff] [review]
> WIP part 3 & 4
> 
> Seems fine.  Some minor comments:
> 
> +      } else if (s.GetType() == Zip::Stream::STORE) {
> +        mappable = MappableSZ::Create(name, zip, &s);
> 
> This logic assumes that DEFLATE vs STORE is a reliable indication
> of non-seekable vs seekable zip.  I guess that's OK since it's
> very unlikely that a compress attempt for a seekable zip would
> find it to be so uncompressable that it would up being STOREd.

Seekable zips are *not* meant to be DEFLATEd, and plain libraries are *not* meant to be STOREd. Both cases won't work, they're supported by no code. They will gracefully fail as well, since the ::Create methods will properly fail.

> +SZStream::Init (void* ptr)
> 
> General comment: I think there may be unaligned accesses when
> fetching bits of the seekable zstream.  Is that OK?  I think they
> are ok on all important archs (x86, x86_64, armv6 and later)

The le_uint32 type used in SZSHeader is automagically read properly.

> 
> 
> +   strm.avail_in  = 999999; // FIXME
> 
> This assumes that the (un)compressed block size <= 999999;
> I suppose that's OK.

I guess we could use avail_out * 2 to make it more ok.

> +// This is the core of the thing.  It contains the segfault handler
> +// and the decompression function.  These run in separate threads.
> 
> (eg) some of the comments are a bit out of date now, considering there
> is no separate decompress thread.

Yeah, that's what I wrote in comment 15, a lot of the comments need to be changed :)
Comment on attachment 591776 [details] [diff] [review]
part 2 - Use a SIGSEGV signal handler to handle segmentation faults happening in loaded libraries address space

carrying over r+, which i had forgotten to do.
Attachment #591776 - Flags: review+
Depends on: 717540
Component: Widget: Android → mozglue
QA Contact: android → mozglue
Attachment #595750 - Flags: review?(jseward)
Attachment #595750 - Flags: review?(taras.mozilla)
Attachment #595751 - Flags: review?(jseward)
Attachment #595750 - Attachment is obsolete: true
Attachment #595750 - Flags: review?(taras.mozilla)
Attachment #595750 - Flags: review?(jseward)
Attachment #595751 - Flags: review?(taras.mozilla)
Comment on attachment 595751 [details] [diff] [review]
part 3 - Tool to generate seekable compressed streams

I am going to assume the le_to_cpu stuff just works.


+  if (argc != 3 || !argv[1] || !argv[2] || (strcmp(argv[1], argv[2]) == 0)) {
+    char *lastSlash = rindex(argv[0], '/');
+    log("usage: %s file_to_compress out_file",
+                    lastSlash ? lastSlash + 1 : argv[0]);
+    return 1;
+  }

Really? Is it so bad to display a relative/absolute filename in an error?

+  struct stat st;
+  int ret = fstat(unzF, &st);
+  if (ret == -1) {
+    log("Couldn't seek %s: %s", argv[1], strerror(errno));
+    return 1;
+  }

wrong error message

+  /* Give enough room for the header and the offset table, and map them */
+  ftruncate(outF, zOff)

In theory fallocate is more efficient for this.


The variable naming in this patch is terrible:szsh(can we use zHeader or something?), SZSHeader(don't use abbrevs in type names please), s/outF/outFD/, strm(we aren't Egyptians, vowels are encouraged).

+  static const uint32_t magic; // = 0x7a5a6553;
+                                /* 'S' 'e' 'Z' - Seekable Zstream
+                                 * 'z'         - chunks compressed by zlib. */
+
Are we inventing a new zip stream format? ie is this a custom new magic?


+#pragma pack(1) <-- is this supposed to compile on windows, if not I prefer to use the GNU packed attribute on the class.
Attachment #595751 - Flags: review?(taras.mozilla) → review-
(In reply to Taras Glek (:taras) from comment #23)
> The variable naming in this patch is terrible:szsh(can we use zHeader or
> something?), SZSHeader(don't use abbrevs in type names please),
> s/outF/outFD/, strm(we aren't Egyptians, vowels are encouraged).

Looks like I didn't change enough variable names from Julian's original implementation. :)

> +  static const uint32_t magic; // = 0x7a5a6553;
> +                                /* 'S' 'e' 'Z' - Seekable Zstream
> +                                 * 'z'         - chunks compressed by zlib.
> */
> +
> Are we inventing a new zip stream format? ie is this a custom new magic?

It's just a file format that happens to start with a magic number, and there is code to handle that in Zip.h, so I though it was better to reuse it.
 
> +#pragma pack(1) <-- is this supposed to compile on windows, if not I prefer
> to use the GNU packed attribute on the class.

It's not supposed to compile on windows now, but may some day. Zip.h also uses pragma pack, it makes things consistent at the faulty.lib level.
Comment on attachment 595751 [details] [diff] [review]
part 3 - Tool to generate seekable compressed streams

Looks ok from a functional point of view.  Minor paranoia: in the
chunk compression loop, maybe assert that we didn't overrun the
output buffer for any reason?  Since it is sized as 2 * CHUNK we
should never get anywhere near zero for strm.avail_out.

+    ret = deflate(&strm, Z_FINISH);
+    MOZ_ASSERT(ret == Z_STREAM_END);
     MOZ_ASSERT(strm.avail_out > 0);
Attachment #595751 - Flags: review?(jseward) → review+
(In reply to Mike Hommey [:glandium] from comment #24)
> (In reply to Taras Glek (:taras) from comment #23)
> > The variable naming in this patch is terrible:szsh(can we use zHeader or
> > something?), SZSHeader(don't use abbrevs in type names please),
> > s/outF/outFD/, strm(we aren't Egyptians, vowels are encouraged).
> 
> Looks like I didn't change enough variable names from Julian's original
> implementation. :)

I suspected that, please change them. Whatever usual naming convention you use is ok.
> 
> > +  static const uint32_t magic; // = 0x7a5a6553;
> > +                                /* 'S' 'e' 'Z' - Seekable Zstream
> > +                                 * 'z'         - chunks compressed by zlib.
> > */
> > +
> > Are we inventing a new zip stream format? ie is this a custom new magic?
> 
> It's just a file format that happens to start with a magic number, and there
> is code to handle that in Zip.h, so I though it was better to reuse it.

I was curious if this was a brand new magic number. sounds like a yes.
>  
> > +#pragma pack(1) <-- is this supposed to compile on windows, if not I prefer
> > to use the GNU packed attribute on the class.
> 
> It's not supposed to compile on windows now, but may some day. Zip.h also
> uses pragma pack, it makes things consistent at the faulty.lib level.

ok.
Attachment #591776 - Attachment is obsolete: true
Comment on attachment 597338 [details] [diff] [review]
part 2 - Use a SIGSEGV signal handler to handle segmentation faults happening in loaded libraries address space

Carrying over r+
Attachment #597338 - Flags: review+
This is quite different from the previous iterations. While jseward's implementation was meant to be generic and may have been used for mostly anything besides the linker, this version takes some shortcuts due to what we know about libraries. It basically makes the same assumptions as the one made in MappableDeflate.
There is also an implementation of full extraction for the debug intent.
Attachment #597340 - Flags: review?(jseward)
Attachment #591871 - Attachment is obsolete: true
Attachment #597340 - Flags: review?(taras.mozilla)
This doesn't enable building with seekable compressed streams by default. One needs to export SZIP_LIBRARIES=1 in .mozconfig in order to actually enable it.
Attachment #597349 - Flags: review?(blassey.bugs)
Attachment #595751 - Attachment is obsolete: true
Comment on attachment 597340 [details] [diff] [review]
part 4 - Make the linker load libraries with on-demand decompression when they are seekable compressed streams

What is it that MappableSeekableZStream::mutex protects?  
It appears to protect ::chunkAvail, but shouldn't it also protect
::lazyMaps ?  I don't see what stops
MappableSeekableZStream::{mmap,munmap} messing with ::lazyMaps whilst
MappableSeekableZStream::ensure reads it in a different thread.



+   * When we fault in the remaining pages of that chunk, we want to decompress
+   * the complete chunk again */

we want -> we don't want ?



+#if defined(ANDROID) && defined(__arm__)
+    if (map->prot & PROT_EXEC) {
+      /* We just extracted data that may be executed in the future.
+       * We thus need to ensure Instruction and Data cache coherency. */

isn't this really just "#if defined(__arm__)" ?  Although I guess the
cacheflush call calling conventions etc might be android specific.
(In reply to Julian Seward from comment #32)
> Comment on attachment 597340 [details] [diff] [review]
> part 4 - Make the linker load libraries with on-demand decompression when
> they are seekable compressed streams
> 
> What is it that MappableSeekableZStream::mutex protects?  
> It appears to protect ::chunkAvail, but shouldn't it also protect
> ::lazyMaps ?  I don't see what stops
> MappableSeekableZStream::{mmap,munmap} messing with ::lazyMaps whilst
> MappableSeekableZStream::ensure reads it in a different thread.

What stops it from happening is the way the linker works. This is one of the shortcuts we can safely take in a non-generic implementation.

> +   * When we fault in the remaining pages of that chunk, we want to
> decompress
> +   * the complete chunk again */
> 
> we want -> we don't want ?

we want. If we didn't, we'd end up with nothing between 4k and 16k (which in android libxul.so case would mean a corrupted DT_HASH table).

> +#if defined(ANDROID) && defined(__arm__)
> +    if (map->prot & PROT_EXEC) {
> +      /* We just extracted data that may be executed in the future.
> +       * We thus need to ensure Instruction and Data cache coherency. */
> 
> isn't this really just "#if defined(__arm__)" ?  Although I guess the
> cacheflush call calling conventions etc might be android specific.

The cacheflush function is android specific, and since the code for the corresponding cache flush is not there...
Comment on attachment 597340 [details] [diff] [review]
part 4 - Make the linker load libraries with on-demand decompression when they are seekable compressed streams

> > What is it that MappableSeekableZStream::mutex protects?  
> > It appears to protect ::chunkAvail, but shouldn't it also protect
> > ::lazyMaps ?  I don't see what stops
> > MappableSeekableZStream::{mmap,munmap} messing with ::lazyMaps whilst
> > MappableSeekableZStream::ensure reads it in a different thread.
> 
> What stops it from happening is the way the linker works. This is one of the shortcuts we can safely take > in a non-generic implementation.

Maybe add a comment explaining how the linker's design makes this
safe, for the benefit of future visitors to this code.
Attachment #597340 - Flags: review?(jseward) → review+
Attachment #597389 - Attachment is obsolete: true
Attachment #597389 - Flags: review?(taras.mozilla)
Attachment #597393 - Flags: review?(taras.mozilla)
Comment on attachment 597393 [details] [diff] [review]
part 7 - Hook linker stats in Startup Timeline

For the build bits.
Attachment #597393 - Flags: review?(khuey)
Attachment #597339 - Flags: review?(taras.mozilla) → review+
Comment on attachment 597349 [details] [diff] [review]
part 5 - Optionally make libxul.so a seekable compressed stream on Android

khuey would be a better reviewer
Attachment #597349 - Flags: review?(blassey.bugs) → review?(khuey)
Comment on attachment 597340 [details] [diff] [review]
part 4 - Make the linker load libraries with on-demand decompression when they are seekable compressed streams

This looks nice, not much room for improvements, thus r+. 

+    debug("%s: Calling function @%p", GetPath(), ptr);
     f.func();

leftover debug code?

MappableExtractFile::Create <-- worried about logic duplication here(OO would be safer). could lead to badness in the future. According to IRC this code will go away in a future iteration.


+  size_t chunkEnd = chunkStart + length;
+  std::vector<LazyMap>::iterator it;
+  for (it = map; it < lazyMaps.end(); ++it) {
+    if (chunkEnd <= it->offset + it->length)
+      break;
+  }
+  if ((it == lazyMaps.end()) || (chunkEnd > it->offset + it->length)) {
+    /* The mapping "it" points at now is past the interesting one */
+    --it;
+    length = it->offset + it->length - chunkStart;
+  }

it->offset + it->length used multiple times... Use a casted ->end()..or add a new utility accessor?

i think LazyMap could also use a contains() method 


I would prefer raii mutex usage.

Please comment finalize/ensure in Mappable.h.
Attachment #597340 - Flags: review?(taras.mozilla) → review+
Comment on attachment 597391 [details] [diff] [review]
part 6 - Add functions to display stats about seekable compressed streams

+  for (size_t i = 0, j = 1; i < nEntries; i++, j++) {
+    map[j] = chunkAvail[i] ? '*' : '_';
+    if ((j == len) || (i == nEntries - 1)) {
+      map[j + 1] = ']';
+      map[j + 2] = '\0';
+      debug("%s", static_cast<char *>(map));
+      j = 0;
+    }
+  }

I think this is missing a "break;" to avoid weird overwriting due to j being reset to 0.
Attachment #597391 - Flags: review?(taras.mozilla) → review+
Comment on attachment 597393 [details] [diff] [review]
part 7 - Hook linker stats in Startup Timeline

I expected this patch to be the inverse:ie expose some of the linker timestamps via telemetry.
Attachment #597393 - Flags: review?(taras.mozilla) → review+
(In reply to Taras Glek (:taras) from comment #41)
> I think this is missing a "break;" to avoid weird overwriting due to j being
> reset to 0.

Overwriting is the wanted effect.
(In reply to Taras Glek (:taras) from comment #40)
> Comment on attachment 597340 [details] [diff] [review]
> part 4 - Make the linker load libraries with on-demand decompression when
> they are seekable compressed streams
> 
> This looks nice, not much room for improvements, thus r+. 
> 
> +    debug("%s: Calling function @%p", GetPath(), ptr);
>      f.func();
> 
> leftover debug code?

mmm yes and no. This probably doesn't have its place here, and as such it is a leftover, on the other hand, it's a useful debug output that should have been here in the first place.

> Please comment finalize/ensure in Mappable.h.

They are described in Mappable, where they are inherited from. Isn't that enough?
Comment on attachment 597349 [details] [diff] [review]
part 5 - Optionally make libxul.so a seekable compressed stream on Android

Review of attachment 597349 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/mozapps/installer/packager.mk
@@ +308,5 @@
>    removed-files \
>    recommended-addons.json \
>    $(NULL)
>  
> +ifdef SZIP_LIBRARIES

Is this supposed to be ifndef?
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #45)
> Comment on attachment 597349 [details] [diff] [review]
> part 5 - Optionally make libxul.so a seekable compressed stream on Android
> 
> Review of attachment 597349 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/mozapps/installer/packager.mk
> @@ +308,5 @@
> >    removed-files \
> >    recommended-addons.json \
> >    $(NULL)
> >  
> > +ifdef SZIP_LIBRARIES
> 
> Is this supposed to be ifndef?

it's supposed to be ifdef. Maybe using the same variable is confusing though. Would you rather I use a different variable?
(In reply to Mike Hommey [:glandium] from comment #44)

> > Please comment finalize/ensure in Mappable.h.
> 
> They are described in Mappable, where they are inherited from. Isn't that
> enough?

it's enough.
Attachment #597340 - Attachment is obsolete: true
Attachment #597391 - Attachment is obsolete: true
Comment on attachment 597770 [details] [diff] [review]
part 6 - Add functions to display stats about seekable compressed streams

Carrying over r+
Attachment #597770 - Flags: review+
(In reply to Taras Glek (:taras) from comment #42)
> Comment on attachment 597393 [details] [diff] [review]
> part 7 - Hook linker stats in Startup Timeline
> 
> I expected this patch to be the inverse:ie expose some of the linker
> timestamps via telemetry.

I'll change the commit message when landing.
Attachment #597349 - Attachment is obsolete: true
Attachment #597349 - Flags: review?(khuey)
Adding MOZ_ to the config variable, per ted suggestion on irc.
Attachment #597809 - Attachment is obsolete: true
Comment on attachment 597810 [details] [diff] [review]
part 5 - Optionally make libxul.so a seekable compressed stream on Android

Carrying over r+.
Attachment #597810 - Flags: review+
Attachment #597393 - Attachment is obsolete: true
Comment on attachment 597841 [details] [diff] [review]
part 7 - Display linker stats on each Startup Timeline event

Carrying r+ over.
Attachment #597841 - Flags: review+
Comment on attachment 597769 [details] [diff] [review]
part 4 - Make the linker load libraries with on-demand decompression when they are seekable compressed streams


+  if (inflate(&zStream, (length == chunkLen) ? Z_FINISH : Z_SYNC_FLUSH)
+      != (length == chunkLen) ? Z_STREAM_END : Z_OK) {

bit ugly.

Sorry, I forgot to click submit on friday :(
Attachment #597769 - Flags: review?(taras.mozilla) → review+
Blocks: 729604
Depends on: 729596
Blocks: 729641
Blocks: 748488
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: