Closed Bug 683127 Opened 13 years ago Closed 12 years ago

New custom ELF linker

Categories

(Firefox Build System :: General, defect)

All
Linux
defect
Not set
normal

Tracking

(firefox11 fixed)

RESOLVED FIXED
mozilla12
Tracking Status
firefox11 --- fixed

People

(Reporter: glandium, Assigned: glandium)

References

Details

(Whiteboard: [inbound][qa-])

Attachments

(16 files, 41 obsolete files)

3.06 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
4.88 KB, patch
glandium
: review+
Details | Diff | Splinter Review
17.83 KB, patch
glandium
: review+
Details | Diff | Splinter Review
8.89 KB, patch
glandium
: review+
Details | Diff | Splinter Review
16.12 KB, patch
glandium
: review+
Details | Diff | Splinter Review
17.71 KB, patch
taras.mozilla
: review+
mwu
: review+
Details | Diff | Splinter Review
11.27 KB, patch
blassey
: review+
Details | Diff | Splinter Review
878 bytes, patch
khuey
: review+
Details | Diff | Splinter Review
14.32 KB, patch
glandium
: review+
Details | Diff | Splinter Review
43.43 KB, patch
glandium
: review+
Details | Diff | Splinter Review
25.73 KB, patch
glandium
: review+
Details | Diff | Splinter Review
15.70 KB, patch
glandium
: review+
Details | Diff | Splinter Review
17.96 KB, patch
glandium
: review+
Details | Diff | Splinter Review
16.36 KB, patch
glandium
: review+
Details | Diff | Splinter Review
12.94 KB, patch
glandium
: review+
Details | Diff | Splinter Review
887 bytes, patch
glandium
: review+
Details | Diff | Splinter Review
This is something I had spotted in my previous linker hacking, and it seems Google is working on binary incompatible changes to the linker structures that effectively make Firefox crash with it. The problem is that our linker uses data it gets from the system linker as if it were exactly in the format it knows, which assumes the system linker would always be binary compatible with whatever revision we forked from.

The binary incompatible changes are however not planned for Ice Cream Sandwich, but probably for the following release, which fortunately leaves us some time to get things right.
Blocks: 686282
Depends on: 706116
Assignee: nobody → mh+mozilla
Attached patch [WIP] Proof of concept (obsolete) — Splinter Review
This works on desktop x86, x86-64 and Android. On desktop builds, it breaks nss tools because nss requires the linker, but the nss tools don't provide it. I also expect the plugin-container to crash at startup because it actually is linked against the libraries (that is, it doesn't dlopen them).
On Android, the linker doesn't share memory between main and content processes like the current linker does, but considering incremental decompression is going to require serious changes to that anyways, it's not worth doing memory sharing until then. Plus, for the short term, mobile is not going to use a separate content process.
In both cases, it doesn't actually dlclose() properly, and many cases may lead to memory leaks, so it still needs some cleanup.
The nice thing is that despite it being completely not optimized (most notably, it doesn't try to reduce the amount of symbol resolution (it ends up doing symbols resolution several times in the same libraries), and it doesn't do dynamic symbol resolution for plt calls like the current linker does), it manages to be (slightly) faster than the current linker, probably because it doesn't load libfreebl and libsoftokn until actually required.
I think the ZipContent part needs serious changes to handle various cases (such as forcing decompression for the debug intent).
I forgot to mention, on android, the system linker's dlsym doesn't return values for defined weak symbols, which, unfortunately, we do required. Fortunately, it made me realize that we do need to handle __cxa_atexit for dlclose to work properly (the weak symbol we fail to dlsym is __aeabi_atexit, which is the ARM ABI equivalent). The PoC patch just defines an _aeabi_atexit dummy function that can be resolved at runtime (and breaking the shutdown path as aside effect).
I realize this is a WiP, but 
+void ElfLinker::SetZip(const char *aZip)
+{
+  zip = new Zip(aZip);
+}

let hope final version uses RAII or something more robust than above.
This also needs more comments.
(In reply to Taras Glek (:taras) from comment #3)
> I realize this is a WiP, but 
> +void ElfLinker::SetZip(const char *aZip)
> +{
> +  zip = new Zip(aZip);
> +}
> 
> let hope final version uses RAII or something more robust than above.

Yeah, initialization and shutdown paths are currently not handled very well. FWIW, I think it should initialize itself on some environment variables instead of using that SetZip.

(In reply to Taras Glek (:taras) from comment #4)
> This also needs more comments.

The lack of comments was deliberate, until I have something stable enough code-wise. Next iteration will be properly documented.
Component: Widget: Android → Widget
OS: Android → MeeGo
Summary: custom Android linker relies on system linker being binary compatible with previous versions → New custom ELF linker
Target Milestone: --- → mozilla9
Version: Trunk → 11 Branch
Depends on: 708567
Depends on: 708570
Component: Widget → Build Config
OS: MeeGo → Linux
QA Contact: android → build-config
Target Milestone: mozilla9 → ---
Component: Build Config → Widget
Depends on: 709776
OS: Linux → MeeGo
Target Milestone: --- → mozilla9
Attachment #580911 - Flags: review?(khuey)
Attachment #580910 - Flags: review? → review?(mozilla)
Component: Widget → Build Config
OS: MeeGo → Linux
Target Milestone: mozilla9 → ---
Version: 11 Branch → Trunk
Fixed an erroneous magic number
Attachment #580919 - Flags: review?
Attachment #580910 - Attachment is obsolete: true
Attachment #580910 - Flags: review?(mozilla)
Attachment #580920 - Flags: review?(khuey)
Attachment #580911 - Attachment is obsolete: true
Attachment #580911 - Flags: review?(khuey)
Oops
Attachment #580921 - Flags: review?(khuey)
Attachment #580920 - Attachment is obsolete: true
Attachment #580920 - Flags: review?(khuey)
End-of-day re-oops
Attachment #580924 - Flags: review?(khuey)
Attachment #580921 - Attachment is obsolete: true
Attachment #580921 - Flags: review?(khuey)
Attachment #580919 - Flags: review? → review?(mozilla)
Comment on attachment 580924 [details] [diff] [review]
part 2 - Build system glue for the new linker

Forget it, i'm tired.
Attachment #580924 - Flags: review?(khuey)
This should be the one.
Attachment #580929 - Flags: review?(khuey)
Attachment #580924 - Attachment is obsolete: true
A few modifications I had forgotten
Attachment #580953 - Flags: review?(mozilla)
Attachment #580919 - Attachment is obsolete: true
Attachment #580919 - Flags: review?(mozilla)
Comment on attachment 580953 [details] [diff] [review]
part 1 - Simple Zip reader for the new ELF Linker

Turns out this only works because our apk is a weird Zip. I'm going to fix the implementation to work with more standard Zips.
Attachment #580953 - Flags: review?(mozilla)
Attachment #580953 - Attachment is obsolete: true
Attached patch part 3 - Test for the Zip reader (obsolete) — Splinter Review
Attachment #581660 - Flags: review?
Attachment #581659 - Flags: review? → review?(taras.mozilla)
Attachment #581660 - Flags: review? → review?(taras.mozilla)
Attachment #581662 - Flags: review? → review?(taras.mozilla)
Attached patch part 3 - Test for the Zip reader (obsolete) — Splinter Review
Attachment #581904 - Flags: review?(taras.mozilla)
Attachment #581660 - Attachment is obsolete: true
Attachment #581660 - Flags: review?(taras.mozilla)
Attachment #581904 - Flags: review?(taras.mozilla) → review+
Comment on attachment 581659 [details] [diff] [review]
part 1 - Simple Zip reader for the new ELF Linker

>+    static const T *cast(const void *buf)
>+    {
>+      const T *ret = static_cast<const T *>(buf);
>+      if (ret->signature == T::magic)
>+        return ret;
>+      return NULL;
>+    }

I'd call this validate since it's not a cast
Comment on attachment 581659 [details] [diff] [review]
part 1 - Simple Zip reader for the new ELF Linker

There is very little bounds checking in this, but I guess we can trust the file we are executing from.
Have you measured a speedup from bypassing reading the central directory? Are the files actually laid out in the startup order in apk so the hot path is always used?

Seems ok given that this code only needs to be good enough to read the apk.
Attachment #581659 - Flags: review?(taras.mozilla) → review+
Attachment #581662 - Flags: review?(taras.mozilla) → review+
Comment on attachment 581659 [details] [diff] [review]
part 1 - Simple Zip reader for the new ELF Linker

Review of attachment 581659 [details] [diff] [review]:
-----------------------------------------------------------------

Looks mostly fine to me.

::: mozutils/linker/Zip.cpp
@@ +84,5 @@
> +{
> +  if (mapped != MAP_FAILED) {
> +    munmap(mapped, size);
> +    debug("Unmapped %s @%p", name, mapped);
> +    free(name);

name looks like it's leaked when mapped == MAP_FAILED.

@@ +113,5 @@
> +    out->type = static_cast<Stream::Type>(uint16_t(nextFile->compression));
> +
> +    /* Find the next Local File header. It is usually simply following the
> +     * compressed stream, but in cases where the 3rd bit of the generalFlag
> +     * is set, there is a Data Descriptor header before. */

Can you just test that flag instead of trying both options?

@@ +128,5 @@
> +   * Directory for the entry corresponding to the given path */
> +  if (!nextDir || !nextDir->GetName().Equals(path)) {
> +    const DirectoryEntry *entry = GetFirstEntry();
> +    debug("%s - Scan directory entries in search for %s", name, path);
> +    for (; entry && !entry->GetName().Equals(path); entry = entry->GetNext());

I think a while loop would look more normal, but hey, it's all the same..
Attachment #581659 - Flags: review+ → review?(taras.mozilla)
Comment on attachment 581659 [details] [diff] [review]
part 1 - Simple Zip reader for the new ELF Linker

bugzilla fail
Attachment #581659 - Flags: review?(taras.mozilla) → review+
Addressed mwu and taras' comments
Attachment #581659 - Attachment is obsolete: true
Refreshed with a few minor changes
Attachment #580929 - Attachment is obsolete: true
Comment on attachment 583143 [details] [diff] [review]
part 1 - Simple Zip reader for the new ELF Linker.

Carrying over r+
Attachment #583143 - Flags: review+
Comment on attachment 583144 [details] [diff] [review]
part 2 - Build system glue for the new linker.

Carrying over r+
Attachment #583144 - Flags: review+
Depends on: 712281
Depends on: 712579
This is meant to be folded with part 1, but is easier to review separately. This adds bookkeeping for Zip instances (will be used by the linker) and a RAII wrapper for file descriptors.
Attachment #584533 - Flags: review?
Attachment #584533 - Flags: review? → review?(taras.mozilla)
Depends on: 714029
Some changes in configure.in
Attachment #584737 - Flags: review?(khuey)
Attachment #583144 - Attachment is obsolete: true
The sXULLibHandle change could be ifdef MOZ_LINKER, but it actually doesn't hurt to change everywhere. I limited to GLIBC, though, because I'm not sure what other dlopen implementations do.
Attachment #584750 - Flags: review?(benjamin)
Attachment #584746 - Attachment is obsolete: true
Attachment #584746 - Flags: review?(taras.mozilla)
This is the main part of the linker. That is, the linker itself.
Attachment #584836 - Flags: review?(taras.mozilla)
Attachment #584750 - Flags: review?(benjamin) → review+
Attachment #584766 - Attachment is obsolete: true
Attachment #584766 - Flags: review?(taras.mozilla)
Attachment #584836 - Attachment is obsolete: true
Attachment #584836 - Flags: review?(taras.mozilla)
Attachment #584997 - Flags: review?(taras.mozilla)
This last part is not completely ready. The debug intent is not going to be very useful due to the lack of hook for gdb.
Please note that one problem that I know of that is not marked with a TODO in the patches is that fini_array functions are not called in reverse order as they ought to. Fortunately, the only library we have that has a fini_array has only one entry in it, so it's not a problem that is going to affect us. Most C++ code does destruction with functions registered with __cxa_atexit anyways.
Comment on attachment 584737 [details] [diff] [review]
part 2 - Build system glue for the new linker.

Review of attachment 584737 [details] [diff] [review]:
-----------------------------------------------------------------

I thought we didn't like system zlib?
Attachment #584737 - Flags: review?(khuey) → review+
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #39)
> Comment on attachment 584737 [details] [diff] [review]
> part 2 - Build system glue for the new linker.
> 
> Review of attachment 584737 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I thought we didn't like system zlib?

We already use it on android. And for the moment, the elf linker is there for debugging only on desktop, so in practice, zlib is not being required here. That being said, no one has been able to tell me if there was a compelling reason why we don't use the system zlib on linux nowadays. The alternative would be to link mozzlib in mozglue, but that's kind of gross.
Comment on attachment 584533 [details] [diff] [review]
part 1.5 - Add bookkeeping for Zips

Why aren't you using std::find? instead of for loops. 

Overall this is looking nicer than our ZipArchive stuff.

Why are there multiple zipfiles involved in library loading?
Attachment #584533 - Flags: review?(taras.mozilla) → review+
(In reply to Taras Glek (:taras) from comment #41)
> Comment on attachment 584533 [details] [diff] [review]
> part 1.5 - Add bookkeeping for Zips
> 
> Why aren't you using std::find? instead of for loops.

Because in the long term I want to remove libstdc++ requirement.

> Overall this is looking nicer than our ZipArchive stuff.
> 
> Why are there multiple zipfiles involved in library loading?

There aren't. It's future proofing, to allow loading components from extensions XPIs.
(In reply to Mike Hommey [:glandium] from comment #42)
> (In reply to Taras Glek (:taras) from comment #41)
> > Comment on attachment 584533 [details] [diff] [review]
> > part 1.5 - Add bookkeeping for Zips
> > 
> > Why aren't you using std::find? instead of for loops.
> 
> Because in the long term I want to remove libstdc++ requirement.

Though now that I look at it, it doesn't look like this pulls stuff from libstdc++. It looks like it's all defined in headers.
Blocks: 686805
(In reply to Mike Hommey [:glandium] from comment #34)
> Created attachment 584995 [details] [diff] [review]
> part 5 - Initial Elf Loader, wrapping around dlopen/dladdr/dlsym/dlclose.

Looks plausible, to the limited extent that I can tell

+  /* Build up a list of all library handles with direct (external) references.
+   * We actually skip system library handles because we want to keep at least
+   * some of these open. Most notably, Mozilla codebase keeps a few libgnome
+   * libraries deliberately open because of the mess that libORBit destruction
+   * is. dlclose()ing these libraries actually leads to problems. */

Where is the logic / list of libraries that are never closed?


+  void ReleaseDirectRef()
+      // ASSERT(directRefCnt >= mozilla::RefCounted<LibHandle>::refCount())

why is this commented out?  it looks useful.   ditto in
+  void ReleaseAllDirectRefs()
(In reply to Mike Hommey [:glandium] from comment #35)
> Created attachment 584996 [details] [diff] [review]
> part 7 - Use a custom Elf linker for libraries given with an absolute path
> name

various instances of
+#ifndef __GLIBC__

is that the right guarding symbol?  I suppose it probably is if it is
guarding glibc compatibility hacks.



+CustomElf::LoadSegment(const Phdr *pt_load) const
+#if PF_X == PROT_READ && PF_W == PROT_WRITE && PF_R == PROT_EXEC
+  // Optimized version of the following, in the standard case where
+  // read and exec flags are simply inverted between PF_* and PROT_*.
+  prot = ((((prot >> 2) & 1) | (prot << 2)) & 0x7) | (prot & 2);
+#else
+  prot = ((prot & PF_X) ? PROT_EXEC : 0) ||
+          ((prot & PF_W) ? PROT_WRITE : 0) ||
+          ((prot & PF_R) ? PROT_READ : 0);
+#endif

Is the optimised version really necessary?  How many times will
this get executed per Fx startup -- 15 libraries x (say) 5 segments
per library ?
(In reply to Julian Seward from comment #45)
> (In reply to Mike Hommey [:glandium] from comment #35)
> > Created attachment 584996 [details] [diff] [review]
> > part 7 - Use a custom Elf linker for libraries given with an absolute path
> > name

Duh, I missed copy-pasting some comments.  More comments for this patch:

+class CustomElf: public LibHandle
+   * content. The file descriptor is fully appropriated, and will be closed

What does "fully appropriated" mean?  (is also used below).  Not sure
what appropriated means.


+class UnboundArray

Would UnsizedArray be a better name?  I know you mean "array that 
does not have an upper bound", but it also gives the intuition
"array that has not been bound to anything", whatever that might
mean.
(In reply to Mike Hommey [:glandium] from comment #36)
> Created attachment 584997 [details] [diff] [review]
> part 8 - Allow to load Elf files from a Zip archive

+/**
+ * RAII wrapper for a mapping of the first page off a Mappable object.
+ * This calls Mappable::munmap instead of system munmap.
+ */
+class Mappable1stPagePtr: public GenericMappedPtr<Mappable1stPagePtr> {

Is this the/a replacement for AutoMunmap in the 27 Dec patch set?
Ifo so it will be too limited in some cases, eg if I want to traverse
the section header table and hence the section header's string table,
in order to pick up the .text VMA to calculate "add-symbol-file" addresses
for GDB/Valgrind etc, since IIUC the section header table can be anywhere in the file.


Sanity check: in the 27 Dec patch, 

bool
CustomElf::LoadSegment(const Phdr *pt_load,
has
      if (mprotect(GetPtr(next_page), mem_end - next_page, flags) < 0) {

is it correct that this is a direct-to-the-kernel mprotect, and
does not need to be against a Mappable ?  This patch also does not
change it to be a against-Mappable operation.
(In reply to Julian Seward from comment #44)
> (In reply to Mike Hommey [:glandium] from comment #34)
> > Created attachment 584995 [details] [diff] [review]
> > part 5 - Initial Elf Loader, wrapping around dlopen/dladdr/dlsym/dlclose.
> 
> Looks plausible, to the limited extent that I can tell
> 
> +  /* Build up a list of all library handles with direct (external)
> references.
> +   * We actually skip system library handles because we want to keep at
> least
> +   * some of these open. Most notably, Mozilla codebase keeps a few libgnome
> +   * libraries deliberately open because of the mess that libORBit
> destruction
> +   * is. dlclose()ing these libraries actually leads to problems. */
> 
> Where is the logic / list of libraries that are never closed?

The list most probably varies between versions. Most (all?) are GNOME libraries.

> +  void ReleaseDirectRef()
> +      // ASSERT(directRefCnt >= mozilla::RefCounted<LibHandle>::refCount())
> 
> why is this commented out?  it looks useful.   ditto in
> +  void ReleaseAllDirectRefs()

The ASSERTs are commented because I want to mozilla/Assertions.h, but it uses JS_Assert at the moment.(In reply to Julian Seward from comment #45)
> (In reply to Mike Hommey [:glandium] from comment #35)
> > Created attachment 584996 [details] [diff] [review]
> > part 7 - Use a custom Elf linker for libraries given with an absolute path
> > name
> 
> various instances of
> +#ifndef __GLIBC__
> 
> is that the right guarding symbol?  I suppose it probably is if it is
> guarding glibc compatibility hacks.

Indeed.

> +CustomElf::LoadSegment(const Phdr *pt_load) const
> +#if PF_X == PROT_READ && PF_W == PROT_WRITE && PF_R == PROT_EXEC
> +  // Optimized version of the following, in the standard case where
> +  // read and exec flags are simply inverted between PF_* and PROT_*.
> +  prot = ((((prot >> 2) & 1) | (prot << 2)) & 0x7) | (prot & 2);
> +#else
> +  prot = ((prot & PF_X) ? PROT_EXEC : 0) ||
> +          ((prot & PF_W) ? PROT_WRITE : 0) ||
> +          ((prot & PF_R) ? PROT_READ : 0);
> +#endif
> 
> Is the optimised version really necessary?

Probably not, but does it hurt to keep it?

(In reply to Julian Seward from comment #46)
> (In reply to Julian Seward from comment #45)
> > (In reply to Mike Hommey [:glandium] from comment #35)
> > > Created attachment 584996 [details] [diff] [review]
> > > part 7 - Use a custom Elf linker for libraries given with an absolute path
> > > name
> 
> Duh, I missed copy-pasting some comments.  More comments for this patch:
> 
> +class CustomElf: public LibHandle
> +   * content. The file descriptor is fully appropriated, and will be closed
> 
> What does "fully appropriated" mean?  (is also used below).  Not sure
> what appropriated means.

Ownership of the object is taken by the CustomElf.

> +class UnboundArray
> 
> Would UnsizedArray be a better name?

Yes, that sounds better.

(In reply to Julian Seward from comment #47)
> (In reply to Mike Hommey [:glandium] from comment #36)
> > Created attachment 584997 [details] [diff] [review]
> > part 8 - Allow to load Elf files from a Zip archive
> 
> +/**
> + * RAII wrapper for a mapping of the first page off a Mappable object.
> + * This calls Mappable::munmap instead of system munmap.
> + */
> +class Mappable1stPagePtr: public GenericMappedPtr<Mappable1stPagePtr> {
> 
> Is this the/a replacement for AutoMunmap in the 27 Dec patch set?
> Ifo so it will be too limited in some cases, eg if I want to traverse
> the section header table and hence the section header's string table,
> in order to pick up the .text VMA to calculate "add-symbol-file" addresses
> for GDB/Valgrind etc, since IIUC the section header table can be anywhere in
> the file.

You don't need to traverse the section header table with the r_debug patch I sent you.

> Sanity check: in the 27 Dec patch, 
> 
> bool
> CustomElf::LoadSegment(const Phdr *pt_load,
> has
>       if (mprotect(GetPtr(next_page), mem_end - next_page, flags) < 0) {
> 
> is it correct that this is a direct-to-the-kernel mprotect, and
> does not need to be against a Mappable ?  This patch also does not
> change it to be a against-Mappable operation.

Yes, this mprotects anonymous memory.
(In reply to Mike Hommey [:glandium] from comment #48)

> The ASSERTs are commented because I want to mozilla/Assertions.h, but it
> uses JS_Assert at the moment.

MappableSZ.cpp has a whole bunch of assertions, which will have the same problem,
so it would be good to get a solution to this.


> > +CustomElf::LoadSegment(const Phdr *pt_load) const
> > +#if PF_X == PROT_READ && PF_W == PROT_WRITE && PF_R == PROT_EXEC
> > +  // Optimized version of the following, in the standard case where
> > +  // read and exec flags are simply inverted between PF_* and PROT_*.
> > +  prot = ((((prot >> 2) & 1) | (prot << 2)) & 0x7) | (prot & 2);
> > +#else
> > +  prot = ((prot & PF_X) ? PROT_EXEC : 0) ||
> > +          ((prot & PF_W) ? PROT_WRITE : 0) ||
> > +          ((prot & PF_R) ? PROT_READ : 0);
> > +#endif
> > 
> > Is the optimised version really necessary?
> 
> Probably not, but does it hurt to keep it?

Well, yes .. it makes the code generally longer and more difficult to
understand in return for no clear benefit.

Actually (just spotted this) in the general case shouldn't those ||s be |s ?


> > What does "fully appropriated" mean?  (is also used below).  Not sure
> > what appropriated means.
> 
> Ownership of the object is taken by the CustomElf.

ok .. I didn't get that from the comment.  I know what misappropriated
means (stolen, basically) but not appropriated.


> > Is this the/a replacement for AutoMunmap in the 27 Dec patch set?
> > Ifo so it will be too limited in some cases, eg if I want to traverse
> > the section header table and hence the section header's string table,
> > in order to pick up the .text VMA to calculate "add-symbol-file" addresses
> > for GDB/Valgrind etc, since IIUC the section header table can be anywhere in
> > the file.
> 
> You don't need to traverse the section header table with the r_debug patch I
> sent you.

Yeah .. I realised that after posting the above comment :-(
Mike, pls can you indicate which of these patches need to be applied in
which order?  I'd like to rebase my stuff against this, but it's not obvious
(to me) how to get to a suitable starting point.
Attachment #577949 - Attachment is obsolete: true
(In reply to Julian Seward from comment #50)
> Mike, pls can you indicate which of these patches need to be applied in
> which order?  I'd like to rebase my stuff against this, but it's not obvious
> (to me) how to get to a suitable starting point.

All of them, in the order of their part number. They replace the bug683127-wip and subsequent patches I sent you. You'll still need the dependencies patch (which, in bugzilla, is scattered across the dependencies to this bug).

(In reply to Julian Seward from comment #49)
> Actually (just spotted this) in the general case shouldn't those ||s be |s ?

yes.
Comment on attachment 584995 [details] [diff] [review]
part 5 - Initial Elf Loader, wrapping around dlopen/dladdr/dlsym/dlclose.

seems like if you made ReleaseDirectRef() return a boolean, ReleaseAllDirectRefs would not be needed.

This looks ok. Please ask Julian for subsequent reviews since he's actually working with the code.
Attachment #584995 - Flags: review?(taras.mozilla) → review+
Comment on attachment 584996 [details] [diff] [review]
part 7 - Use a custom Elf linker for libraries given with an absolute path name

I don't know about linkers to honestly r+ this patch. I don't see anything wrong beyond Julian's review. He can review followups.
Attachment #584996 - Flags: review?(taras.mozilla)
Comment on attachment 584997 [details] [diff] [review]
part 8 - Allow to load Elf files from a Zip archive

this is a pretty cool hack
Attachment #584997 - Flags: review?(taras.mozilla) → review+
Refreshed against bug 701371, changed headers to MPL 2.0 (yay for shortened licensing boilerplate), merged part 1.5 and replaced a loop with std::find.
Attachment #583143 - Attachment is obsolete: true
Attachment #584533 - Attachment is obsolete: true
Comment on attachment 586914 [details] [diff] [review]
part 1 - Simple Zip reader for the new ELF Linker.

Carrying over r+
Attachment #586914 - Flags: review+
Refreshed against bug 701371.
Attachment #584737 - Attachment is obsolete: true
Comment on attachment 586916 [details] [diff] [review]
part 2 - Build system glue for the new linker.

Carrying over r+
Attachment #586916 - Flags: review+
Refreshed against bug 701371.
Attachment #581904 - Attachment is obsolete: true
Comment on attachment 586918 [details] [diff] [review]
part 3 - Test for the Zip reader.

Carrying over r+
Attachment #586918 - Flags: review+
Forgot to update the license boilerplates.
Attachment #586918 - Attachment is obsolete: true
Comment on attachment 586919 [details] [diff] [review]
part 3 - Test for the Zip reader.

Carrying over r+
Attachment #586919 - Flags: review+
Attachment #581662 - Attachment is obsolete: true
Comment on attachment 586921 [details] [diff] [review]
part 4 - Use the new ELF linker Zip reader in the old linker.

Carrying over r+
Attachment #586921 - Flags: review+
Applied the suggested change to ReleaseDirectRef and got rid of ReleaseAllDirectRefs.
Attachment #584995 - Attachment is obsolete: true
Better std::find logic
Attachment #586914 - Attachment is obsolete: true
Comment on attachment 586930 [details] [diff] [review]
part 1 - Simple Zip reader for the new ELF Linker.

Carrying over r+
Attachment #586930 - Flags: review+
Use std::find in ElfLoader::Forget, like in ZipCollection::Forget
Attachment #586928 - Attachment is obsolete: true
Comment on attachment 586931 [details] [diff] [review]
part 5 - Initial Elf Loader, wrapping around dlopen/dladdr/dlsym/dlclose.

Carrying over r+
Attachment #586931 - Flags: review+
Modified the CustomElf::Load comment, changed the flags setting in CustomElf::LoadSegment and renamed UnboundArray to UnsizedArray.
Attachment #584996 - Attachment is obsolete: true
Refreshed against changes to other patches. (context changes only)
Attachment #584997 - Attachment is obsolete: true
Comment on attachment 586936 [details] [diff] [review]
part 8 - Allow to load Elf files from a Zip archive.

Carrying over r+
Attachment #586936 - Flags: review+
Attachment #586934 - Flags: review?(jseward)
Re-refreshed against bug 701371.
Attachment #586916 - Attachment is obsolete: true
Comment on attachment 586942 [details] [diff] [review]
part 2 - Build system glue for the new linker.

Carrying over r+
Attachment #586942 - Flags: review+
Comment on attachment 586934 [details] [diff] [review]
part 7 - Use a custom Elf linker for libraries given with an absolute path name.

Review of attachment 586934 [details] [diff] [review]:
-----------------------------------------------------------------

+#ifndef PAGE_SIZE
+#define PAGE_SIZE 4096
+#endif

is it ok to hardwire this instead of using sysconf() ?  At least for
some ppc64-linux and mips-linux setups, the page size isn't
necessarily 4k.


+unsigned long
+ElfHash(const char *symbol)
+{
+  unsigned long h = 0, g;
+  while (*symbol) {
+    h = (h << 4) + *symbol++;

Does the behaviour of this depend on the signedness of char?
Is the sign extension of *symbol independent of the signedness
of *symbol?  eg
(gdb) p (unsigned int)( (signed char) 0xFF )
$3 = 4294967295
(gdb) p (unsigned int)( (unsigned char) 0xFF )
$4 = 255


r+ provided the sign-extension thing is OK.  I'm not so bothered about the page size thing.
Attachment #586934 - Flags: review?(jseward) → review+
(In reply to Julian Seward from comment #75)
> Comment on attachment 586934 [details] [diff] [review]
> part 7 - Use a custom Elf linker for libraries given with an absolute path
> name.
> 
> Review of attachment 586934 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> +#ifndef PAGE_SIZE
> +#define PAGE_SIZE 4096
> +#endif
> 
> is it ok to hardwire this instead of using sysconf() ?  At least for
> some ppc64-linux and mips-linux setups, the page size isn't
> necessarily 4k.

But these systems are far from being supported at the moment.

> +unsigned long
> +ElfHash(const char *symbol)
> +{
> +  unsigned long h = 0, g;
> +  while (*symbol) {
> +    h = (h << 4) + *symbol++;
> 
> Does the behaviour of this depend on the signedness of char?
> Is the sign extension of *symbol independent of the signedness
> of *symbol?  eg
> (gdb) p (unsigned int)( (signed char) 0xFF )
> $3 = 4294967295
> (gdb) p (unsigned int)( (unsigned char) 0xFF )
> $4 = 255

There is indeed a problem here, and in fact the ELF ABI for SysV has an unsigned char argument. I wonder how I got it off.
Depends on: 716825
Attachment #586942 - Attachment is obsolete: true
Attachment #586921 - Attachment is obsolete: true
Refreshed against bug 709776
Attachment #585000 - Attachment is obsolete: true
Landed up to part 4
https://hg.mozilla.org/integration/mozilla-inbound/rev/12f6fad66924
https://hg.mozilla.org/integration/mozilla-inbound/rev/85c7cdc1a916
https://hg.mozilla.org/integration/mozilla-inbound/rev/ea78ef72f47f
https://hg.mozilla.org/integration/mozilla-inbound/rev/52edf4287893

This is basically what we're going to get if we end up disabling the new linker once this bug lands completely (through MOZ_OLD_LINKER in configure.in). I want it tested before (finishing and) landing the rest.
Comment on attachment 587378 [details] [diff] [review]
part 2 - Build system glue for the new linker.

Carrying over r+
Attachment #587378 - Flags: review+
Comment on attachment 587379 [details] [diff] [review]
part 4 - Use the new ELF linker Zip reader in the old linker.

Carrying over r+
Attachment #587379 - Flags: review+
Changed a couple debugging messages
Attachment #586934 - Attachment is obsolete: true
Wrong log level in previous patch (sorry for the noise)
Attachment #587685 - Attachment is obsolete: true
This adds a few debugging messages, removes the useless madvise (it was applied to the decompressed buffer, not the compressed buffer ; and since we don't know the compressed size of the requested decompressed length...) and fixes cacheflush (was done at the wrong moment, on a slightly wrong range)
Attachment #586936 - Attachment is obsolete: true
Comment on attachment 587687 [details] [diff] [review]
part 7 - Use a custom Elf linker for libraries given with an absolute path name.

Carrying over r+
Attachment #587687 - Flags: review+
Comment on attachment 587688 [details] [diff] [review]
part 8 - Allow to load Elf files from a Zip archive.

Carrying over r+
Attachment #587688 - Flags: review+
Grah, there was a missing hunk
Attachment #587688 - Attachment is obsolete: true
Attachment #587694 - Flags: review+
This allows valgrind to happily find symbols and allows to implement what we currently do with the android debug intent.
Attachment #588892 - Flags: review?(taras.mozilla)
Attachment #587380 - Attachment is obsolete: true
Essentially, this is the same as what has been done in bug 687446, without relying on /proc/self/auxv (which happens not to be readable when android:debuggable is set to false (iow, all mozilla builds)) and without crashing under valgrind.
Attachment #588894 - Flags: review?(taras.mozilla)
This hooks the new linker in APKOpen.cpp.
Attachment #588896 - Flags: review?(blassey.bugs)
This disables the old linker on android native ui (thus enabling the new one). This is the part that is supposed to be backed out if things go wrong, the rest can stay.
Attachment #588898 - Flags: review?(khuey)
Comment on attachment 588896 [details] [diff] [review]
part 11 - Hook the new linker in Android initialization

Review of attachment 588896 [details] [diff] [review]:
-----------------------------------------------------------------

we should probably remove the lib cache code since its not used anymore. It would make your patch a bit cleaner.
Attachment #588896 - Flags: review?(blassey.bugs) → review+
(In reply to Brad Lassey [:blassey] from comment #95)
> we should probably remove the lib cache code since its not used anymore. It
> would make your patch a bit cleaner.

The patch series doesn't remove the old linker, and the old linker still uses it. The old linker is also needed for the xul builds, because the new linker doesn't support separate processes yet.
Comment on attachment 588892 [details] [diff] [review]
part 9 - Allow to temporarily extract Elf files from a Zip archive for e.g. valgrind

if (extract && (extract[0] == '1') && (extract[1] == '\0')) <-- use strncmp. I'm not even sure why we care about contents of the env var, should just check for existence?

I think AutoAny should be AutoClean. We should add a class like this to gecko.
Attachment #588892 - Flags: review?(taras.mozilla) → review+
Comment on attachment 588894 [details] [diff] [review]
part 10 - Allow debug symbols to be found under gdb without extracted libraries

Seems like you should use your new Auto traits thing for singleton forgetting and possibly GenericMappedPtr.

I think mwu should review the debug section magic in ElfLoader::InitDebugger()
Attachment #588894 - Flags: review?(taras.mozilla)
Attachment #588894 - Flags: review?(mwu)
Attachment #588894 - Flags: review+
(In reply to Taras Glek (:taras) from comment #97)
> Comment on attachment 588892 [details] [diff] [review]
> part 9 - Allow to temporarily extract Elf files from a Zip archive for e.g.
> valgrind
> 
> if (extract && (extract[0] == '1') && (extract[1] == '\0')) <-- use strncmp.
> I'm not even sure why we care about contents of the env var, should just
> check for existence?

Checking for existence can lead to people getting unexpected behaviour when setting VAR= or VAR=0.

> I think AutoAny should be AutoClean. We should add a class like this to
> gecko.

I'll file a MFBT bug, then.

(In reply to Taras Glek (:taras) from comment #98)
> Comment on attachment 588894 [details] [diff] [review]
> part 10 - Allow debug symbols to be found under gdb without extracted
> libraries
> 
> Seems like you should use your new Auto traits thing for singleton
> forgetting and possibly GenericMappedPtr.

I wanted to, but since the MappedPtr stores more than one value, this is jumping through many hoops for limited added value.
Attachment #586931 - Attachment is obsolete: true
The previous version I had sent didn't address comment 75.
Attachment #589438 - Attachment is obsolete: true
(In reply to Mike Hommey [:glandium] from comment #101)
> Created attachment 589441 [details] [diff] [review]
> part 5 - Initial Elf Loader, wrapping around dlopen/dladdr/dlsym/dlclose.
> 
> The previous version I had sent didn't address comment 75.

This comment was meant for a new part 7, not for part 5.
Attachment #587687 - Attachment is obsolete: true
Attachment #587694 - Attachment is obsolete: true
Comment on attachment 589441 [details] [diff] [review]
part 5 - Initial Elf Loader, wrapping around dlopen/dladdr/dlsym/dlclose.

Carrying over r+
Attachment #589441 - Flags: review+
Adjusted to changes in part 8, and renamed AutoAny to AutoClean.
Comment on attachment 589442 [details] [diff] [review]
part 7 - Use a custom Elf linker for libraries given with an absolute path name.

Carrying over r+
Attachment #588892 - Attachment is obsolete: true
Attachment #589442 - Flags: review+
Attachment #589445 - Attachment is obsolete: true
Comment on attachment 589444 [details] [diff] [review]
part 8 - Allow to load Elf files from a Zip archive.

Carrying over r+
Attachment #589444 - Flags: review+
Comment on attachment 589446 [details] [diff] [review]
part 9 - Allow to temporarily extract Elf files from a Zip archive for e.g. valgrind

Carrying over r+
Attachment #589446 - Flags: review+
Comment on attachment 588894 [details] [diff] [review]
part 10 - Allow debug symbols to be found under gdb without extracted libraries

Review of attachment 588894 [details] [diff] [review]:
-----------------------------------------------------------------

The InitDebugger part looks ok to me, assuming the comments about how it works is accurate.

::: mozglue/linker/ElfLoader.cpp
@@ +20,5 @@
> +#endif
> +
> +#ifndef PAGE_MASK
> +#define PAGE_MASK (~ (PAGE_SIZE - 1))
> +#endif

Hm, this should only be needed for building on normal Linux systems, right? Those would probably want sysconf but I guess it doesn't matter much.

@@ +504,5 @@
> +    if (phdr->p_type == PT_LOAD && phdr->p_offset == 0)
> +      base -= phdr->p_vaddr;
> +    if (phdr->p_type == PT_DYNAMIC) {
> +      dyns.Init(base + phdr->p_vaddr, phdr->p_filesz);
> +    }

Probably want to remove the curly braces here.
Attachment #588894 - Flags: review?(mwu) → review+
(In reply to Michael Wu [:mwu] from comment #111)
> Comment on attachment 588894 [details] [diff] [review]
> part 10 - Allow debug symbols to be found under gdb without extracted
> libraries
> 
> Review of attachment 588894 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> The InitDebugger part looks ok to me, assuming the comments about how it
> works is accurate.
> 
> ::: mozglue/linker/ElfLoader.cpp
> @@ +20,5 @@
> > +#endif
> > +
> > +#ifndef PAGE_MASK
> > +#define PAGE_MASK (~ (PAGE_SIZE - 1))
> > +#endif
> 
> Hm, this should only be needed for building on normal Linux systems, right?
> Those would probably want sysconf but I guess it doesn't matter much.

Using sysconf only matters when being completely cross-architecture, which the linker isn't, since it only supports x86, x64 and arm, all of which use 4k pages. If (when?) I port it to other architectures, I'll switch to use sysconf.
And yet another fixup (double oops)
https://hg.mozilla.org/integration/mozilla-central/rev/8297a76e0552

This wasn't caught because there is no test for the places migration :(
Disabled the new linker until I figure out what is wrong on monday, since the previous fixup was apparently not enough.
https://hg.mozilla.org/mozilla-central/rev/244711942710
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Keywords: dev-doc-needed
Fixed for good, and re-enabled:
https://hg.mozilla.org/mozilla-central/rev/fd8fc492279c
https://hg.mozilla.org/mozilla-central/rev/758005504cab
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Depends on: 720621
Depends on: 720737
[Approval Request Comment]
This is the new Elf linker for Android (which also works for Linux, but is not meant to be used in releases there, at least for now). It paves the way for bug 686805, which should give us some nice startup improvements.
There are essentially two parts to this bug, with different impacts:
- part 1 to 4 is a Zip reader class that is being used by the new Elf linker, but also replaces the Zip reading stuff that was in APKOpen.cpp. This landed 2 weeks ago and saw no problem besides build errors what were fixed after landing. It had the side effect of dropping RSS.
- part 5 to 11 is the Elf linker itself, and only adds code. It doesn't change anything in the old linker. Part 12 is the part that enables it, and it can be disabled by backing out this part at any time.

I'm going to attach a few refreshed versions (with the landed fixups, or with context changes for aurora) of a few parts.
Attachment #591500 - Flags: review+
Attachment #591500 - Flags: approval-mozilla-aurora?
Attachment #587378 - Flags: approval-mozilla-aurora?
Attachment #586919 - Flags: approval-mozilla-aurora?
Attachment #589441 - Flags: approval-mozilla-aurora?
Attachment #584750 - Flags: approval-mozilla-aurora?
Attachment #589442 - Flags: approval-mozilla-aurora?
Attachment #589444 - Flags: approval-mozilla-aurora?
Attachment #589446 - Flags: approval-mozilla-aurora?
Attachment #588894 - Flags: approval-mozilla-aurora?
Depends on: 721099
Here is a list of all direct and indirect dependencies of this bug that need to land beforehand (and thus require aurora approval if this one has aurora approval):
- bug 701371
- bug 709776
- bug 708570
- bug 717906
- bug 712284
- bug 714029
- bug 712579
- bug 716825
- bug 719253

The refreshed part 11 depends on bug 713228, because of conflicts between the two patches (and because bug 713228 landed first on m-c). The original version of part 11 + a fixup may be used on aurora if bug 713228 is not taken.
Attachment #584750 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #586919 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #587378 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #588894 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #589441 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #589442 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #589444 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #589446 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #591500 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #591502 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #591504 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #591505 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Whiteboard: [inbound] → [inbound][qa-]
Comment on attachment 588896 [details] [diff] [review]
part 11 - Hook the new linker in Android initialization

>Bug 683127 part 11 - Hook the new linker in Android initialization
>diff --git a/configure.in b/configure.in
>@@ -9025,17 +9025,27 @@ if test -z "$MOZ_NATIVE_NSPR"; then
>+    if test -n "$MOZ_LINKER" -a -z "$MOZ_OLD_LINKER" -a $ac_cv_func_dladdr = no ; then

This line causes |c:/mozilla/inbound/configure: line 25335: test: too many arguments| in my local win7 build, MozillaBuild 1.6. (Albeit carries on past the warning)
Depends on: 725517
Depends on: 725736
Depends on: 729883
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: