Closed Bug 859782 Opened 11 years ago Closed 6 years ago

Firefox cannot start without /proc (chroot)

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox-esr52 --- wontfix
firefox-esr60 --- wontfix
firefox60 --- wontfix
firefox61 --- wontfix
firefox62 --- fixed

People

(Reporter: harry.percival, Assigned: richard)

Details

(Keywords: crash, Whiteboard: [tor 20283])

Attachments

(1 file, 7 obsolete files)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0
Build ID: 20130326150557

Steps to reproduce:

attempt to start firefox in a sandboxed (chrooted) environment, where /proc is not available


Actual results:

firefox crashes on startup, with error message:

ABORT: Recursive layout module initialization: file /builds/slave/rel-m-rel-l64_bld-000000000000/build/layout/build/nsLayoutModule.cpp, line 372

here's someone else reporting the same problem: http://kevinlocke.name/bits/2013/02/22/firefox-fails-in-odd-ways-without-proc/


Expected results:

 I gather this is relatively new behaviour... If at all possible, it would be lovely if Firefox were to just gracefully degrade in some way, and survive a missing /proc.
OS: Windows 7 → Linux
Severity: normal → critical
Component: Untriaged → General
Keywords: crash
Summary: Firefox cannot start without /proc → Firefox cannot start without /proc (chroot)
harry hasn't responded
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Resolution: --- → INCOMPLETE
I think this is more likely to be not-a-bug than not-reproducible.
Status: RESOLVED → UNCONFIRMED
Resolution: INCOMPLETE → ---
Attached patch 859782_00.patch (obsolete) — Splinter Review
Patch which uses __libc_stack_end to calculate the stack base for the main thread on Linux to avoid reading /proc/self/maps
Attachment #8966695 - Flags: review?(aswan)
Comment on attachment 8966695 [details] [diff] [review]
859782_00.patch

I'm not an appropriate reviewer for this patch.  I would suggest moving the Product/Component to something like Core:Javascript Engine and then bugzilla should suggest some appropriate reviewers.
Your patch also appears to be against a very old version of mozilla-central, it will need to be rebased sooner or later if its going to land, I would also suggest doing that before requesting review.
Attachment #8966695 - Flags: review?(aswan)
Component: General → JavaScript Engine
Product: Firefox → Core
Version: 20 Branch → unspecified
Whoops, this patch was actually against mozilla-unified.  Pulling down mozilla-central and will rebase and update the patch.  Thanks!
Attached patch 859782_01.patch (obsolete) — Splinter Review
Updated patch against mozilla-central.

Firefox uses the current stack frame address and the stack size
as a sort of heuristic for various things in the javascript
engine.  The js::GetNativeStackBaseImpl() function is used to
get the base stack address (ie the address from which the stack
grows, so this can be either the first or last memory address of
the stack memory space depending on the CPU architecture).

On Linux, this function is implemented using the pthreads APIs.
For non-main threads, the queried thread info is stored in
memory.  The main thread does not have this information on hand,
so it gets the stack memory range via the /proc/self/maps file
( see glibc's pthread_get_attr_np.c ).

Fortunately (per discussions with the firefox devs in #jsapi)
the base address only needs to be approximation.  In reality,
environment variables, args, and other things are stored in space
between the end/beginning of the mapped stack memory and the 'top'
of the stack space used by stack frames.

We can get the top of this usable stack from __libc_stack_end,
which is a void* set by glibc during program initialization.
Non-main threads still get their stack-base through the usual
pthreads APIs.

This patch creates a specific implementation of 
js::GetNativeStackBaseImpl() for non-android Linux using the
described __libc_stack_end read.
Attachment #8966695 - Attachment is obsolete: true
Attachment #8966748 - Flags: review?(luke)
Attachment #8966748 - Flags: review?(kvijayan)
Attachment #8966748 - Flags: review?(jorendorff)
Comment on attachment 8966748 [details] [diff] [review]
859782_01.patch

Forwarding review to sfink (who happens to be taking a bit of time off, but is back tomorrow)
Attachment #8966748 - Flags: review?(sphink)
Attachment #8966748 - Flags: review?(luke)
Attachment #8966748 - Flags: review?(kvijayan)
Attachment #8966748 - Flags: review?(jorendorff)
Summary for sfink: Currently we call pthread_attr_getstack() to get the address range of the stack. Unfortunately, for the main thread, the implementation in glibc uses /proc/self/maps to do its job. Boo. However, for our purpose, we don't need a precise answer. We can use this alternative hack to get good-enough information without relying on /proc.

(If that is helpful, you might want to ask for a better comment!)
Comment on attachment 8966748 [details] [diff] [review]
859782_01.patch

Review of attachment 8966748 [details] [diff] [review]:
-----------------------------------------------------------------

Marking r+, but you'll probably have to upload an updated revision for checkin if you don't have commit privileges.

::: js/src/util/NativeStack.cpp
@@ +68,5 @@
> +static pid_t
> +gettid()
> +{
> +    return syscall(__NR_gettid);
> +}

Ugh. I thought we had a main thread utility function, but it looks like the closest is NS_IsMainThread from XPCOM. So I agree with you, this works.

@@ +74,5 @@
> +void*
> +js::GetNativeStackBaseImpl()
> +{
> +    // main thread, get stack base from __libc_stack_end rather than pthread APIs
> +    // to avoid filesystem calls /proc/self/maps

Make the comment a complete sentence, ending with a period, and also mention why we want to avoid filesystem calls (eg ".../proc/self/maps, which may not be available in a sandboxed configuration.").

@@ +78,5 @@
> +    // to avoid filesystem calls /proc/self/maps
> +    if(gettid() == getpid()) {
> +        void** pLibcStackEnd = (void**)dlsym(RTLD_DEFAULT, "__libc_stack_end");
> +        // if __libc_stack_end is not found, architecture specific frame pointer hopping will need
> +        // to be implemented

period at the end

@@ +79,5 @@
> +    if(gettid() == getpid()) {
> +        void** pLibcStackEnd = (void**)dlsym(RTLD_DEFAULT, "__libc_stack_end");
> +        // if __libc_stack_end is not found, architecture specific frame pointer hopping will need
> +        // to be implemented
> +        MOZ_ASSERT(pLibcStackEnd);

MOZ_RELEASE_ASSERT(pLibcStackEnd, "__libc_stack_end unavailable, unable to set up stack range for JS") or something.

@@ +81,5 @@
> +        // if __libc_stack_end is not found, architecture specific frame pointer hopping will need
> +        // to be implemented
> +        MOZ_ASSERT(pLibcStackEnd);
> +        void* stackBase = *pLibcStackEnd;
> +        MOZ_ASSERT(stackBase);

MOZ_RELEASE_ASSERT(stackBase, "invalid stack base, unable to set up stack range for JS") or similar.

@@ +86,5 @@
> +        // we don't need to fix stackBase, as it already roughly points to beginning of the stack
> +        return stackBase;
> +    }
> +    // non-main threads have the required info stored in memory, no filesystem calls are made
> +    else {

Add a period at the end of the comment.
Attachment #8966748 - Flags: review?(sphink) → review+
Whiteboard: [tor 20283]
Attached patch Updated patch based on feedback (obsolete) — Splinter Review
Attachment #8966748 - Attachment is obsolete: true
Attachment #8969059 - Flags: review+
Attachment #8969059 - Flags: checkin?
Comment on attachment 8969059 [details] [diff] [review]
Updated patch based on feedback

Just saw RyanVM's request to use checkin-needed instead.
Attachment #8969059 - Flags: checkin?
Pushed by nerli@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0ab0d909476f
Firefox cannot start without /proc (chroot) r=sphink
Keywords: checkin-needed
Comment on attachment 8969059 [details] [diff] [review]
Updated patch based on feedback

Review of attachment 8969059 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/util/NativeStack.cpp
@@ +78,5 @@
> +    // to avoid filesystem calls /proc/self/maps.  Non-main threads spawned with pthreads can read
> +    // this information directly from their pthread struct, but the main thread must go parse
> +    // /proc/self/maps to figure the mapped stack address space ranges.  We want to avoid reading
> +    // from /proc/ so that firefox can run in sandboxed environments where /proc may not be mounted
> +    if(gettid() == getpid()) {

I just saw this patch and I am not quite sure if this works. I think getpid() in the child process always returns 0 when sandboxed. https://searchfox.org/mozilla-central/source/security/sandbox/linux/SandboxFilter.cpp#568
Backout by ebalazs@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/22ed5e1657aa
Backed out changeset 0ab0d909476f for bustage in builds/worker/workspace/build/src/js/src/util/NativeStack.cpp on a CLOSED TREE
You want to needinfo the patch author.
Assignee: nobody → richard
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Flags: needinfo?(sdetar) → needinfo?(richard)
(In reply to Tom Schuster [:evilpie] from comment #15)
> Comment on attachment 8969059 [details] [diff] [review]
> Updated patch based on feedback
> 
> Review of attachment 8969059 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/util/NativeStack.cpp
> @@ +78,5 @@
> > +    // to avoid filesystem calls /proc/self/maps.  Non-main threads spawned with pthreads can read
> > +    // this information directly from their pthread struct, but the main thread must go parse
> > +    // /proc/self/maps to figure the mapped stack address space ranges.  We want to avoid reading
> > +    // from /proc/ so that firefox can run in sandboxed environments where /proc may not be mounted
> > +    if(gettid() == getpid()) {
> 
> I just saw this patch and I am not quite sure if this works. I think
> getpid() in the child process always returns 0 when sandboxed.
> https://searchfox.org/mozilla-central/source/security/sandbox/linux/
> SandboxFilter.cpp#568

That's only for getppid(), not getpid(). (I asked jld.)

But jld did also say

<jld> That *is* going to be a problem for bug 1151624, because my current plan is to make getpid() return the pid in the parent's namespace, but syscall(__NR_getpid) is still the real getpid.

which makes me a little worried that syscall(__NR_gettid) == getpid() might not return things in the same namespace at some point.

Ok, how about this -- I'll complain about some style issues here that I incorrectly ignored, and then punt the review over to jld. :-) Though evilpie may need to do his own complaining, because although I initially thought that the if { ...; return; } else { ... } should be switched to an early return without the else, I changed my mind and decided that what you already have better represents the logic (the mainthread case is not an exceptional case to get out of the way first, it's on equal footing.) But there's other stuff, and of course it exploded on landing. I can do try pushes of updated patches, if you don't have access?
Comment on attachment 8969059 [details] [diff] [review]
Updated patch based on feedback

Review of attachment 8969059 [details] [diff] [review]:
-----------------------------------------------------------------

evilpie, if there are additional style violations that I'm ignoring, please point them out.

I'll clear review for now until the build failures are fixed. (Maybe add unistd.h, and possibly #define _GNU_SOURCE ? Not sure.)

::: js/src/util/NativeStack.cpp
@@ +63,5 @@
>  
> +#elif defined(XP_LINUX) && !defined(ANDROID)
> +
> +# include <dlfcn.h>
> +# include <sys/syscall.h>

I probably should have pointed this out initially, but the #includes should really be up at the top with the rest. It's already doing a bunch of platform detection.
Attachment #8969059 - Flags: review+ → feedback?(evilpies)
Weird, I pushed the first revision of the patch through the try servers and they came back clean, though subsequent iteration I only built locally (Linux x64).  I'll push a new version once I find and fix the build issue.  I'll also move those includes ;)
Flags: needinfo?(richard)
Comment on attachment 8969059 [details] [diff] [review]
Updated patch based on feedback

Review of attachment 8969059 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/util/NativeStack.cpp
@@ +78,5 @@
> +    // to avoid filesystem calls /proc/self/maps.  Non-main threads spawned with pthreads can read
> +    // this information directly from their pthread struct, but the main thread must go parse
> +    // /proc/self/maps to figure the mapped stack address space ranges.  We want to avoid reading
> +    // from /proc/ so that firefox can run in sandboxed environments where /proc may not be mounted
> +    if(gettid() == getpid()) {

if ( [add space]

@@ +88,5 @@
> +        MOZ_RELEASE_ASSERT(stackBase, "invalid stack base, unable to setup stack range for JS");
> +        // We don't need to fix stackBase, as it already roughly points to beginning of the stack
> +        return stackBase;
> +    }
> +    // Non-main threads have the required info stored in memory, so no filesystem calls are made.

} else { 
// Non-main

@@ +98,5 @@
> +        // stackBase will be the *lowest* address on all architectures.
> +        void* stackBase = nullptr;
> +        size_t stackSize = 0;
> +        int rc = pthread_attr_getstack(&sattr, &stackBase, &stackSize);
> +        if(rc) {

if (rc)

@@ +99,5 @@
> +        void* stackBase = nullptr;
> +        size_t stackSize = 0;
> +        int rc = pthread_attr_getstack(&sattr, &stackBase, &stackSize);
> +        if(rc) {
> +            MOZ_CRASH();

Probably better to pass a reason string here.
Attached patch 859782_03.patch (obsolete) — Splinter Review
Attachment #8969059 - Attachment is obsolete: true
Attachment #8969059 - Flags: feedback?(evilpies)
Attachment #8969457 - Flags: checkin?
Patch applies!  I moved the includes and the gettid() function up with the rest of the includes, and did the same for the SOLARIS and AIX ucontext.h include for good measure.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=74f2b4b1c87ac3029be1f4a3f8a3043a6e2bda2d

EDIT: Just noticed the new feedback, updating the patch with the various white-space changes and message for the MOZ_CRASH
(In reply to Steve Fink [:sfink] [:s:] (PTO Apr9-12) from comment #19)
> But jld did also say
> 
> <jld> That *is* going to be a problem for bug 1151624, because my current
> plan is to make getpid() return the pid in the parent's namespace, but
> syscall(__NR_getpid) is still the real getpid.
> 
> which makes me a little worried that syscall(__NR_gettid) == getpid() might
> not return things in the same namespace at some point.

To clarify, the situation I'd be creating is: syscall(__NR_getpid) == 1, syscall(__NR_gettid) == 1 iff on the main thread, and getpid() != 1.  (This assumes I do in fact wind up interposing getpid(), rather than solving the pid confusion some other way, but so far that seems to be the least bad approach.)

And it's not a big problem if I have to replace the getpid() when I go to land that, so I think it's okay to leave it.



Ironically, when /proc doesn't exist none of that matters, because:
1. /proc/self doesn't exist, so my fake getpid() won't work; I'd need to detect that and turn off pid namespaces
2. /proc/self/ns/user doesn't exist, which currently means we assume the kernel was built with CONFIG_USER_NS=n and don't use or even directly probe for namespace support.  (This should be fixed, but that's another bug.)

But this code will also be used in the normal /proc-ful case, as I understand it (and that's a good thing, because
Attached patch 859782_04.patch (obsolete) — Splinter Review
Attachment #8969457 - Attachment is obsolete: true
Attachment #8969457 - Flags: checkin?
Fixed spacing, added assert string per Tom's feedback.
Comment on attachment 8969457 [details] [diff] [review]
859782_03.patch

Review of attachment 8969457 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/util/NativeStack.cpp
@@ +70,5 @@
>      return static_cast<char*>(context.uc_stack.ss_sp) +
>          context.uc_stack.ss_size;
>  }
>  
> +#elif defined(XP_LINUX) && !defined(ANDROID)

This condition should be checking for glibc, I think — other Linux libcs won't necessarily have this symbol.  For example, musl doesn't seem to.
Attachment #8969457 - Attachment is obsolete: false
Comment on attachment 8969457 [details] [diff] [review]
859782_03.patch

(Fixing mid-air collision.)
Attachment #8969457 - Attachment is obsolete: true
Attached patch 859782_05.patch (obsolete) — Splinter Review
Added explicit __GLIBC__ check in addition existing LINUX and !ANDROID checks for implementation per JLD's suggestion
Attachment #8969492 - Attachment is obsolete: true
Attachment #8979368 - Flags: review?(jld)
Attachment #8979368 - Flags: review?(evilpies)
Comment on attachment 8979368 [details] [diff] [review]
859782_05.patch

Review of attachment 8979368 [details] [diff] [review]:
-----------------------------------------------------------------

I just looked at the style here. Personally I would usually add a blank line before comments to give a bit more breathing room.

::: js/src/util/NativeStack.cpp
@@ +88,5 @@
> +        void* stackBase = *pLibcStackEnd;
> +        MOZ_RELEASE_ASSERT(stackBase, "invalid stack base, unable to setup stack range for JS");
> +        // We don't need to fix stackBase, as it already roughly points to beginning of the stack
> +        return stackBase;
> +    } else {

Remove else block. (no else after return)
Attachment #8979368 - Flags: review?(evilpies) → review+
Attached patch 859782_06.patch (obsolete) — Splinter Review
Added new lines before comments, removed else block
Attachment #8982017 - Flags: review?(evilpies)
Attachment #8982017 - Flags: review?(evilpies) → review+
Attachment #8979368 - Flags: review?(jld) → review+
Keywords: checkin-needed
Attachment #8979368 - Attachment is obsolete: true
Please attach a patch which includes commit information (name, email, descriptive commit message) and re-request checkin.
Flags: needinfo?(richard)
Keywords: checkin-needed
Attached patch 859782_07.patchSplinter Review
Attachment #8982017 - Attachment is obsolete: true
Flags: needinfo?(richard)
Keywords: checkin-needed
Can this patch be landed even tho it has no review+?
Flags: needinfo?(richard)
Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e8d4a24e47a9
Firefox cannot start without /proc (chroot). r=sfink,evilpie,jld
Keywords: checkin-needed
(There were no changes to the patch since evilpie's review, no functional changes since mine, and the changes made in the sfink's r+-with-changes have been made (and I'm assuming the minor changes made after that wouldn't require re-review).)
Flags: needinfo?(richard)
https://hg.mozilla.org/mozilla-central/rev/e8d4a24e47a9
Status: ASSIGNED → RESOLVED
Closed: 10 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
You need to log in before you can comment on or make changes to this bug.