Closed Bug 1066760 Opened 5 years ago Closed 5 years ago

Make abort crashes easier to diagnose

Categories

(Firefox for Android :: General, defect)

All
Android
defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 35

People

(Reporter: jchen, Assigned: jchen)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 2 obsolete files)

Right now when abort() is called, the stack shows up as somewhere inside libmozalloc.so. Most of the time, Socorro can unwind the stack and/or retrieve the abort message. However, the Play Store doesn't have that capability, leading to, e.g., bug 1060009.
I think we'll get better results in the Play Store if we turned abort() calls into Java exceptions. Even on crash-stats, we'll see individual aborts instead of everything lumped into a crash in libmozalloc.so.
Attachment #8488831 - Flags: review?(snorp)
Also, a lot of times we don't get a stack trace on Android when calling abort(). However, we can change the abort message to include the address that's calling abort() to help diagnosis.
Attachment #8488837 - Flags: review?(snorp)
Attachment #8488837 - Flags: review?(snorp) → review+
Add lib base address to the __wrap_dladdr output and fall back to using the system dladdr.
Attachment #8490943 - Flags: review?(mh+mozilla)
Added a check for SIGSEGV handler being inside libxul (which we assume is the crash reporter). In that case we return from abortThroughJava early and go through MOZ_CRASH directly. Otherwise, in the case of the crash reporter not yet being initialized, we redirect the abort through Java.

Glandium, can you take a look too?
Attachment #8488831 - Attachment is obsolete: true
Attachment #8488831 - Flags: review?(snorp)
Attachment #8490946 - Flags: review?(snorp)
Attachment #8490946 - Flags: review?(mh+mozilla)
New patch that uses dladdr.
Attachment #8488837 - Attachment is obsolete: true
Attachment #8490948 - Flags: review+
Comment on attachment 8490946 [details] [diff] [review]
Redirect mozalloc_abort through Java exception handling (v2)

Review of attachment 8490946 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozglue/android/APKOpen.cpp
@@ +137,5 @@
> +    }
> +
> +    Dl_info info = {};
> +    if ((sigact.sa_flags & SA_SIGINFO) &&
> +        __wrap_dladdr(reinterpret_cast<void*>(sigact.sa_sigaction), &info) &&

I think this should work, but it may be better to just use ElfLoader::Singleton.GetHandleByPtr(). You can then simply check handle && !handle->IsSystemElf() exactly like we do in the SEGV handler.
Attachment #8490946 - Flags: review?(snorp) → review+
Comment on attachment 8490943 [details] [diff] [review]
Add base address and fall back to system dladdr in __wrap_dladdr (v1)

Review of attachment 8490943 [details] [diff] [review]:
-----------------------------------------------------------------

snorp said to ask froydnj
Attachment #8490943 - Flags: review?(nfroyd)
Comment on attachment 8490943 [details] [diff] [review]
Add base address and fall back to system dladdr in __wrap_dladdr (v1)

Review of attachment 8490943 [details] [diff] [review]:
-----------------------------------------------------------------

Makes sense.

::: mozglue/linker/ElfLoader.cpp
@@ +96,2 @@
>    info->dli_fname = handle->GetPath();
> +  info->dli_fbase = handle->GetBase();

Might also be prudent to set dli_sname and dli_saddr to 0.  Then again, since we weren't setting them before...
Attachment #8490943 - Flags: review?(nfroyd) → review+
Comment on attachment 8490943 [details] [diff] [review]
Add base address and fall back to system dladdr in __wrap_dladdr (v1)

Nathan's review is enough here.
Attachment #8490943 - Flags: review?(mh+mozilla)
Comment on attachment 8490946 [details] [diff] [review]
Redirect mozalloc_abort through Java exception handling (v2)

Review of attachment 8490946 [details] [diff] [review]:
-----------------------------------------------------------------

snorp's review is enough.

::: mozglue/linker/ElfLoader.h
@@ +333,5 @@
> +protected:
> +  SEGVHandler();
> +  ~SEGVHandler();
> +
> +private:

Why are you moving this?
Attachment #8490946 - Flags: review?(mh+mozilla)
(In reply to Mike Hommey [:glandium] from comment #10)
> Comment on attachment 8490946 [details] [diff] [review]
> Redirect mozalloc_abort through Java exception handling (v2)
> 
> ::: mozglue/linker/ElfLoader.h
> @@ +333,5 @@
> > +protected:
> > +  SEGVHandler();
> > +  ~SEGVHandler();
> > +
> > +private:
> 
> Why are you moving this?

I moved ElfLoader::__wrap_sigaction to public so we can use it inside mozglue to find the real sigsegv handler.
(In reply to Jim Chen [:jchen :nchen] from comment #11)
> I moved ElfLoader::__wrap_sigaction to public so we can use it inside
> mozglue to find the real sigsegv handler.

The real sigsegv handler is not an interesting information. All it will tell you is that the handler is in libmozglue. That won't tell you if gecko is loaded and initialized.
(In reply to Mike Hommey [:glandium] from comment #12)
> (In reply to Jim Chen [:jchen :nchen] from comment #11)
> > I moved ElfLoader::__wrap_sigaction to public so we can use it inside
> > mozglue to find the real sigsegv handler.
> 
> The real sigsegv handler is not an interesting information. All it will tell
> you is that the handler is in libmozglue. That won't tell you if gecko is
> loaded and initialized.

Sorry, by "real sigsegv handler" I meant the one that's wrapped by the faulty.lib one. If we call the system sigaction it's going to give us the faulty.lib one, but if we call __wrap_sigaction, it's going to give us the sigsegv handler that faulty.lib is wrapping. We use the wrapped handler to check if the crash reporter is (likely) loaded or not.
Ah yes, makes sense.
You need to log in before you can comment on or make changes to this bug.