Closed Bug 941044 Opened 11 years ago Closed 10 years ago

After Australis update, experiencing crash but no crash reporter

Categories

(Firefox :: General, defect)

x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mconley, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash, Whiteboard: [Australis:P-])

Attachments

(1 file)

This one is pretty disturbing - after catlee updated to a recent Nightly, his browser crashed twice, and each time the crash reporter didn't come up. It's unclear if this is related to the Australis merge, but I'm adding this as blocking australis-merge just in case. If anybody else experiences this, please mention it here.
Slight correction to this: the behaviour I see is that I apply the update, restart firefox, crash, restart firefox, and then it stays running fine. I've hit this both for yesterday's update and today's.
Also, there's no evidence of this in about:crashes.
Happened again today. I'm having problems catching this in gdb...any tips for following firefox process across restarts?
(In reply to Chris AtLee [:catlee] from comment #3) > Happened again today. I'm having problems catching this in gdb...any tips > for following firefox process across restarts? Ehsan worked on background updates - I wonder if he ever had to do something similar. Let's ask.
Flags: needinfo?(ehsan)
(There is no usable gdb on Windows) Can you try to see if the crash me add-on brings up the crash dialog? For following the process through restarts on Windows, the easiest way is to add a long sleep somewhere at the beginning of XRE_main(), and attach visual studio when Firefox is restarted before the sleep comes to an end.
Flags: needinfo?(ehsan) → needinfo?(catlee)
I'm running a linux64 build, so gdb should work! I do get the crash dialog when using crashme.
Flags: needinfo?(catlee)
There are already a number of (known) ways that Firefox can crash without triggering the crash reporter. So I'd suspect this isn't Australis having broken the reporter, but it's quite possible that Australis has exposed/caused one of these kinds of crashes.
(Sorry for some reason I thought this is about Windows!)
P1 on figuring this out, I guess...
Whiteboard: [Australis:P1]
Any luck getting a stacktrace this morning, catlee?
Flags: needinfo?(catlee)
Attached file firefox.log
Caught this today. Here's the output of 'thread apply all bt' (attached). The process died with SIGSEGV.
Flags: needinfo?(catlee)
Attachment #8341718 - Attachment mime type: text/x-log → text/plain
Which OS(es) is this happening with?
(In reply to Chris AtLee [:catlee] from comment #11) > Created attachment 8341718 [details] > firefox.log > > Caught this today. Here's the output of 'thread apply all bt' (attached). > The process died with SIGSEGV. If Thread 1 is the crashing thread then this is a JS engine crash in the JIT.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #12) > Which OS(es) is this happening with? I'm running 64-bit linux (debian testing/unstable)
Untracking for Australis, I don't think it's likely that this is actually Australis-specific (see comment 7), especially given the lack of widespread reports of similar problems.
Whiteboard: [Australis:P1] → [Australis:P-]
This is still happening to me several times a day. Anything else I can do to help debug?
(In reply to Chris AtLee [:catlee] from comment #16) > This is still happening to me several times a day. Anything else I can do to > help debug? Seeing as people said this might be JIT-related, have you tried turning off some of the JIT prefs for chrome and/or content?
Alternatively, you could check if you can reproduce on some of the older (archived) UX builds and/or otherwise get an idea of the regression range. Probably want to start with the UX builds just before merging to see if those had the problem while the regular nightlies produced at the same time were OK.
This happens very frequently immediately after typing in my master password.
So it's likely the stack trace I posted here is irrelevant due to bug 957729. I'll re-run with JS_DISABLE_SLOW_SCRIPT_SIGNALS=1 and see what happens next time.
So re-running with the env var set, I get this crash: Program terminated with signal 5, Trace/breakpoint trap. #0 0x00007fee832b91f1 in _dl_debug_state () from /lib64/ld-linux-x86-64.so.2 #1 0x00007fee832b057a in ?? () from /lib64/ld-linux-x86-64.so.2 #2 0x00007fee832b224b in ?? () from /lib64/ld-linux-x86-64.so.2 #3 0x00007fee832bc8bc in ?? () from /lib64/ld-linux-x86-64.so.2 #4 0x00007fee832b8806 in ?? () from /lib64/ld-linux-x86-64.so.2 #5 0x00007fee832bc339 in ?? () from /lib64/ld-linux-x86-64.so.2 #6 0x00007fee821de722 in do_dlopen (ptr=0x7fedfe9fd4e0) at dl-libc.c:87 #7 0x00007fee832b8806 in ?? () from /lib64/ld-linux-x86-64.so.2 #8 0x00007fee821de7bf in dlerror_run (operate=operate@entry=0x7fee821de6e0 <do_dlopen>, args=args@entry=0x7fedfe9fd4e0) at dl-libc.c:46 #9 0x00007fee821de831 in __GI___libc_dlopen_mode (name=name@entry=0x7fedfe9fd510 "libnss_mdns4.so.2", mode=mode@entry=-2147483647) at dl-libc.c:163 #10 0x00007fee821b7bc8 in nss_load_library (ni=<optimized out>) at nsswitch.c:399 #11 nss_load_library (ni=0x7fee714c8500, ni=0x7fee714c8500) at nsswitch.c:368 #12 0x00007fee821b836f in __GI___nss_lookup_function (ni=0x7fee714c8500, fct_name=fct_name@entry=0x7fee82228cd7 "gethostbyname4_r") at nsswitch.c:507 #13 0x00007fee821835e9 in gaih_inet (name=name@entry=0x7fee278f83b8 "fls-devo.vipinteg.amazon.com", service=<optimized out>, req=req@entry=0x7fedfe9fddb0, pai=pai@entry=0x7fedfe9fdc30, naddrs=naddrs@entry=0x7fedfe9fdc28) at ../sysdeps/posix/getaddrinfo.c:840 #14 0x00007fee82186a14 in __GI_getaddrinfo (name=0x7fee278f83b8 "fls-devo.vipinteg.amazon.com", service=<optimized out>, hints=0x7fedfe9fddb0, pai=0x7fedfe9fdda8) at ../sysdeps/posix/getaddrinfo.c:2473 #15 0x00007fee81ced4d9 in PR_GetAddrInfoByName () from /home/catlee/minefield/libnspr4.so #16 0x00007fee7d461b64 in nsHostResolver::ThreadFunc(void*) () from /home/catlee/minefield/libxul.so #17 0x00007fee81ceac06 in _pt_root () from /home/catlee/minefield/libnspr4.so #18 0x00007fee83095e0e in start_thread (arg=0x7fedfe9fe700) at pthread_create.c:311 #19 0x00007fee821a80fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Hmm, according to http://sysprogs.com/blog/?p=51, _dl_debug_state() should be a null function in the dynamic library loading stack that exists so that GDB can hook in as notification that a shared library has been loaded. That could mean the stack in comment 21 is an artifact of having GDB attached, rather than the actual crash; it could also be that something else is wrong with our attempt to load libnss_mdns4.so.2 Chris, to rule out the possibility that this problem is caused by a broken Nightly update, could you move aside (and save) your existing Firefox install directory and install a fresh copy?
Flags: needinfo?(catlee)
Haven't hit this in a while. Magical fixes FTW?
Flags: needinfo?(catlee)
The crash reporter seem to be working as expected in the latest Release and truk builds (except OSX). I will go ahead and mark this report worksforme
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: