19.90 KB, patch
|Details | Diff | Splinter Review|
1.56 KB, patch
|Details | Diff | Splinter Review|
19.93 KB, patch
|Details | Diff | Splinter Review|
It is #2 top crasher in Fennec 4.0b3 for the last week. Signature libc.so@0xb538 UUID f03fb650-fa24-4202-a089-f0e512110104 Time 2011-01-04 11:59:57.565009 Uptime 8 Install Age 438247 seconds (5.1 days) since version was first installed. Product Fennec Version 4.0b3 Build ID 20101221205132 Branch 1.9 OS Linux OS Version 0.0.0 Linux 126.96.36.199 #1 Tue Nov 9 00:50:41 KST 2010 armv7l CPU arm CPU Info Crash Reason SIGSEGV Crash Address 0x2b7fdc App Notes samsung GT-P1000 samsung/GT-P1000/GT-P1000/GT-P1000:2.2/FROYO/JPJK2:user/release-keys Frame Module Signature [Expand] Source 0 libc.so libc.so@0xb538 1 libc.so libc.so@0xc94c 2 libxul.so std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_M_insert_ stl_tree.h:914 3 libc.so libc.so@0xcfe8 4 libc.so libc.so@0x13484 5 libxul.so base::CloseSuperfluousFds ipc/chromium/src/base/process_util_posix.cc:138 6 libxul.so base::LaunchApp stl_tree.h:243 7 libxul.so mozilla::ipc::GeckoChildProcessHost::PerformAsyncLaunch ipc/glue/GeckoChildProcessHost.cpp:564 8 libxul.so RunnableMethod<mozilla::ipc::GeckoChildProcessHost, bool , Tuple2<std::vector<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, base::ProcessArchitecture> >::Run ipc/chromium/src/base/tuple.h:400 9 libxul.so MessageLoop::RunTask ipc/chromium/src/base/message_loop.cc:344 10 libxul.so MessageLoop::DeferOrRunPendingTask ipc/chromium/src/base/message_loop.cc:354 11 libxul.so MessageLoop::DoWork ipc/chromium/src/base/message_loop.cc:451 12 libxul.so base::MessagePumpLibevent::Run ipc/chromium/src/base/message_pump_libevent.cc:310 13 libxul.so MessageLoop::RunInternal ipc/chromium/src/base/message_loop.cc:220 14 libxul.so MessageLoop::Run ipc/chromium/src/base/message_loop.cc:512 15 libxul.so base::Thread::ThreadMain ipc/chromium/src/base/thread.cc:159 16 libxul.so ThreadFunc ipc/chromium/src/base/platform_thread_posix.cc:28 17 libc.so libc.so@0x111b3 18 libc.so libc.so@0x10ca3 More reports at: http://crash-stats.mozilla.com/report/list?range_value=4&range_unit=weeks&signature=libc.so%400xb538
With combined signatures, it is #1 top crasher in Fennec 4.0b3 for the last week.
(In reply to comment #1) > With combined signatures, it is #1 top crasher in Fennec 4.0b3 for the last > week. Can you find it in the nightlies? I couldn't when I looked.
> Can you find it in the nightlies? I couldn't when I looked. You can sort out crash reports by version (click on the arrow on the right of version column). See: bp-9fbea702-7245-4de3-9972-ad3c52101221 According to the low crash daily rate in nightlies, it could be fixed in recent nightlies. But I don't think so because there are only 220 ADU (Active Daily User) in nightlies and 22,000 ones in Beta 3 (ratio of 100).
I added other crash signatures. More reports at: http://crash-stats.mozilla.com/query/query?product=Fennec&range_value=4&range_unit=weeks&query_search=signature&query_type=startswith&query=libc.so@0xb&build_id=&process_type=any&hang_type=any&do_query=1
Ahh... my previous search in the nightly crash stats didn't go back far enough. The last crash report from a nightly build with CloseSuperfluousFds in the stack was from Dec 19th. I'm optimistically thinking that this has solved itself.
marking as fixed since we haven't seen a crash since the 12/19 build. Please reopen if you see it again.
I saw one in Fennec 4.0b4pre/20110109: bp-c6c92ae0-ce08-40c5-9f7b-60ca82110109
Is this currently a topcrasher? cjones, can you think of why this might be crashing? It appears to be calling opendir("/self/proc/fd").
> Is this currently a topcrasher? No. There have been only one crash in Fennec 4.0b4pre for the last week.
I don't know which part of the backtrace to believe. Crashing at std::_Rb_tree::_M_insert_ suggests OOM to me, and nothing else really. Otherwise, no ideas.
(In reply to comment #10) > I don't know which part of the backtrace to believe. Crashing at > std::_Rb_tree::_M_insert_ suggests OOM to me, and nothing else really. > Otherwise, no ideas. Just hit that one - on startup only application running was fennec.
With combined signatures, it is a virtual #13 top crasher in Firefox for mobile over the last 3 days.
is bug 663494 a duplicate of this bug?
With the dlmalloc_stats crash signature, it is #2 top crasher in 6.0 and #1 in 7.0 Beta.
In 7.0.1, there are only dlmalloc_stats (Samsung devices), ThreadFunc crash signatures. Crashes happen at startup. It's probably a dupe of bug 663494 (same stack traces) about energy saving power down.
http://code.google.com/p/chromium/issues/detail?id=36678 http://git.chromium.org/gitweb/?p=chromium.git;a=blob;f=base/process_util_posix.cc#l334 According to this, the problem is the use of dynamic memory allocation in that function. This is line Chris Jones' comment 10.
Right-o, good find!
Hmm, apparently we already knew about that issue, fixed it, and you reviewed it, in bug 680190.
The crash is still happening in Fennec 9.0, so either it's something else, or the fixes in 680190 aren't enough. The crash happens in opendir, or thereabouts. That function is both locking and malloc-ing stuff. If I get bug 680190 correctly, the malloc aren't actually a problem. But the locking would be, right? If we are in opendir in a thread at the same time as the LaunchApp is called, we're dead? http://nv-tegra.nvidia.com/gitweb/?p=android/platform/bionic.git;a=blob;f=libc/unistd/opendir.c
Created attachment 565205 [details] [diff] [review] Patch 1. Avoid malloc and locking after fork
Created attachment 565206 [details] [diff] [review] Patch 2. Remove the hacks from the previous fix
This updates our IPC code to be more in sync with the current Chromium code, which avoids dynamic memory allocation (and the opendir) between the fork() and execvp() calls. This should also allow us to remove the prefork/postfork jemalloc hacks. I'm not positive we're really hitting the problem case (and hence, if this will resolve the bug), but unless anyone has a better idea... Try run: https://tbpl.mozilla.org/?tree=Try&rev=cb9a9ce90ddc
Hi Gian-Carlo, (In reply to Gian-Carlo Pascutto (:gcp) from comment #21) > Created attachment 565205 [details] [diff] [review] [diff] [details] [review] > Patch 1. Avoid malloc and locking after fork Is this just merging in upstream changes? I can rubber-stamp those, but if there's new code included in this patch, I need to review that.
Comment on attachment 565206 [details] [diff] [review] Patch 2. Remove the hacks from the previous fix >diff --git a/ipc/chromium/src/base/process_util_linux.cc b/ipc/chromium/src/base/process_util_linux.cc >--- a/ipc/chromium/src/base/process_util_linux.cc >+++ b/ipc/chromium/src/base/process_util_linux.cc It's OK to remove the hacks from here since the helper function always fork()+exec()s, but we need to keep the general workaround code in jemalloc for situations in which we just fork() in a jemalloc-using process.
>Is this just merging in upstream changes? I can rubber-stamp those, but if >there's new code included in this patch, I need to review that. It's merging in upstream changes, but it's not a direct update. I made the patch that is in the chrome bug mentioned earlier on our codebase, and edited it wherever appropriate to avoid bringing in more updates (notably, it was using a LOG define that we don't seem to have yet). If I messed up, it's more likely to be in process_util_* since the other files were mostly direct copypasting.
Comment on attachment 565205 [details] [diff] [review] Patch 1. Avoid malloc and locking after fork Alright, rs=me.
Created attachment 566217 [details] [diff] [review] Patch 2. Remove the hacks from the previous fix Only remove pre/postfork usage in fixed code path.
Can we get this nominated for being pushed to Aurora and Beta please?
- There are no tests or easy STR for this. It's a race condition that can happen when launching the app. The analysis for the fix here is based on reading the crashing code. - It's been on m-c for 2 weeks. - No performance impact. We want it because it seems to fix a topcrasher. If I read Socorro correctly we haven't seen any after the fix landed. Because of that, I think it's low risk. - No impact to API.
Comment on attachment 565205 [details] [diff] [review] Patch 1. Avoid malloc and locking after fork [Triage Comment] We're denying these patches for Beta because * This is not a recent regression * The patch looks non-trivial * It's too late in the 8 cycle for m-c-only tested code to land
Comment on attachment 565205 [details] [diff] [review] Patch 1. Avoid malloc and locking after fork We'll actually take this on mozilla-aurora (top crasher, importing 3rd party released code) but would like a rollup patch. Please attach one and ask for mozilla-aurora approval. Thanks!
Created attachment 571310 [details] [diff] [review] Patch 3. Folded patch of previous 2 patches
Comment on attachment 571310 [details] [diff] [review] Patch 3. Folded patch of previous 2 patches [triage comment] Approved for aurora. Please land today if at all possible.
Does this crash only affect Mobile?
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #39) > Does this crash only affect Mobile? AFAIK, yes, this is Mobile-only, I guess Android-only.