Closed
Bug 622992
Opened 14 years ago
Closed 13 years ago
Fennec crash in base::CloseSuperfluousFds at startup
Categories
(Core :: IPC, defect)
Tracking
()
RESOLVED
FIXED
mozilla10
People
(Reporter: scoobidiver, Assigned: gcp)
References
Details
(Keywords: crash, topcrash, Whiteboard: [mobile-crash])
Crash Data
Attachments
(3 files, 1 obsolete file)
19.90 KB,
patch
|
cjones
:
review+
christian
:
approval-mozilla-aurora-
akeybl
:
approval-mozilla-beta-
|
Details | Diff | Splinter Review |
1.56 KB,
patch
|
cjones
:
review+
christian
:
approval-mozilla-aurora-
akeybl
:
approval-mozilla-beta-
|
Details | Diff | Splinter Review |
19.93 KB,
patch
|
christian
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
It is #2 top crasher in Fennec 4.0b3 for the last week. Signature libc.so@0xb538 UUID f03fb650-fa24-4202-a089-f0e512110104 Time 2011-01-04 11:59:57.565009 Uptime 8 Install Age 438247 seconds (5.1 days) since version was first installed. Product Fennec Version 4.0b3 Build ID 20101221205132 Branch 1.9 OS Linux OS Version 0.0.0 Linux 2.6.32.9 #1 Tue Nov 9 00:50:41 KST 2010 armv7l CPU arm CPU Info Crash Reason SIGSEGV Crash Address 0x2b7fdc App Notes samsung GT-P1000 samsung/GT-P1000/GT-P1000/GT-P1000:2.2/FROYO/JPJK2:user/release-keys Frame Module Signature [Expand] Source 0 libc.so libc.so@0xb538 1 libc.so libc.so@0xc94c 2 libxul.so std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_M_insert_ stl_tree.h:914 3 libc.so libc.so@0xcfe8 4 libc.so libc.so@0x13484 5 libxul.so base::CloseSuperfluousFds ipc/chromium/src/base/process_util_posix.cc:138 6 libxul.so base::LaunchApp stl_tree.h:243 7 libxul.so mozilla::ipc::GeckoChildProcessHost::PerformAsyncLaunch ipc/glue/GeckoChildProcessHost.cpp:564 8 libxul.so RunnableMethod<mozilla::ipc::GeckoChildProcessHost, bool , Tuple2<std::vector<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, base::ProcessArchitecture> >::Run ipc/chromium/src/base/tuple.h:400 9 libxul.so MessageLoop::RunTask ipc/chromium/src/base/message_loop.cc:344 10 libxul.so MessageLoop::DeferOrRunPendingTask ipc/chromium/src/base/message_loop.cc:354 11 libxul.so MessageLoop::DoWork ipc/chromium/src/base/message_loop.cc:451 12 libxul.so base::MessagePumpLibevent::Run ipc/chromium/src/base/message_pump_libevent.cc:310 13 libxul.so MessageLoop::RunInternal ipc/chromium/src/base/message_loop.cc:220 14 libxul.so MessageLoop::Run ipc/chromium/src/base/message_loop.cc:512 15 libxul.so base::Thread::ThreadMain ipc/chromium/src/base/thread.cc:159 16 libxul.so ThreadFunc ipc/chromium/src/base/platform_thread_posix.cc:28 17 libc.so libc.so@0x111b3 18 libc.so libc.so@0x10ca3 More reports at: http://crash-stats.mozilla.com/report/list?range_value=4&range_unit=weeks&signature=libc.so%400xb538
Reporter | ||
Updated•14 years ago
|
tracking-fennec: --- → ?
Reporter | ||
Comment 1•14 years ago
|
||
With combined signatures, it is #1 top crasher in Fennec 4.0b3 for the last week.
Summary: Fennec crash [@ libc.so@0xb538 ] → Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ]
Comment 2•14 years ago
|
||
(In reply to comment #1) > With combined signatures, it is #1 top crasher in Fennec 4.0b3 for the last > week. Can you find it in the nightlies? I couldn't when I looked.
Reporter | ||
Comment 3•14 years ago
|
||
> Can you find it in the nightlies? I couldn't when I looked. You can sort out crash reports by version (click on the arrow on the right of version column). See: bp-9fbea702-7245-4de3-9972-ad3c52101221 According to the low crash daily rate in nightlies, it could be fixed in recent nightlies. But I don't think so because there are only 220 ADU (Active Daily User) in nightlies and 22,000 ones in Beta 3 (ratio of 100).
Reporter | ||
Comment 4•14 years ago
|
||
I added other crash signatures. More reports at: http://crash-stats.mozilla.com/query/query?product=Fennec&range_value=4&range_unit=weeks&query_search=signature&query_type=startswith&query=libc.so@0xb&build_id=&process_type=any&hang_type=any&do_query=1
Summary: Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] → Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ][@ libc.so@0xb57c ] [@ libc.so@0xb43c ]
Comment 5•14 years ago
|
||
Ahh... my previous search in the nightly crash stats didn't go back far enough. The last crash report from a nightly build with CloseSuperfluousFds in the stack was from Dec 19th. I'm optimistically thinking that this has solved itself.
Updated•14 years ago
|
blocking2.0: --- → ?
tracking-fennec: ? → 2.0+
Comment 6•14 years ago
|
||
marking as fixed since we haven't seen a crash since the 12/19 build. Please reopen if you see it again.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•14 years ago
|
blocking2.0: ? → final+
Resolution: FIXED → WORKSFORME
Reporter | ||
Comment 7•14 years ago
|
||
I saw one in Fennec 4.0b4pre/20110109: bp-c6c92ae0-ce08-40c5-9f7b-60ca82110109
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Comment 8•14 years ago
|
||
Is this currently a topcrasher? cjones, can you think of why this might be crashing? It appears to be calling opendir("/self/proc/fd").
Reporter | ||
Comment 9•14 years ago
|
||
> Is this currently a topcrasher?
No. There have been only one crash in Fennec 4.0b4pre for the last week.
Updated•14 years ago
|
blocking2.0: final+ → -
tracking-fennec: 2.0+ → ---
I don't know which part of the backtrace to believe. Crashing at std::_Rb_tree::_M_insert_ suggests OOM to me, and nothing else really. Otherwise, no ideas.
Reporter | ||
Comment 11•13 years ago
|
||
4.0b6pre crash: bp-84f4fa86-3634-4d1c-b8e6-02cb92110302 bp-e4181d9b-031b-4339-a606-152142110305
Summary: Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ][@ libc.so@0xb57c ] [@ libc.so@0xb43c ] → Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ][@ libc.so@0xb57c ] [@ libc.so@0x12c0e ]
Comment 12•13 years ago
|
||
(In reply to comment #10) > I don't know which part of the backtrace to believe. Crashing at > std::_Rb_tree::_M_insert_ suggests OOM to me, and nothing else really. > Otherwise, no ideas. Just hit that one - on startup only application running was fennec.
Reporter | ||
Comment 13•13 years ago
|
||
With combined signatures, it is a virtual #13 top crasher in Firefox for mobile over the last 3 days.
Updated•13 years ago
|
Crash Signature: [@ libc.so@0xb538 ]
[@ libc.so@0xb45c ]
[@ libc.so@0xb40c ]
[@ libc.so@0xb47c ]
[@ libc.so@0xb548 ]
[@ libc.so@0xb64c ]
[@ libc.so@0xb52c ]
[@ libc.so@0xb4ce ]
[@ libc.so@0xb42c ]
[@ libc.so@0xb66c ]
[@ libc.so@0xb57c ]
[@ libc.so@0x12c0e ]
is bug 663494 a duplicate of this bug?
Crash Signature: [@ libc.so@0xb538 ]
[@ libc.so@0xb45c ]
[@ libc.so@0xb40c ]
[@ libc.so@0xb47c ]
[@ libc.so@0xb548 ]
[@ libc.so@0xb64c ]
[@ libc.so@0xb52c ]
[@ libc.so@0xb4ce ]
[@ libc.so@0xb42c ]
[@ libc.so@0xb66c ]
[@ libc.so@0xb57c ]
[@ libc.so@0x12c0e ] → [@ libc.so@0xb538 ]
[@ libc.so@0xb45c ]
[@ libc.so@0xb40c ]
[@ libc.so@0xb47c ]
[@ libc.so@0xb548 ]
[@ libc.so@0xb64c ]
[@ libc.so@0xb52c ]
[@ libc.so@0xb4ce ]
[@ libc.so@0xb42c ]
[@ libc.so@0xb66c ]
[@ libc.so@0xb57c ]
[@ libc.so@0x12c0e ]
Reporter | ||
Updated•13 years ago
|
Summary: Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ][@ libc.so@0xb57c ] [@ libc.so@0x12c0e ] → Fennec crash in base::CloseSuperfluousFds
Updated•13 years ago
|
Crash Signature: [@ libc.so@0xb538 ]
[@ libc.so@0xb45c ]
[@ libc.so@0xb40c ]
[@ libc.so@0xb47c ]
[@ libc.so@0xb548 ]
[@ libc.so@0xb64c ]
[@ libc.so@0xb52c ]
[@ libc.so@0xb4ce ]
[@ libc.so@0xb42c ]
[@ libc.so@0xb66c ]
[@ libc.so@0xb57c ]
[@ libc.so@0x12c0e ] → ]
[@ libc.so@0x12e2e ] [@ libc.so@0xb538 ]
[@ libc.so@0xb45c ]
[@ libc.so@0xb40c ]
[@ libc.so@0xb47c ]
[@ libc.so@0xb548 ]
[@ libc.so@0xb64c ]
[@ libc.so@0xb52c ]
[@ libc.so@0xb4ce ]
[@ libc.so@0xb42c ]
[@ libc.so@0xb66c ]
[@ libc.so@0xb57c ]
…
Reporter | ||
Comment 15•13 years ago
|
||
With the dlmalloc_stats crash signature, it is #2 top crasher in 6.0 and #1 in 7.0 Beta.
tracking-fennec: --- → ?
Crash Signature: ]
[@ libc.so@0x12e2e ] → ]
[@ libc.so@0x12e2e ]
[@ dlmalloc_stats ]
Updated•13 years ago
|
tracking-fennec: ? → +
Updated•13 years ago
|
Whiteboard: [mobile-crash]
Reporter | ||
Updated•13 years ago
|
Crash Signature: ]
[@ libc.so@0x12e2e ]
[@ dlmalloc_stats ] → ]
[@ libc.so@0x12e2e ]
[@ dlmalloc_stats ]
[@ ThreadFunc ]
Reporter | ||
Comment 16•13 years ago
|
||
In 7.0.1, there are only dlmalloc_stats (Samsung devices), ThreadFunc crash signatures. Crashes happen at startup. It's probably a dupe of bug 663494 (same stack traces) about energy saving power down.
Summary: Fennec crash in base::CloseSuperfluousFds → Fennec crash in base::CloseSuperfluousFds at startup
Assignee | ||
Comment 17•13 years ago
|
||
http://code.google.com/p/chromium/issues/detail?id=36678 http://git.chromium.org/gitweb/?p=chromium.git;a=blob;f=base/process_util_posix.cc#l334 According to this, the problem is the use of dynamic memory allocation in that function. This is line Chris Jones' comment 10.
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → gpascutto
Right-o, good find!
Assignee | ||
Comment 19•13 years ago
|
||
Hmm, apparently we already knew about that issue, fixed it, and you reviewed it, in bug 680190.
Assignee | ||
Comment 20•13 years ago
|
||
The crash is still happening in Fennec 9.0, so either it's something else, or the fixes in 680190 aren't enough. The crash happens in opendir, or thereabouts. That function is both locking and malloc-ing stuff. If I get bug 680190 correctly, the malloc aren't actually a problem. But the locking would be, right? If we are in opendir in a thread at the same time as the LaunchApp is called, we're dead? http://nv-tegra.nvidia.com/gitweb/?p=android/platform/bionic.git;a=blob;f=libc/unistd/opendir.c
Assignee | ||
Comment 21•13 years ago
|
||
Attachment #565205 -
Flags: review?(jones.chris.g)
Assignee | ||
Comment 22•13 years ago
|
||
Attachment #565206 -
Flags: review?(jones.chris.g)
Assignee | ||
Comment 23•13 years ago
|
||
This updates our IPC code to be more in sync with the current Chromium code, which avoids dynamic memory allocation (and the opendir) between the fork() and execvp() calls. This should also allow us to remove the prefork/postfork jemalloc hacks. I'm not positive we're really hitting the problem case (and hence, if this will resolve the bug), but unless anyone has a better idea... Try run: https://tbpl.mozilla.org/?tree=Try&rev=cb9a9ce90ddc
Hi Gian-Carlo, (In reply to Gian-Carlo Pascutto (:gcp) from comment #21) > Created attachment 565205 [details] [diff] [review] [diff] [details] [review] > Patch 1. Avoid malloc and locking after fork Is this just merging in upstream changes? I can rubber-stamp those, but if there's new code included in this patch, I need to review that.
Comment on attachment 565206 [details] [diff] [review] Patch 2. Remove the hacks from the previous fix >diff --git a/ipc/chromium/src/base/process_util_linux.cc b/ipc/chromium/src/base/process_util_linux.cc >--- a/ipc/chromium/src/base/process_util_linux.cc >+++ b/ipc/chromium/src/base/process_util_linux.cc It's OK to remove the hacks from here since the helper function always fork()+exec()s, but we need to keep the general workaround code in jemalloc for situations in which we just fork() in a jemalloc-using process.
Attachment #565206 -
Flags: review?(jones.chris.g)
Assignee | ||
Comment 26•13 years ago
|
||
>Is this just merging in upstream changes? I can rubber-stamp those, but if
>there's new code included in this patch, I need to review that.
It's merging in upstream changes, but it's not a direct update. I made the patch that is in the chrome bug mentioned earlier on our codebase, and edited it wherever appropriate to avoid bringing in more updates (notably, it was using a LOG define that we don't seem to have yet). If I messed up, it's more likely to be in process_util_* since the other files were mostly direct copypasting.
Comment on attachment 565205 [details] [diff] [review] Patch 1. Avoid malloc and locking after fork Alright, rs=me.
Attachment #565205 -
Flags: review?(jones.chris.g) → review+
Assignee | ||
Comment 28•13 years ago
|
||
Only remove pre/postfork usage in fixed code path.
Attachment #565206 -
Attachment is obsolete: true
Attachment #566217 -
Flags: review?(jones.chris.g)
Updated•13 years ago
|
Attachment #566217 -
Flags: review?(jones.chris.g) → review+
Comment 29•13 years ago
|
||
passed try: https://tbpl.mozilla.org/?tree=Try&usebuildbot=1&rev=d29c3097da42 landed on m-i: https://hg.mozilla.org/integration/mozilla-inbound/rev/de0be83bf880 https://hg.mozilla.org/integration/mozilla-inbound/rev/fed6a1e02b7d
Comment 30•13 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/fed6a1e02b7d https://hg.mozilla.org/mozilla-central/rev/de0be83bf880
Status: REOPENED → RESOLVED
Closed: 14 years ago → 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
Can we get this nominated for being pushed to Aurora and Beta please?
Assignee | ||
Updated•13 years ago
|
Attachment #565205 -
Flags: approval-mozilla-beta?
Attachment #565205 -
Flags: approval-mozilla-aurora?
Assignee | ||
Updated•13 years ago
|
Attachment #566217 -
Flags: approval-mozilla-beta?
Attachment #566217 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 33•13 years ago
|
||
- There are no tests or easy STR for this. It's a race condition that can happen when launching the app. The analysis for the fix here is based on reading the crashing code. - It's been on m-c for 2 weeks. - No performance impact. We want it because it seems to fix a topcrasher. If I read Socorro correctly we haven't seen any after the fix landed. Because of that, I think it's low risk. - No impact to API.
Comment 34•13 years ago
|
||
Comment on attachment 565205 [details] [diff] [review] Patch 1. Avoid malloc and locking after fork [Triage Comment] We're denying these patches for Beta because * This is not a recent regression * The patch looks non-trivial * It's too late in the 8 cycle for m-c-only tested code to land
Attachment #565205 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
Updated•13 years ago
|
Attachment #566217 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
Attachment #566217 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora-
Comment 35•13 years ago
|
||
Comment on attachment 565205 [details] [diff] [review] Patch 1. Avoid malloc and locking after fork We'll actually take this on mozilla-aurora (top crasher, importing 3rd party released code) but would like a rollup patch. Please attach one and ask for mozilla-aurora approval. Thanks!
Attachment #565205 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora-
Assignee | ||
Comment 36•13 years ago
|
||
Attachment #571310 -
Flags: approval-mozilla-aurora?
Comment 37•13 years ago
|
||
Comment on attachment 571310 [details] [diff] [review] Patch 3. Folded patch of previous 2 patches [triage comment] Approved for aurora. Please land today if at all possible.
Attachment #571310 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 38•13 years ago
|
||
http://hg.mozilla.org/releases/mozilla-aurora/rev/4c79fc728cc3
status-firefox9:
--- → fixed
Reporter | ||
Updated•13 years ago
|
Crash Signature: ]
[@ libc.so@0x12e2e ]
[@ dlmalloc_stats ]
[@ ThreadFunc ] → ]
[@ libc.so@0x12e2e ]
[@ dlmalloc_stats ]
[@ ThreadFunc ]
[@ std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_M_insert_ ]
Comment 39•13 years ago
|
||
Does this crash only affect Mobile?
Comment 40•13 years ago
|
||
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #39) > Does this crash only affect Mobile? AFAIK, yes, this is Mobile-only, I guess Android-only.
You need to log in
before you can comment on or make changes to this bug.
Description
•