Fennec crash in base::CloseSuperfluousFds at startup

RESOLVED FIXED in Firefox 9

Status

()

Core
IPC
--
critical
RESOLVED FIXED
7 years ago
4 years ago

People

(Reporter: Scoobidiver (away), Assigned: gcp)

Tracking

({crash, topcrash})

Trunk
mozilla10
ARM
Android
crash, topcrash
Points:
---

Firefox Tracking Flags

(firefox9 fixed, blocking2.0 -, fennec+)

Details

(Whiteboard: [mobile-crash], crash signature)

Attachments

(3 attachments, 1 obsolete attachment)

(Reporter)

Description

7 years ago
It is #2 top crasher in Fennec 4.0b3 for the last week.

Signature	libc.so@0xb538
UUID	f03fb650-fa24-4202-a089-f0e512110104
Time 	2011-01-04 11:59:57.565009
Uptime	8
Install Age	438247 seconds (5.1 days) since version was first installed.
Product	Fennec
Version	4.0b3
Build ID	20101221205132
Branch	1.9
OS	Linux
OS Version	0.0.0 Linux 2.6.32.9 #1 Tue Nov 9 00:50:41 KST 2010 armv7l
CPU	arm
CPU Info	
Crash Reason	SIGSEGV
Crash Address	0x2b7fdc
App Notes 	samsung GT-P1000
samsung/GT-P1000/GT-P1000/GT-P1000:2.2/FROYO/JPJK2:user/release-keys

Frame 	Module 	Signature [Expand] 	Source
0 	libc.so 	libc.so@0xb538 	
1 	libc.so 	libc.so@0xc94c 	
2 	libxul.so 	std::_Rb_tree<int, int, std::_Identity<int>, std::less<int>, std::allocator<int> >::_M_insert_ 	stl_tree.h:914
3 	libc.so 	libc.so@0xcfe8 	
4 	libc.so 	libc.so@0x13484 	
5 	libxul.so 	base::CloseSuperfluousFds 	ipc/chromium/src/base/process_util_posix.cc:138
6 	libxul.so 	base::LaunchApp 	stl_tree.h:243
7 	libxul.so 	mozilla::ipc::GeckoChildProcessHost::PerformAsyncLaunch 	ipc/glue/GeckoChildProcessHost.cpp:564
8 	libxul.so 	RunnableMethod<mozilla::ipc::GeckoChildProcessHost, bool , Tuple2<std::vector<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, base::ProcessArchitecture> >::Run 	ipc/chromium/src/base/tuple.h:400
9 	libxul.so 	MessageLoop::RunTask 	ipc/chromium/src/base/message_loop.cc:344
10 	libxul.so 	MessageLoop::DeferOrRunPendingTask 	ipc/chromium/src/base/message_loop.cc:354
11 	libxul.so 	MessageLoop::DoWork 	ipc/chromium/src/base/message_loop.cc:451
12 	libxul.so 	base::MessagePumpLibevent::Run 	ipc/chromium/src/base/message_pump_libevent.cc:310
13 	libxul.so 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:220
14 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:512
15 	libxul.so 	base::Thread::ThreadMain 	ipc/chromium/src/base/thread.cc:159
16 	libxul.so 	ThreadFunc 	ipc/chromium/src/base/platform_thread_posix.cc:28
17 	libc.so 	libc.so@0x111b3 	
18 	libc.so 	libc.so@0x10ca3 

More reports at:
http://crash-stats.mozilla.com/report/list?range_value=4&range_unit=weeks&signature=libc.so%400xb538
(Reporter)

Updated

7 years ago
tracking-fennec: --- → ?
(Reporter)

Comment 1

7 years ago
With combined signatures, it is #1 top crasher in Fennec 4.0b3 for the last week.
Summary: Fennec crash [@ libc.so@0xb538 ] → Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ]
(Reporter)

Updated

7 years ago
Keywords: topcrash
(In reply to comment #1)
> With combined signatures, it is #1 top crasher in Fennec 4.0b3 for the last
> week.

Can you find it in the nightlies? I couldn't when I looked.
(Reporter)

Comment 3

7 years ago
> Can you find it in the nightlies? I couldn't when I looked.
You can sort out crash reports by version (click on the arrow on the right of version column). See: bp-9fbea702-7245-4de3-9972-ad3c52101221

According to the low crash daily rate in nightlies, it could be fixed in recent nightlies. But I don't think so because there are only 220 ADU (Active Daily User) in nightlies and 22,000 ones in Beta 3 (ratio of 100).
(Reporter)

Comment 4

7 years ago
I added other crash signatures.

More reports at:
http://crash-stats.mozilla.com/query/query?product=Fennec&range_value=4&range_unit=weeks&query_search=signature&query_type=startswith&query=libc.so@0xb&build_id=&process_type=any&hang_type=any&do_query=1
Summary: Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] → Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ][@ libc.so@0xb57c ] [@ libc.so@0xb43c ]
Ahh... my previous search in the nightly crash stats didn't go back far enough. The last crash report from a nightly build with CloseSuperfluousFds in the stack was from Dec 19th. I'm optimistically thinking that this has solved itself.
blocking2.0: --- → ?
tracking-fennec: ? → 2.0+
marking as fixed since we haven't seen a crash since the 12/19 build. Please reopen if you see it again.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
blocking2.0: ? → final+
Resolution: FIXED → WORKSFORME
(Reporter)

Comment 7

6 years ago
I saw one in Fennec 4.0b4pre/20110109: bp-c6c92ae0-ce08-40c5-9f7b-60ca82110109
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Is this currently a topcrasher? cjones, can you think of why this might be crashing? It appears to be calling opendir("/self/proc/fd").
(Reporter)

Comment 9

6 years ago
> Is this currently a topcrasher?
No. There have been only one crash in Fennec 4.0b4pre for the last week.
blocking2.0: final+ → -
tracking-fennec: 2.0+ → ---
I don't know which part of the backtrace to believe.  Crashing at std::_Rb_tree::_M_insert_ suggests OOM to me, and nothing else really.  Otherwise, no ideas.
(Reporter)

Comment 11

6 years ago
4.0b6pre crash:
bp-84f4fa86-3634-4d1c-b8e6-02cb92110302
bp-e4181d9b-031b-4339-a606-152142110305
Summary: Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ][@ libc.so@0xb57c ] [@ libc.so@0xb43c ] → Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ][@ libc.so@0xb57c ] [@ libc.so@0x12c0e ]
(In reply to comment #10)
> I don't know which part of the backtrace to believe.  Crashing at
> std::_Rb_tree::_M_insert_ suggests OOM to me, and nothing else really. 
> Otherwise, no ideas.

Just hit that one - on startup only application running was fennec.
(Reporter)

Comment 13

6 years ago
With combined signatures, it is a virtual #13 top crasher in Firefox for mobile over the last 3 days.
Crash Signature: [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ] [@ libc.so@0xb57c ] [@ libc.so@0x12c0e ]
is bug 663494 a duplicate of this bug?
Crash Signature: [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ] [@ libc.so@0xb57c ] [@ libc.so@0x12c0e ] → [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ] [@ libc.so@0xb57c ] [@ libc.so@0x12c0e ]
(Reporter)

Updated

6 years ago
Summary: Fennec crash [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ][@ libc.so@0xb57c ] [@ libc.so@0x12c0e ] → Fennec crash in base::CloseSuperfluousFds
Crash Signature: [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ] [@ libc.so@0xb57c ] [@ libc.so@0x12c0e ] → ] [@ libc.so@0x12e2e ] [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ] [@ libc.so@0xb&hellip;
(Reporter)

Comment 15

6 years ago
With the dlmalloc_stats crash signature, it is #2 top crasher in 6.0 and #1 in 7.0 Beta.
tracking-fennec: --- → ?
Crash Signature: [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ] [@ libc.so@0xb57c ] [@ libc.so@0x12c0&hellip; → [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ] [@ libc.so@0xb57c ] [@ libc.so@0x12c0&hellip;
tracking-fennec: ? → +
Whiteboard: [mobile-crash]
(Reporter)

Updated

6 years ago
Crash Signature: [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ] [@ libc.so@0xb57c ] [@ libc.so@0x12c0&hellip; → [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ] [@ libc.so@0xb57c ] [@ libc.so@0x12c0&hellip;
(Reporter)

Comment 16

6 years ago
In 7.0.1, there are only dlmalloc_stats (Samsung devices), ThreadFunc crash signatures.
Crashes happen at startup.
It's probably a dupe of bug 663494 (same stack traces) about energy saving power down.
Summary: Fennec crash in base::CloseSuperfluousFds → Fennec crash in base::CloseSuperfluousFds at startup
(Assignee)

Comment 17

6 years ago
http://code.google.com/p/chromium/issues/detail?id=36678

http://git.chromium.org/gitweb/?p=chromium.git;a=blob;f=base/process_util_posix.cc#l334

According to this, the problem is the use of dynamic memory allocation in that function. This is line Chris Jones' comment 10.
(Assignee)

Updated

6 years ago
Assignee: nobody → gpascutto
Right-o, good find!
(Assignee)

Comment 19

6 years ago
Hmm, apparently we already knew about that issue, fixed it, and you reviewed it, in bug 680190.
(Assignee)

Comment 20

6 years ago
The crash is still happening in Fennec 9.0, so either it's something else, or the fixes in 680190 aren't enough.

The crash happens in opendir, or thereabouts. That function is both locking and malloc-ing stuff. If I get bug 680190 correctly, the malloc aren't actually a problem. But the locking would be, right? If we are in opendir in a thread at the same time as the LaunchApp is called, we're dead?

http://nv-tegra.nvidia.com/gitweb/?p=android/platform/bionic.git;a=blob;f=libc/unistd/opendir.c
(Assignee)

Comment 21

6 years ago
Created attachment 565205 [details] [diff] [review]
Patch 1. Avoid malloc and locking after fork
Attachment #565205 - Flags: review?(jones.chris.g)
(Assignee)

Comment 22

6 years ago
Created attachment 565206 [details] [diff] [review]
Patch 2. Remove the hacks from the previous fix
Attachment #565206 - Flags: review?(jones.chris.g)
(Assignee)

Comment 23

6 years ago
This updates our IPC code to be more in sync with the current Chromium code, which avoids dynamic memory allocation (and the opendir) between the fork() and execvp() calls. This should also allow us to remove the prefork/postfork jemalloc hacks.

I'm not positive we're really hitting the problem case (and hence, if this will resolve the bug), but unless anyone has a better idea...

Try run:
https://tbpl.mozilla.org/?tree=Try&rev=cb9a9ce90ddc
Hi Gian-Carlo,

(In reply to Gian-Carlo Pascutto (:gcp) from comment #21)
> Created attachment 565205 [details] [diff] [review] [diff] [details] [review]
> Patch 1. Avoid malloc and locking after fork

Is this just merging in upstream changes?  I can rubber-stamp those, but if there's new code included in this patch, I need to review that.
Comment on attachment 565206 [details] [diff] [review]
Patch 2. Remove the hacks from the previous fix

>diff --git a/ipc/chromium/src/base/process_util_linux.cc b/ipc/chromium/src/base/process_util_linux.cc
>--- a/ipc/chromium/src/base/process_util_linux.cc
>+++ b/ipc/chromium/src/base/process_util_linux.cc

It's OK to remove the hacks from here since the helper function always
fork()+exec()s, but we need to keep the general workaround code in
jemalloc for situations in which we just fork() in a jemalloc-using
process.
Attachment #565206 - Flags: review?(jones.chris.g)
(Assignee)

Comment 26

6 years ago
>Is this just merging in upstream changes?  I can rubber-stamp those, but if 
>there's new code included in this patch, I need to review that.

It's merging in upstream changes, but it's not a direct update. I made the patch that is in the chrome bug mentioned earlier on our codebase, and edited it wherever appropriate to avoid bringing in more updates (notably, it was using a LOG define that we don't seem to have yet). If I messed up, it's more likely to be in process_util_* since the other files were mostly direct copypasting.
Comment on attachment 565205 [details] [diff] [review]
Patch 1. Avoid malloc and locking after fork

Alright, rs=me.
Attachment #565205 - Flags: review?(jones.chris.g) → review+
(Assignee)

Comment 28

6 years ago
Created attachment 566217 [details] [diff] [review]
Patch 2. Remove the hacks from the previous fix

Only remove pre/postfork usage in fixed code path.
Attachment #565206 - Attachment is obsolete: true
Attachment #566217 - Flags: review?(jones.chris.g)
Attachment #566217 - Flags: review?(jones.chris.g) → review+
passed try:
https://tbpl.mozilla.org/?tree=Try&usebuildbot=1&rev=d29c3097da42

landed on m-i:
https://hg.mozilla.org/integration/mozilla-inbound/rev/de0be83bf880
https://hg.mozilla.org/integration/mozilla-inbound/rev/fed6a1e02b7d
https://hg.mozilla.org/mozilla-central/rev/fed6a1e02b7d
https://hg.mozilla.org/mozilla-central/rev/de0be83bf880
Status: REOPENED → RESOLVED
Last Resolved: 7 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
(Assignee)

Updated

6 years ago
Duplicate of this bug: 663494
Can we get this nominated for being pushed to Aurora and Beta please?
(Assignee)

Updated

6 years ago
Attachment #565205 - Flags: approval-mozilla-beta?
Attachment #565205 - Flags: approval-mozilla-aurora?
(Assignee)

Updated

6 years ago
Attachment #566217 - Flags: approval-mozilla-beta?
Attachment #566217 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 33

6 years ago
- There are no tests or easy STR for this. It's a race condition that can happen when launching the app. The analysis for the fix here is based on reading the crashing code.
- It's been on m-c for 2 weeks.
- No performance impact. We want it because it seems to fix a topcrasher. If I read Socorro correctly we haven't seen any after the fix landed. Because of that, I think it's low risk.
- No impact to API.
Comment on attachment 565205 [details] [diff] [review]
Patch 1. Avoid malloc and locking after fork

[Triage Comment]
We're denying these patches for Beta because
* This is not a recent regression
* The patch looks non-trivial
* It's too late in the 8 cycle for m-c-only tested code to land
Attachment #565205 - Flags: approval-mozilla-beta? → approval-mozilla-beta-

Updated

6 years ago
Attachment #566217 - Flags: approval-mozilla-beta? → approval-mozilla-beta-

Updated

6 years ago
Attachment #566217 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora-

Comment 35

6 years ago
Comment on attachment 565205 [details] [diff] [review]
Patch 1. Avoid malloc and locking after fork

We'll actually take this on mozilla-aurora (top crasher, importing 3rd party released code) but would like a rollup patch. Please attach one and ask for mozilla-aurora approval. Thanks!
Attachment #565205 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora-
(Assignee)

Comment 36

6 years ago
Created attachment 571310 [details] [diff] [review]
Patch 3. Folded patch of previous 2 patches
Attachment #571310 - Flags: approval-mozilla-aurora?

Comment 37

6 years ago
Comment on attachment 571310 [details] [diff] [review]
Patch 3. Folded patch of previous 2 patches

[triage comment]

Approved for aurora. Please land today if at all possible.
Attachment #571310 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(Assignee)

Comment 38

6 years ago
http://hg.mozilla.org/releases/mozilla-aurora/rev/4c79fc728cc3
status-firefox9: --- → fixed
(Reporter)

Updated

6 years ago
Crash Signature: [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ] [@ libc.so@0xb57c ] [@ libc.so@0x12c0&hellip; → [@ libc.so@0xb538 ] [@ libc.so@0xb45c ] [@ libc.so@0xb40c ] [@ libc.so@0xb47c ] [@ libc.so@0xb548 ] [@ libc.so@0xb64c ] [@ libc.so@0xb52c ] [@ libc.so@0xb4ce ] [@ libc.so@0xb42c ] [@ libc.so@0xb66c ] [@ libc.so@0xb57c ] [@ libc.so@0x12c0&hellip;
Does this crash only affect Mobile?

Comment 40

6 years ago
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #39)
> Does this crash only affect Mobile?

AFAIK, yes, this is Mobile-only, I guess Android-only.
Duplicate of this bug: 683799
You need to log in before you can comment on or make changes to this bug.