intermittent TEST-UNEXPECTED-FAIL | testANRReporter | ANR reporter wrote one ping - got 0, expected 1

RESOLVED FIXED in Firefox 34

Status

()

defect
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: dbaron, Assigned: jchen)

Tracking

({intermittent-failure})

unspecified
Firefox 36
ARM
Android
Points:
---

Firefox Tracking Flags

(firefox33 wontfix, firefox34 fixed, firefox35 fixed, firefox36 fixed, firefox-esr31 unaffected)

Details

Attachments

(1 attachment)

New intermittent failure:

https://tbpl.mozilla.org/php/getParsedLog.php?id=48024327&tree=Mozilla-Inbound
Android 4.0 Panda mozilla-inbound opt test robocop-1 on 2014-09-13 04:29:03 PDT for push b4735c318a46
slave: panda-0367

TEST-UNEXPECTED-FAIL | testANRReporter | ANR reporter wrote one ping - got 0, expected 1
0 ERROR Exception caught during test! - junit.framework.AssertionFailedError: TEST-UNEXPECTED-FAIL | testANRReporter | ANR reporter wrote one ping - got 0, expected 1
TEST-UNEXPECTED-FAIL | testANRReporter | Exception caught - junit.framework.AssertionFailedError: TEST-UNEXPECTED-FAIL | testANRReporter | ANR reporter wrote one ping - got 0, expected 1
Return code: 1
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Sigh... On pre-JB Android, there's a bug in Bionic's fork implementation that can cause a hang when launching processes, so this patch makes us avoid launching processes if we can.
Attachment #8509738 - Flags: review?(snorp)
Comment on attachment 8509738 [details] [diff] [review]
Try to not launch processes on pre-JB devices because of Android bug (v1)

Review of attachment 8509738 [details] [diff] [review]:
-----------------------------------------------------------------

What kind of fork bug? It's just a system call, so seems like it's kind of hard to screw up.
Attachment #8509738 - Flags: review?(snorp) → review+
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #6)
> Comment on attachment 8509738 [details] [diff] [review]
> Try to not launch processes on pre-JB devices because of Android bug (v1)
> 
> Review of attachment 8509738 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> What kind of fork bug? It's just a system call, so seems like it's kind of
> hard to screw up.

See http://stackoverflow.com/a/9273186. Bionic tries to do some additional work right after a fork() call, and that can cause memory allocation. Memory allocation after fork is like during a signal handler and bad things happen.
Assignee: nobody → nchen
Status: NEW → ASSIGNED
Comment on attachment 8509738 [details] [diff] [review]
Try to not launch processes on pre-JB devices because of Android bug (v1)

Actually gonna retract my r+ here. We have our own fork() in BionicGlue.cpp, so maybe it's fixable there? What's the nature of the problem?
Attachment #8509738 - Flags: review+ → review-
Ah, I see the comment now. We don't do any allocation (AFAICT) in our own fork(), so this must be some other issue.
Maybe one of the atfork() handlers is mallocing?
Our native fork() calls are fine AFAIK, but Android's fork() calls (e.g. through the Java ProcessBuilder class or Runtime.exec method) go directly to Bionic, so these Java calls are susceptible to the bug.
(In reply to Jim Chen [:jchen] from comment #11)
> Our native fork() calls are fine AFAIK, but Android's fork() calls (e.g.
> through the Java ProcessBuilder class or Runtime.exec method) go directly to
> Bionic, so these Java calls are susceptible to the bug.

Ah, that's true. Ugh. Don't use Java to read the logcat then?
I really don't want to lose logcat for everything pre-JB. That's a lot of devices.
Comment on attachment 8509738 [details] [diff] [review]
Try to not launch processes on pre-JB devices because of Android bug (v1)

Review of attachment 8509738 [details] [diff] [review]:
-----------------------------------------------------------------

Apparently we already don't collect logcat for pre-JB devices, so I guess this is alright.
Attachment #8509738 - Flags: review- → review+
Comment hidden (Legacy TBPL/Treeherder Robot)
https://hg.mozilla.org/mozilla-central/rev/63a25da1a5a0
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 36
James, I see that Jim is on PTO for a couple weeks. Can you please request Aurora approval on this (and Beta if also affected - it isn't clear to me if it is or not)? Thanks!
Flags: needinfo?(snorp)
Comment on attachment 8509738 [details] [diff] [review]
Try to not launch processes on pre-JB devices because of Android bug (v1)

Approval Request Comment
[Feature/regressing bug #]: 826053
[User impact if declined]: Occassional deadlocks when submitting ANR reports
[Describe test coverage new/current, TBPL]: caught by robocop
[Risks and why]: Low risk, only causes us to not submit logcat data on pre-JB devices
[String/UUID change made/needed]: None
Flags: needinfo?(snorp)
Attachment #8509738 - Flags: approval-mozilla-beta?
Attachment #8509738 - Flags: approval-mozilla-aurora?
Comment on attachment 8509738 [details] [diff] [review]
Try to not launch processes on pre-JB devices because of Android bug (v1)

Beta+
Aurora+
Attachment #8509738 - Flags: approval-mozilla-beta?
Attachment #8509738 - Flags: approval-mozilla-beta+
Attachment #8509738 - Flags: approval-mozilla-aurora?
Attachment #8509738 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.