Frequent ASan browser_whitelist6.js | application terminated with exit code 9, with or without | application timed out after 330 seconds with no output

RESOLVED FIXED in Firefox 28

Status

()

Toolkit
Add-ons Manager
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: philor, Assigned: decoder)

Tracking

({intermittent-failure})

Trunk
mozilla28
x86_64
Linux
intermittent-failure
Points:
---

Firefox Tracking Flags

(firefox26 unaffected, firefox27 unaffected, firefox28 fixed, firefox-esr24 unaffected, b2g-v1.2 unaffected)

Details

Attachments

(1 attachment)

(Reporter)

Description

5 years ago
Something started this up during the period leading into the multiple-day OOM tree closure, when I hid ASan browser-chrome because it was constantly failing, and then at some unknown time this got added into the constant failure.

Without a timeout:

https://tbpl.mozilla.org/php/getParsedLog.php?id=30088295&tree=Mozilla-Inbound
Ubuntu ASAN VM 12.04 x64 mozilla-inbound opt test mochitest-browser-chrome on 2013-11-04 09:03:21 PST for push 07ea7b11adef
slave: tst-linux64-ec2-144

10:37:04     INFO -  INFO TEST-END | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/xpinstall/browser_whitelist6.js | finished in 841ms
10:37:04     INFO -  TEST-INFO | checking window state
10:41:34     INFO -  [Parent 2386] WARNING: waitpid failed pid:2435 errno:10: file /builds/slave/m-in-l64-asan-0000000000000000/build/ipc/chromium/src/base/process_util_posix.cc, line 254
10:41:36     INFO -  [Parent 2386] WARNING: waitpid failed pid:2435 errno:10: file /builds/slave/m-in-l64-asan-0000000000000000/build/ipc/chromium/src/base/process_util_posix.cc, line 254
10:41:36     INFO -  [Parent 2386] WARNING: Failed to deliver SIGKILL to 2435!(3).: file /builds/slave/m-in-l64-asan-0000000000000000/build/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc, line 118
10:42:03     INFO -  [Parent 2386] WARNING: waitpid failed pid:2491 errno:10: file /builds/slave/m-in-l64-asan-0000000000000000/build/ipc/chromium/src/base/process_util_posix.cc, line 254
10:42:05     INFO -  [Parent 2386] WARNING: waitpid failed pid:2491 errno:10: file /builds/slave/m-in-l64-asan-0000000000000000/build/ipc/chromium/src/base/process_util_posix.cc, line 254
10:42:05     INFO -  [Parent 2386] WARNING: Failed to deliver SIGKILL to 2491!(3).: file /builds/slave/m-in-l64-asan-0000000000000000/build/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc, line 118
10:42:13  WARNING -  TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/xpinstall/browser_whitelist6.js | application terminated with exit code 9
10:42:13     INFO -  INFO | runtests.py | Application ran for: 1:28:02.905489

With a timeout:

https://tbpl.mozilla.org/php/getParsedLog.php?id=30091852&tree=Mozilla-Inbound
Ubuntu ASAN VM 12.04 x64 mozilla-inbound opt test mochitest-browser-chrome on 2013-11-04 10:54:15 PST for push 7384e5f3eb7a
slave: tst-linux64-ec2-439

12:08:58     INFO -  INFO TEST-END | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/xpinstall/browser_whitelist6.js | finished in 880ms
12:08:58     INFO -  TEST-INFO | checking window state
12:09:01     INFO -  TEST-INFO | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/xpinstall/browser_whitelist6.js | Console message: [JavaScript Error: "A promise chain failed to handle a rejection.
12:09:01     INFO -  Date: Mon Nov 04 2013 12:08:54 GMT-0800 (PST)
12:09:01     INFO -  Full Message: Unix error 17 during operation makeDir (File exists)"]
12:14:32  WARNING -  TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/xpinstall/browser_whitelist6.js | application timed out after 330 seconds with no output
(screenshot)
12:14:40     INFO -  Can't trigger Breakpad, just killing process
12:14:40     INFO -  Failed to kill process 2269: [Errno 3] No such process
12:14:40  WARNING -  TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/xpinstall/browser_whitelist6.js | application terminated with exit code 9
12:14:40     INFO -  INFO | runtests.py | Application ran for: 1:15:57.791279

Although this has gotten a little better since its low point, I expect it to be bad enough we'll want to disable this test on ASan.
(Reporter)

Comment 3

5 years ago
I must be new here, trying to figure out why this bug that I didn't put the intermittent-failure keyword on doesn't get suggested, and then wondering what happened to all the (mis)starring that I did earlier, calling instances of this bug 926674 instead.
Keywords: intermittent-failure
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Dave, would you mind taking a look at why this tests suddenly started failing extremely frequently on ASan runs? :decoder will be able to assist with any of the ASan specific questions.
Flags: needinfo?(dtownsend+bugmail)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
My best guess is that there isn't anything wrong with this test, the logs show that it passes successfully each time and it is only after that that things go wrong. I think the problem is that this is the very last test in the run and it is getting tagged by a failure to shutdown. A simple way to test that would be to disable the test and see if browser_whitelist5 suddenly started showing up as an intermittent failure.
Flags: needinfo?(dtownsend+bugmail)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Reporter)

Comment 56

5 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=30492285&tree=Mozilla-Inbound, which got into calling the current test "Shutdown" and then silently signal 9'ed, tends to agree that it's just near-or-at-shutdown.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 93

5 years ago
Here's a try run that forces the low-memory configuration onto the mid-memory slaves:

https://tbpl.mozilla.org/?tree=Try&rev=7e6d5f51c919

All green with 10 retriggers. I conclude that it's an OOM issue and we should not apply the mid-memory configuration to the current slaves.

How much memory do the test slaves have? 3 or 4 GB?
Flags: needinfo?(catlee)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 113

5 years ago
Created attachment 8333547 [details] [diff] [review]
try-asan-oom

This patch essentially removes the mid-memory configuration and forces the slaves to run on the low-memory config instead, even if they have 4 GB of RAM. If they have less, then we can lower the bar in the patch further, but for now I assumed 4 GB. This is green on all tests and also kills the intermittent.
Assignee: nobody → choller
Status: NEW → ASSIGNED
Attachment #8333547 - Flags: review?(ted)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Reporter)

Comment 144

5 years ago
Comment on attachment 8333547 [details] [diff] [review]
try-asan-oom

The fun thing about being nearly completely broken is that you can just push any sort of changes with any sort of review, since you can hardly make things worse.
Attachment #8333547 - Flags: review?(ted) → review+
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
https://hg.mozilla.org/mozilla-central/rev/95813fcf6a62
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
status-b2g-v1.2: --- → unaffected
status-firefox26: --- → unaffected
status-firefox27: --- → unaffected
status-firefox28: --- → fixed
status-firefox-esr24: --- → unaffected
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
The ubuntu ec2 instances are m1.medium, which have 3.75G of memory.
Flags: needinfo?(catlee)
You need to log in before you can comment on or make changes to this bug.