Closed Bug 991274 Opened 10 years ago Closed 10 years ago

Perma-fail on Linux64 slaves - test_hosted.xul,test_packaged.xul,test_packaged_launch.xul | Error during test: [Exception... "Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFileOutputStream.init]" nsresult: "0x80520012

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set
critical

Tracking

(firefox29 unaffected, firefox30 fixed, firefox31 fixed, firefox-esr24 unaffected)

RESOLVED FIXED
Tracking Status
firefox29 --- unaffected
firefox30 --- fixed
firefox31 --- fixed
firefox-esr24 --- unaffected

People

(Reporter: RyanVM, Assigned: rail)

References

Details

(Keywords: intermittent-failure, Whiteboard: [buildduty])

Attachments

(1 file)

This started happening today on non-spot linux64 EC2 slaves across multiple trees. No clue what's different between spot and non-spot that would be causing this, but it appears to be perma-failing. The "ImportError: No module named pygtk" doesn't appear to be relevant as there are other failing logs where it doesn't appear.

https://tbpl.mozilla.org/php/getParsedLog.php?id=37156862&tree=Mozilla-Inbound

Ubuntu VM 12.04 x64 mozilla-inbound debug test mochitest-other on 2014-04-02 11:46:04 PDT for push d07b3f034bd5
slave: tst-linux64-ec2-016

12:18:27     INFO -  3961 INFO TEST-START | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted.xul
12:18:27     INFO -  ++DOMWINDOW == 115 (0xe55e750) [pid = 2429] [serial = 5319] [outer = 0x3c48e90]
12:18:27     INFO -  Couldn't convert chrome URL: chrome://mochitests/tests/SimpleTest/test.css
12:18:27     INFO -  [Parent 2429] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/slave/m-in-l64-d-0000000000000000000/build/netwerk/base/src/nsIOService.cpp, line 594
12:18:27     INFO -  [Parent 2429] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/slave/m-in-l64-d-0000000000000000000/build/layout/style/Loader.cpp, line 1998
12:18:27     INFO -  3962 INFO TEST-INFO | dumping last 6 message(s)
12:18:27     INFO -  3963 INFO TEST-INFO | if you need more context, please use SimpleTest.requestCompleteLog() in your test
12:18:27     INFO -  3964 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted.xul | NativeApp object created
12:18:27     INFO -  3965 INFO TEST-INFO | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted.xul | Test update for an uninstalled application
12:18:27     INFO -  3966 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted.xul | Exception thrown
12:18:27     INFO -  3967 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted.xul | Profile directory created
12:18:27     INFO -  3968 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted.xul | profiles.ini file created
12:18:27     INFO -  3969 INFO TEST-INFO | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted.xul | Test installation
12:18:27     INFO -  3970 INFO TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted.xul | Error during test: [Exception... "Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFileOutputStream.init]"  nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)"  location: "JS frame :: resource://gre/components/nsINIProcessor.js :: INIProcessor.prototype.writeFile :: line 147"  data: no]
12:18:27     INFO -  3971 INFO TEST-INFO | MEMORY STAT vsize after test: 2186780672
12:18:27     INFO -  3972 INFO TEST-INFO | MEMORY STAT residentFast after test: 736190464
12:18:27     INFO -  3973 INFO TEST-END | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted.xul | finished in 487ms
12:18:28     INFO -  ++DOMWINDOW == 116 (0x960b0b0) [pid = 2429] [serial = 5320] [outer = 0x3c48e90]
12:18:28     INFO -  3974 INFO TEST-START | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted_launch.xul
12:18:28     INFO -  ++DOMWINDOW == 117 (0xd29c630) [pid = 2429] [serial = 5321] [outer = 0x3c48e90]
12:18:28     INFO -  Couldn't convert chrome URL: chrome://mochitests/tests/SimpleTest/test.css
12:18:28     INFO -  [Parent 2429] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/slave/m-in-l64-d-0000000000000000000/build/netwerk/base/src/nsIOService.cpp, line 594
12:18:28     INFO -  [Parent 2429] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/slave/m-in-l64-d-0000000000000000000/build/layout/style/Loader.cpp, line 1998
12:18:28     INFO -  3975 INFO TEST-INFO | dumping last 3 message(s)
12:18:28     INFO -  3976 INFO TEST-INFO | if you need more context, please use SimpleTest.requestCompleteLog() in your test
12:18:28     INFO -  3977 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted_launch.xul | NativeApp object created
12:18:28     INFO -  3978 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted_launch.xul | Profile directory created
12:18:28     INFO -  3979 INFO TEST-INFO | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted_launch.xul | Test installation
12:18:28     INFO -  3980 INFO TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted_launch.xul | Error during test: [Exception... "Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFileOutputStream.init]"  nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)"  location: "JS frame :: resource://gre/components/nsINIProcessor.js :: INIProcessor.prototype.writeFile :: line 147"  data: no]
12:18:28     INFO -  3981 INFO TEST-INFO | MEMORY STAT vsize after test: 2186555392
12:18:28     INFO -  3982 INFO TEST-INFO | MEMORY STAT residentFast after test: 736235520
12:18:28     INFO -  3983 INFO TEST-END | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_hosted_launch.xul | finished in 422ms
12:18:28     INFO -  ++DOMWINDOW == 118 (0xddf9160) [pid = 2429] [serial = 5322] [outer = 0x3c48e90]
12:18:28     INFO -  3984 INFO TEST-START | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged.xul
12:18:28     INFO -  ++DOMWINDOW == 119 (0x5b6d580) [pid = 2429] [serial = 5323] [outer = 0x3c48e90]
12:18:28     INFO -  Couldn't convert chrome URL: chrome://mochitests/tests/SimpleTest/test.css
12:18:28     INFO -  [Parent 2429] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/slave/m-in-l64-d-0000000000000000000/build/netwerk/base/src/nsIOService.cpp, line 594
12:18:28     INFO -  [Parent 2429] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/slave/m-in-l64-d-0000000000000000000/build/layout/style/Loader.cpp, line 1998
12:18:29     INFO -  3985 INFO TEST-INFO | dumping last 6 message(s)
12:18:29     INFO -  3986 INFO TEST-INFO | if you need more context, please use SimpleTest.requestCompleteLog() in your test
12:18:29     INFO -  3987 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged.xul | NativeApp object created
12:18:29     INFO -  3988 INFO TEST-INFO | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged.xul | Test update for an application that isn't installed
12:18:29     INFO -  3989 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged.xul | Exception thrown
12:18:29     INFO -  3990 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged.xul | Profile directory created
12:18:29     INFO -  3991 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged.xul | profiles.ini file created
12:18:29     INFO -  3992 INFO TEST-INFO | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged.xul | Test installation
12:18:29     INFO -  3993 INFO TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged.xul | Error during test: [Exception... "Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFileOutputStream.init]"  nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)"  location: "JS frame :: resource://gre/components/nsINIProcessor.js :: INIProcessor.prototype.writeFile :: line 147"  data: no]
12:18:29     INFO -  3994 INFO TEST-INFO | MEMORY STAT vsize after test: 2186555392
12:18:29     INFO -  3995 INFO TEST-INFO | MEMORY STAT residentFast after test: 736264192
12:18:29     INFO -  3996 INFO TEST-END | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged.xul | finished in 423ms
12:18:29     INFO -  ++DOMWINDOW == 120 (0xdfb1050) [pid = 2429] [serial = 5324] [outer = 0x3c48e90]
12:18:29     INFO -  3997 INFO TEST-START | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged_launch.xul
12:18:29     INFO -  ++DOMWINDOW == 121 (0x64d7470) [pid = 2429] [serial = 5325] [outer = 0x3c48e90]
12:18:29     INFO -  Couldn't convert chrome URL: chrome://mochitests/tests/SimpleTest/test.css
12:18:29     INFO -  [Parent 2429] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/slave/m-in-l64-d-0000000000000000000/build/netwerk/base/src/nsIOService.cpp, line 594
12:18:29     INFO -  [Parent 2429] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/slave/m-in-l64-d-0000000000000000000/build/layout/style/Loader.cpp, line 1998
12:18:29     INFO -  3998 INFO TEST-INFO | dumping last 3 message(s)
12:18:29     INFO -  3999 INFO TEST-INFO | if you need more context, please use SimpleTest.requestCompleteLog() in your test
12:18:29     INFO -  4000 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged_launch.xul | NativeApp object created
12:18:29     INFO -  4001 INFO TEST-PASS | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged_launch.xul | Profile directory created
12:18:29     INFO -  4002 INFO TEST-INFO | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged_launch.xul | Test installation
12:18:29     INFO -  4003 INFO TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged_launch.xul | Error during test: [Exception... "Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFileOutputStream.init]"  nsresult: "0x80520012 (NS_ERROR_FILE_NOT_FOUND)"  location: "JS frame :: resource://gre/components/nsINIProcessor.js :: INIProcessor.prototype.writeFile :: line 147"  data: no]
12:18:29     INFO -  4004 INFO TEST-INFO | MEMORY STAT vsize after test: 2186620928
12:18:29     INFO -  4005 INFO TEST-INFO | MEMORY STAT residentFast after test: 736276480
12:18:29     INFO -  4006 INFO TEST-END | chrome://mochitests/content/chrome/toolkit/webapps/tests/test_packaged_launch.xul | finished in 441ms
This appears to be due to disk space and related to bug 988262.  The remaining instances with only 15GB space are being recreated during this weekend's tree-closing window (TCW).
this shouldn't be a disk issue, since bug 988262 is closed (see https://bugzilla.mozilla.org/show_bug.cgi?id=988262#c17).

I can't parse what is the real error in the logs and what does it mean... :/
Can we run mochitest-other on spot instances only for now?
Flags: needinfo?(rail)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #6)
> Can we run mochitest-other on spot instances only for now?

Yes, but tomorrow spot instances will be in the same state (we regenerate the AMIs every night).
Flags: needinfo?(rail)
Attached patch pkg.diffSplinter Review
Sounds like a puppet issue. I dumped the pakacges on spot and on-demand instances and they look a bit different. See the attached diff/
See also bug 975216
Sounds like bug 975216 didn't upgrade all generated packages properly and our test slaves used to have packages with different versions...
Depends on: 975216
Can you be a little more descriptive about what happened?
Sure.

Our old on-demand instances (used to generate spot AMIs) were generated ages ago. The used to use libxcb1 and libxcb*0 1.8.1-1. Bug 975216 upgraded libxcb1 only.

When I regenerated new AMIs apt-get installed libxcb1 1.8.1-2ubuntu2.1mozilla1 and all corresponding versions oncluding libxcb-*0 what broke the tests (that's my theory, yet to be proven).
OK, so in a way this traces back to inconsistent Ubuntu base images?
(In reply to Dustin J. Mitchell [:dustin] from comment #14)
> OK, so in a way this traces back to inconsistent Ubuntu base images?

Not really. The base image is the same.

We have 2 different paths:

1) 
* use the base image
* apply the puppet manifests to the version *before* we upgraded the version of that library
* apply the puppet manifests to the current version

This would geneerate in a different package list comparing to

2)
* use the base image
* apply the puppet manifests to the current version
FTR, I disabled the spot AMI generation job, so the spot instances shouldn't be affected by this tomorrow.
Assignee: nobody → rail
Hmm, I just checked tst-linux64-ec2-002 (from comment #23) and it has the same package list as one of the spot instances...
So, the issue is not related to the package list mismatch since downgrading the packages didn't help.

Ryan, can we find someone who can help debugging this failure? The logs are not that helpful to tell what's going on here...
Flags: needinfo?(ryanvm)
Pretty sure anyone who can help is already CCed...
Flags: needinfo?(ryanvm)
The installation is failing while creating a INI configuration file (http://mxr.mozilla.org/mozilla-central/source/toolkit/webapps/LinuxNativeApp.js#259).
nsIINIParserWriter::writeFile creates the file (http://mxr.mozilla.org/mozilla-central/source/xpcom/ds/nsINIProcessor.js#144) before writing.
The problem here is that the file isn't created.

The file is created in a directory under the /tmp directory. Is it possible that the /tmp volume is full?
(In reply to Marco Castelluccio [:marco] from comment #29)
> Is it possible that the /tmp volume is full?

Could be, but the machines have ~12G of empty disk space at the beginning, what should be quite enough.
(In reply to Rail Aliiev [:rail] from comment #30)
> (In reply to Marco Castelluccio [:marco] from comment #29)
> > Is it possible that the /tmp volume is full?
> 
> Could be, but the machines have ~12G of empty disk space at the beginning,
> what should be quite enough.

You could add the following line in LinuxNativeApp.js#261:
> dump("Space available: " + webappINIfile.diskSpaceAvailable);
It doesn't look like a disk space issue. I just connected to tst-linux64-spot-402 which runs the test for https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=783c8af4658b. I can see the TEST-UNEXPECTED-FAIL messages in the logs but the vm has plenty of disk space (9G).
Summary: Perma-fail on Linux64 non-spot slaves - test_hosted.xul,test_packaged.xul,test_packaged_launch.xul | Error during test: [Exception... "Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFileOutputStream.init]" nsresult: "0x80520012 → Perma-fail on Linux64 slaves - test_hosted.xul,test_packaged.xul,test_packaged_launch.xul | Error during test: [Exception... "Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFileOutputStream.init]" nsresult: "0x80520012
Another thing we like to do in our tests to then break our tests, besides actually filling the disk, is to create the maximum number of files in a single directory, however high that may be.
All trees closed over this; current failure rate isn't sustainable.
Severity: normal → blocker
(In reply to Rail Aliiev [:rail] from comment #51)
> It doesn't look like a disk space issue. I just connected to
> tst-linux64-spot-402 which runs the test for
> https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=783c8af4658b. I can see
> the TEST-UNEXPECTED-FAIL messages in the logs but the vm has plenty of disk
> space (9G).

Is there a separate partition for "/tmp"? How much space is reserved for this partition, if there is one?
On my system I've plenty of space, but "/tmp" is small and I need to clean it often when running tests.
I've had another idea, could you access the test machine and see how many directories (with a name that starts with samplehostedapp) in /tmp are created?
Depends on: 994216
Severity: blocker → critical
Filed 994216 to track fixes. This bug will continue to be the tbpl honeypot.
(In reply to Marco Castelluccio [:marco] from comment #78)
> 
> Is there a separate partition for "/tmp"? How much space is reserved for
> this partition, if there is one?

/tmp is not mounted as a separate partition. 

(In reply to Marco Castelluccio [:marco] from comment #86)
> I've had another idea, could you access the test machine and see how many
> directories (with a name that starts with samplehostedapp) in /tmp are
> created?

I was watching# of files (find /tmp | wc -l) one of the slaves and there were ~950 files there when it started failing.
the fix for this is tracked in bug 994216
/me declares victory
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: