Closed Bug 915879 Opened 11 years ago Closed 10 years ago

Intermittent test_packaged_app_update.html | Test timed out.

Categories

(Core Graveyard :: DOM: Apps, defect)

x86
Windows 8
defect
Not set
normal

Tracking

(firefox27 unaffected, firefox28 wontfix, firefox29 wontfix, firefox30 fixed, firefox31 fixed, firefox32 fixed, firefox-esr24 unaffected, b2g-v1.2 unaffected, b2g-v1.3 disabled, b2g-v1.3T disabled, b2g-v1.4 fixed, b2g-v2.0 fixed)

RESOLVED FIXED
mozilla31
Tracking Status
firefox27 --- unaffected
firefox28 --- wontfix
firefox29 --- wontfix
firefox30 --- fixed
firefox31 --- fixed
firefox32 --- fixed
firefox-esr24 --- unaffected
b2g-v1.2 --- unaffected
b2g-v1.3 --- disabled
b2g-v1.3T --- disabled
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed

People

(Reporter: RyanVM, Assigned: marco)

References

Details

(Keywords: intermittent-failure)

Attachments

(2 files, 1 obsolete file)

Not sure how relevant those System JS errors are, but they look troubling.

https://tbpl.mozilla.org/php/getParsedLog.php?id=27777825&tree=Mozilla-Central

WINNT 6.2 mozilla-central debug test mochitest-2 on 2013-09-12 10:14:26 PDT for push d5fc994ca2ed
slave: t-w864-ix-003

10:23:30     INFO -  System JS : ERROR resource://gre/modules/Webapps.jsm:1087
10:23:30     INFO -                       NS_ERROR_FILE_IS_LOCKED: Component returned failure code: 0x8052000e (NS_ERROR_FILE_IS_LOCKED) [nsIFile.moveTo]
10:23:31     INFO -  WARNING: 1 sort operation has occurred for the SQL statement '0x18580060'.  See https://developer.mozilla.org/En/Storage/Warnings details.: file e:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/storage/src/mozStoragePrivateHelpers.cpp, line 110
10:23:31     INFO -  WARNING: 1 sort operation has occurred for the SQL statement '0x18581060'.  See https://developer.mozilla.org/En/Storage/Warnings details.: file e:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/storage/src/mozStoragePrivateHelpers.cpp, line 110
10:24:15     INFO -  System JS : ERROR resource://gre/modules/FormHistory.jsm:376
10:24:15     INFO -                       NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get]
10:24:15     INFO -  WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file e:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/toolkit/components/places/Database.cpp, line 453
10:24:15     INFO -  WARNING: NS_ENSURE_TRUE(mDB) failed: file e:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/toolkit/components/places/nsNavHistory.cpp, line 298
10:24:15     INFO -  System JS : ERROR resource://gre/modules/PlacesUtils.jsm:1815
10:24:15     INFO -                       NS_ERROR_XPC_GS_RETURNED_FAILURE: Component returned failure code: 0x80570016 (NS_ERROR_XPC_GS_RETURNED_FAILURE) [nsIJSCID.getService]
10:24:15     INFO -  WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file e:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/toolkit/components/places/Database.cpp, line 453
10:24:15     INFO -  WARNING: NS_ENSURE_TRUE(mDB) failed: file e:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/toolkit/components/places/nsNavHistory.cpp, line 298
10:26:01     INFO -  881 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/apps/tests/test_packaged_app_update.html | Test timed out.
10:26:01     INFO -  INFO | automation.py | Launching: C:\slave\test\build\tests\bin\screenshot.exe c:\users\cltbld~1.t-w\appdata\local\temp\mozilla-test-fail_ru6avv
10:26:02     INFO -  SCREENSHOT: <see log>
10:26:02     INFO -  882 INFO TEST-END | /tests/dom/apps/tests/test_packaged_app_update.html | finished in 304110ms
Is it possible that we're trying to move the manifest.webapp file from the temp directory while it's still being written?
The writeFile operation is async, but we never really wait for it to finish. Maybe when applyDownload is tring to move the file, the file is still being written to.

I couldn't find much documentation about the NS_ERROR_FILE_IS_LOCKED error, though. So this is just a wild guess.
Ok, if that's failing on windows, that's because a file is being moved while it's open. I fixed a bug with that before landing (because it failed on Windows all the time)... seems we have another one with that.
:amac, is this something you could look into? Sounds like you have an idea of what's going on here, per your earlier comment. If not, let me know and we'll look for other resources to dig in here.

Thanks!
Assignee: nobody → amac
:amac, any updates here? Please let me know if you won't be able to spend time on this test failure. Thanks.
(In reply to Johnny Stenback (:jst, jst@mozilla.com) from comment #144)
> :amac, any updates here? Please let me know if you won't be able to spend
> time on this test failure. Thanks.

Sorry, I didn't see this comment (or your previous one)... amongst all the fail mails. I can give it a shot, although I don't have a WINNT build environment to try.
There seem to be two different problems that have been bundled on this bug. The original problem is still happening sporadically on some Windows builds (and I think it's caused because we have a race condition where we're trying to move a file we've read before closing it). The last fail from this problem is on comment #113.

There's another problem (which I believe is a different one) which started happening on Dec 5th (comment 18) which only happens on B2G ICS emulator (opt). Windows builds seem to not be failing anymore on Mozilla Central (all the latest Windows failures are on M-A). 

I'm building an ICS emulator build to see if I can reproduce the problem locally. I don't know if it's worth trying to fix the original problem, unless it starts happening again on M-C.
Antonio, thanks for the investigation here! I agree that if the windows failures truly disappeared on m-c it's questionable whether it's worth investigating that any more. Ryan, thoughts on that, and also on whether we should just disable these tests on windows on aurora to stop the spew? Are we spinning any b2g releases off of current aurora?
Aurora is destined to be B2G v1.3. That said, I'm inclined to agree that the B2G failures are the bigger issue. I would suggest we focus there and get the fix uplifted and then re-evaluate. Depending on what the problem turns out to be, we may get the Windows fix for free as well if it's some sort of race or other timing issue.
Sorry about the delay in updating this. I'm having problems running the emulator mochitests locally, and since that's the only environment this seems to be currently failing that's making diagnosing the problem hard. When running the tests I get an error saying the device is configured to use a proxy that's rejecting connection... both running them with mach and executing runtestsb2g.py
Disabled for ongoing failures.

https://hg.mozilla.org/integration/b2g-inbound/rev/164ac10c8f7e
Whiteboard: [test disabled on B2G][leave open]
He, I was trying to get one of my test builds (with lots of logs enabled) to fail on the try server since I cannot run the emulator tests locally (dunno why, they just don't want to run inside of a VM, I can run the emulator but not the tests).

Unassigning it from myself. In any case, I think that bug 903291 should help with this kind of failures on the app tests.

Can you take a look at asking to reenable this when that bug lands, Fernando?
Assignee: amac → nobody
Flags: needinfo?(ferjmoreno)
Sure, I'll leave the needinfo on me as a reminder.
Blocks: 997886
Attached patch Patch (obsolete) — Splinter Review
With this patch, the test hasn't failed for several runs (13): https://tbpl.mozilla.org/?tree=Try&rev=8650a7467be2
Attachment #8412207 - Flags: review?(fabrice)
Comment on attachment 8412207 [details] [diff] [review]
Patch

Review of attachment 8412207 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Nice, that's actually more readable. But do you know why we were failing before?

::: dom/apps/src/Webapps.jsm
@@ +1968,5 @@
> +      let updateDeferred = Promise.defer();
> +
> +      updateSvc.checkForUpdate(Services.io.newURI(manifest.fullAppcachePath(), null, null),
> +                               aApp.localId, false,
> +                               (aSubject, aTopic) => updateDeferred.resolve(aTopic));

please add the 3rd observer parameter (aData) even if it's unused.
Attachment #8412207 - Flags: review?(fabrice) → review+
Attached patch PatchSplinter Review
(In reply to Fabrice Desré [:fabrice] from comment #262)
> r=me
> Nice, that's actually more readable. But do you know why we were failing
> before?

The test was failing in applyDownload, while trying to move the manifest file (http://hg.mozilla.org/mozilla-central/annotate/5fa1fb553007/dom/apps/src/Webapps.jsm#l1449). So I thought the problem could be that we were still writing the file when we tried to move it.
Let's hope this is really the problem, the try run looks promising!
Attachment #8412207 - Attachment is obsolete: true
Attachment #8412796 - Flags: review+
Keywords: checkin-needed
https://hg.mozilla.org/integration/mozilla-inbound/rev/27207b8dc546
Keywords: checkin-needed
Whiteboard: [test disabled on B2G][leave open] → [leave open]
The latest failure is weird and looks unrelated to the other failures. Looks like the test timed out (with the custom timeout that we should remove) but then finished correctly...

Looks like the original intermittent failure was fixed.
Looking at the log, it looks like the custom 30 s timeout elapses while the test is still correctly running. I think the removal of the 30 s timeout (that will be done in bug 947565) will fix this new failure too.
Bug 947565 is fixed, the latest failure here should be fixed too.
Assignee: nobody → mar.castelluccio
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [leave open]
Marco, can you please request Aurora/Beta approval on this when you get a chance? Thanks :)
Flags: needinfo?(ferjmoreno) → needinfo?(mar.castelluccio)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #273)
> Marco, can you please request Aurora/Beta approval on this when you get a
> chance? Thanks :)

Sure, but I'm not sure how the request works. Should I first write a patch to Aurora/Beta and then ask for approval or ask for approval and then write the patch?
Flags: needinfo?(mar.castelluccio) → needinfo?(ryanvm)
If the patch needs rebasing, go ahead and do so, then request approval on that. Thanks!
Flags: needinfo?(ryanvm)
This is already in Aurora :D
Target Milestone: mozilla32 → mozilla31
touche!
(In reply to Marco Castelluccio [:marco] from comment #276)
> This is already in Aurora :D

Still needs approval for beta!
Flags: needinfo?(mar.castelluccio)
FWIW, I took a look at this and it's going to need some rebasing for beta uplift. Still worth doing, IMO :)
Attached patch Patch for BetaSplinter Review
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 
User impact if declined: dom/apps/tests intermittently failing
Testing completed (on m-c, etc.): It's been on m-c and m-a for more than a week
Risk to taking this patch (and alternatives if risky): Low risk, most of the patch is modifying code indentation.
String or IDL/UUID changes made by this patch: None.

Note that this patch also contains the patch from bug 999220 (that fixed the last issue with file I/O in Webapps.jsm).
Attachment #8420144 - Flags: approval-mozilla-beta?
Flags: needinfo?(mar.castelluccio)
Attachment #8420144 - Attachment description: wait_for_writefile → Patch for Beta
Attachment #8420144 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.