Closed
Bug 782641
Opened 12 years ago
Closed 11 years ago
Intermittent test_nodb_pluschanges.js | test failed (with xpcshell return code: 0) | Timeout - See following stack | head.js | exception thrown from do_timeout callback: 2147500036
Categories
(Firefox :: Search, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: emorley, Assigned: Yoric)
References
Details
(Keywords: intermittent-failure)
Attachments
(1 file, 3 obsolete files)
2.00 KB,
patch
|
Gavin
:
review+
|
Details | Diff | Splinter Review |
Rev3 WINNT 6.1 fx-team pgo test xpcshell on 2012-08-13 17:16:12 PDT for push 84324c323762 slave: talos-r3-w7-039 https://tbpl.mozilla.org/php/getParsedLog.php?id=14355004&tree=Fx-Team { TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/components/search/tests/xpcshell/test_nodb_pluschanges.js | Forcing flush *** Search: SRCH_SVC_saveSortedEngineList: starting *** Search: SRCH_SVC_saveSortedEngineList: something to do *** Search: epsCommit: start *** Search: epsCommit: (re)setting timer *** Search: Lazy_go: starting *** Search: Lazy_go: creating timer *** Search: SRCH_SVC_saveSortedEngineList: done *** Search: _buildCache: Writing to cache file. *** Search: Lazy_flush: starting *** Search: epsWriteCommit: start TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/components/search/tests/xpcshell/test_nodb_pluschanges.js | Commit complete TEST-PASS | c:/talos-slave/test/build/xpcshell/tests/toolkit/components/search/tests/xpcshell/test_nodb_pluschanges.js | [null : 66] true == true TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/components/search/tests/xpcshell/test_nodb_pluschanges.js | Parsing metadata *** Search: epsWriteCommit: done 0 *** Search: epsWriteCommit: done 0 TEST-UNEXPECTED-FAIL | c:/talos-slave/test/build/xpcshell/tests/toolkit/components/search/tests/xpcshell/test_nodb_pluschanges.js | Timeout - See following stack: JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 451 JS frame :: c:/talos-slave/test/build/xpcshell/tests/toolkit/components/search/tests/xpcshell/test_nodb_pluschanges.js :: <TOP_LEVEL> :: line 99 JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: <TOP_LEVEL> :: line 121 TEST-INFO | (xpcshell/head.js) | exiting test TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\head.js | exception thrown from do_timeout callback: 2147500036 - See following stack: JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 451 JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: <TOP_LEVEL> :: line 123 }
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 3•12 years ago
|
||
Since this is reported only under Windows, I suspect that this is due to a file being improperly closed.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•12 years ago
|
Keywords: intermittent-failure
Updated•12 years ago
|
Whiteboard: [orange]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 15•11 years ago
|
||
Self-reviewing a trivial typo fix. Not 100% sure it will fix the current bug, but it's worth a try.
Assignee: nobody → dteller
Attachment #731189 -
Flags: review+
Assignee | ||
Updated•11 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 16•11 years ago
|
||
Alternative fix, equally trivial, as suggested by gavin.
Attachment #731189 -
Attachment is obsolete: true
Attachment #731233 -
Flags: review+
Comment 17•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/3f725a32ba89
Flags: in-testsuite-
Keywords: checkin-needed
Assignee | ||
Comment 18•11 years ago
|
||
After more analysis, I believe that the fix will not prevent the intermittent test failures. I have the strong impression that this is yet another case of Windows not removing files when it claims to have removed them, and rather producing "Access denied" errors when one attempts to do anything that could touch that file. The problem would appear in |writeCommit|.
Comment 19•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/3f725a32ba89
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 22
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•11 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: Firefox 22 → ---
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 24•11 years ago
|
||
I'll see if an active wait can ensure that we always wait long enough that Windows flushes its buffers.
Assignee | ||
Comment 25•11 years ago
|
||
Ok, here's a version of the test suite that spins until the file is effectively removed. Something of a hack, but it seems to do the trick.
Attachment #731233 -
Attachment is obsolete: true
Attachment #750424 -
Flags: review?(gavin.sharp)
Comment 26•11 years ago
|
||
Can you explain what the problem is exactly, and why this fixes it? I don't understand based on earlier comments here.
Flags: needinfo?(dteller)
Assignee | ||
Comment 27•11 years ago
|
||
Windows has an interesting implementation of file removal. After the system call has returned, there is a brief interval during which the file *name* is unavailable, i.e. any system call involving this file name will fail. There is no system call to inform clients that the file name has become available again. In the testsuite, we cleanup some files that may be leftover from other tests, then proceed to call code that immediately recreates these files. This causes the behavior mentioned above. The fix simply loops until the file name is available before proceeding.
Flags: needinfo?(dteller)
Comment 28•11 years ago
|
||
(In reply to David Rajchenbach Teller [:Yoric] from comment #27) > In the testsuite, we cleanup some files that may be leftover from other > tests, then proceed to call code that immediately recreates these files. > This causes the behavior mentioned above. > > The fix simply loops until the file name is available before proceeding. How does calling open("I do not exist") do this? Shouldn't be you be calling open() on the filename that we depend on being available instead?
Comment 29•11 years ago
|
||
(sorry, I meant "read", not "open")
Assignee | ||
Comment 30•11 years ago
|
||
Might be related to me confusing two patches for distinct bugs :)
Attachment #750424 -
Attachment is obsolete: true
Attachment #750424 -
Flags: review?(gavin.sharp)
Attachment #750701 -
Flags: review?(gavin.sharp)
Comment 31•11 years ago
|
||
Comment on attachment 750701 [details] [diff] [review] Hacking around Windows file system oddities This makes much more sense! IIRC this problem has come up a lot before, I forget how it was solved in other cases. A while(true)exists() loop seems dangerous - if the removal fails for some reason, this could iloop. I guess polling less frequently and adding a callback would require more extensive test changes...
Assignee | ||
Comment 32•11 years ago
|
||
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #31) > Comment on attachment 750701 [details] [diff] [review] > Hacking around Windows file system oddities > > This makes much more sense! > > IIRC this problem has come up a lot before, I forget how it was solved in > other cases. A while(true)exists() loop seems dangerous - if the removal > fails for some reason, this could iloop. Well, I assumed that we would timeout. I can add a 1second timeout, if necessary. > I guess polling less frequently and > adding a callback would require more extensive test changes... It would. If you deem it necessary, though, I can rewrite the tests to make them somewhat async.
Comment 33•11 years ago
|
||
Hmm, interestingly this problem seems to no longer occur. Did we asyncify something that makes this problem less likely to occur in practice now?
Comment 34•11 years ago
|
||
Comment on attachment 750701 [details] [diff] [review] Hacking around Windows file system oddities I guess this is fine as a test-hack, but let's not land it unless we need to.
Attachment #750701 -
Flags: review?(gavin.sharp) → review+
Assignee | ||
Comment 35•11 years ago
|
||
Let's WONTFIX this for the moment, then.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → WONTFIX
Updated•11 years ago
|
Resolution: WONTFIX → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•