Closed
Bug 909933
Opened 10 years ago
Closed 10 years ago
Intermittent browser_thumbnails_update.js | thumbnail is still green - Got 255,0,0, expected 0,255,0
Categories
(Firefox :: Tabbed Browser, defect)
Tracking
()
RESOLVED
DUPLICATE
of bug 916497
People
(Reporter: RyanVM, Unassigned)
Details
(Keywords: intermittent-failure)
Attachments
(1 file)
14.76 KB,
patch
|
markh
:
review+
|
Details | Diff | Splinter Review |
https://tbpl.mozilla.org/php/getParsedLog.php?id=27078953&tree=Mozilla-Inbound Windows XP 32-bit mozilla-inbound debug test mochitest-browser-chrome on 2013-08-27 12:24:24 PDT for push 1dce2c72ea15 slave: t-xp32-ix-102 13:31:01 INFO - TEST-PASS | chrome://mochitests/content/browser/toolkit/components/thumbnails/test/browser_thumbnails_update.js | c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpahnfov\thumbnails\eae6197cdeceaf2b806a2119a1bf8ebe.png should exist 13:31:02 INFO - ++DOMWINDOW == 4 (06685558) [serial = 4] [outer = 05457BE8] 13:31:02 INFO - ++DOMWINDOW == 5 (0669D488) [serial = 5] [outer = 05457BE8] 13:31:02 INFO - TEST-PASS | chrome://mochitests/content/browser/toolkit/components/thumbnails/test/browser_thumbnails_update.js | modified time should be >= now 13:31:02 INFO - ++DOCSHELL 23A97628 == 44 [id = 6315] 13:31:02 INFO - ++DOMWINDOW == 308 (158E79F8) [serial = 16140] [outer = 00000000] 13:31:02 INFO - ++DOMWINDOW == 309 (1DA2E4F0) [serial = 16141] [outer = 158E79F8] 13:31:02 INFO - [Parent 3068] WARNING: NS_ENSURE_TRUE(mMutable) failed: file e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/netwerk/base/src/nsSimpleURI.cpp, line 266 13:31:02 INFO - [Parent 3068] WARNING: NS_ENSURE_TRUE(mMutable) failed: file e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/netwerk/base/src/nsSimpleURI.cpp, line 266 13:31:02 INFO - ++DOMWINDOW == 310 (35E33B88) [serial = 16142] [outer = 158E79F8] 13:31:02 INFO - --DOCSHELL 1EA17A40 == 43 [id = 6308] 13:31:02 INFO - --DOCSHELL 1D93C350 == 42 [id = 6307] 13:31:02 INFO - --DOCSHELL 245E4420 == 41 [id = 6306] 13:31:02 INFO - --DOCSHELL 0BB3BAB0 == 40 [id = 6305] 13:31:02 INFO - --DOCSHELL 1EBA7F58 == 39 [id = 6304] 13:31:02 INFO - --DOCSHELL 20561520 == 38 [id = 6303] 13:31:02 INFO - --DOCSHELL 0D444A00 == 37 [id = 6302] 13:31:02 INFO - --DOCSHELL 150CC970 == 36 [id = 6301] 13:31:02 INFO - --DOCSHELL 13210A70 == 35 [id = 6300] 13:31:02 INFO - --DOCSHELL 14B7F8A0 == 34 [id = 6299] 13:31:02 INFO - --DOCSHELL 36838F80 == 33 [id = 6298] 13:31:02 INFO - --DOCSHELL 3772E4A0 == 32 [id = 6297] 13:31:02 INFO - --DOCSHELL 14639098 == 31 [id = 6296] 13:31:02 INFO - --DOCSHELL 137A85E0 == 30 [id = 6295] 13:31:02 INFO - --DOCSHELL 21211FE8 == 29 [id = 6294] 13:31:02 INFO - --DOMWINDOW == 309 (218F2E40) [serial = 16040] [outer = 00000000] [url = about:privatebrowsing] 13:31:02 INFO - --DOMWINDOW == 308 (36FE3C40) [serial = 16047] [outer = 00000000] [url = chrome://global/content/mozilla.xhtml] 13:31:02 INFO - --DOMWINDOW == 307 (23D07D90) [serial = 16039] [outer = 00000000] [url = about:blank] 13:31:02 INFO - --DOCSHELL 230DB190 == 28 [id = 6293] 13:31:02 INFO - --DOCSHELL 1E305C28 == 27 [id = 6292] 13:31:02 INFO - --DOCSHELL 0B9FB100 == 26 [id = 6291] 13:31:02 INFO - --DOCSHELL 37CC0B28 == 25 [id = 6290] 13:31:02 INFO - --DOCSHELL 1FE8E1D8 == 24 [id = 6289] 13:31:02 INFO - --DOCSHELL 128A3FB8 == 23 [id = 6288] 13:31:02 INFO - --DOCSHELL 22B24160 == 22 [id = 6287] 13:31:02 INFO - --DOMWINDOW == 306 (23F0F0D8) [serial = 16050] [outer = 00000000] [url = about:neterror?e=deniedPortAccess&u=http%3A//127.0.0.1%3A1/&c=UTF-8&d=This%20address%20uses%20a%20network%20port%20which%20is%20normally%20used%20for%20purposes%20other%20than%20Web%20browsing.%20Firefox%20has%20canceled%20the%20request%20for%20your%20protection.] 13:31:02 INFO - --DOMWINDOW == 305 (1FAB95D8) [serial = 16030] [outer = 00000000] [url = about:blank] 13:31:02 INFO - --DOMWINDOW == 304 (15B2DB60) [serial = 16044] [outer = 00000000] [url = http://mochi.test:8888/browser/toolkit/components/thumbnails/test/thumbnails_background.sjs?%7B%22setGreenCookie%22%3Atrue%7D] 13:31:02 INFO - --DOMWINDOW == 303 (13ED0EC0) [serial = 16043] [outer = 00000000] [url = http://mochi.test:8888/browser/toolkit/components/thumbnails/test/thumbnails_background.sjs?%7B%22setGreenCookie%22%3Atrue%7D] 13:31:02 INFO - [Parent 3068] WARNING: NS_ENSURE_TRUE(mMutable) failed: file e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/netwerk/base/src/nsSimpleURI.cpp, line 266 13:31:02 WARNING - TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/toolkit/components/thumbnails/test/browser_thumbnails_update.js | thumbnail is still green - Got 255,0,0, expected 0,255,0 13:31:02 INFO - Stack trace: 13:31:02 INFO - JS frame :: chrome://mochikit/content/browser-test.js :: test_is :: line 560 13:31:02 INFO - JS frame :: chrome://mochitests/content/browser/toolkit/components/thumbnails/test/browser_thumbnails_update.js :: errorResponseUpdateTest/</< :: line 105 13:31:02 INFO - JS frame :: chrome://mochitests/content/browser/toolkit/components/thumbnails/test/head.js :: <TOP_LEVEL> :: line 162 13:31:02 INFO - JS frame :: chrome://mochikit/content/browser-test.js :: <TOP_LEVEL> :: line 589 13:31:02 INFO - native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0 13:31:02 INFO - TEST-INFO | chrome://mochitests/content/browser/toolkit/components/thumbnails/test/browser_thumbnails_update.js | Running subtest goodResponseUpdateTest
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 6•10 years ago
|
||
The log in comment 0 shows that subtests are getting interleaved, so something bad's happening: > 13:31:01 INFO - TEST-PASS | chrome://mochitests/content/browser/toolkit/components/thumbnails/test/browser_thumbnails_update.js | c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpahnfov\thumbnails\eae6197cdeceaf2b806a2119a1bf8ebe.png should exist > 13:31:02 INFO - TEST-PASS | chrome://mochitests/content/browser/toolkit/components/thumbnails/test/browser_thumbnails_update.js | modified time should be >= now > 13:31:02 WARNING - TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/toolkit/components/thumbnails/test/browser_thumbnails_update.js | thumbnail is still green - Got 255,0,0, expected 0,255,0 Lines 1 and 3 are from capIfStaleErrorResponseUpdateTest, but line 2 is from capIfStaleGoodResponseUpdateTest. The latter runs after the former. Investigating...
Assignee: nobody → adw
Status: NEW → ASSIGNED
Comment 7•10 years ago
|
||
Oops, this test has been updated since comment 0, so I was comparing an old log to current code. The logs after comment 0 show a similar but different error, and now I don't see what the problem could be. A green thumbnail is on disk that's somehow overwritten by a red thumbnail, but that should not happen, because the bg service never overwrites any files, and the fg service doesn't overwrite when the response is not a 2xx. All the failures are on Windows, though... I remember hearing over the years about flakiness on Windows due to files not being "completely" written when they should be, causing random test failures. We're just relying on OS.File, so there ought to be similar OS.File intermittent Windows failures if that were the problem, but bug searches don't reveal any, open or closed. A shot in the dark: PageThumbsStorage.writeData passes `flush: false` to OS.File.writeAtomic. Maybe passing true would make a difference, but reading the writeAtomic implementation [1], I don't see how, and none of the writeAtomic tests pass `flush` anyway [2]. So I don't know what. Maybe some logging, and changing the thumbnail colors per subtest to be sure that they aren't interfering with each other. [1] http://mxr.mozilla.org/mozilla-central/source/toolkit/components/osfile/modules/osfile_shared_front.jsm#383 [2] http://mxr.mozilla.org/mozilla-central/source/toolkit/components/osfile/tests/mochi/worker_test_osfile_front.js#184
Comment 8•10 years ago
|
||
(In reply to Drew Willcoxon :adw from comment #7) > and the fg > service doesn't overwrite when the response is not a 2xx. And now it shouldn't overwrite one for 2 days, and the failure is the second capture of that URL IIRC. > A shot in the dark: PageThumbsStorage.writeData passes `flush: false` to > OS.File.writeAtomic. Maybe passing true would make a difference, but > reading the writeAtomic implementation [1], I don't see how, and none of the > writeAtomic tests pass `flush` anyway [2]. writeAtomic taking that param looks like a copy-paste error to me :) It closes the handle, so there's no need to flush (it makes sense for plain write() though) > So I don't know what. Maybe some logging, and changing the thumbnail colors > per subtest to be sure that they aren't interfering with each other. Yeah - I had a look and couldn't rationalize it either :) That .sjs is fragile too - resetting the state second time around is a problem, so logging there would help too - pity it's not more regular, then you wouldn't have trouble hitting it on try :) IMO it wouldn't really hurt if we checked into trunk one or 2 logStringMessage (sp?) calls as a b/g capture started and completed.
Comment 9•10 years ago
|
||
This makes the test more rigorous. It tests in a consistent manner all possible and valid combinations of capture method, response status, whether the file previously exists, and whether the file is stale. It ensures page colors are as expected after page load, not only the colors of thumbnails that capture those pages. It ensures thumbnail files don't exist before each subtest, and each subtest removes its file. It generates a unique color each time the sjs loads. It doesn't use timestamps anymore since page colors are sufficient. I don't know if this will fix the failure. Maybe it'll introduce others. But I think it's sturdier than what we have now. We can add logging later if it's not enough.
Attachment #809555 -
Flags: review?(mhammond)
Comment 10•10 years ago
|
||
Comment on attachment 809555 [details] [diff] [review] update.js rewrite Review of attachment 809555 [details] [diff] [review]: ----------------------------------------------------------------- IMO this change has made the test significantly harder to read/rationalize, with the main advantage being that if the tests somehow run in parallel it will be more obvious. I'm very skeptical that this is the reason behind the oranges and if they do exist for other reasons, I don't think this is going to make it much easier to see why. That said though, there are enough good changes here that I really can't object - but fingers-crossed this actually helps oranges and doesn't create new ones. I'd like to suggest that this gets pushed to try and (say) 10 retriggers are done on each platform before checking in, just so we can be confident we aren't causing more problems than we are solving here. ::: toolkit/components/thumbnails/test/browser_thumbnails_update.js @@ +109,5 @@ > + let captureExpected = (capIfStale && > + ((responseOK && (!exists || stale)) || > + (!responseOK && !exists))) || > + (!capIfStale && > + (responseOK || !exists)); this is a very complex expression to wrap your head around. I think I'd prefer to see it written as something like: if (capIfStale) captureExpected = responseOK ? !exists || stale : !exists; else captureExpected = responseOK || !exists; as I can mentally decode that far easier than the original. I also think we need some more info() call sprinkled here - if this test still has the same oranges as before, it seems it will be hard to trace the logic purely from the logs.
Attachment #809555 -
Flags: review?(mhammond) → review+
Comment 11•10 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=30dc8b9b87b2 So far one failure: https://tbpl.mozilla.org/php/getParsedLog.php?id=28429753&tree=Try Again, a fresh thumbnail is on disk, but apparently the subsequent captureAndStoreIfStale overwrites it. I'm really starting to suspect Windows file system flakiness. I'll add some logging to PageThumbs and try again.
Comment 12•10 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=f2137d089490 One failure: https://tbpl.mozilla.org/php/getParsedLog.php?id=28490547&tree=Try This time, a thumbnail is on disk, we load a 401 page, call captureAndStore, which writes the new thumbnail with noOverwrite=true, but still the old thumbnail is overwritten with the errored thumbnail. There's no writeData "The file exists" error like there is during test 3, when noOverwrite=true due to a 401 and the file exists. Otherwise everything is working as expected during the test. I'm going to have to add logging to OS.File now because this here is crazy.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•10 years ago
|
Blocks: fxdesktopbacklog
Whiteboard: [triage]
Updated•10 years ago
|
Whiteboard: [triage]
Updated•10 years ago
|
Whiteboard: p=0
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 26•10 years ago
|
||
Seeing this more frequently with mochitest-bc running in chunks.
Flags: firefox-backlog?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•10 years ago
|
Flags: firefox-backlog? → firefox-backlog+
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•10 years ago
|
No longer blocks: fxdesktopbacklog
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 40•10 years ago
|
||
Patch in bug 916497 touches code used by this test and the most recent occurrence is the day that landed.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
Updated•10 years ago
|
Assignee: adw → nobody
Flags: firefox-backlog+ → firefox-backlog-
Whiteboard: p=0
You need to log in
before you can comment on or make changes to this bug.
Description
•