Closed Bug 863738 Opened 13 years ago Closed 6 years ago

Intermittent test_cookies_persistence.js | test failed (with xpcshell return code: 0) | 4 == 0 | head_cookies.js | caught exception 2147500036 - See following stack:

Categories

(Core :: Networking: Cookies, defect, P3)

x86_64
macOS
defect

Tracking

()

RESOLVED INVALID

People

(Reporter: RyanVM, Unassigned)

Details

(Keywords: intermittent-failure, Whiteboard: [test disabled][leave open][necko-backlog])

https://tbpl.mozilla.org/php/getParsedLog.php?id=21964245&tree=Mozilla-Central Rev4 MacOSX Lion 10.7 mozilla-central opt test xpcshell on 2013-04-18 09:17:41 PDT for push fa5d5fccbc11 slave: talos-r4-lion-042 09:32:44 INFO - TEST-INFO | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/test_cookies_persistence.js | running test ... 09:32:45 WARNING - TEST-UNEXPECTED-FAIL | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/test_cookies_persistence.js | test failed (with xpcshell return code: 0), see following log: 09:32:45 INFO - >>>>>>> 09:32:45 INFO - TEST-INFO | (xpcshell/head.js) | test 1 pending 09:32:45 INFO - TEST-INFO | (xpcshell/head.js) | test 2 pending 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js | [do_set_cookies : 121] 1 == 1 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js | [do_set_cookies : 124] 2 == 2 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js | [do_set_cookies : 127] 3 == 3 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js | [do_set_cookies : 130] 4 == 4 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js | [do_set_cookies : 121] 1 == 1 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js | [do_set_cookies : 124] 2 == 2 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js | [do_set_cookies : 127] 3 == 3 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js | [do_set_cookies : 130] 4 == 4 09:32:45 INFO - TEST-INFO | (xpcshell/head.js) | test 2 finished 09:32:45 INFO - TEST-INFO | (xpcshell/head.js) | running event loop 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js | [_observer.prototype.observe : 72] cookie-db-closed == cookie-db-closed 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/test_cookies_persistence.js | [do_run_test : 52] 4 == 4 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/test_cookies_persistence.js | [do_run_test : 53] 0 == 0 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/test_cookies_persistence.js | [do_run_test : 60] 4 == 4 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/test_cookies_persistence.js | [do_run_test : 61] 0 == 0 09:32:45 INFO - TEST-PASS | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js | [_observer.prototype.observe : 72] cookie-db-closed == cookie-db-closed 09:32:45 WARNING - TEST-UNEXPECTED-FAIL | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/test_cookies_persistence.js | 4 == 0 - See following stack: 09:32:45 INFO - JS frame :: /Users/cltbld/talos-slave/test/build/tests/xpcshell/head.js :: do_throw :: line 472 09:32:45 INFO - JS frame :: /Users/cltbld/talos-slave/test/build/tests/xpcshell/head.js :: do_report_result :: line 574 09:32:45 INFO - JS frame :: /Users/cltbld/talos-slave/test/build/tests/xpcshell/head.js :: _do_check_eq :: line 584 09:32:45 INFO - JS frame :: /Users/cltbld/talos-slave/test/build/tests/xpcshell/head.js :: do_check_eq :: line 591 09:32:45 INFO - JS frame :: /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/test_cookies_persistence.js :: do_run_test :: line 67 09:32:45 INFO - JS frame :: /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js :: do_run_generator :: line 47 09:32:45 INFO - JS frame :: /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js :: _observer.prototype.observe :: line 78 09:32:45 INFO - native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0 09:32:45 INFO - TEST-INFO | (xpcshell/head.js) | exiting test 09:32:45 WARNING - TEST-UNEXPECTED-FAIL | /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js | caught exception 2147500036 - See following stack: 09:32:45 INFO - JS frame :: /Users/cltbld/talos-slave/test/build/tests/xpcshell/head.js :: do_throw :: line 472 09:32:45 INFO - JS frame :: /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js :: do_run_generator :: line 50 09:32:45 INFO - JS frame :: /Users/cltbld/talos-slave/test/build/tests/xpcshell/tests/extensions/cookie/test/unit/head_cookies.js :: _observer.prototype.observe :: line 78 09:32:45 INFO - native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0 09:32:45 INFO - TEST-INFO | (xpcshell/head.js) | exiting test
I'm inclined to disable the test at this point due to the frequency and lack of traction in getting it investigated and fixed. Your thoughts, Patrick?
Flags: needinfo?(mcmanus)
ryanvm, :bagder will take a pass at this
Flags: needinfo?(mcmanus)
I've been running this single test in a loop on my mac here since several hours. It is on 1653 laps now and counting and not once has it failed for me. I'll leave it running longer but it may need something else to trigger. The log says Mac OS 10.7 for example and my mac is on 10.9. Could also be other things that differ. Do we even run the tests on other versions? Can I view that somehow?
Flags: needinfo?(ryanvm)
All of the recent failures have been on our OSX 10.6 rev4 mini slaves. According to the manifest, we're running this test across all platforms. http://mxr.mozilla.org/mozilla-central/source/extensions/cookie/test/unit/xpcshell.ini Worst case, you can always try getting a loaner slave to test on. https://wiki.mozilla.org/ReleaseEngineering/How_To/Request_a_slave
Flags: needinfo?(ryanvm)
Given the current Windows failure rate, this needs an owner Really Soon Now or we'll need to skip the test.
Flags: needinfo?(jduell.mcbugs)
Steve: Monica used to own cookies--is there anyone to take her place at this point?
Flags: needinfo?(jduell.mcbugs) → needinfo?(sworkman)
(In reply to Jason Duell [:jduell] (needinfo? me) from comment #415) > Steve: Monica used to own cookies--is there anyone to take her place at this > point? We don't have an owner yet - I'm emailing her separately about it. Chris or Tanvi, can one of you take a look at this please? Daniel already took a look and couldn't reproduce it (comment #201). I'd like an idea of what's failing so we can know how to proceed. Looks like there are some pref changes - if it's testing non-default cases, then disabling the test should have a lower impact.
Flags: needinfo?(tanvi)
Flags: needinfo?(sworkman)
Flags: needinfo?(mozilla)
I can't reproduce at the moment, this needs more investigation. The only thing I spotted when starring at the testcode is a copy/paste error, but that's not responsible for the problem we are facing here: > var httpchannel1 = channel1.QueryInterface(Ci.nsIHttpChannelInternal); > - var httpchannel2 = channel1.QueryInterface(Ci.nsIHttpChannelInternal); > + var httpchannel2 = channel2.QueryInterface(Ci.nsIHttpChannelInternal); > httpchannel1.forceAllowThirdPartyCookie = true; > httpchannel2.forceAllowThirdPartyCookie = true; Maybe Tanvi has any ideas!
Flags: needinfo?(mozilla)
(In reply to Christoph Kerschbaumer [:ckerschb] from comment #471) > Maybe Tanvi has any ideas! Leaving my needinfo flag, as I will take a look next week. I have to wrap up a couple other bugs for Firefox 42 first. Ryan, looks like the test failures have all of a sudden spiked? Is that ture? If so, do you want to temporarily disable the test, or is it okay to leave as is until we have more info next week?
(In reply to Tanvi Vyas [:tanvi] from comment #498) > (In reply to Christoph Kerschbaumer [:ckerschb] from comment #471) > > Maybe Tanvi has any ideas! > > Leaving my needinfo flag, as I will take a look next week. I have to wrap > up a couple other bugs for Firefox 42 first. > > Ryan, looks like the test failures have all of a sudden spiked? Is that > ture? If so, do you want to temporarily disable the test, or is it okay to > leave as is until we have more info next week?
Flags: needinfo?(ryanvm)
(In reply to Tanvi Vyas [:tanvi] from comment #498) > Ryan, looks like the test failures have all of a sudden spiked? Is that > ture? If so, do you want to temporarily disable the test, or is it okay to > leave as is until we have more info next week? I'd prefer we disable now so we don't have to deal with it on Aurora post-uplift as well.
Flags: needinfo?(ryanvm)
Whiteboard: [test disabled][leave open]
The recent update is for when the test is run on Windows 8 64 bit. There are also a few Windows 7 bugs. This test is also failing recently and is also windows only - https://bugzilla.mozilla.org/show_bug.cgi?id=1156277 The line that is failing is https://mxr.mozilla.org/mozilla-central/source/extensions/cookie/test/unit/test_cookies_persistence.js#60 [1]. This same test succeeds on line 52, so maybe it is something about how we close the profile the second time: // Again, but don't wait for the async close to complete. This should always // work, since we blocked on close above and haven't kicked off any writes // since then. I'm not sure what this means. needinfo'ing dwitte to see if he can shed some light here. Also cc'ing mak since he wrote the other failing test. [1] TEST-UNEXPECTED-FAIL | extensions/cookie/test/unit/test_cookies_persistence.js | do_run_test - [do_run_test : 60] 0 == 4
Flags: needinfo?(dwitte)
Flags: needinfo?(tanvi)
I haven't read the code, because I wrote it about 5 years ago, but based on the comment I'm guessing it means that whatever it's trying to test at that point should succeed because 1) it depends on the implementation's db connection being closed and 2) the test or implementation is known to have already waited for the close to succeed, and hasn't reopened the connection for writes since.
Flags: needinfo?(dwitte)
(In reply to dwitte@gmail.com from comment #509) > I haven't read the code, because I wrote it about 5 years ago, but based on > the comment I'm guessing it means that whatever it's trying to test at that > point should succeed because 1) it depends on the implementation's db > connection being closed and 2) the test or implementation is known to have > already waited for the close to succeed, and hasn't reopened the connection > for writes since. So if we wait for the close to succeed and then load profile, this test succeeds. And if we don't wait for the close to succeed on Windows 8, the test fails when it didn't before. mak, can you look at this bug along with https://bugzilla.mozilla.org/show_bug.cgi?id=1156277?
The problem here is that openUnsharedDatabase returns NS_ERROR_STORAGE_BUSY (that is basically SQLITE_BUSY) when trying to open the database, at that point cookies think the database is corrupt, it calls TryInitDB(true), that basically throws away the database and creates a new one. The new database doesn't have the cookies so the test fails. Now the question is why we get a SQLITE_BUSY, it's clear another connection has a table lock, and considering when this happens, it's quite clear it's the previous connection that we asyncClosed without waiting. Whether it's worth spending time trying to figure what is causing a table lock to be taken (may be some internal Sqlite change or something that exists from a long time and it just happens more often due to a different thread scheduling), depends on how important is to retain this behavior. Afaik we don't support restartable profiles from a long time, so why is it important to test that closing and reopening a profile in the same tick works? I'd personally just throw away the idea that this thing should work. It's clear from this bug history that it's not realiable. Afaict, it's an almost impossible case in real-life. A possible (but I can't bet on that) workaround for this could be to set PRAGMA busy_timeout by default on all Storage connections, but we preferred to not do that cause it may cause main-thread jank, and since here we are failing on init, we could not do it just for cookies. So it's a no-go. Another possibility is to not throw away the database if we get NS_ERROR_STORAGE_BUSY, indeed I'm not sure why we do that, ideally we should only throw away the file if we get NS_ERROR_FILE_CORRUPTED. I didn't work on this code, so I'm not sure why it's doing that. But then what would be the solution, an endless while loop trying to open? That doesn't sound too good. But also losing all the cookies for a busy error doesn't sound nice. (In reply to Tanvi Vyas [:tanvi] from comment #511) > mak, can you look at this bug along with > https://bugzilla.mozilla.org/show_bug.cgi?id=1156277? I don't think there is any relation with bug 1156277, I will investigate that on Try with some added logging to figure at which point it fails, but that test doesn't cause a profile restart like this one and thus it's unrelated.
Whiteboard: [test disabled][leave open] → [test disabled][leave open][necko-backlog]
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
Priority: P3 → P1
Priority: P1 → P3

No intermittent failures in the last 3 years.

Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.