Closed Bug 956211 Opened 6 years ago Closed 5 years ago

Intermittent TEST-UNEXPECTED-FAIL | crashtests/694165-1.xhtml | load failed: null

Categories

(Core :: Networking, defect)

x86_64
macOS
defect
Not set

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: KWierso, Unassigned)

References

Details

(Keywords: intermittent-failure, regression)

https://tbpl.mozilla.org/php/getParsedLog.php?id=32488544&tree=Mozilla-Central
slave: talos-r4-snow-119



14:09:56     INFO -  REFTEST TEST-START | file:///builds/slave/talos-slave/test/build/tests/reftest/tests/image/test/crashtests/694165-1.xhtml
14:09:56     INFO -  REFTEST TEST-LOAD | file:///builds/slave/talos-slave/test/build/tests/reftest/tests/image/test/crashtests/694165-1.xhtml | 2503 / 2602 (96%)
14:09:56     INFO -  ++DOMWINDOW == 28 (0x1446c7b18) [pid = 889] [serial = 5666] [outer = 0x11d5a82c8]
14:10:28     INFO -  --DOMWINDOW == 27 (0x14872fe28) [pid = 889] [serial = 5633] [outer = 0x0] [url = data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E]
14:10:28     INFO -  --DOMWINDOW == 26 (0x148aa4f78) [pid = 889] [serial = 5632] [outer = 0x0] [url = file:///builds/slave/talos-slave/test/build/tests/reftest/tests/gfx/tests/crashtests/893572-4.html]
14:10:28     INFO -  --DOMWINDOW == 25 (0x1258f8088) [pid = 889] [serial = 5661] [outer = 0x0] [url = data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E]
14:10:28     INFO -  --DOMWINDOW == 24 (0x1436a7d38) [pid = 889] [serial = 5659] [outer = 0x0] [url = data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E]
14:10:30     INFO -  --DOMWINDOW == 23 (0x1488ac578) [pid = 889] [serial = 5658] [outer = 0x0] [url = file:///builds/slave/talos-slave/test/build/tests/reftest/tests/image/test/crashtests/256-width.ico]
14:10:30     INFO -  --DOMWINDOW == 22 (0x144b1bcc8) [pid = 889] [serial = 5660] [outer = 0x0] [url = file:///builds/slave/talos-slave/test/build/tests/reftest/tests/image/test/crashtests/256-height.ico]
14:10:52     INFO -  --DOMWINDOW == 21 (0x14283fd98) [pid = 889] [serial = 5634] [outer = 0x0] [url = file:///builds/slave/talos-slave/test/build/tests/reftest/tests/image/test/crashtests/ownerdiscard.html]
14:11:12     INFO -  --DOMWINDOW == 20 (0x140741228) [pid = 889] [serial = 5662] [outer = 0x0] [url = http://localhost:49207/1388699285659/5/delayedframe.sjs]
14:15:09     INFO -  REFTEST TEST-UNEXPECTED-FAIL | file:///builds/slave/talos-slave/test/build/tests/reftest/tests/image/test/crashtests/694165-1.xhtml | load failed: null
14:15:09     INFO -  REFTEST INFO | Saved log: START file:///builds/slave/talos-slave/test/build/tests/reftest/tests/image/test/crashtests/694165-1.xhtml
14:15:09     INFO -  REFTEST INFO | Saved log: [CONTENT] AfterPaintListener in data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E
14:15:09     INFO -  REFTEST INFO | Saved log: [CONTENT] AfterPaintListener in data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E
14:15:09     INFO -  REFTEST INFO | Saved log: [CONTENT] AfterPaintListener in file:///builds/slave/talos-slave/test/build/tests/reftest/tests/image/test/crashtests/694165-1.xhtml
14:15:09     INFO -  REFTEST INFO | Saved log: [CONTENT] AfterPaintListener in file:///builds/slave/talos-slave/test/build/tests/reftest/tests/image/test/crashtests/694165-1.xhtml
14:15:09     INFO -  REFTEST INFO | Loading a blank page
14:15:20     INFO -  ++DOMWINDOW == 21 (0x151e219b8) [pid = 889] [serial = 5667] [outer = 0x11d5a82c8]
14:15:37     INFO -  REFTEST TEST-END | file:///builds/slave/talos-slave/test/build/tests/reftest/tests/image/test/crashtests/694165-1.xhtml
Any ideas on this, Dan? I'm contemplating disabling the test due to the frequency.
Flags: needinfo?(dholbert)
No idea.

The test is a bit extreme (it loads a data URI that's ~2.2 million characters long), but it loads in 2 seconds on my linux box.

If we do disable it, can we disable it on MacOSX Mountain Lion 10.8 only?  It looks like pretty much all of the failures are on that platform (aside from a bundle of WinXP failures from May).

I'm going to try to load it on my Mac Mini & see if anything weird happens...
Flags: needinfo?(dholbert)
My mac mini (running OS X 10.10 Yosemite) loads the testcase in ~2 seconds, too.  So, no obvious problems there, from viewing it in a desktop browser.
From some debugging with RyanVM, this seems to have been caused by a networking cache2 change.

FACTS:
  (A) The first report of this on Mac OS X 10.8 was on Try, in comment 37. Based on gps's Try archive, that Try push included these 3 network-cache-related bugs:
   Bug 1053517 - Enable the new HTTP cache in automation.
   Bug 1054418 - Rewrite AppCacheUtils.jsm to use HTTP Cache v2 APIs
   Bug 1054425 - cache2: leak in CacheFileMetadata::WriteMetadata

So, that makes cache2 seem a bit suspicious.

 (B) The first report of this on *Aurora* on 10.8 was in comment 40, for which RyanVM gave me this likely regression range:
https://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=209550&tochange=209619
  That Aurora push includes this commit:
> b529a7e31a98	Michal Novotny — Bug 1042192 - Use Cache2 I/O thread for callbacks from CacheFileIOManager. r=sworkman, a=sledru
  ...and nothing else from that push stood out to me as being likely to be related.

 (C) The first report of this on *Inbound* on 10.8 was in comment 38, which was several days after *that same bug* (bug 1042192) landed there.


So: This is looking a lot like a regression from bug 1042192, or from something cache2 related at least.  This makes some sense as a network-cache-related bug, since the testcase loads a huge data URI, which could easily stress the network cache.

Moving over to Core|Networking. Can't poke around much more at the moment; it would be nice to get to the bottom of this (since the failures since August do seem to be a regression), but worst-case, I'd be OK with marking the test as skip-on-mac, since it's a kind of extreme testcase that doesn't resemble real web content.
Component: SVG → Networking
Marking as likely-regression from bug 1042192, for the issues starting in August, per comment 155 parts (B) & (C).
Blocks: 1042192
Keywords: regression
David, is it possible that this test would benefit from changes similar to the ones you made in bug 891347?
Flags: needinfo?(dbaron)
I don't see how; the fundamental problem there was a test calling window.location.reload(), which seems like it would confuse the harness.  (We should probably do something to block that... filed bug 1128714.)

I don't see any tests here doing that.
Flags: needinfo?(dbaron)
Inactive; closing (see bug 1180138).
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.