Open Bug 1336075 Opened 7 years ago Updated 1 year ago

Intermittent dom/tests/browser/browser_largeAllocation_non_win32.js | Test timed out -

Categories

(Core :: DOM: Core & HTML, defect, P2)

defect

Tracking

()

Tracking Status
firefox70 --- fix-optional
firefox71 - fix-optional

People

(Reporter: intermittent-bug-filer, Unassigned)

Details

(Keywords: bulk-close-intermittents, intermittent-failure, leave-open, Whiteboard: [stockwell disabled])

Attachments

(2 files, 2 obsolete files)

Priority: -- → P5
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
https://wiki.mozilla.org/Bug_Triage#Intermittent_Test_Failure_Cleanup
Status: REOPENED → RESOLVED
Closed: 7 years ago6 years ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
https://wiki.mozilla.org/Bug_Triage#Intermittent_Test_Failure_Cleanup
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
There have been 67 failures in the last 7 days.

Failures per platform and build type:
- osx-10-10 / debug & opt: 29
- linux64 / opt, pgo, debug: 21
- linux32 / opt & debug: 15
- windows10-64 / opt & pgo: 2

Recent relevant log file and snippet with the failure:
https://treeherder.mozilla.org/logviewer.html#?job_id=194666867&repo=mozilla-inbound&lineNumber=17100


00:33:50     INFO - Console message: [JavaScript Error: "Polling for changes failed: Server error 404 Not Found: "JSON.parse: unexpected character at line 1 column 1 of the JSON data"." {file: "resource://services-settings/remote-settings.js" line: 721}]
00:33:50     INFO - remoteSettingsFunction/remoteSettings.pollChanges@resource://services-settings/remote-settings.js:721:13
00:33:50     INFO - async*notify@jar:file:///Users/cltbld/tasks/task_1534571222/build/application/Firefox%20NightlyDebug.app/Contents/Resources/omni.ja!/components/RemoteSettingsComponents.js:24:5
00:33:50     INFO - TM_notify/<@jar:file:///Users/cltbld/tasks/task_1534571222/build/application/Firefox%20NightlyDebug.app/Contents/Resources/omni.ja!/components/nsUpdateTimerManager.js:197:48
00:33:50     INFO - TM_notify@jar:file:///Users/cltbld/tasks/task_1534571222/build/application/Firefox%20NightlyDebug.app/Contents/Resources/omni.ja!/components/nsUpdateTimerManager.js:244:7
00:33:50     INFO - 
00:33:50     INFO - Buffered messages finished
00:33:50     INFO - TEST-UNEXPECTED-FAIL | dom/tests/browser/browser_largeAllocation_non_win32.js | Test timed out - 
00:33:50     INFO - GECKO(2236) | MEMORY STAT | vsize 4436MB | residentFast 415MB | heapAllocated 86MB
00:33:50     INFO - TEST-OK | dom/tests/browser/browser_largeAllocation_non_win32.js | took 360091ms
Flags: needinfo?(overholt)
Could be a red herring but the remote-settings.js thing seems relevant.

Nika, can you take a look next week?
Flags: needinfo?(overholt) → needinfo?(nika)
Priority: -- → P2
Attached patch Skipped test on linux and mac (obsolete) — Splinter Review
Added a patch in case we decide to disable the test.
Attachment #9003720 - Flags: review?(jmaher)
Comment on attachment 9003720 [details] [diff] [review]
Skipped test on linux and mac

Review of attachment 9003720 [details] [diff] [review]:
-----------------------------------------------------------------

what a long skip-if line, but this looks accurate
Attachment #9003720 - Flags: review?(jmaher) → review+
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]
Pushed by ebalazs@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9b254e372c54
Disable browser_largeAllocation_non_win32.js on linux and mac for frequent failures. r=jmaher
Keywords: checkin-needed
Backout by ebalazs@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7e3cfbe2be22
Backed out changeset 9b254e372c54 for "=" missing in the skip-if line. CLOSED TREE
Attached file Skipped test on linux and mac (obsolete) —
Attachment #9003720 - Attachment is obsolete: true
Attachment #9003774 - Flags: review?(jmaher)
Comment on attachment 9003774 [details]
Skipped test on linux and mac

I cannot believe I saw os = 'mac' in the review and thought it was ok, thanks for backing it out and getting a real patch.
Attachment #9003774 - Flags: review?(jmaher) → review+
Update:
There have been 107 failures in the last 7 days and 257 failures in the last 21 days.
The failures occur on all the builds, with an exception for asan
Failures per platform:

- osx-10-10: 54
- linux32: 28
- linux64: 23
- linux64-nightly: 2 

Recent relevant log file:
https://treeherder.mozilla.org/logviewer.html#?job_id=196991085&repo=mozilla-inbound&lineNumber=20506

18:14:52     INFO - Buffered messages logged at 18:11:52
18:14:52     INFO - Longer timeout required, waiting longer...  Remaining timeouts: 1
18:14:52     INFO - Buffered messages logged at 18:12:39
18:14:52     INFO - Console message: [JavaScript Error: "Polling for changes failed: Server error 404 Not Found: "JSON.parse: unexpected character at line 1 column 1 of the JSON data"." {file: "resource://services-settings/remote-settings.js" line: 721}]
18:14:52     INFO - remoteSettingsFunction/remoteSettings.pollChanges@resource://services-settings/remote-settings.js:721:13
18:14:52     INFO - async*notify@jar:file:///Users/cltbld/tasks/task_1535761999/build/application/Firefox%20NightlyDebug.app/Contents/Resources/omni.ja!/components/RemoteSettingsComponents.js:24:5
18:14:52     INFO - TM_notify/<@jar:file:///Users/cltbld/tasks/task_1535761999/build/application/Firefox%20NightlyDebug.app/Contents/Resources/omni.ja!/components/nsUpdateTimerManager.js:192:48
18:14:52     INFO - TM_notify@jar:file:///Users/cltbld/tasks/task_1535761999/build/application/Firefox%20NightlyDebug.app/Contents/Resources/omni.ja!/components/nsUpdateTimerManager.js:239:7
18:14:52     INFO - 
18:14:52     INFO - Buffered messages finished
18:14:52     INFO - TEST-UNEXPECTED-FAIL | dom/tests/browser/browser_largeAllocation_non_win32.js | Test timed out - 
18:14:52     INFO - GECKO(2180) | MEMORY STAT | vsize 4421MB | residentFast 392MB | heapAllocated 78MB
18:14:52     INFO - TEST-OK | dom/tests/browser/browser_largeAllocation_non_win32.js | took 450074ms
Added the patch with a different name than the obsolete one.
Assignee: nobody → nbeleuzu
Attachment #9003774 - Attachment is obsolete: true
Attachment #9007540 - Flags: review?(jmaher)
Comment on attachment 9007540 [details] [diff] [review]
Disable test on mac and linux

Review of attachment 9007540 [details] [diff] [review]:
-----------------------------------------------------------------

as a note, this only runs on osx/opt, and win10 after this patch lands
Attachment #9007540 - Flags: review?(jmaher) → review+
Pushed by rgurzau@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e8b79d20b287
Disable browser_largeAllocation_non_win32.js on linux and mac for frequent failures. r=jmaher
Keywords: checkin-needed
Component: DOM → DOM: Core & HTML

There are 26 failures in the last 7 days on macosx1014-64-shippable opt:
https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-08-30&endday=2019-09-06&tree=trunk&bug=1336075

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=265256378&repo=autoland&lineNumber=4841

[task 2019-09-06T00:25:40.216Z] 00:25:40 INFO - TEST-UNEXPECTED-FAIL | dom/tests/browser/browser_largeAllocation_non_win32.js | Test timed out -
[task 2019-09-06T00:25:40.216Z] 00:25:40 INFO - GECKO(2030) | JavaScript error: resource://testing-common/PromiseTestUtils.jsm, line 112: uncaught exception: Object
[task 2019-09-06T00:25:40.216Z] 00:25:40 INFO - Console message: [JavaScript Error: "uncaught exception: Object" {file: "resource://testing-common/PromiseTestUtils.jsm" line: 112}]
[task 2019-09-06T00:25:40.217Z] 00:25:40 INFO - GECKO(2030) | MEMORY STAT | vsize 7736MB | residentFast 360MB | heapAllocated 90MB
[task 2019-09-06T00:25:40.217Z] 00:25:40 INFO - TEST-OK | dom/tests/browser/browser_largeAllocation_non_win32.js | took 180043ms
[task 2019-09-06T00:25:40.217Z] 00:25:40 INFO - Not taking screenshot here: see the one that was previously logged
[task 2019-09-06T00:25:40.217Z] 00:25:40 INFO - TEST-UNEXPECTED-FAIL | dom/tests/browser/browser_largeAllocation_non_win32.js | Found a tab after previous test timed out: about:blank -
[task 2019-09-06T00:25:40.217Z] 00:25:40 INFO - checking window state
[task 2019-09-06T00:25:40.217Z] 00:25:40 INFO - TEST-START | dom/tests/browser/browser_localStorage_e10s.js
[task 2019-09-06T00:25:40.229Z] 00:25:40 INFO - Not taking screenshot here: see the one that was previously logged
[task 2019-09-06T00:25:40.230Z] 00:25:40 INFO - Buffered messages logged at 00:25:40
[task 2019-09-06T00:25:40.230Z] 00:25:40 INFO - Entering test bound
[task 2019-09-06T00:25:40.230Z] 00:25:40 INFO - Buffered messages finished
[task 2019-09-06T00:25:40.230Z] 00:25:40 INFO - TEST-UNEXPECTED-FAIL | dom/tests/browser/browser_localStorage_e10s.js | uncaught exception - ReferenceError: ok is not defined at observer@chrome://mochitests/content/browser/dom/tests/browser/helper_largeAllocation.js:17:7
[task 2019-09-06T00:25:40.230Z] 00:25:40 INFO - _insertBrowser@chrome://browser/content/tabbrowser.js:2361:24

Hsin, what could be done about this?

Flags: needinfo?(htsai)

Hi Nika, could you please give us a hint here of what we should do for the next? Also, could you think of recent changes that caused a raise on the failure rate in the past week? Thank you.

Flags: needinfo?(htsai)
Flags: needinfo?(nika)

Oops ... NI for comment 106.

Flags: needinfo?(nika)

Not sure where in here we're actually running into the issue, but it appears like it might be somewhere in this range: https://searchfox.org/mozilla-central/rev/588814f2edddf0e132d77d326ddae50911e8bad1/dom/tests/browser/helper_largeAllocation.js#250-260 (probably waiting for the expectProcessCreated https://searchfox.org/mozilla-central/rev/588814f2edddf0e132d77d326ddae50911e8bad1/dom/tests/browser/helper_largeAllocation.js#258). I'm not entirely sure why that would be happening and I don't have a ton of time to look into it.

It also, interestingly, looks like it's always happening on macos. We never run the largeAllocation header on macos (only ever on 32-bit windows), so if this becomes an issue, we can probably turn off the test.

Flags: needinfo?(nika)

40 failures were associated with this bug in the last 7 days: https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-10-01&endday=2019-10-08&tree=trunk&bug=1336075

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=270178854&repo=autoland&lineNumber=4193

This failed on: macosx1014-64-shippable, windows10-64, windows10-64-qr debug and opt build types.

:Nika Layzell, can we disable this?

Flags: needinfo?(nika)

If this isn't happening on 32-bit Windows (doesn't look like it is), we should disable the tests. Can we do architecture/OS-specific disabling?

(Nika and I briefly chatted about this)

Flags: needinfo?(nika)

(In reply to Cosmin Sabou [:CosminS] from comment #117)

For the win32 failures there is Bug 1586957, newly filed. https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-10-01&endday=2019-10-08&tree=trunk&bug=1586957

That is a different test.

(In reply to Andrew Overholt [:overholt] from comment #115)

If this isn't happening on 32-bit Windows (doesn't look like it is), we should disable the tests. Can we do architecture/OS-specific disabling?

This test is already skipped on many platforms, including 32-bit Windows.

$ ./mach test-info tests dom/tests/browser/browser_largeAllocation_non_win32.js
===== dom/tests/browser/browser_largeAllocation_non_win32.js =====
Found dom/tests/browser/browser_largeAllocation_non_win32.js in source control.
dom/tests/browser/browser_largeAllocation_non_win32.js found in manifest dom/tests/browser/browser.ini
  flavor: browser-chrome
  skip-if: fission || !e10s || (os == "win" && (processor == "x86" || processor == "aarch64")) || (verify && debug && (os == 'linux')) || (os == 'linux') || (os == 'mac' && debug)
Querying ActiveData...
Found records matching 'dom/tests/browser/browser_largeAllocation_non_win32.js' in ActiveData.

Test results for dom/tests/browser/browser_largeAllocation_non_win32.js on mozilla-central,mozilla-inbound,autoland between 2019-10-01 and 2019-10-08
linux64/asan-opt-e10s:                        0 failures (   161 skipped) in    161 runs
linux64/ccov-debug-e10s:                      0 failures (    29 skipped) in     29 runs
linux64/debug-e10s-service-worker:            0 failures (    25 skipped) in     25 runs
linux64/debug-e10s:                           0 failures (   161 skipped) in    161 runs
linux64/debug-fis:                            0 failures (   265 skipped) in    265 runs
linux64/opt-e10s:                             0 failures (   177 skipped) in    177 runs
linux64/opt-fis:                              0 failures (    25 skipped) in     25 runs
macosx1014-64-shippable/opt-e10s:            24 failures (     0 skipped) in    215 runs
macosx1014-64/debug-e10s:                     0 failures (   318 skipped) in    318 runs
windows10-64-qr/debug-e10s-qr:                0 failures (     0 skipped) in    363 runs
windows10-64-qr/opt-e10s-qr:                  0 failures (     0 skipped) in     24 runs
windows10-64-qr/opt-fis-qr:                   0 failures (    27 skipped) in     27 runs
windows10-64-shippable-qr/opt-e10s-qr:        0 failures (     0 skipped) in    155 runs
windows10-64-shippable/opt-e10s:              0 failures (     0 skipped) in    148 runs
windows10-64-shippable/opt-e10s:              0 failures (     0 skipped) in     23 runs
windows10-64/asan-opt-e10s:                   1 failures (     0 skipped) in    424 runs
windows10-64/ccov-debug-e10s:                 0 failures (     0 skipped) in     22 runs
windows10-64/debug-e10s:                     10 failures (     0 skipped) in    273 runs
windows10-64/opt-e10s:                        0 failures (     0 skipped) in     26 runs
windows10-64/opt-fis:                         0 failures (    26 skipped) in     26 runs
windows10-aarch64/opt-e10s:                   0 failures (    24 skipped) in     24 runs
windows7-32-shippable/opt-e10s:               0 failures (   151 skipped) in    151 runs
windows7-32/debug-e10s:                       0 failures (   162 skipped) in    162 runs
windows7-32/opt-e10s:                         0 failures (    24 skipped) in     24 runs
Attachment #9099657 - Attachment description: Bug 1336075 - Disable test on mac and wind debug due to frequent failures. r=#intermittent-reviewers → Bug 1336075 - Disable test on mac and windows debug due to frequent failures. r=#intermittent-reviewers

(In reply to Geoff Brown [:gbrown] from comment #118)

(In reply to Andrew Overholt [:overholt] from comment #115)

If this isn't happening on 32-bit Windows (doesn't look like it is), we should disable the tests. Can we do architecture/OS-specific disabling?

This test is already skipped on many platforms, including 32-bit Windows.

OK, thanks for confirming. That's worrying since this feature is only for 32-bit Windows :)

Can we get some quick manual testing done here to get a regression range for when this broke? It worked in 53, at least (see bug 1331083). The test plan is available here: https://wiki.mozilla.org/QA/Large-Allocation_header.

Flags: qe-verify?
Flags: needinfo?(tmaity)
Flags: needinfo?(overholt)

Might be something to keep an eye on for 71.

(In reply to Andrew Overholt [:overholt] from comment #121)

(In reply to Geoff Brown [:gbrown] from comment #118)

(In reply to Andrew Overholt [:overholt] from comment #115)

If this isn't happening on 32-bit Windows (doesn't look like it is), we should disable the tests. Can we do architecture/OS-specific disabling?

This test is already skipped on many platforms, including 32-bit Windows.

OK, thanks for confirming. That's worrying since this feature is only for 32-bit Windows :)

Can we get some quick manual testing done here to get a regression range for when this broke? It worked in 53, at least (see bug 1331083). The test plan is available here: https://wiki.mozilla.org/QA/Large-Allocation_header.

Hi Bogdan,
As comments above, there's a chance the Large-Allocation header might have stopped working and we're seeing a corresponding increase in OOMs (bug 1584266 and bug 1584232). As you've been helping some testing for bug 1584232, would you please to help us get a regression range here? It's something we consider urgent.

Flags: needinfo?(bogdan.maris)

(In reply to Hsin-Yi Tsai [:hsinyi] from comment #124)

(In reply to Andrew Overholt [:overholt] from comment #121)

(In reply to Geoff Brown [:gbrown] from comment #118)

(In reply to Andrew Overholt [:overholt] from comment #115)

If this isn't happening on 32-bit Windows (doesn't look like it is), we should disable the tests. Can we do architecture/OS-specific disabling?

This test is already skipped on many platforms, including 32-bit Windows.

OK, thanks for confirming. That's worrying since this feature is only for 32-bit Windows :)

Can we get some quick manual testing done here to get a regression range for when this broke? It worked in 53, at least (see bug 1331083). The test plan is available here: https://wiki.mozilla.org/QA/Large-Allocation_header.

Hi Bogdan,
As comments above, there's a chance the Large-Allocation header might have stopped working and we're seeing a corresponding increase in OOMs (bug 1584266 and bug 1584232). As you've been helping some testing for bug 1584232, would you please to help us get a regression range here? It's something we consider urgent.

Okay, Alphan seemed find an STR, that it's broken on the latest nightly but worked on Fx59. He will try to narrow down the range. We will ask for help if needed later.

Flags: needinfo?(bogdan.maris)
Priority: P2 → P1

(In reply to Hsin-Yi Tsai [:hsinyi] from comment #125)

(In reply to Hsin-Yi Tsai [:hsinyi] from comment #124)

(In reply to Andrew Overholt [:overholt] from comment #121)

(In reply to Geoff Brown [:gbrown] from comment #118)

(In reply to Andrew Overholt [:overholt] from comment #115)

If this isn't happening on 32-bit Windows (doesn't look like it is), we should disable the tests. Can we do architecture/OS-specific disabling?

This test is already skipped on many platforms, including 32-bit Windows.

OK, thanks for confirming. That's worrying since this feature is only for 32-bit Windows :)

Can we get some quick manual testing done here to get a regression range for when this broke? It worked in 53, at least (see bug 1331083). The test plan is available here: https://wiki.mozilla.org/QA/Large-Allocation_header.

Hi Bogdan,
As comments above, there's a chance the Large-Allocation header might have stopped working and we're seeing a corresponding increase in OOMs (bug 1584266 and bug 1584232). As you've been helping some testing for bug 1584232, would you please to help us get a regression range here? It's something we consider urgent.

Okay, Alphan seemed find an STR, that it's broken on the latest nightly but worked on Fx59. He will try to narrow down the range. We will ask for help if needed later.

See https://bugzilla.mozilla.org/show_bug.cgi?id=1584266#c13
There's no STR to figure out if LargeAllocation is broken on certain version. Bingo Blitz app that was mentioned in the test plan seems not use this header now.

Finding Regression Range by automated test is not something Manual QA worked before. Over the last few days I tried to work with other QA managers as well however without the exact steps we couldn't help much here.

Flags: needinfo?(tmaity)

Talking to a few people, it seems like we should likely work toward un-shipping Large-Allocation so I'm not sure how much we have to worry about these intermittents and maybe we should just turn them off.

Priority: P1 → P2
Assignee: nbeleuzu → nobody
Status: REOPENED → NEW
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d31698ac0559
Disable test on mac and windows debug due to frequent failures. r=gbrown

Untracking for 71 since this is now a P2.

Severity: normal → S3

Changing qe-verify? to qe-verify+.

Flags: qe-verify? → qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: