Closed Bug 1582318 Opened 1 year ago Closed 8 months ago

Intermittent [fission] netwerk/cookie/test/browser/browser_sharedWorker.js | Test timed out -

Categories

(Core :: Networking: Cookies, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla77
Fission Milestone M4.1
Tracking Status
firefox-esr68 --- unaffected
firefox75 --- disabled
firefox76 --- disabled
firefox77 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: kmag)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [stockwell disabled])

Crash Data

Attachments

(3 files, 2 obsolete files)

Filed by: nerli [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=267342554&repo=mozilla-central
Full log: https://queue.taskcluster.net/v1/task/dOvHh4Q_SvGI-fee47NUVA/runs/0/artifacts/public/logs/live_backing.log


[task 2019-09-18T23:08:58.564Z] 23:08:58 INFO - TEST-PASS | netwerk/cookie/test/browser/browser_sharedWorker.js | SharedWorker is allowed - true == true -
[task 2019-09-18T23:08:58.564Z] 23:08:58 INFO - Leaving test bound
[task 2019-09-18T23:08:58.564Z] 23:08:58 INFO - Entering test bound
[task 2019-09-18T23:08:58.565Z] 23:08:58 INFO - Starting SharedWorker: ({fromBehavior:2, toBehavior:2, fromPermission:1, toPermission:0})
[task 2019-09-18T23:08:58.565Z] 23:08:58 INFO - Console message: [JavaScript Error: "Loading Worker from “https://example.com/browser/netwerk/cookie/test/browser/a.js” was blocked because of a disallowed MIME type (“text/html”)."]
[task 2019-09-18T23:08:58.565Z] 23:08:58 INFO - Console message: [JavaScript Error: "TypeError: process is null" {file: "resource://gre/modules/ProcessSelector.jsm" line: 56}]
[task 2019-09-18T23:08:58.565Z] 23:08:58 INFO - Buffered messages finished
[task 2019-09-18T23:08:58.565Z] 23:08:58 INFO - TEST-UNEXPECTED-FAIL | netwerk/cookie/test/browser/browser_sharedWorker.js | Test timed out -
[task 2019-09-18T23:08:58.565Z] 23:08:58 INFO - GECKO(484) | MEMORY STAT | vsize 2104154MB | vsizeMaxContiguous 64964897MB | residentFast 245MB | heapAllocated 82MB
[task 2019-09-18T23:08:58.565Z] 23:08:58 INFO - TEST-OK | netwerk/cookie/test/browser/browser_sharedWorker.js | took 45037ms
[task 2019-09-18T23:08:58.566Z] 23:08:58 INFO - Not taking screenshot here: see the one that was previously logged
[task 2019-09-18T23:08:58.566Z] 23:08:58 INFO - TEST-UNEXPECTED-FAIL | netwerk/cookie/test/browser/browser_sharedWorker.js | Found a tab after previous test timed out: about:blank -
[task 2019-09-18T23:08:58.566Z] 23:08:58 INFO - checking window state

For more info about fission you can check here https://wiki.mozilla.org/Project_Fission/Enabling_Tests_with_Fission

Flags: needinfo?(valentin.gosu)
Regressed by: 1580750

[task 2019-09-18T23:08:58.565Z] 23:08:58 INFO - Console message: [JavaScript Error: "TypeError: process is null" {file: "resource://gre/modules/ProcessSelector.jsm" line: 56}]
Not sure what to make of this error here.

Kris, any idea why this happens? Or do you know who to ask? Thanks!

Flags: needinfo?(valentin.gosu) → needinfo?(kmaglione+bmo)

It looks like there's a race where we clear a ContentParent's scriptable helper before we mark it as dead and remove it from the pool. My best guess is that it's happening when we hit this branch: https://searchfox.org/mozilla-central/rev/7531325c8660cfa61bf71725f83501028178cbb9/dom/ipc/ContentParent.cpp#1428-1456

If you can try to confirm that, I can probably fix it.

Flags: needinfo?(kmaglione+bmo)

(In reply to Kris Maglione [:kmag] from comment #3)

It looks like there's a race where we clear a ContentParent's scriptable helper before we mark it as dead and remove it from the pool. My best guess is that it's happening when we hit this branch: https://searchfox.org/mozilla-central/rev/7531325c8660cfa61bf71725f83501028178cbb9/dom/ipc/ContentParent.cpp#1428-1456

If you can try to confirm that, I can probably fix it.

I wasn't able to reproduce it locally, so I can't confirm if that's what's happening or not. 🙁

There have been 38 failures within the last 7 days:

  • 10 failures on Windows 10 x64 QuantumRender opt
  • 9 failures on Linux x64 opt/debug
  • 19 failures on Windows 10 x64 opt

Recent log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=269850408&repo=mozilla-central&lineNumber=8348

[task 2019-10-04T22:09:02.676Z] 22:09:02 INFO - Leaving test bound
[task 2019-10-04T22:09:02.676Z] 22:09:02 INFO - Entering test bound
[task 2019-10-04T22:09:02.677Z] 22:09:02 INFO - Starting SharedWorker: ({fromBehavior:2, toBehavior:2, fromPermission:1, toPermission:0})
[task 2019-10-04T22:09:02.677Z] 22:09:02 INFO - Console message: [JavaScript Error: "TypeError: process is null" {file: "resource://gre/modules/ProcessSelector.jsm" line: 56}]
[task 2019-10-04T22:09:02.678Z] 22:09:02 INFO - Buffered messages finished
[task 2019-10-04T22:09:02.679Z] 22:09:02 INFO - TEST-UNEXPECTED-FAIL | netwerk/cookie/test/browser/browser_sharedWorker.js | Test timed out -
[task 2019-10-04T22:09:02.680Z] 22:09:02 INFO - GECKO(18243) | MEMORY STAT | vsize 2935MB | residentFast 292MB | heapAllocated 96MB
[task 2019-10-04T22:09:02.681Z] 22:09:02 INFO - TEST-OK | netwerk/cookie/test/browser/browser_sharedWorker.js | took 45016ms
[task 2019-10-04T22:09:02.681Z] 22:09:02 INFO - Not taking screenshot here: see the one that was previously logged
[task 2019-10-04T22:09:02.682Z] 22:09:02 INFO - TEST-UNEXPECTED-FAIL | netwerk/cookie/test/browser/browser_sharedWorker.js | Found a tab after previous test timed out: about:blank -
[task 2019-10-04T22:09:02.682Z] 22:09:02 INFO - checking window state

Flags: needinfo?(nhnguyen)
Whiteboard: [stockwell needswork:owner]
Flags: needinfo?(nhnguyen) → needinfo?(valentin.gosu)

I haven't had any breakthroughs into why this happens.
Kris, any idea on how we should proceed?

Flags: needinfo?(valentin.gosu) → needinfo?(kmaglione+bmo)
Attachment #9100745 - Attachment is obsolete: true
Pushed by rmaries@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/30bc2bf7941f
Disable browser_sharedWorker.js on fission. r=mccr8
Keywords: leave-open
Whiteboard: [stockwell disable-recommended] → [stockwell disabled

There are 1 failures associated to this bug in the last 7 days. These are occurring on linux64 debug builds and windows10-64 opt builds.

Tentatively moving all bugs whose summaries mention "Fission" (or other Fission-related keywords) but are not assigned to a Fission Milestone to the "?" triage milestone.

This will generate a lot of bugmail, so you can filter your bugmail for the following UUID and delete them en masse:

0ee3c76a-bc79-4eb2-8d12-05dc0b68e732

Fission Milestone: --- → ?
Fission Milestone: ? → M4.1

Hi Andrew, this got enabled somewhere back to fission and it's failing again. should we disable it?

Flags: needinfo?(continuation)

Sure.

Flags: needinfo?(continuation)

Valentin, are you working on this bug? I saw an email from ckerschb saying you would be working on it.

Flags: needinfo?(valentin.gosu)
Priority: -- → P2

Not currently working on it. I investigated a bit a while ago, but I didn't get anywhere. Ultimately I didn't manage to confirm if kmag's guess in comment 3 is accurate.

Flags: needinfo?(valentin.gosu)
Assignee: nobody → apavel
Whiteboard: [stockwell needswork:owner] → [stockwell disabled]
Pushed by dvarga@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/993bacbf2731
disable browser_sharedWorker.js on fission r=mccr8

(In reply to Pulsebot from comment #31)

Pushed by dvarga@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/993bacbf2731
disable browser_sharedWorker.js on fission r=mccr8

Does this disable a wrong test case? This disables the browser_sharedWorker.js in the folder 'browser/component/originattributes/test/browser/', but I think 'netwerk/cookie/test/browser/browser_sharedWorker.js' should be the one that you need to disable.

Flags: needinfo?(apavel)

(In reply to Tim Huang[:timhuang] from comment #34)

(In reply to Pulsebot from comment #31)

Pushed by dvarga@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/993bacbf2731
disable browser_sharedWorker.js on fission r=mccr8

Does this disable a wrong test case? This disables the browser_sharedWorker.js in the folder 'browser/component/originattributes/test/browser/', but I think 'netwerk/cookie/test/browser/browser_sharedWorker.js' should be the one that you need to disable.

Thank you for catching that, I've made the necessary changes, please take a look.

Flags: needinfo?(apavel)
Attachment #9111113 - Attachment is obsolete: true
Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c1bcadb739a9
disable correct browser_sharedWorker.js test case on fission r=timhuang

(In reply to Kris Maglione [:kmag] from comment #3)

It looks like there's a race where we clear a ContentParent's scriptable helper before we mark it as dead and remove it from the pool. My best guess is that it's happening when we hit this branch: https://searchfox.org/mozilla-central/rev/7531325c8660cfa61bf71725f83501028178cbb9/dom/ipc/ContentParent.cpp#1428-1456

If you can try to confirm that, I can probably fix it.

Tentatively assigning this test bug to kmag because he said he would confirm whether this is a ContentParent race.

Assignee: apavel → kmaglione+bmo

Tentatively assigning this test bug to kmag because he said he would confirm whether this is a ContentParent race.

kmag said he'll check today whether this is a ContentParent race or a process selection bug.

Attachment #9137857 - Attachment description: Bug 1582318: Ignore shutting-down processes in process selectors. r=nika → Bug 1582318: Remove shutting-down processes from pool immediately. r=nika
Pushed by maglione.k@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/a0732358fb87
Remove shutting-down processes from pool immediately. r=nika
Pushed by maglione.k@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/001575dc6d78
Remove shutting-down processes from pool immediately. r=nika
Regressions: 1629477
Regressions: 1628661
Pushed by maglione.k@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/11a09ee4618c
Remove shutting-down processes from pool immediately. r=nika

In this crash report: https://crash-stats.mozilla.org/report/index/594d270c-c8d8-4295-91e2-5fbd80200416 the crash reason is MOZ_DIAGNOSTIC_ASSERT(p->mScriptableHelper).

Crash Signature: [@ mozilla::dom::ContentParent::GetUsedBrowserProcess]
Status: NEW → RESOLVED
Closed: 8 months ago
Flags: needinfo?(kmaglione+bmo)
Resolution: --- → FIXED
Target Milestone: --- → mozilla77
You need to log in before you can comment on or make changes to this bug.