Open Bug 1421183 Opened 6 years ago Updated 2 years ago

Intermittent browser/base/content/test/tabs/browser_reload_deleted_file.js | leaked 1 window(s) until shutdown [url = about:newtab]

Categories

(Firefox :: Tabbed Browser, defect, P5)

defect

Tracking

()

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: intermittent-failure, leave-open, Whiteboard: [stockwell disabled])

Attachments

(1 file)

Filed by: csabou [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=148075262&repo=autoland

https://queue.taskcluster.net/v1/task/Jmc7SPC0QIWZysFBWQ93vQ/runs/0/artifacts/public/logs/live_backing.log

23:30:02    ERROR - TEST-UNEXPECTED-FAIL | browser/base/content/test/tabs/browser_reload_deleted_file.js | leaked 1 window(s) until shutdown [url = about:newtab]
23:30:02     INFO - TEST-INFO | browser/base/content/test/tabs/browser_reload_deleted_file.js | windows(s) leaked: [pid = 846] [serial = 71]
23:30:02     INFO - runtests.py | Application ran for: 0:01:34.395856
23:30:02     INFO - zombiecheck | Reading PID log: /var/folders/qj/h929bfh57qxczfm_qgtb6vt000000w/T/tmpHT4Llipidlog
This have has failed 37 times in the last 7 days. The failures are split between OS X and Linux, only happening on debug build type.

This is a recent log from OS X:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=150339278&lineNumber=7473

And here is a part of the log:
INFO - GECKO(866) | nsStringStats
14:52:11     INFO - GECKO(866) |  => mAllocCount:         375687
14:52:11     INFO - GECKO(866) |  => mReallocCount:        37479
14:52:11     INFO - GECKO(866) |  => mFreeCount:          375687
14:52:11     INFO - GECKO(866) |  => mShareCount:         415281
14:52:11     INFO - GECKO(866) |  => mAdoptCount:           6329
14:52:11     INFO - GECKO(866) |  => mAdoptFreeCount:       6329
14:52:11     INFO - GECKO(866) |  => Process ID: 866, Thread ID: 140735134696192
14:52:11     INFO - TEST-INFO | Main app process: exit 0
14:52:11    ERROR - TEST-UNEXPECTED-FAIL | browser/base/content/test/tabs/browser_reload_deleted_file.js | leaked 1 window(s) until shutdown [url = about:newtab]
14:52:11     INFO - TEST-INFO | browser/base/content/test/tabs/browser_reload_deleted_file.js | windows(s) leaked: [pid = 869] [serial = 77]
14:52:11     INFO - runtests.py | Application ran for: 0:01:34.037769
14:52:11     INFO - zombiecheck | Reading PID log: /var/folders/l4/0zk2mdz15m9ddcx3t9j4t40m00000w/T/tmpamMUswpidlog

Hi :Dao: Can you please take a look at this bug?
Flags: needinfo?(dao+bmo)
Whiteboard: [stockwell needswork]
This test was added in bug 1327942. Redirecting needinfo to Bob and Gijs.
Flags: needinfo?(gijskruitbosch+bugs)
Flags: needinfo?(dao+bmo)
Flags: needinfo?(bobowencode)
There seems to have been a few "leaked 1 window(s) until shutdown [url = about:newtab]" intermittent failures that started recently.

These appear to have started around the 28th Nov, possibly on autoland.
I've done some retriggers on the merge back to inbound before it start there:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=eb30901ed24e317a89c5a62e5bd6bd126d7920aa
Flags: needinfo?(bobowencode)
Looks like it may well have been introduced to inbound on that merge.
Just done some more retriggers on the previous changeset.
(In reply to Bob Owen (:bobowen) from comment #4)
> There seems to have been a few "leaked 1 window(s) until shutdown [url =
> about:newtab]" intermittent failures that started recently.
> 
> These appear to have started around the 28th Nov, possibly on autoland.
> I've done some retriggers on the merge back to inbound before it start there:
> https://treeherder.mozilla.org/#/jobs?repo=mozilla-
> inbound&revision=eb30901ed24e317a89c5a62e5bd6bd126d7920aa

https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=eb30901ed24e317a89c5a62e5bd6bd126d7920aa&selectedJob=150495541 is orange, and the previous push is green ( https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=7aa92cc6daba597de911ba6c11193367612dd3ce ) - but the ratio of orange on osx only might mean that you'd need more retriggers to be sure it was green...

:jmaher, do we have good tools for this type of thing yet? :-(

If this was caused by the merge, it could potentially be a result of bug 1414745. Samael, does that seem plausible? Any idea how those changes could cause leaks given what the test is doing? ( https://dxr.mozilla.org/mozilla-central/source/browser/base/content/test/tabs/browser_reload_deleted_file.js )
Flags: needinfo?(sawang)
Flags: needinfo?(jmaher)
Flags: needinfo?(gijskruitbosch+bugs)
:gijs, unfortunately we don't have any tools to find root causes or bisection.  That hasn't been a focus for us, I imagine in a short amount of time next year we will be wrapping up triage handoff and can focus more on making tools better.
Flags: needinfo?(jmaher)
(In reply to Bob Owen (:bobowen) from comment #7)
> I'd homed in on that change as well, it was certainly there or before:
> https://treeherder.mozilla.org/#/
> jobs?repo=autoland&revision=6dd2a65b4c3e80d69bcc3821869fe209ccf9ccbc&filter-
> searchStr=osx+debug+browser-chrome
> 
> No failure on the lone before yet:
> https://treeherder.mozilla.org/#/
> jobs?repo=autoland&revision=2d5916de8b6d49a1875aca61bf080f85334265f7&filter-
> searchStr=osx+debug+browser-chrome

Yeah, seems pretty certain it was that change. Hopefully Samael can help figure out why that change would have had this impact on the test.
Over the last 7 days there are 30 failures present for this bug.

The failures occur on Linux, Linux x64 and OSX 10.10

This is an example of a recent log: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=154545819&lineNumber=4180

[task 2018-01-06T12:58:10.396Z] 12:58:10     INFO - TEST-INFO | Main app process: exit 0
[task 2018-01-06T12:58:10.398Z] 12:58:10    ERROR - TEST-UNEXPECTED-FAIL | browser/base/content/test/tabs/browser_reload_deleted_file.js | leaked 1 window(s) until shutdown [url = about:newtab]
[task 2018-01-06T12:58:10.400Z] 12:58:10     INFO - TEST-INFO | browser/base/content/test/tabs/browser_reload_deleted_file.js | windows(s) leaked: [pid = 1461] [serial = 97]
[task 2018-01-06T12:58:10.401Z] 12:58:10     INFO - runtests.py | Application ran for: 0:02:54.310921
[task 2018-01-06T12:58:10.403Z] 12:58:10     INFO - zombiecheck | Reading PID log: /tmp/tmpxzXsTapidlog
[task 2018-01-06T12:58:10.405Z] 12:58:10     INFO - ==> process 1315 launched child process 1338
[task 2018-01-06T12:58:10.407Z] 12:58:10     INFO - ==> process 1315 launched child process 1370
[task 2018-01-06T12:58:10.409Z] 12:58:10     INFO - ==> process 1315 launched child process 1413
[task 2018-01-06T12:58:10.411Z] 12:58:10     INFO - ==> process 1315 launched child process 1428
[task 2018-01-06T12:58:10.413Z] 12:58:10     INFO - ==> process 1315 launched child process 1461
[task 2018-01-06T12:58:10.414Z] 12:58:10     INFO - ==> process 1315 launched child process 1486
[task 2018-01-06T12:58:10.417Z] 12:58:10     INFO - ==> process 1315 launched child process 1514
[task 2018-01-06T12:58:10.418Z] 12:58:10     INFO - ==> process 1315 launched child process 1538
[task 2018-01-06T12:58:10.419Z] 12:58:10     INFO - ==> process 1315 launched child process 1562
[task 2018-01-06T12:58:10.420Z] 12:58:10     INFO - ==> process 1315 launched child process 1587
[task 2018-01-06T12:58:10.421Z] 12:58:10     INFO - ==> process 1315 launched child process 1616
[task 2018-01-06T12:58:10.422Z] 12:58:10     INFO - ==> process 1315 launched child process 1641
Flags: needinfo?(dao+bmo)
I don't really have any clue yet...
Assignee: nobody → sawang
Flags: needinfo?(sawang)
Sorry, we'll need someone else to take on this.
Assignee: freesamael → nobody
Flags: needinfo?(dao+bmo)
Blocks: 1414745
Mike, I hate to do this, but looks like Samael isn't around and this test is increasing in frequency. You reviewed the change in bug 1414745, any idea how that could be responsible for this (intermittent) leak? :-(
Flags: needinfo?(mconley)
(I tried looking myself, but I don't really see anything - we do create browser filter instances in the code that generates the TabRemotenessChange event that BTU.waitForErrorPage depends on, and we call waitForErrorPage in the test... but I don't see how that would ever trigger a leak, and presumably in the content process, too...)
I had been thinking if it's related to the preload browser, but I hadn't figured it out.
There have been 37 total failures in the last week.
Removing [stockwell disable-recommended] tag, as orange factor shows only 105 failures in the last 21 days.
The failures occur only on the debug build type.

Occurrences per platform:
- linux64-stylo-disabled: 17
- Linux x64: 9
- OS X 10.10: 9
- Linux: 2

Here is a recent log file and a snippet with the error
https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=161190293&lineNumber=4711

[task 2018-02-08T22:51:38.828Z] 22:51:38     INFO - TEST-INFO | Main app process: exit 0
[task 2018-02-08T22:51:38.831Z] 22:51:38    ERROR - TEST-UNEXPECTED-FAIL | browser/base/content/test/tabs/browser_reload_deleted_file.js | leaked 1 window(s) until shutdown [url = about:newtab]
[task 2018-02-08T22:51:38.832Z] 22:51:38     INFO - TEST-INFO | browser/base/content/test/tabs/browser_reload_deleted_file.js | windows(s) leaked: [pid = 1392] [serial = 94]

:mconley, have you had the chance to look at this?
Whiteboard: [stockwell disable-recommended] → [stockwell needswork]
Hi! I tried to make a patch to disable this test that is failing frequently. Could you please review it and see if I've done it correctly. Thank you!

Here's the output with qdiff:

+skip-if = (debug && os == 'mac') || (debug && os == 'linux' && bits == 64) #Bug 1421183, disabled on Linux/OSX for leaked windows
 [browser_tabswitch_updatecommands.js]
 [browser_viewsource_of_data_URI_in_file_process.js]
 [browser_visibleTabs_bookmarkAllTabs.js]
Attachment #8950048 - Flags: review?(jmaher)
Attachment #8950048 - Flags: review?(gbrown)
Comment on attachment 8950048 [details] [diff] [review]
disable_bug1421183.patch

Review of attachment 8950048 [details] [diff] [review]:
-----------------------------------------------------------------

That looks fine - thanks!
Attachment #8950048 - Flags: review?(jmaher)
Attachment #8950048 - Flags: review?(gbrown)
Attachment #8950048 - Flags: review+
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/46a579031752
Disabled browser/base/content/test/tabs/browser_reload_deleted_file.js on Linux/OSX for frequent leaked windows. r=gbrown
Keywords: leave-open
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]

Apologies - declaring needinfo bankruptcy on needinfo's greater than 2 years old.

Flags: needinfo?(mconley)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: