Closed Bug 1506032 Opened 11 months ago Closed 4 months ago

Intermittent devtools/client/framework/test/browser_toolbox_window_reload_target.js | Test timed out -

Categories

(DevTools :: Framework, defect, P5)

defect

Tracking

(firefox69 fixed)

RESOLVED FIXED
Firefox 69
Tracking Status
firefox69 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: RaulGurzau)

References

(Regression)

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell disabled])

Attachments

(2 files, 1 obsolete file)

Filed by: ebalazs [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=210739543&repo=mozilla-inbound

https://queue.taskcluster.net/v1/task/agEKxMUXToOmCLxfbZm3og/runs/0/artifacts/public/logs/live_backing.log

TEST-PASS | devtools/client/framework/test/browser_toolbox_window_reload_target.js | Detected the right number of reloads in the page - 
21:41:33     INFO - Select tool inspector
21:41:33     INFO - Buffered messages logged at 21:28:02
21:41:33     INFO - Longer timeout required, waiting longer...  Remaining timeouts: 9
21:41:33     INFO - Buffered messages logged at 21:29:32
21:41:33     INFO - Longer timeout required, waiting longer...  Remaining timeouts: 8
21:41:33     INFO - Buffered messages logged at 21:31:02
21:41:33     INFO - Longer timeout required, waiting longer...  Remaining timeouts: 7
21:41:33     INFO - Buffered messages logged at 21:32:32
21:41:33     INFO - Longer timeout required, waiting longer...  Remaining timeouts: 6
21:41:33     INFO - Buffered messages logged at 21:34:02
21:41:33     INFO - Longer timeout required, waiting longer...  Remaining timeouts: 5
21:41:33     INFO - Buffered messages logged at 21:35:32
21:41:33     INFO - Longer timeout required, waiting longer...  Remaining timeouts: 4
21:41:33     INFO - Buffered messages logged at 21:37:02
21:41:33     INFO - Longer timeout required, waiting longer...  Remaining timeouts: 3
21:41:33     INFO - Buffered messages logged at 21:38:32
21:41:33     INFO - Longer timeout required, waiting longer...  Remaining timeouts: 2
21:41:33     INFO - Buffered messages logged at 21:40:02
21:41:33     INFO - Longer timeout required, waiting longer...  Remaining timeouts: 1
21:41:33     INFO - Buffered messages finished
21:41:33     INFO - TEST-UNEXPECTED-FAIL | devtools/client/framework/test/browser_toolbox_window_reload_target.js | Test timed out - 
21:41:33     INFO - GECKO(2122) | ++DOMWINDOW == 30 (0x121c2b000) [pid = 2122] [serial = 1100] [outer = 0x1210a6600]
21:41:34     INFO - Removing tab.
21:41:34     INFO - Waiting for event: 'TabClose' on [object XULElement].
21:41:34     INFO - Got event: 'TabClose' on [object XULElement].
21:41:34     INFO - Tab removed and finished closing

This is almost perma-failing everywhere, reason why it fails beta sims too.

I tried to bisect it but didn't find the real cuplrit. The Failures start on the 30th of May according to OF https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-05-26&endday=2019-06-02&tree=trunk&bug=1506032

The higher failure rate is on the 31st of May and looked like it started with bug 1307227 (https://hg.mozilla.org/integration/autoland/rev/020a3fc71e39#l5.3) but backing out did not solve the issue: https://treeherder.mozilla.org/#/jobs?repo=try&revision=35851011aba06d1b82b892c81494f416ce65ae47&group_state=expanded&selectedJob=249632967

I went backwards and the first classification was on the 27th of May here: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&searchStr=macosx1010-64%2Cdebug%2Cmochitests%2Ctest-macosx1010-64%2Fdebug-mochitest-devtools-chrome-e10s-1%2Cm%28dt1%29&tochange=b89b3cda594adf29317f5d3ff9d615a9d3dcde11&fromchange=5eb6bcb8b9e484c5a169e9f7f663d1826d828615&group_state=expanded&selectedJob=248504330 with Bug 1376646

That's not the culprit either: https://treeherder.mozilla.org/#/jobs?repo=try&revision=07bfc17002c12e01fdbaab0f0bd4c3c8b4671a5e&selectedJob=249643893&group_state=expanded

The test moves chunks and I went backwards on both trees but did not find other occurrences before ^
I'll post the disabling patch, however, Sebastian, could you please take a look?

Flags: needinfo?(pbrosset) → needinfo?(aryx.bugmail)

The test times out because DevTools fails hard when switching to the inspector in the undocked mode. So nothing works afterwards and the test just times out.

Here's the screenshot taken at that point: https://taskcluster-artifacts.net/bvUbSq-8Rw6IAdm2RsbDzw/0/public/test_info//mozilla-test-fail-screenshot_KOkWPa.png
And another one on macOS: https://taskcluster-artifacts.net/ADBGjKW3Sw6Cmcg3Gqt5aA/0/public/test_info/mozilla-test-fail-screenshot_9ZCeJh.png
And one on Windows: https://taskcluster-artifacts.net/batAZ3u_QY6FwIWyIRGC-g/0/public/test_info/mozilla-test-fail-screenshot_p9pldz.png

Looks like DevTools fails shortly after having switched to undocked mode, as on Windows and Linux, the content of the window is missing. Only on macOS the toolbar appears.

I searched for bugs that landed in DevTools around the dates mentioned in comment 24, but couldn't find any that seemed like they would have caused this.
Might be due to a change in the frame swapping code perhaps, but I have nothing to prove this.

There doesn't seem to be any error messages in the logs.

I was able to reproduce the issue locally on: macOS, with an optimized (artifact) build, and with the --verify option. It's frequent but not permanent locally for me.
No error messages there either, nothing that rings a bell.

Daisuke: by any chance, would you have an idea for where this would be coming from?

Flags: needinfo?(dakatsuka)

It seems to have gotten frequent with the patch for bug 1554848: https://treeherder.mozilla.org/#/jobs?repo=autoland&tochange=a2ec4ac648cfd975770679be53525cb104368400&fromchange=e0f0a7707eb3a88805fe8769967aae7c51837249&searchStr=devtools%2Clinux%2Cx64%2Casan&group_state=expanded&selectedJob=249675383

The first 4 occurrences were on macOS which is not representing the platform distribution when it got frequent later.

Flags: needinfo?(aryx.bugmail)
Regressed by: 1554848

Hello! I have investigated a bit.
At first, although I tested the build which reverted the patch of bug 1554848, I could not reproduce this failure in my environment.
However, as Patrick said, I could reproduce in artifact build.

I am not sure exactly the behavior yet though, due to the timing of reloading the document in the test, the promise in getRootNode() is sometimes overwritten in _createRootNodePromise() ,then never resolve. This caused the timeout.
Also, as the bug 1554848 affected to the loading timing, I changed the loading event at here from load to DOMContentLoaded.
This change seems to work well in my environment.
I threw a patch to the try server, I'll confirm the result.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8326ca9b7079f8e067fcc5c213fa94709e4ee252

Flags: needinfo?(dakatsuka)

Oh, oranges occurred from Linux os. Will investigate.

Pushed by rgurzau@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/4eb5b585f0f2
disabled browser_toolbox_window_reload_target.js on win, linux and mac r=aryx
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]
Status: NEW → RESOLVED
Closed: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 69
Assignee: nobody → rgurzau
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---

I have proceeded to investigate more.
When reload the document, the two events of documentUnload and newRoot are fired. However, these can happen at the same time, or newRoot can happen late a bit. newRoot event which is caused by previous reloading is fired during selecting the inspector, the inspector processes this event. Then overwrite the promise getRootNode() refers. This seems to cause the timeout.
I threw the patch to try, will confirm again.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4a74c55a2874c95f5e4f6c4186bd9225cee2ced8

Attachment #9069261 - Attachment is obsolete: true

I tested 10 times on linux though, the try was green.
Thus, I sent a patch to review. Thanks!

Pushed by dakatsuka@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f82c331cf98a
Wait for documentUnload and newRoot events are fired. r=pbro
Status: REOPENED → RESOLVED
Closed: 4 months ago4 months ago
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.