Open Bug 1438663 Opened 5 years ago Updated 17 days ago

Intermittent browser/components/extensions/test/browser/test-oop-extensions/browser_ext_popup_focus.js | Test timed out -

Categories

(WebExtensions :: General, defect, P2)

Unspecified
All
defect

Tracking

(Not tracked)

REOPENED

People

(Reporter: intermittent-bug-filer, Assigned: rpl)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell])

Attachments

(2 files, 1 obsolete file)

This is a regression from bug 1324255. Shane, please take a look.
Blocks: 1324255
Flags: needinfo?(mixedpuppy)
Looking into it, haven't been able to reproduce yet.
Assignee: nobody → mixedpuppy
Flags: needinfo?(mixedpuppy)
This has hit the failure rate for disable-recommended tag with 175 failures from 15ht February until now.

https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1438663&startday=2018-02-15&endday=2018-02-18&tree=all

Shane, do you still want to try and fix this or should we disable it for now? Thanks.
Flags: needinfo?(mixedpuppy)
Lets go ahead and disable.  I was able to finally reproduce it last week but will have to track it down.
Flags: needinfo?(mixedpuppy)
Priority: -- → P1
Attached patch disable_bug1438663.patch (obsolete) — Splinter Review
Hello! Made a patch for disablng this test for now as it has a really big number of failures. Please look over it to see if it's ok. Thank you.
Attachment #8952645 - Flags: review?(aryx.bugmail)
Comment on attachment 8952645 [details] [diff] [review]
disable_bug1438663.patch

Review of attachment 8952645 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/extensions/test/browser/browser-common.ini
@@ +119,4 @@
>  [browser_ext_popup_background.js]
>  [browser_ext_popup_corners.js]
>  [browser_ext_popup_focus.js]
> +skip-if = true # Bug 1438663

Please use the 'disabled' keyword https://dxr.mozilla.org/mozilla-central/source/testing/mozbase/docs/manifestparser.rst#54-68
Attachment #8952645 - Flags: review?(aryx.bugmail) → review-
Modified it. Thanks.
Attachment #8952645 - Attachment is obsolete: true
Attachment #8952652 - Flags: review?(aryx.bugmail)
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/391296e5323f
Disable browser/components/extensions/test/browser/test-oop-extensions/browser_ext_popup_focus.js for frequent failures. r=aryx
Keywords: leave-open
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]
Product: Toolkit → WebExtensions
Moving to p3 because no activity for at least 24 weeks.
See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information
Priority: P1 → P3
Assignee: mixedpuppy → nobody

fully disabled test

Severity: normal → S2
OS: Unspecified → All
Priority: P3 → P2

I took a look into this, the test case was intermittent also for me locally (on a linux64 opt artifact build) and for me the test task related to the browserAction wasn't failing anymore if I remove the part that opens a new tab, which makes me suspect that the tab being loaded may have been intermittently stealing the focus that the browserAction popup does expect.

I attached the changes I applied locally (which also include some additional calls to browser.test.log to make it slightly clearer based just on the logs where the test got stuck) and pushed to try here:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4162a2b72dde29f13514e8b8f34413b43bc28640

Let's see how it goes in the try push, if it looks green enough I'll open the phabricator revision for review (I currently marked it as work-in-progress to defer a review to once we have verified if it does help also when the test runs on the build infrastructure as it did locally).

Assignee: nobody → lgreco
Status: NEW → ASSIGNED

The test seems to be still failing in the linux-opt webrender TV job from time to time, but I'm not sure that the intermittency rate is as high as it used to be when the test has been disabled on all platforms and build types (e.g. I tried to rerun the linux opt and linux debug TV jobs a bunch of times and it did never fail).

Nevertheless the fact that I does still fail intermittently in webrender (and I can reproduce that locally too, not that often but often enough) makes me think that there is still some kind of race that we are able to trigger with this test.

I haven't been able to record a failure with rr yet, but I did:

  • run the test with the addition of MOZ_LOG="Focus:5"
  • then compared the nsFocusManager logs emitted when the test pass just fine with the ones emitted when the test get stuck because we never get the "focus" event on the extension popup side and fails for a timeout

Based on those logs I got the impression that we may be early exiting from the nsFocusManager::SetFocusInner method and so I looked a bit into what we are doing inside that method and there is an inline comment that looked suspicious (quoting it from here):

// XXX This is wrong for `<iframe mozbrowser>` and for XUL
// `<browser remote="true">`. See:
// https://searchfox.org/mozilla-central/rev/8a63fc190b39ed6951abb4aef4a56487a43962bc/dom/base/nsFrameLoader.cpp#229-232

I'm not 100% sure yet if we are actually hitting that part and if it is part of what is triggering the issue, I'm going to give a look next asap.

Attachment #9223602 - Attachment description: WIP: Bug 1438663 - Fix browserAction focus test intermittency in browser_ext_popup_focus.js and renable it. r?mixedpuppy! → Bug 1438663 - Fix browserAction focus test intermittency in browser_ext_popup_focus.js and renable it. r?mixedpuppy!

After looking more deeply into the intermittent behavior (and having to resort to some "printf debugging" to get some additional trace logs to figure out what was happening when the panel wasn't focused) I think that I have figure out the actual underlying issue:

  • when the panel isn't focused, nsFocusManager::SetFocus is early exiting before it would actually focus the panel, because the popup frame is detected to still have view visibility set to nsViewVisibility_kHide.

  • The popup frame visibility is set to nsViewVisibility_kShow by nsMenuPopupFrame::LayoutPopup

  • when the test pass successfully (and the popup panel focused as expected):

    • nsMenuPopupFrame::LayoutPopup did set the visibility to nsViewVisibility_kShow
    • then "Extension:GrabFocus" message is sent to the panel (from ExtensionPopups.jsm)
    • and finaly nsFocusManager::SetFocus is being called in response to Services.focus.focusWindow = content.window; (from ext-browser-content.js handler for the "Extension:GrabFocus" message)
  • when the test gets stuck and fails (and the popup panel is never focused):

    • "Extension:GrabFocus" message is sent to the panel (from ExtensionPopups.jsm)
    • nsFocusManager::SetFocus is being called in response to Services.focus.focusWindow = content.window;
    • nsMenuPopupFrame::LayoutPopup did set the visibility to nsViewVisibility_kShow too late and the panel doesn't grab the focus

To fix the intermittency, we could explicitly wait for the chrome document to be fully flushed before sending the "Extension:GrabFocus" message (from ExtensionPopups.jsm).

I've applied this change and the test file was finally passing consistenty with --verify (with and without WebRender enabled), as I was expecting given the analysis described above.

I've just updated the patch accordingly (the changes to the test file should now be purely readability changes, they shouldn't have an impact on the intermittency in practice).

Let's double-check that a new push to try that this version of the patch is making the test stable also when running on the build infrastructure:

Pushed by luca.greco@alcacoop.it:
https://hg.mozilla.org/integration/autoland/rev/dc642224b715
Fix browserAction focus test intermittency in browser_ext_popup_focus.js and renable it. r=mixedpuppy
Whiteboard: [stockwell disabled] → [stockwell]

Closing as fixed, it doesn't seem we got any new occurrence of this intermittent from when we re-enabled the test one week ago. This can be re-opened if that will happen again in the near future.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Keywords: leave-open
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
You need to log in before you can comment on or make changes to this bug.