47 bytes, text/x-phabricator-request
|Details | Review|
Various Python tests for Marionette open chrome pages with a custom URL pointing to example XUL file. Right now each of those tests use a trigger callback to open the window. This should be centralized, so it will happen automatically when calling `open_window()` with a URL. Also right now all of those tests are flaky because they don't wait for the chrome window to have focus and being activated. This causes test failures like bug 1504201. Changing this now will also make my patch for bug 1504756 a bit simpler.
I faced a problem for our tests. As Gijs noted on bug 1506796 when we open a new browser or chrome window in general we should wait until the window has received focus. The problem here by is that if our Marionette tests are getting run in the background, we won't get the `focus` nor `activate` event. That even with the focusmanager test mode turned on. Neil, could it be that we don't fire those events yet when the newly opened window is in the background, and the focus manager test mode turned on? If yes, where would be the best place to fire those events?
I could workaround that problem by explicitly calling `focus()` right after opening the window: > let win = window.openDialog(url, null, "chrome,centerscreen"); > win.focus();
New try build with workaround in-place: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3d54aef1537023050ccc8d6c8190755ef0a730f5
The current implemenation for opening new chrome windows via the WindowManager mixin class is racy because it doesn't wait for the newly opened window to have focus and being activated first. The patch adds a new method to the WindowManager class, which is waits for those events. It can then be used across all existent Marionette unit tests.
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/autoland/rev/51abb63ca246 [marionette] Allow to open a chrome window via the WindowManager mixin class. r=ato
Backout by email@example.com: https://hg.mozilla.org/integration/autoland/rev/1bfd0fb3b0d7 Backed out changeset 51abb63ca246 for frequent marionette failures
Backed out changeset 51abb63ca246 (Bug 1507803) for frequent marionette failures Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=pending%2Crunning%2Csuccess%2Ctestfailed%2Cbusted%2Cexception&searchStr=04db67d46d878f734f52c518deaa36bde4766cbe&revision=51abb63ca2464f588229752a5c6070f459a74e6b&selectedJob=212464098 Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=212464098&repo=autoland&lineNumber=14880 Backout: https://hg.mozilla.org/integration/autoland/rev/1bfd0fb3b0d740be0ea913164dce90263aa822d1
With my patch landed there is a crash of Firefox: 17:55:59 INFO - Thread 0 (crashed) 17:55:59 INFO - 0 XUL!mozilla::a11y::FocusManager::ProcessFocusEvent(mozilla::a11y::AccEvent*) [nsCOMPtr.h:51abb63ca2464f588229752a5c6070f459a74e6b : 919 + 0x0] 17:55:59 INFO - rax = 0x1f00e5179b0df7b1 rdx = 0x00007fff8990779d 17:55:59 INFO - rcx = 0x0000000000000000 rbx = 0x000000012cf65d90 17:55:59 INFO - rsi = 0x0000020600000032 rdi = 0x0000000000000000 17:55:59 INFO - rbp = 0x00007fff5ae4bc60 rsp = 0x00007fff5ae4bc10 17:55:59 INFO - r8 = 0x00007fff8990767d r9 = 0x00007fff8990767e 17:55:59 INFO - r10 = 0x00007faf43028b20 r11 = 0x00007fff750c6740 17:55:59 INFO - r12 = 0x0000000123d5fe40 r13 = 0x0000000123d5fe40 17:55:59 INFO - r14 = 0x000000012cf65d00 r15 = 0x0000000000000000 17:55:59 INFO - rip = 0x000000010a0db3b9 17:55:59 INFO - Found by: given as instruction pointer in context 17:55:59 INFO - 1 XUL!mozilla::a11y::EventQueue::ProcessEventQueue() [EventQueue.cpp:51abb63ca2464f588229752a5c6070f459a74e6b : 308 + 0xb] 17:55:59 INFO - rbp = 0x00007fff5ae4bca0 rsp = 0x00007fff5ae4bc70 17:55:59 INFO - rip = 0x000000010a0da9a2 17:55:59 INFO - Found by: previous frame's frame pointer 17:55:59 INFO - 2 XUL!mozilla::a11y::NotificationController::WillRefresh(mozilla::TimeStamp) [NotificationController.cpp:51abb63ca2464f588229752a5c6070f459a74e6b : 909 + 0x8] 17:55:59 INFO - rbp = 0x00007fff5ae4be90 rsp = 0x00007fff5ae4bcb0 17:55:59 INFO - rip = 0x000000010a0e4ad0 17:55:59 INFO - Found by: previous frame's frame pointer 17:55:59 INFO - 3 XUL!nsRefreshDriver::Tick(mozilla::TimeStamp) [nsRefreshDriver.cpp:51abb63ca2464f588229752a5c6070f459a74e6b : 1878 + 0x9] 17:55:59 INFO - rbp = 0x00007fff5ae4c140 rsp = 0x00007fff5ae4bea0 17:55:59 INFO - rip = 0x0000000108eb1538 17:55:59 INFO - Found by: previous frame's frame pointer 17:55:59 INFO - 4 XUL!mozilla::RefreshDriverTimer::TickRefreshDrivers(mozilla::TimeStamp, nsTArray<RefPtr<nsRefreshDriver> >&) [nsRefreshDriver.cpp:51abb63ca2464f588229752a5c6070f459a74e6b : 326 + 0x8] 17:55:59 INFO - rbp = 0x00007fff5ae4c170 rsp = 0x00007fff5ae4c150 17:55:59 INFO - rip = 0x0000000108eb8ef3 17:55:59 INFO - Found by: previous frame's frame pointer 17:55:59 INFO - 5 XUL!mozilla::RefreshDriverTimer::Tick(mozilla::TimeStamp) [nsRefreshDriver.cpp:51abb63ca2464f588229752a5c6070f459a74e6b : 319 + 0xb] 17:55:59 INFO - rbp = 0x00007fff5ae4c1e0 rsp = 0x00007fff5ae4c180 17:55:59 INFO - rip = 0x0000000108eb8d99 17:55:59 INFO - Found by: previous frame's frame pointer 17:55:59 INFO - 6 XUL!mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::TimeStamp) [nsRefreshDriver.cpp:51abb63ca2464f588229752a5c6070f459a74e6b : 760 + 0xb] 17:55:59 INFO - rbp = 0x00007fff5ae4c220 rsp = 0x00007fff5ae4c1f0 17:55:59 INFO - rip = 0x0000000108eba567 17:55:59 INFO - Found by: previous frame's frame pointer Kartikaya, I wonder if that is somewhat related to bug 1306896, even not identical with the frames above `mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver`.
Flags: needinfo?(hskupin) → needinfo?(kats)
As it looks like it is not related, but most likely caused by the extra call to `focus()` right after opening the new window, which I used as workaround for getting focus at all for that window with the focusmanager testmode enabled. Maybe this crash only occurs because we initially don't set the focus on the window. Neil will know more about it, but will only be back by next week. Note that I cannot reproduce the crash locally at all with a debug build of Firefox on MacOS.
The focus crash actually happened only once. Otherwise I can see hangs of the harness in `TestTimeouts.test_reset_timeout` which I cannot explain right now. I just triggered another try push only for MacOS debug to see if that is still a problem.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=30d1962e29eba6e60bb8e7d4d012aacab234f7c6 So the hang is still there, and I can finally reproduce it at least with running the headless tests. I will start narrowing it down now.
The very interesting fact is that this failure can ONLY be reproduced the very first time I run a Firefox installation. When I re-use the same application bundle for the next test job locally, it passes all fine. As such I have to remove it, and install again from the DMG file. Next step is to find the minimum amount of tests to run to actually trigger this probably hang of Firefox.
Also note that killing the Marionette job with Ctrl+C does NOT kill the Firefox process! But I can kill it just fine via the command line with `kill -9 pid`. Once I have a reduced testcase it will be easier to debug, and I will provide a fix on a separate bug.
Locally an unexpected update of Firefox while the test is running makes the test hang. I filed that as bug 1508726. This would also explain why the initial Mn jobs after the build is done are fine, but are failing for retriggers as done later in the day. Because at this time new nightlies are most likely out, and as such would cause an update which means the same condition as I see locally. Also I wouldn't have found this if bug 1508836 wouldn't be present. It currently allows the CI debug build from autoland to update to an opt Nightly build on mozilla-central. One of both we need to be fixed, and I assume the latter will be faster and easier.
Depends on: 1508836
Now with the hang fixed via bug 1508836, and not allowing CI builds to update via the nightly channel, the focus manager crash as noted earlier is remaining. It's tracked on bug 1509234 now.
(In reply to Henrik Skupin (:whimboo) from comment #14) > Also note that killing the Marionette job with Ctrl+C does NOT kill the > Firefox process! But I can kill it just fine via the command line with `kill > -9 pid`. Once I have a reduced testcase it will be easier to debug, and I > will provide a fix on a separate bug. The reason for this simply is that Firefox gets upgraded and restarts itself during start-up. That gives the process a new process id, which makes it unreachable for mozprocess to kill. As such we see this 1000s job timeout.
For the missing focus / activate events I filed bug 1509380 now. So I moved the ni? for Neil.
A workaround which seems to be fine is to wait with calling `win.focus()` until the next tick: https://treeherder.mozilla.org/#/jobs?repo=try&revision=bd595c43f7ff5fb922f40d3ddfe44508aee6e319&searchStr=debug Here a try build for all platforms to see if we can get this landed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9749bd765484d0622f4aa7b21d6552ab51f238d1
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/autoland/rev/0f57247530fd [marionette] Allow to open a chrome window via the WindowManager mixin class. r=ato
You need to log in before you can comment on or make changes to this bug.