JavaScript error: resource://gre/modules/Prompter.jsm, line 642: NS_ERROR_NOT_AVAILABLE: prompt aborted by user
Categories
(Firefox :: Security, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox76 | --- | unaffected |
firefox77 | + | wontfix |
firefox78 | --- | fixed |
People
(Reporter: whimboo, Assigned: pbz)
References
(Regression)
Details
(Keywords: regression)
Attachments
(1 file)
There are new test failures for Marionette and webdriver spec tests. They all have the following failure in common:
[task 2020-04-17T19:37:17.347Z] 19:37:17 INFO - JavaScript error: resource://gre/modules/Prompter.jsm, line 642: NS_ERROR_NOT_AVAILABLE: prompt aborted by user
Looks like a regression from bug 1615588.
Reporter | ||
Comment 1•5 years ago
|
||
Reporter | ||
Updated•5 years ago
|
Assignee | ||
Comment 2•5 years ago
|
||
I'll look into this. Thanks!
Updated•5 years ago
|
Assignee | ||
Comment 3•5 years ago
|
||
I think this might a side effect of regular teardown. When a tab / browser is destroyed it calls abortPrompt
on CommonDialog.jsm
which causes Prompter.jsm to throw here. We should probably differentiate this more and only throw for actual user prompt aborts like navigation.
Reporter | ||
Comment 4•5 years ago
|
||
Paul, do you have an update? This problem is causing a fair amount of intermittent failures, some even reaching the top intermittent list. Thanks!
Assignee | ||
Comment 5•5 years ago
•
|
||
Hi! Sorry for the delay, I was busy with 77 work. I'll take a look today and tomorrow and will keep you updated here. As a quick fix it might be possible to just ignore these exceptions, since they don't really indicate a problem in the test.
I'm still a bit confused why this throws more often with my patch, given that the old code also threw on prompt close / teardown. It's also interesting that there are barely any failures on linux.
Assignee | ||
Comment 6•5 years ago
|
||
I suspect that this became intermittent because I fixed a bug with my patch. The for loop here was faulty https://searchfox.org/mozilla-central/rev/74bd4471abc8545c9a12424b33c6d90889b864aa/browser/actors/PromptParent.jsm#123 it did not iterate over the right object and thus never aborted any prompts => no exception thrown on teardown. Now with the fixed for
loop it throws an exception when the browser (+ PromptParent window actor) is destroyed.
Since throwing is the correct behavior I suggest that we ignore the exception in these tests or make sure to always fully close all prompts before doing any kind of teardown. I'll run a few experiments on try and see how I can fix this.
Reporter | ||
Comment 7•5 years ago
|
||
Paul, can you please explain when exactly the error is thrown? When some code calls accept()
or dismiss()
on the prompt object?
Reporter | ||
Comment 8•5 years ago
|
||
If that is the case, the code affected in Marionette is:
And hereby we always issue a click event on the appropriate buttons.
Reporter | ||
Comment 9•5 years ago
|
||
Note that this problem mainly happens for MacOS debug and Windows 7 opt builds. Why don't we see it on Linux or other kind of builds. This feel kinda in-consequent if we expect this error to be thrown all the time. Then I would expect a perma failure.
Reporter | ||
Comment 10•5 years ago
|
||
Please note that I'm not able to reproduce this failure locally even not with testing/web-platform/tests/webdriver/tests/execute_script/execute.py
running in a loop for 50 times.
Reporter | ||
Comment 11•5 years ago
|
||
Ok, so what's interesting here are those lines:
[task 2020-04-30T04:43:43.213Z] 04:43:43 INFO - PID 1678 | 1588221823183 webdriver::server DEBUG -> POST /session/c0b0a745-6571-5844-9016-25179d3d44cd/execute/sync {"args": [], "script": "window.alert('Hello'); return 1;"}
[task 2020-04-30T04:43:43.214Z] 04:43:43 INFO - PID 1678 | 1588221823185 Marionette DEBUG 0 -> [0,342,"WebDriver:ExecuteScript",{"args":[],"script":"window.alert('Hello'); return 1;"}]
[task 2020-04-30T04:43:43.279Z] 04:43:43 INFO - PID 1678 | 1588221823250 Marionette TRACE Received observer notification tabmodal-dialog-loaded
[task 2020-04-30T04:43:43.280Z] 04:43:43 INFO - PID 1678 | 1588221823275 Marionette DEBUG 0 <- [1,342,null,{"value":null}]
[task 2020-04-30T04:43:43.298Z] 04:43:43 INFO - PID 1678 | 1588221823288 Marionette TRACE Received event DOMModalDialogClosed
[task 2020-04-30T04:43:43.317Z] 04:43:43 INFO - PID 1678 | JavaScript error: resource://gre/modules/Prompter.jsm, line 1178: NS_ERROR_NOT_AVAILABLE: prompt aborted by user
[task 2020-04-30T04:43:43.317Z] 04:43:43 INFO - PID 1678 | 1588221823312 webdriver::server DEBUG <- 200 OK {"value":null}
[task 2020-04-30T04:43:43.318Z] 04:43:43 INFO - PID 1678 | 1588221823314 webdriver::server DEBUG -> POST /session/c0b0a745-6571-5844-9016-25179d3d44cd/alert/accept {}
[task 2020-04-30T04:43:43.335Z] 04:43:43 INFO - PID 1678 | 1588221823319 Marionette DEBUG 0 -> [0,343,"WebDriver:AcceptDialog",{}]
[task 2020-04-30T04:43:43.335Z] 04:43:43 INFO - PID 1678 | 1588221823321 Marionette DEBUG 0 <- [1,343,{"error":"no such alert","message":"","stacktrace":"WebDriverError@chrome://marionette/content/error.js:175:5\nNoSuchA ... t@chrome://marionette/content/server.js:249:9\n_onJSONObjectReady/<@chrome://marionette/content/transport.js:501:20\n"},null]
The call to WebDriver:ExecuteScript
causes a tabmodal-dialog-loaded
event, which means the alert
prompt has been opened. Then the command within Marionette returns just fine. But why is that followed with DOMModalDialogClosed
immediately? There is clearly NO user interaction here. Closing the alert happens around 33ms later via WebDriver:AcceptDialog
.
In working cases it looks like:
6:53.63 pid:70610 1588230028090 Marionette DEBUG 0 -> [0,100,"WebDriver:ExecuteScript",{"args":[],"script":"window.alert('Hello'); return 1;"}]
6:53.67 pid:70610 1588230028121 Marionette TRACE Received observer notification tabmodal-dialog-loaded
6:53.67 pid:70610 1588230028123 Marionette DEBUG 0 <- [1,100,null,{"value":null}]
6:53.67 pid:70610 1588230028133 webdriver::server DEBUG <- 200 OK {"value":null}
6:53.67 pid:70610 1588230028134 webdriver::server DEBUG -> POST /session/f3874870-b931-864e-8b16-47dfe6bb409c/alert/accept {}
6:53.68 pid:70610 1588230028136 Marionette DEBUG 0 -> [0,101,"WebDriver:AcceptDialog",{}]
6:53.68 pid:70610 1588230028141 Marionette TRACE Received event DOMModalDialogClosed
6:53.68 pid:70610 1588230028143 Marionette TRACE Received DOM event DOMModalDialogClosed for [object XULFrameElement]
6:53.69 INFO STDOUT: PASSED
6:53.70 pid:70610 1588230028155 Marionette DEBUG 0 <- [1,101,null,{"value":null}]
Reporter | ||
Comment 12•5 years ago
|
||
Same for this Marionette test:
[task 2020-04-30T07:33:36.963Z] 07:33:36 INFO - 1588232016960 Marionette DEBUG 48 -> [0,2,"WebDriver:ExecuteAsyncScript",{"scriptTimeout":null,"newSandbox":true,"args":["alert","foo alert"],"filename":"Z:\\task ... .return_value = null;\n window.return_value = window[arguments[0]](arguments[1]);","sandbox":"default","line":45}]
[task 2020-04-30T07:33:36.970Z] 07:33:36 INFO - 1588232016967 Marionette TRACE Received observer notification tabmodal-dialog-loaded
[task 2020-04-30T07:33:36.970Z] 07:33:36 INFO - 1588232016968 Marionette DEBUG 48 <- [1,2,null,{"value":null}]
[task 2020-04-30T07:33:36.971Z] 07:33:36 INFO - 1588232016969 Marionette TRACE Received event DOMModalDialogClosed
[task 2020-04-30T07:33:36.974Z] 07:33:36 INFO - JavaScript error: resource://gre/modules/Prompter.jsm, line 1178: NS_ERROR_NOT_AVAILABLE: prompt aborted by user
[task 2020-04-30T07:33:36.977Z] 07:33:36 INFO - 1588232016973 Marionette DEBUG 48 -> [0,3,"WebDriver:GetTitle",{}]
[task 2020-04-30T07:33:36.977Z] 07:33:36 INFO - 1588232016974 Marionette DEBUG 48 <- [1,3,null,{"value":"Marionette Test"}]
The tab modal gets opened and immediately closed. As such the call to WebDriver:GetTitle
fails because it expects an alert
prompt to be open.
So something clearly closes the opened tab modal prompt.
CC'ing both Johann and Matthew because they reviewed those patches, and maybe we can spot that problem earlier.
Reporter | ||
Comment 13•5 years ago
|
||
I feel that something causes us to run onPromptClose
or forceCleanup
immediately:
Assignee | ||
Comment 14•5 years ago
•
|
||
The exception is thrown when abortPrompt is called on CommonDialog.jsm. This happens in a couple cases:
- "unload" and "TabClose" events here
- Switching away from a tab with an active beforeunload prompt here
- The PromptParent window actor calls abortPrompt on destroy and when the content fires
pagehide
There could be a race condition with how we handle pagehide
events. Since they are fired in the content process, in the new implementation they are sent to the parent via an async actor message. The old implementation handled this with an event listener in Prompter.jsm in the content process. So given that browser (+ JSWindowActor pair) is reused it could be side effects from a preceding test.
I've seen a similar issue in another test: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=300037356&repo=try&lineNumber=14207
Reporter | ||
Comment 15•5 years ago
|
||
For those affected tests we don't do any of those above mentioned cases.
Comment 16•5 years ago
|
||
Henrik if it is decided to backout based on convo here https://bugzilla.mozilla.org/show_bug.cgi?id=1629147#c8, please contact sherrifs on duty on riot in #sheriffs, or by ni the sheriffing alias. thank you.
Comment 17•5 years ago
|
||
A few general questions before we rush into a backout:
- How many tests is this causing to fail per day/week?
- Are these tests tier 1 or tier 2?
- What's the urgency to back out quickly instead of waiting another week for a better resolution?
We would obviously like to avoid backing out a patch that has landed in central a while ago and now lives beneath a bunch of other work.
Comment 18•5 years ago
|
||
(In reply to Johann Hofmann [:johannh] from comment #17)
A few general questions before we rush into a backout:
- How many tests is this causing to fail per day/week?
- Are these tests tier 1 or tier 2?
- What's the urgency to back out quickly instead of waiting another week for a better resolution?
We would obviously like to avoid backing out a patch that has landed in central a while ago and now lives beneath a bunch of other work.
This caused 59 total failures in the last 7 days in bug 1629147, tier1.
In bug 1631112, there are 15 total failures in the last 7 days, tier1.
We usually backout if there is too much noise on the trees, or if the backout was requested. We (sheriffs) can wait for a patch if this is actively being worker on. Leaving the ni for Henrik if he has something to add here.
Reporter | ||
Comment 19•5 years ago
|
||
Nothing more to add from my side beside that I would like to see this fixed sooner than later. And yes, it passed already a week but not because it was reported that late. So hopefully we can this sort out soon.
Assignee | ||
Comment 21•5 years ago
|
||
I'm looking to get it fixed this week.
Currently running with verbose logging on try and also seeing if I can reproduce it locally, on a macOS and Windows VM. I have a suspect but I want to be able to reproduce it reliably before landing a fix.
Reporter | ||
Comment 22•5 years ago
|
||
For verbose logging of Marionette you can add -vv --gecko-log -
to the mach command, which will enable trace logging. For wdspec tests use --webdriver-arg=-vv
.
Assignee | ||
Comment 23•5 years ago
•
|
||
I was able to reproduce it locally on macOS. It looks like it is a race condition with the pagehide
event. So far the try runs look good. I'll submit a patch later today.
Reporter | ||
Updated•5 years ago
|
Assignee | ||
Comment 24•5 years ago
|
||
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 25•5 years ago
|
||
Comment 26•5 years ago
|
||
bugherder |
Assignee | ||
Updated•5 years ago
|
Reporter | ||
Comment 28•4 years ago
|
||
All the Marionette and Webdriver tests are looking fine now! Thanks a lot.
Paul, can you please request uplift so we can also get this fixed for 77 beta? Maybe we should even track the release for it?
Assignee | ||
Comment 29•4 years ago
•
|
||
Glad its fixed!
We haven't really seen issues anywhere outside of the test environment so I'd rather have this ride the train for now. Please let me know if you have concerns about that.
Updated•4 years ago
|
Updated•4 years ago
|
Reporter | ||
Comment 30•4 years ago
|
||
(In reply to Paul Zühlcke [:pbz] from comment #29)
Glad its fixed!
We haven't really seen issues anywhere outside of the test environment so I'd rather have this ride the train for now. Please let me know if you have concerns about that.
Sorry that I missed this comment and an important note: One thing you did not know is that Marionette and geckodriver are tools, which are part of several web testing frameworks. As such they are not only used by ourselves in CI but also heavily by web developers via the Selenium ecosystem to test web pages, and to automate certain tasks. While we do not know what exactly gets tested, it's fair to assume that there will be a lot of tests for user prompts (alert
, confirm
, prompt
) and also including beforeunload
.
Given the amount of intermittent failures we had with this bug being present, we can assume that with not fixing it for Firefox 77 a lot of users will also face it. Also with not having a preference to turn the new behavior off, there would even be no workaround for users, or even us to somewhat hide the failure by implementing a temporary workaround in geckodriver.
As such we may have to re-evaluate the importance of the attached patch and that we should ship it with Firefox 77. I know it is very late and we have the 77.0 RC3 build out. But nevertheless I have to bring this up for consideration by the release drivers. Not sure who will be best to be notified here, so setting needinfo for a couple of folks.
Comment 31•4 years ago
|
||
Sorry but I can't help. i am no longer involved in release management
Comment 32•4 years ago
|
||
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #30)
Sorry that I missed this comment and an important note: One thing you did not know is that Marionette and geckodriver are tools, which are part of several web testing frameworks. As such they are not only used by ourselves in CI but also heavily by web developers via the Selenium ecosystem to test web pages, and to automate certain tasks. While we do not know what exactly gets tested, it's fair to assume that there will be a lot of tests for user prompts (
alert
,confirm
,prompt
) and also includingbeforeunload
.Given the amount of intermittent failures we had with this bug being present, we can assume that with not fixing it for Firefox 77 a lot of users will also face it. Also with not having a preference to turn the new behavior off, there would even be no workaround for users, or even us to somewhat hide the failure by implementing a temporary workaround in geckodriver.
As such we may have to re-evaluate the importance of the attached patch and that we should ship it with Firefox 77. I know it is very late and we have the 77.0 RC3 build out. But nevertheless I have to bring this up for consideration by the release drivers. Not sure who will be best to be notified here, so setting needinfo for a couple of folks.
Hi Henrik,
Unfortunately this is way too late to include it in Tuesday's release with all of our European QA on PTO on Monday, I am going to track it for a potential 77 dot release, if you can monitor external feedback for this issue, that would be good. We could add a mention in our release notes as a known issue if you can provide the wording.
Updated•4 years ago
|
Updated•4 years ago
|
Reporter | ||
Comment 33•4 years ago
|
||
(In reply to Pascal Chevrel:pascalc from comment #32)
Unfortunately this is way too late to include it in Tuesday's release with all of our European QA on PTO on Monday, I am going to track it for a potential 77 dot release, if you can monitor external feedback for this issue, that would be good. We could add a mention in our release notes as a known issue if you can provide the wording.
I cannot explain the underlying details in Firefox. Paul would be the best person here.
Assignee | ||
Comment 34•4 years ago
•
|
||
(In reply to Paul Zühlcke [:pbz] from comment #14)
There could be a race condition with how we handle
pagehide
events. Since they are fired in the content process, in the new implementation they are sent to the parent via an async actor message. The old implementation handled this with an event listener in Prompter.jsm in the content process. So given that browser (+ JSWindowActor pair) is reused it could be side effects from a preceding test.
Given that removing the pagehide window actor message fixed the issue this was most likely the cause for the intermittent tests.
That was the switch from an event listener in the content process here: https://searchfox.org/mozilla-central/rev/74bd4471abc8545c9a12424b33c6d90889b864aa/toolkit/components/prompts/src/Prompter.jsm#608
to the async window actor message here: https://searchfox.org/mozilla-central/rev/3075dbd453f011aaf378bcac6b2700dccfcf814c/browser/actors/PromptChild.jsm#18
TLDR: In 77 there could be situations in the test environment where the pagehide
event of the previous page could cause a prompt spawned in the new page to be closed early. However, only if the window (+ the associated window actor) of the old page is reused for the new page.
Reporter | ||
Comment 35•4 years ago
|
||
Thanks Paul! I have added an item to the developer release notes of Marionette:
https://wiki.developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/77#WebDriver_conformance_Marionette
Also Paul mentioned that it might be better to not fix it in a dot release of Firefox 77 but let it ride the train. Lets see how many users of Webdriver are affected here, before we actually escalate the problem. I will keep an eye on feedback from the Selenium community.
Updated•4 years ago
|
Description
•