Closed Bug 1719150 Opened 3 years ago Closed 3 years ago

Changing search engine keyword to an already used keyword and then selecting another search engine makes Firefox unresponsive

Categories

(Core :: DOM: Events, defect)

Firefox 89
defect

Tracking

()

VERIFIED FIXED
96 Branch
Tracking Status
firefox-esr91 --- verified
firefox89 --- wontfix
firefox93 --- wontfix
firefox94 --- wontfix
firefox95 --- verified
firefox96 --- verified

People

(Reporter: kmoz, Assigned: emilio)

Details

(Keywords: hang, regression)

Attachments

(4 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0

Steps to reproduce:

  1. Go to about:preferences#search Search Shortcuts
  2. Change the keyword of a search engine (A) to a keyword that is already used
  3. Select another search engine than search engine (A). Now a dialog opens, that informs that this keyword is already used.
  4. Close the dialog. Firefox has become unresponsive.

Actual results:

Firefox freezes

Expected results:

The already used keyword is removed and Firefox does not freeze

To reproduce it I had to do some keyboard acrobatics: I already had my mouse pointer ready to click on another search engine before changing the keyword on the first search engine.

The Bugbug bot thinks this bug should belong to the 'Firefox::Search' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Search

I can reproduce this issue on my machine Ubuntu 20.04.1 LTS(X11) using Fx 89.0.2 but you should follow the steps bellow:

  1. Go to about:preferences#search Search Shortcuts
  2. Change the keyword of a search engine (A) to a keyword that is already used but in the same time focus the mouse pointer on the another search engine
  3. Press Enter to select the focused search engine from above . Now a dialog opens, that informs that this keyword is already used.
  4. Close the dialog.
    Actual result: Firefox has become unresponsive.
    If this is not the right component please feel free to change it.
Severity: -- → S3
Status: UNCONFIRMED → NEW
Component: Search → Preferences
Ever confirmed: true

I'm having trouble reproducing this. Can you run mozregression to find when this regressed?

Flags: needinfo?(raluca.popovici)

This is my results from my mozregression run:

2021-07-27T19:18:31.358000: DEBUG : Found commit message:
Backed out 8 changesets (bug 1714933) for causing build bustages in 1659595.js CLOSED TREE

Backed out changeset 2f72875b300f (bug 1714933)
Backed out changeset a42831b46643 (bug 1714933)
Backed out changeset c7ccabae5816 (bug 1714933)
Backed out changeset a4f4e4026174 (bug 1714933)
Backed out changeset fca960ad267b (bug 1714933)
Backed out changeset 62bdba475376 (bug 1714933)
Backed out changeset f168a2d1d391 (bug 1714933)
Backed out changeset 85af7c63b9c5 (bug 1714933)

2021-07-27T19:18:31.359000: DEBUG : Did not find a branch, checking all integration branches
2021-07-27T19:18:31.363000: INFO : The bisection is done.
2021-07-27T19:18:31.364000: INFO : Stopped

Has Regression Range: --- → yes
Flags: needinfo?(raluca.popovici)
Regressed by: 1714933
Flags: needinfo?(jaws)

Hmm.. I doubt ICU changes would have caused this. I looked at the changesets that landed next to the ICU change and bug 1713334 looks like it could be a reasonable candidate. @saschanaz, can you take a look?

Flags: needinfo?(jaws) → needinfo?(krosylight)
Regressed by: 1713334
No longer regressed by: 1714933

That mozregression result pointing version 91 doesn't really make sense when the issue happens on version 89 per comment #0 and comment #3. I can't reproduce it on Firefox 89/90 nor on Nightly 92 on Windows, could you check it happens on Nightly?

Flags: needinfo?(krosylight)
Flags: needinfo?(raluca.popovici)

I have been able to reproduce this on Firefox 90 on Linux. Although it didn't work with a search engine that had a predefined keyword (only noticed that now). I also feel like it got harder to hit this crash, I had to try a few times.

I also tried in Firefox Nightly on Linux (from the archlinux user repository) and it didn't work anymore with the method I described in comment #0.

But I could reproduce it like this:

  1. Add two search engines via the search field next to the address bar (when you are on a page that supports this).
  2. Set a keyword for the first search engine.
  3. Set the same keyword for the second but don't confirm yet.
  4. Confirm by clicking on the other search engine.

Result: Firefox Nightly freezes.

Hey saschanaz, are you able to reproduce given comment 8?

Flags: needinfo?(krosylight)

No. On Nightly 93 and Firefox 90, I see an error message "You have chosen a keyword that is currently in use by “Daum”. Please select another."

My steps:

  1. Get a fresh Firefox 90 from mozregression
  2. Add search engines from https://daum.com and https://github.com
  3. Move to about:preferences#search
  4. On Search Shortcuts, double click the empty cell next to Daum and type "@test"
  5. Double click the empty cell next to GitHub and type "@test" again
  6. Click Daum
  7. An error message opens, as expected.

I still think it's nothing to do with bug 1713334 since it's a version 91 thing and this reportedly happens on version 89 and 90.

Flags: needinfo?(krosylight)

Raluca, re-requesting needinfo to see if you can run mozregression to find when this may have regressed prior to Firefox 89?

Flags: needinfo?(raluca.popovici)

I think you actually canceled the ni 👀

Flags: needinfo?(raluca.popovici)
Attached image mozr.jpg

This is as far as i could go

last good build is 2020-4-16
first bad build is 2020-4-17

that would be firefox nightly 77.0a1

Mozregression gave me an error (screeshot attached) and after i hit ok, it just closed.

´please let me know if this helps somehow.

This pushlog should cover those dates, https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a66e5e777cd16f0ee44e2c13eaab328854a9da5e&tochange=853b0e791775ce726149209092a003ed5f001b0c

I don't see anything in there that jumps out to me. I would expect the bug may exist in Editor or maybe some Layout code (though less likely).

@emilio, does anything in this pushlog stand out to you?

Flags: needinfo?(emilio)

Bug 1628288 perhaps? But not really other than that... Might be worth doing a manual bisection.

Flags: needinfo?(emilio)

Alternatively, a perf profile that shows where time is spent could be useful.

Hi Kmoz, thank you for your continued help here with this issue.

Would you be able to follow the steps at https://firefox-source-docs.mozilla.org/performance/reporting_a_performance_problem.html to record a performance profile of this issue and share it here?

Flags: needinfo?(kmoz)

I'll gladly do that. But I have no access to a desktop for the next four weeks, so you'll have to wait for that long (for me at least).

No longer regressed by: 1713334

Hey kmoz,

It's been about 4 weeks since your last comment. Will you have access to your desktop machine soon?

Attached file ff-nightly-log

Good thing you reminded me, my reminder seems to have failed.

Sadly I have not been able to create a profile, though I tried. Here's what happens:

$ MOZ_PROFILER_SHUTDOWN=/home/kmoz/world_writable/profile firefox-nightly
ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
/usr/lib/libicudata.so.69: unable to generate file identifier
/usr/lib/libicudata.so.69: unable to generate file identifier
/usr/lib/libicudata.so.69: unable to generate file identifier
/usr/lib/libicudata.so.69: unable to generate file identifier
/usr/lib/libicudata.so.69: unable to generate file identifier
/usr/lib/libicudata.so.69: unable to generate file identifier
/usr/lib/libicudata.so.69: unable to generate file identifier
/usr/lib/libicudata.so.69: unable to generate file identifier
/usr/lib/libicudata.so.69: unable to generate file identifier
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
[1]    117260 killed     MOZ_PROFILER_SHUTDOWN=/home/kmoz/world_writable/profile firefox-nightly

But there is no file profile file after killing firefox-nightly. I also tried the normal way with activating the profiler and then triggering the bug, but after force quitting there seems to be no profile (no new tab opening after restart, at least).

I have run this: MOZ_LOG=prof:5 firefox-nightly and triggered the bug. I attached the output log (Hoping that works).

Flags: needinfo?(kmoz)

(In reply to kmoz from comment #21)

Good thing you reminded me, my reminder seems to have failed.

Sadly I have not been able to create a profile, though I tried. Here's what happens:

... elided ...

But there is no file profile file after killing firefox-nightly. I also tried the normal way with activating the profiler and then triggering the bug, but after force quitting there seems to be no profile (no new tab opening after restart, at least).

I have run this: MOZ_LOG=prof:5 firefox-nightly and triggered the bug. I attached the output log (Hoping that works).

Hi kmoz, sorry I'm a little confused.

Looking at your comment #8, when you launch Firefox in that setting, you should then go to https://profiler.firefox.com/ to enable the Firefox profiler button. Then you can click on the button to start Capturing, then click again to stop recording. The new tab will be opened with the profile. There shouldn't be a restart required.

Does that help?

Flags: needinfo?(raluca.popovici) → needinfo?(kmoz)

I was a little unclear. As Firefox crashes when I trigger this bug, I have to restart. If I don't restart, Firefox just stays frozen (everything, no UI elements react to clicks) and no new tab with a profile ever opens because I cannot stop capturing.

That's why I tried with the MOZ_PROFILER_SHUTDOWN environment variable but it also doesn't didn't work.

Flags: needinfo?(kmoz)

Oh! If you're getting a crash, that might actually be easier than gathering a profile. If you crash, then restart Firefox and go to about:crashes, can you paste the link to the most recent submitted crash report? (You might have to scroll by any unsubmitted crash reports to get to them).

Flags: needinfo?(kmoz)

Sorry, I really need to get my terminology straight. It's not a crash, it becomes unresponsive/freezes. I have to manually quit Firefox (kill it on linux, force quit on MacOS). So no new entries under about:crashes, I checked...

Flags: needinfo?(kmoz)

I just reproduced the bug on the latest Firefox Nightly on MacOS and attached the log MacOS generated after force quitting the unresponsive Firefox Nightly.

Thanks, kmoz.

Looking at that trace, it looks like we're hanging here somewhere in the intersection of layout and graphics (building displaylists in order to find the right frame at a particular coordinate).

Hey dholbert, does anything from that trace jump out at you that might explain what kmoz is hitting here?

Flags: needinfo?(dholbert)

(In reply to Mike Conley (:mconley) (:⚙️) from comment #27)

Hey dholbert, does anything from that trace jump out at you that might explain what kmoz is hitting here?

Not yet, no. But I was able to reproduce locally (first try, using Nightly with a fresh profile). The key is picking a different search engine at the same time as you confirm the one that you were editing.

I was able to generate a crash report, by doing kill -11 [pid] for the hanging process:. Here's my crash report:
bp-047435ea-db2e-40a6-a287-be1440211005

Here's a pernosco recording of a session that ends up hanging due to this bug (and then I use kill -11 to crash the process during the hang):
https://pernos.co/debug/v1OxVkSCN6PuaSEAXJdtiw/index.html

dholbert. Hrm... I'm afraid my Pernosco skills are still pretty shoddy. Are we dealing with some kind of deadlock in the displaylist code?

Hey fgriffith, is there somebody on Layout who might have some cycles to help us with this?

Flags: needinfo?(fgriffith)

Will try to look. From what I see, I don't see a deadlock, only we getting stuck dispatching delayed events.

Flags: needinfo?(fgriffith) → needinfo?(emilio)

So the relevant issue seems to be here in PresShell::EventHandler::MaybeDiscardOrDelayMouseEvent.

We're dispatching delayed events, and the event is targeted to a document that has events suppressed, but mPresShell is not the target frame's presShell, which seems to be the assumption for the loop to terminate.

So we're basically always dispatching an event and delaying it, then trying to dispatch it again in a loop. The suppressed event is going to about:preferences, but mPresShell->mDocument is browser.xhtml.

I've let some notes in the document, but I think I know of a reasonable fix, let's see what Olli thinks.

Component: Preferences → DOM: Events
Product: Firefox → Core

Otherwise if we get an event targeting a different suppressed pres shell
while we're unsuppressing we might get stuck delaying the event over and
over, see the links in comment 33.

Not sure how to get a repro for this, the patch is written based on the
pernosco session, but suggestions welcome.

Assignee: nobody → emilio
Status: NEW → ASSIGNED
Flags: needinfo?(emilio)
Flags: needinfo?(dholbert)

Thanks for picking this up, emilio!

(In reply to Emilio Cobos Álvarez (:emilio) from comment #34)

Not sure how to get a repro for this, the patch is written based on the
pernosco session, but suggestions welcome.

I just reconfirmed that I can reproduce the bug in my local debug+opt mozilla-central build -- it reproduced on the first try. I also tried in a "full debug" build (with --disable-optimize), and it seems a bit harder to repro there (there might be some aspect of the race condition which is harder to trip without optimizations); but I confirmed it there as well after a few minutes of repeating the STR.

Then I rebuilt my debug+opt build with emilio's patch, and I spent another few minutes trying to reproduce in my debug+opt build, and was unable to repro.

So from my testing, it looks like this does indeed fix the issue. \o/

Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/462095e0f360
Push delayed mouse event to the right pres shell. r=smaug
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 96 Branch

The patch landed in nightly and beta is affected.
:emilio, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(emilio)

Comment on attachment 9248723 [details]
Bug 1719150 - Push delayed mouse event to the right pres shell. r=smaug

Beta/Release Uplift Approval Request

  • User impact if declined: comment 0
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: comment 0 and other comments in the bug have STR
  • List of other uplifts needed: none
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Relatively simple change.
  • String changes made/needed: none
Flags: needinfo?(emilio)
Attachment #9248723 - Flags: approval-mozilla-beta?
Flags: qe-verify+

I'll take this one in beta once this has been verified in Nightly by QA.

Reproduced the issue in Beta 95 using Ubuntu 20.04.2 LTS, X11.
Verified - Fixed in latest Nightly 96.0a1 (2021-11-04) (build id: 20211104214758), using the same OS, Ubuntu 20.04.

Comment on attachment 9248723 [details]
Bug 1719150 - Push delayed mouse event to the right pres shell. r=smaug

Approved for 95 beta 4, thanks.

Attachment #9248723 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Verified - Fixed in latest Beta 95.0a4 (build id: 20211107219147), using Ubuntu 20.04, Windows 10 and macOS 10.15.

QA Whiteboard: [qa-triaged]

Should we consider this for ESR uplift too?

Flags: needinfo?(emilio)

Comment on attachment 9248723 [details]
Bug 1719150 - Push delayed mouse event to the right pres shell. r=smaug

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: prevents UI hang
  • User impact if declined: prevents UI hang
  • Fix Landed on Version: 95
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Relatively simple fix.
  • String or UUID changes made by this patch: none
Flags: needinfo?(emilio)
Attachment #9248723 - Flags: approval-mozilla-esr91?
Attachment #9248723 - Flags: approval-mozilla-esr91? → approval-mozilla-esr91+

Verified - Fixed in latest 91.4.0esr (build id: 20211124155212), using Ubuntu 20.04, Windows 10 and macOS 10.15.

Status: RESOLVED → VERIFIED
QA Whiteboard: [qa-triaged]
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: