Closed Bug 1691813 Opened 5 years ago Closed 4 years ago

Browser hangs for a few seconds when scrolling and right-clicking on a view source tab after session restoration

Categories

(Core :: Disability Access APIs, defect, P1)

Desktop
Windows 10
defect

Tracking

()

RESOLVED FIXED
91 Branch
Performance Impact medium
Tracking Status
firefox-esr78 --- unaffected
firefox85 --- unaffected
firefox86 --- wontfix
firefox87 --- wontfix
firefox88 --- wontfix
firefox89 --- wontfix
firefox90 --- wontfix
firefox91 --- fixed

People

(Reporter: alice0775, Assigned: eeejay)

References

(Regression)

Details

(4 keywords)

Attachments

(1 file)

Reproducible: always

Steps to reproduce:
0. Activate acceciblity tools such as ATOK(Japanese IME) insights

  1. Enabe "Restore previous session" from about:preferences

  2. Open https://ftp.mozilla.org/pub/firefox/nightly/2021/01/

  3. Right click on the page, Choose "View Page Source"

  4. Quit beowser

  5. Start the browser
    6-1. Right click so that context menu will appear
    6-2. Click elsewhere to hide the context
    6-3. Scroll 50-100px

  6. Repeat step 6-1 to step 6-3 several times
    --- browser does not hang as expected --- browser get hang up for long period. BUG!

Profiler log: https://share.firefox.dev/3p2ECcV

Component: Performance → Disability Access APIs
Whiteboard: [qf]

This looks quite serious. The content process hangs for five seconds.

Whiteboard: [qf] → [qf:p2:responsiveness]
Severity: -- → S2
Priority: -- → P1

Regression window(*1):

https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=69ae5f925fbc67d86a6fda1ed0a4b02ae8fc4588&tochange=02d875d4f7d769cd03be3175490377dc600355fe

*1,
On the good build, Though it takes time for page displayed at the session restored.
On the bad build, Session restore is fast, but this bug appears.

Keywords: regression
Regressed by: 1676966
Has Regression Range: --- → yes

I can't seem to reproduce this. I'm also puzzled as to why this would be more likely to happen when restoring tabs than not.

If I read this profile right (and I still struggle with the profiler, so I might not be reading it correctly), we're spending a lot of time in DocAccessible::GetAccessibleOrDescendant called by DocAccessible::PruneOrInsertSubtree, which seems to be triggered by layout reframing (RestyleManager). One thing that does occur to me is that this document contains a lot of span elements inside a pre element and a11y strips out the spans. If GetAccessibleOrDescendant gets called a lot on these spans, it's going to get to the container (the pre element). Then, it's going to walk through the (9061) children looking for a matching node in their ancestry. Since PruneOrInsertSubtree is often recursive, that probably ends up being O(n^2), particularly if the pre element gets reframed.

I wonder if it'd make sense to somehow track nodes which have descendant Accessibles? We'd have to update that when inserting and removing from the document, though, and I worry that the ancestry walking necessary to do this might become its own source of perf problems.

Eitan, thoughts?

(In reply to James Teh [:Jamie] from comment #3)

I'm also puzzled as to why this would be more likely to happen when restoring tabs than not.

Ah. The regression window in comment 2 points to bug 1676966. I guess reflowing after character maps becoming available would explain the change in behaviour (see comment 3 for why I think reflow might be slow for a11y here). That said, I don't think there are any characters on that page that should require font fallback? :jfkthame, any idea why this might get triggered here? Does reflow after async character map availability reflow all documents or only affected ones?

Flags: needinfo?(jfkthame)

At that point, we reflow all documents -- because we don't know which document(s) may have encountered font fallback during their previous layout, and will now render more correctly. I.e. we don't know whether any given document will be affected or not. (It's a lot like the situation if the user modifies their system font configuration (e.g. installs a new font) while the browser is running; we have to reflow everything, because we don't know which documents might have wanted to use that font if it had been available to them.)

For the character map loading case, we could in principle keep track during reflow of whether the document encounters any font fallbacks that are affected by cmaps not being available, set a flag on the document (or the prescontext or something), and then only reflow "potentially-affected" documents when the loading completes. However, it would involve a lot of extra plumbing to connect the current prescontext to the code way down in gfx that performs font matching during textrun construction; currently we don't know at that level what document we're dealing with. But maybe we should look into that, if the global reflow is so disruptive.

(Worth noting that in general this occurs a maximum of once per session; once the character maps are fully loaded, it won't happen again.)

Flags: needinfo?(jfkthame)

Thanks. That's really helpful info.

It's pretty bad that reflow causes such nasty perf for a11y in this case, so I think we'll explore options for dealing with this in a11y first. Good to know there's another option if we truly can't fix this in a11y, though.

Set release status flags based on info from the regressing bug 1676966

This seems pretty bad. Anything we can do to get this un-stalled?

Flags: needinfo?(jteh)

Eitan, see comment 3. I thought I needinfoed you, but apparently I didn't. 😳

Flags: needinfo?(jteh) → needinfo?(eitan)

Jamie, did you make any progress? Did you talk with Eitan off-bugzilla or are you still waiting for his reply?

Is there a way to force this kind of reflow on a document, not just in session restore?

Flags: needinfo?(jfkthame)

Installing a new font in the OS (or removing one) while the browser is running should trigger something similar, I believe.

Flags: needinfo?(jfkthame)

Thanks! Not sure if I am reproducing this issue. But I definitely am able to reproduce a profile with multi-second jank with a similar stack as above.

Assignee: nobody → eitan
Flags: needinfo?(eitan)

This speeds up reflows on pages like view source that have a single
giant container. But this maybe slows down other cases? I'm not 100%
sure.

Pushed by eisaacson@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/af44502e176a Use TreeWalker to find accessible descendants. r=Jamie

Fixed. thank you!

Flags: needinfo?(eitan)
Pushed by eisaacson@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/87c6360e23c2 Use TreeWalker to find accessible descendants. r=Jamie
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch
Regressions: 1716860
No longer regressions: 1716860
Flags: needinfo?(jteh) → in-testsuite+
Performance Impact: --- → P2
Whiteboard: [qf:p2:responsiveness]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: