[meta] Laggy text input on Twitter direct messages (we flush layout too often, and reflow is slow)

RESOLVED FIXED

Status

()

P3
normal
RESOLVED FIXED
2 years ago
4 months ago

People

(Reporter: overholt, Unassigned)

Tracking

(Depends on: 1 bug, Blocks: 1 bug, {meta, perf})

Trunk
meta, perf
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qf:meta])

Attachments

(2 attachments)

While in a conversation on Twitter with paint flashing on, I see a lot of painting areas earlier in the conversation when typing. After a number of messages back and forth (maybe 15?), input latency (i.e. letters I type showing up on the screen) gets bad; maybe this is related?
Ehsan noticed Twitter doing some sort of updating while we were simulating a conversation so maybe it's not entirely Gecko's fault.
Did you get a profile for this?
Flags: needinfo?(overholt)
Ehsan did.
Flags: needinfo?(overholt) → needinfo?(ehsan)

Comment 4

2 years ago
I lost the link to the profile unfortunately, but it wasn't showing anything interesting.  Although I captured yet another profile later that did show a jank but unfortunately the updater had switched up my build under the hood and the symbols were corrupted so the profile was unreadable.  I suggest someone should spend some time reprofiling, but it looked like there was some expensive synchronous layout flush involved.
Flags: needinfo?(ehsan)
Here's a profile I got just now: http://bit.ly/2sBRXNT. Ehsan said it's slow reflow with flexbox and tables.
Blocks: 1341750
Component: Layout: Web Painting → Layout
Summary: Overpainting while typing in Twitter direct messages → Laggy text input on Twitter direct messages
Flags: needinfo?(dholbert)
Flags: needinfo?(dbaron)
Hmm, but we're flushing when handling some IME stuff. Could we not flush there?
Flags: needinfo?(masayuki)
Though, reflow is so super slow that any layout flush at any point would make the input handling janky.

Comment 8

2 years ago
http://searchfox.org/mozilla-central/rev/714606a8145636d93b116943d3a65a6a49d2acf8/editor/libeditor/HTMLAnonymousNodeEditor.cpp#319 could probably be changed to keep track of whether the table editing / absolute position editing UI is currently displayed for the editor instead and avoid flushing if it isn't.
(In reply to Olli Pettay [:smaug] from comment #6)
> Hmm, but we're flushing when handling some IME stuff. Could we not flush
> there?
(In reply to Olli Pettay [:smaug] from comment #7)
> Though, reflow is so super slow that any layout flush at any point would
> make the input handling janky.

When some text is inputted, we need to notify main process of new text, selection range and character position around new selection. For computing the character positions, ContentEventHandler needs to flush pending layout (painting is not necessary, though, reflow is needed). If we do it a lot during an input event handling, we should stop doing it except the last one. However, if it's not, I have no idea. If we put off it, IME may show its windows, e.g., navigation window, suggest window and candidate window, at odd position. That makes users feel buggy.

The worst scenario is, web apps modify DOM nodes after every text input asynchronously. Then, at least two reflow are required.
Flags: needinfo?(masayuki)
I wonder how bad it would be to notify parent process once per animation frame tick?
In the profile is looks like we end up handling two or three input events between refresh driver ticks and all them do slow flush.
Flags: needinfo?(masayuki)
Posted file testcase
(In reply to Olli Pettay [:smaug] from comment #10)
> I wonder how bad it would be to notify parent process once per animation
> frame tick?
> In the profile is looks like we end up handling two or three input events
> between refresh driver ticks and all them do slow flush.

Honestly, not sure. I think that IMM, TSF and GTK are fine because IME windows are adjust with every new notification. But on macOS, we cannot notify IME of layout change. On the other hand, in e10s mode, we already notify requested character position before handling focused remote process. So, I guess that it's no problem.

The trigger of kicking ContentEventHandler is IMEContentObserver::TryToFlushPendingNotifications() which is called by PresShell::HandleEventInternal() after dispatching a DOM event.
https://searchfox.org/mozilla-central/search?q=symbol:_ZN7mozilla18IMEContentObserver30TryToFlushPendingNotificationsEv&redirect=false

I don't know how to call it from refresh driver, though.
Flags: needinfo?(masayuki)
This is just showing were we flush and shouldn't.
Attachment #8880349 - Attachment description: this show 4 places were we flush and shouldn't if possible. → this show 4 places where we flush and shouldn't if possible.
Ah, but we cannot remove the call of IMEContentObserver::TryToFlushPendingNotifications() in PresShell::HandleEventInternal() if it's in the main process because if it's in chrome process, we don't use ContentCache. Therefore, any query events will fail instead of return dummy (cached) values.
We should basically try to move all the flushing here to happen around tick time, I think, but
ensure that IME handling doesn't regress.
Would that make sense from IME point of view?
That patch is just a simple thing showing where the issues are, not any proposal how they should be fixed.
(In reply to Olli Pettay [:smaug] from comment #15)
> We should basically try to move all the flushing here to happen around tick
> time, I think, but
> ensure that IME handling doesn't regress.
> Would that make sense from IME point of view?

Yes, of course.

But not using ContentCache when main process content has focus may be a problem. That means that we may need to different handling if it's in main process or content process. If we could use ContentCache even in main process, IME related code won't need to check if it's in main process. But I'm still not sure, where should ContentCache be.
In other words, when main process's content has focus, IME handling in widget still depends on synchronous content update for every input event.
That is probably fine. We could special case child process handling to be async.
(the patch shows also an issue in editor and in selection handling)
Depends on: 1375484
Depends on: 1375487
Depends on: 1375491
Filed bugs about the flushes, reflow part needs to be still analyzed.
Summary: Laggy text input on Twitter direct messages → [meta] Laggy text input on Twitter direct messages (we flush layout too often, and reflow is slow)

Updated

2 years ago
Depends on: 1375493
Marking [qf] because smaug thinks this is indicative of general input latency issues.
Whiteboard: [qf]

Updated

2 years ago
Whiteboard: [qf] → [qf:meta]
Once bug 1375491 is in Nightlies, could you Andrew retest, and possibly get a new profile if there are still issues.
Flags: needinfo?(overholt)
Depends on: 1377253
I spun off bug 1377253 for improving the flex layout time here.
Flags: needinfo?(dholbert)
(In reply to Andrew Overholt [:overholt] from comment #21)
> Marking [qf] because smaug thinks this is indicative of general input
> latency issues.

it seems much better now!
Flags: needinfo?(overholt)
Priority: -- → P3

Updated

11 months ago
See Also: → bug 1253955
I think we can call this fixed -- bug 1375491 fixed most of it (reducing the number of reflows), and per bug 1377253 comment 11, the reflows that we are doing seem to be pretty short now.

If anyone's still seeing layout-related jank when typing in a twitter Direct Message, though, then please reopen bug 1377253 (ideally with a perf-html.io performance profile and info that might help with reproducing, e.g. whether backscroll/emoji/images/whatever seem to be related.)
Status: NEW → RESOLVED
Last Resolved: 4 months ago
Flags: needinfo?(dbaron)
Resolution: --- → FIXED
Keywords: meta
You need to log in before you can comment on or make changes to this bug.