Add an event queue to nsThread for input events and annotate input IPC messages to use it

NEW
Assigned to

Status

()

Core
DOM: Events
P1
normal
3 months ago
4 days ago

People

(Reporter: stone, Assigned: stone, NeedInfo)

Tracking

(Blocks: 2 bugs)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(6 attachments, 15 obsolete attachments)

2.54 KB, patch
smaug
: review+
smaug
: feedback+
Details | Diff | Splinter Review
16.58 KB, patch
smaug
: review+
Details | Diff | Splinter Review
3.19 KB, patch
smaug
: review+
Details | Diff | Splinter Review
17.53 KB, patch
smaug
: feedback+
kats
: feedback+
Details | Diff | Splinter Review
4.90 KB, patch
stone
: feedback+
Details | Diff | Splinter Review
48.09 KB, patch
Details | Diff | Splinter Review
Comment hidden (empty)
(Assignee)

Updated

3 months ago
Assignee: nobody → sshih
Priority: -- → P1
(Assignee)

Comment 1

3 months ago
Created attachment 8851955 [details] [diff] [review]
input_events_queue_wip.patch

I think putting input events in a higher queue could improve the UI responsiveness. However, the tradeoff is we may delay the others when users continuously operate the devices and let OS fires input events quickly.

Still unclear to me that how to schedule high priority events, input events, and normal priority events. Just follow current logic to handle input events after high priority events in the draft patch.
(Assignee)

Comment 2

3 months ago
Found that there are some problems about can't find actor for input events if we handle input events early. I think it's because we try to dispatch input events to PBrowserChild actor while the event to create it is still in the normal priority queue.

Wondering should be also put those events which are used to create actors or something like initializations in higher priority queue than input events?
(Assignee)

Comment 3

3 months ago
oops, it may be hard to define what events are related to initialization of content and may be risky to change the event order. Another possible solution is only dispatching input events to content after PBrowserContent actor is created.

Updated

3 months ago
Flags: needinfo?(wmccloskey)
(Assignee)

Comment 4

3 months ago
Did some tests on Linux are found
1. The interval of putting vsync event in the queue isn't always 16.6ms. (It is about 16.6ms in average but sometimes the interval is much more than 16.6ms. e.g. navigating to other pages)
2. In general, up to 4 input events are put in the queue in a frame. (It depends on #1)
3. The interval of putting input event in the queue varied (from 1 to hundreds ms)

Thinking about using the time of a frame as the time budget to process events. We should process vsync event immediately when run out of the budget.

For better user responsiveness, we should process input events quickly. However, consume normal events after consuming all input events might not be a good idea since the normal events are delayed when the user continuously moves the mouse or presses the keyboard. Then we won't have a chance to coalesce repeated input events.

To prevent the input events delays after user stop controlling input devices, maybe we can define a deadline (e.g. 5 frames) that all input events should be consumed.

Working on a wip and plan to test it with some pages with poor performance to see if anything should be taken into consideration.

Comment 5

3 months ago
We'll want to try out different options. One possibility is (which I think is close to what blink does) process vsync, then normal events, then input events and then again vsync
I don't see reason to process input events asap, like vsync, but during the nearest possible frame - but doesn't matter when there. If we want to coalesce events more, processing as late as possible might be good.
(Assignee)

Comment 6

3 months ago
Wondering do we have some test cases that might be good for testing different options?

Comment 7

3 months ago
I'm not aware.

We may have enough telemetry probes to check whether events are handled faster or so.
(Assignee)

Comment 8

3 months ago
Created attachment 8857419 [details] [diff] [review]
bug-1351148-wip.patch

Using a preference 'content.main_thread.prioritization.option' to control different options to change the priority of events.

0: no event prioritization.
1: handle (limited) input events in each frame.
2: handle input events after normal events.

Thinking about how to test the impacts of different options.
Attachment #8851955 - Attachment is obsolete: true
(Assignee)

Comment 9

2 months ago
(In reply to Ming-Chou Shih [:stone] from comment #8)
> Created attachment 8857419 [details] [diff] [review]
> bug-1351148-wip.patch
> 
> Using a preference 'content.main_thread.prioritization.option' to control
> different options to change the priority of events.
> 
> 0: no event prioritization.
> 1: handle (limited) input events in each frame.
> 2: handle input events after normal events.
> 
> Thinking about how to test the impacts of different options.

Using Hasal test framework to tested three options with a test case which loads a big google doc and simulates user page downs. Observed that with the preference=2, UA doesn't response in the beginning and then pages down quickly after few seconds.
(Assignee)

Comment 10

2 months ago
Simulated heavy loads by creating busy loop in a keydown listener and observed that
The 2nd solution could get the shortest response time to user inputs.
The 3rd solution could coalesce the maximum repeated events.

Wondering that is there any other option we can try on it?
(Assignee)

Comment 11

2 months ago
(In reply to Ming-Chou Shih [:stone] from comment #10)
> Simulated heavy loads by creating busy loop in a keydown listener and
> observed that
> The 2nd solution could get the shortest response time to user inputs.
> The 3rd solution could coalesce the maximum repeated events.
> 
> Wondering that is there any other option we can try on it?

Found there are some discussions about user acceptable delay. Wondering do we have such data? Maybe we can try the option that uses the longest acceptable delay as the deadline to process input events.

Also wondering should I refine the patch and turn on it on nightly to collect more feedbacks?

https://ux.stackexchange.com/questions/82485/whats-the-longest-acceptable-delay-before-an-interaction-starts-to-feel-unnatur
https://www.nngroup.com/articles/powers-of-10-time-scales-in-ux/
https://en.wikipedia.org/wiki/Input_lag

Comment 12

2 months ago
(In reply to Ming-Chou Shih [:stone] from comment #8)
> Created attachment 8857419 [details] [diff] [review]
> bug-1351148-wip.patch
> 
> Using a preference 'content.main_thread.prioritization.option' to control
> different options to change the priority of events.
> 
> 0: no event prioritization.
> 1: handle (limited) input events in each frame.
> 2: handle input events after normal events.

Could you explain these a bit more?


I'd like to see data about processing input events right after vsync and right before.
Blink is moving to the model where input events are processed right before vsync.
(Assignee)

Comment 13

2 months ago
(In reply to Olli Pettay [:smaug] from comment #12)
> > 0: no event prioritization.
> > 1: handle (limited) input events in each frame.
> > 2: handle input events after normal events.
> 
> Could you explain these a bit more?
The idea of option 1 is let input events doesn't delay more than N frames. After handling the vsync evnet, it dynamically calculated the number of input events that should be consumed in this frame (all queued input events number / N frames). It first consumes the vsync event, then few input events, then normal events.

Option 2 is simply handling input events after all normal events.

Now I had another option (addressed in comment #10) that consumes one input event per M frames. The concept is that handling input events before the user feels no response of UA.

> 
> I'd like to see data about processing input events right after vsync and
> right before.
> Blink is moving to the model where input events are processed right before
> vsync.
I'll add this option and do some tests. But I afraid it's hard to figure out a good test case that can show the difference of these options.
(Assignee)

Comment 14

2 months ago
Maybe we could check which option get the minimum difference between it's generated and processed.
(Assignee)

Comment 15

2 months ago
(In reply to Ming-Chou Shih [:stone] from comment #14)
> Maybe we could check which option get the minimum difference between it's
> generated and processed.
I mean minimum difference of timestamp.

Comment 16

2 months ago
(2) sounds like a case we definitely don't want.
vsync handling should be very prioritized, and input events probably a bit less.


Btw, currently in parent process we process input events up to 10 ms before processing Gecko events.
http://searchfox.org/mozilla-central/rev/6e1c138a06a80f2bb3167d9dac050cc0fd4fb39e/widget/nsBaseAppShell.cpp#248,272
That code is very old and may not make sense, but something to keep in mind.
(Assignee)

Comment 17

2 months ago
0: no event prioritization.
1: handle (limited) input events in each frame.
2: handle input events after normal events.
3: handle input events per M frames.
4: handle input events before and after vsync events.

(Manually) tested with 200 pages google doc and hold page down key, the maximum delay of input events are
1: 9.431062s
3: 37.141377s (can accumulate more delays)
4: 20.734358s

(3) is too weak to the case that user produces mass input events.

(Manually) tested navigating a web page e.g. http://www.bbc.com/
No obvious difference between (1) and (4).

Plan to use automated tests to compare (1) and (4)

Comment 18

2 months ago
(4) should be two different cases.
Either right before or right after vsync

Not sure google doc case is particularly good for testing.
We need some average times from normal browsing.

Comment 19

2 months ago
Or perhaps especially average times for typing, say, on Facebook.
(Assignee)

Comment 20

2 months ago
4: Process input event before vsync event
5: Process input event after vsync event

Manually test navigating http://www.bbc.com/news and get
(0) Max input delay 0.024296s, Average input delay 0.001853s
(1) Max input delay 0.542883s, Average input delay 0.114777s
(4) Max input delay 0.557435s, Average input delay 0.143875s
(5) Max input delay 0.549214s, Average input delay 0.131457s

Automatically test navigating http://www.bbc.com/news (10 times) and get
(0) Max input delay 0.548622s, Average input delay 0.007708s, Average time to finish test: 20716.326530612245
(1) Max input delay 2.347677s, Average input delay 0.173236s, Average time to finish test: 21363.265306122448
(4) Max input delay 2.747682s, Average input delay 0.194134s, Average time to finish test: 22322.448979591842
(5) Max input delay 2.464956s, Average input delay 0.201493s, Average time to finish test: 21057.142857142859

Comment 21

2 months ago
Navigation isn't perhaps the most interesting case, but how input events are handled once a page has been loaded. Say, typing to some input/textarea element.
(Assignee)

Comment 22

2 months ago
(In reply to Olli Pettay [:smaug] from comment #21)
> Navigation isn't perhaps the most interesting case, but how input events are
> handled once a page has been loaded. Say, typing to some input/textarea
> element.

I'm working on it but it may take a while to create an automated test.
(Assignee)

Comment 23

2 months ago
All these options dispatch input events with a different order to the vsync events. They are impacted when the vsync events fired to content are delayed. Found that the duration of vsync event varies on Linux and may delay for few seconds in some use cases.
(Assignee)

Comment 24

2 months ago
(Automatically) tested typing some words in google translator.
(0) Max input delay 0.082386s, Average input delay 0.00913s, Average time to finish test: 17767.346938775s
(1) Max input delay 0.636969s, Average input delay 0.24956s, Average time to finish test: 15432.653061224s
(4) Max input delay 8.811999s, Average input delay 4.62573s, Average time to finish test: 20334.693877551s
(5) Max input delay 8.856763s, Average input delay 4.49851s, Average time to finish test: 20469.387755102s
(Assignee)

Comment 25

2 months ago
(In reply to Ming-Chou Shih [:stone] from comment #23)
> All these options dispatch input events with a different order to the vsync
> events. They are impacted when the vsync events fired to content are
> delayed. Found that the duration of vsync event varies on Linux and may
> delay for few seconds in some use cases.

Need to check if this happens only on Linux or it also happens on other platforms. We may need to fix it or consider other option that doesn't rely on vsync event.
(Assignee)

Comment 26

2 months ago
(In reply to Ming-Chou Shih [:stone] from comment #24)
> (Automatically) tested typing some words in google translator.
> (0) Max input delay 0.082386s, Average input delay 0.00913s, Average time to
> finish test: 17767.346938775s
> (1) Max input delay 0.636969s, Average input delay 0.24956s, Average time to
> finish test: 15432.653061224s
> (4) Max input delay 8.811999s, Average input delay 4.62573s, Average time to
> finish test: 20334.693877551s
> (5) Max input delay 8.856763s, Average input delay 4.49851s, Average time to
> finish test: 20469.387755102s
Looks like consuming only one input event in a frame is not able to consume all input events on time. Plan to profile the intervals of real input events to double check it.

Comment 27

2 months ago
What are we btw measuring here. I think we need to measure from input event to the next paint.
(Assignee)

Comment 28

2 months ago
6: Consume all input events before vsync
7: Consume all input events after vsync

Add testing results of (6) and (7) and measure the last input event to the next paint for comparison.
(0) Max input delay 0.082386s, Average input delay 0.00913s, Time to finish: 17767.346938775s, Last input to next sync: 0.053598s
(1) Max input delay 0.636969s, Average input delay 0.24956s, Time to finish: 15432.653061224s, Last input to next sync: 0.003019s
(4) Max input delay 8.811999s, Average input delay 4.62573s, Time to finish: 20334.693877551s, Last input to next sync: 0.015905s
(5) Max input delay 8.856763s, Average input delay 4.49851s, Time to finish: 20469.387755102s, Last input to next sync: 1.183101s
(6) Max input delay 0.059571s, Average input delay 0.00193s, Time to finish: 16548.979591836s, Last input to next sync: 0.000009s
(7) Max input delay 0.060438s, Average input delay 0.00199s, Time to finish: 16553.061224489s, Last input to next sync: 1.200592s

Comment 29

2 months ago
What is the difference between 4 and 6 and between 5 and 7? Is it that 4 and 5 are just about processing one event?
(Assignee)

Comment 30

2 months ago
(In reply to Olli Pettay [:smaug] from comment #29)
> What is the difference between 4 and 6 and between 5 and 7? Is it that 4 and
> 5 are just about processing one event?
Yes. (4) and (5) only process one input event per frame.
The units on these numbers is seconds?  So max input delay of (6) is ~60ms?  And with no prioritization in case (0) max input delay is ~80ms?
(Assignee)

Comment 32

2 months ago
(In reply to Ben Kelly [not reviewing due to deadline][:bkelly] from comment #31)
> The units on these numbers is seconds?  So max input delay of (6) is ~60ms? 
> And with no prioritization in case (0) max input delay is ~80ms?

Yes. It's the timestamp when processing the input event on the content process - the timestamp when it's created on parent process.

Updated

2 months ago
Blocks: 1361067
(Assignee)

Comment 33

2 months ago
In the case of typing lots of words, handling too many input events in a frame may cause visually not smooth.

Comment 34

2 months ago
Some sources say that 335 characters per minute is what a professional typist may hit.
That is 5.6 per second, and 0.09 per animation frame. So I don't think we should be worried about handling too many key events
Mousemoves and touchmoves are the very high frequency input events.
(In reply to Olli Pettay [:smaug] from comment #34)
> Some sources say that 335 characters per minute is what a professional
> typist may hit.
> That is 5.6 per second, and 0.09 per animation frame.

This seems kind of low; maybe 335 chars/min (~= 60 words/min) for a sustained amount of time, but it's certainly possible to type much faster, more in the range of 90 words/min ~= 450 chars/min, or even higher.

Comment 36

2 months ago
That is still just 0.125 per frame, in other words very very low number comparing to possible mousemoves or touchmoves.
(Assignee)

Comment 37

2 months ago
Fixed some bugs of the implementation and re-tested them with the test case of typing some text in a textarea.
0: no event prioritization.
1: Consume all input events before the next N (=3) frames.
6: Consume all input events before vsync
7: Consume all input events after vsync
8: Consume all input events a bit earlier before vsync

0: max input delay=0.080807, average input delay=0.002954, average of last input to next sync=0.022376
1: max input delay=0.061146, average input delay=0.002286, average of last input to next sync=0.000073
6: max input delay=0.032695, average input delay=0.001306, average of last input to next sync=0.000047
7: max input delay=0.467425, average input delay=0.136106, average of last input to next sync=0.001153
8: max input delay=0.060646, average input delay=0.009806, average of last input to next sync=0.000270

Comment 38

2 months ago
Even mousemoves seem to be quite rare in child process (we do compress them in IPC layer).
I get usually only max 2 mousemoves per frame on linux.

data:text/html,<script>var c = 0; window.onmousemove = function() { ++c; }; function f() { if (c) {console.log(c); c = 0;} window.requestAnimationFrame(f); }; f(); </script>

We don't seem to compress touchmoves, so might be wort to test them a bit.

Comment 39

2 months ago
On windows I seem to get up to 4 mousemoves per frame, using laptop's touchpad.
(Assignee)

Comment 40

2 months ago
Found that it's caused by a bug in my implementation. When input events are added after vsync, then we may not receive vsync nor consume input events. Fixed it by setup a deadline to process input events when adding input events in the queue and checking if there isn't a deadline.

If we always consume input events in the current frame or in the next frame, we won't accumulate too many input events. Wondering should I worry about the case that some event takes a long time?

I'm doing more tests to observe if there are other factors should be taken into consideration.
(Assignee)

Comment 41

2 months ago
Found some problems when reviewing the testing results and did more analysis.

We won't always get the next vsync event immediately after handling input events. I think it's probably because there are no observers in nsRefreshDriver and the active timer is null. Moreover, we don't know which input event will trigger repaint and flush. It makes us harder to guess how earlier we should start to handle input events without delay the vsync event.

In the case of user triggers some input events while the browser isn't busy on something, we can't process input events before or after vsync event because there is no scheduled event. If we handle it immediately, then there are no obvious difference between the testing results of all options (when testing with normal use cases).

Here are the new test results of typing some words in a text area.
Option 0 (no prioritization)
Option 1 (handle input events after vsync. handle all queued input events evenly in the next N frames (N=5))
Option 6 (handle input events right before vsync)
Option 7 (handle input events right after vsync)
Option 8 (handle input events a little bit earlier before the next vsync)
0: max input delay=0.0441, average input delay=0.0021, average of last input to next vsync=0.0051
1: max input delay=0.0788, average input delay=0.0133, average of last input to next vsync=0.0116
6: max input delay=0.0925, average input delay=0.0183, average of last input to next vsync=0.0010
7: max input delay=0.0428, average input delay=0.0090, average of last input to next vsync=0.0036
8: max input delay=0.1462, average input delay=0.0137, average of last input to next vsync=0.0054

About option 6
Handling input events also take time and delay the vsync event.

About option 8
The time duration between last input to the next vsync event (0.0054s) is longer than options 6 and 7. With an inaccurate assumption of cost time for input events, it will delay the input event to the next frame and induce more delay.

Comment 42

2 months ago
Yes, we get vsync only if it has been requested. So only if there has been some layout/styling changes, or some animation happening or explicit requestAnimationFrame calls.

I would expect option 6 or 8 (probably 8) to lead to best results, and in case refresh driver isn't active, we could just process input events as soon as possible.

Comment 43

2 months ago
Could you test also wheel scrolling? 
I'm particularly interested in the Google Spreadsheets case, link is in
https://bugzilla.mozilla.org/show_bug.cgi?id=1338802#c0

Does option 6 or 8 affect positively to that? I'm hoping they would cause more wheel events to be merged to one.
Flags: needinfo?(sshih)

Comment 44

2 months ago
(and by testing I mean just some manual testing whether scrolling feels different.)
(Assignee)

Comment 45

2 months ago
(In reply to Olli Pettay [:smaug] from comment #44)
> (and by testing I mean just some manual testing whether scrolling feels
> different.)
Will scroll on the google spreadsheet slow? I tested with touchpad, scrolled the page with mouse cursor focus on a field of comment column. I can't find an obvious difference between option 6 and no event prioritization. Not sure if I miss anything when testing.
Flags: needinfo?(sshih)
(Assignee)

Comment 46

a month ago
Tested and profiled keyboard events for analysis.

Option 0 (no prioritization)
Option 6 (handle input events right before vsync)
Option 7 (handle input events right after vsync)
Option 8 (handle input events a little bit earlier before the next vsync)

Tested with simple use case (typing some words in a text area)
0: max input delay=0.0843s, average input delay=0.0036s, last input to next vsync=0.0066s, average interval of vsync=0.0327s
6: max input delay=0.0844s, average input delay=0.0033s, last input to next vsync=0.0055s, average interval of vsync=0.0318s
7: max input delay=0.0873s, average input delay=0.0032s, last input to next vsync=0.0065s, average interval of vsync=0.0311s
8: max input delay=0.0618s, average input delay=0.0021s, last input to next vsync=0.0062s, average interval of vsync=0.0301s
Option 6 gets the minimal duration from the last input to the next vsync event.
Option 8 gets the minimal input delays

Tested with the test case that triggers a lot of input events while UA is busy in rendering the page.
0: max input delay=0.3739s, average input delay=0.1363s, last input to next vsync=0.0123s, average interval of vsync=0.0737s
6: max input delay=0.3246s, average input delay=0.0918s, last input to next vsync=0.0000s, average interval of vsync=0.0806s
7: max input delay=0.4184s, average input delay=0.1232s, last input to next vsync=0.0107s, average interval of vsync=0.0267s
8: max input delay=0.4019s, average input delay=0.1095s, last input to next vsync=0.0166s, average interval of vsync=0.0383s
Option 6 gets the minimal duration from the last input to the next vsync event.
Option 6 and 8 get less input delays
Option 7 and 8 get less interval of vsync events
(Assignee)

Comment 47

a month ago
Revised option 8 with minimal and maximum values of reserved time for executing input events before each vsync event. Had a try and found some test failures. Analyzing.
(Assignee)

Comment 48

a month ago
We have different mechanisms to asynchronously synthesize input events and use some runnables which are supposed to be executed before or after the input events.

e.g.
browser_bug629172.js dispatches an asynchronous message to chrome process. Then the message is forwarded to content process. The message adds an input event listener to drive the test. Then the test case synthesizes input events to chrome process. Enable event prioritization changes the order of the asynchronous message and the input event, which causes the test failed.
Yeah, that isn't surprising. I would expect this work to break some tests, which then need to be fixed.

Updated

a month ago
Blocks: 1366881
Stone told me ha has a WIP quite ready for feedback. He is investigating the try failures to see if something major missed in his patches. Once he's done with that, he will be asking for feedback from more peers.
One thing we need to figure out is how to detect idle time when we have pending input events, since there probably is some idle time after refresh driver tick but before input events will be handled.

Also, a recent idle handling issue, bug 1368286, might be relevant here too - need to ensure HasPendingEvents works as expected.

Happy to help here.
(Assignee)

Comment 52

19 days ago
Created attachment 8875243 [details] [diff] [review]
WIP Part1 - Add include header to TimerThread.h to fix compile errors
Attachment #8857419 - Attachment is obsolete: true
(Assignee)

Comment 53

19 days ago
Created attachment 8875245 [details] [diff] [review]
WIP Part2 - Add a priority queue for input events
(Assignee)

Comment 54

19 days ago
Created attachment 8875248 [details] [diff] [review]
WIP Part3 - Synthesize native input events with priority

The test helper_touch_action_regions.html uses nsDOMWindowUtils to synthesize native input events and creates some runnables to continue the test. It expects the runnables to synthesize native input events are processed first, then the runnables to continue the test, and finally the input events forwarded from chrome to content. Enabling event prioritization may change the execution order. Wraps those runnables to synthesize input events as priority=input and dispatches those runnables to continue the test with priority=input to make sure the execution order is as expected.
(Assignee)

Comment 55

19 days ago
Created attachment 8875250 [details] [diff] [review]
WIP Part4 - Revise those test cases that have some tasks have to be processed before or after the synthesized key events
(Assignee)

Comment 56

19 days ago
Created attachment 8875251 [details] [diff] [review]
WIP Part5 - Revise test_bug1261673.html to continue the test after processing the wheel event.

Enabling event prioritization on content process may change the order of synthesized input event and the posted message. Instead of posting a message, uses the event listener to resolve it.
(Assignee)

Comment 57

19 days ago
Created attachment 8875253 [details] [diff] [review]
WIP Part6 - Revise browser_toolbox_races.js to avoid jam content process by the synthesized input events

When the event prioritization is enabled on content process, we'll reserve some time to process input events in each frame. In that case, the synthesized input events delay the execution of normal events a lot and cause the test timeout.
(Assignee)

Comment 58

19 days ago
Created attachment 8875263 [details] [diff] [review]
WIP Part2 - Add a priority queue for input events
Attachment #8875245 - Attachment is obsolete: true
(Assignee)

Comment 59

19 days ago
(In reply to Olli Pettay [:smaug] from comment #51)
> One thing we need to figure out is how to detect idle time when we have
> pending input events, since there probably is some idle time after refresh
> driver tick but before input events will be handled.
> 
> Also, a recent idle handling issue, bug 1368286, might be relevant here too
> - need to ensure HasPendingEvents works as expected.
> 
> Happy to help here.

Thanks for the information. I'll rebase my patches and check the patch of 1368286.
It seems like if we have input events maybe we shouldn't process any idle runnables.

Also, see this change chrome made:

https://bugs.chromium.org/p/chromium/issues/detail?id=653352
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #60)
> It seems like if we have input events maybe we shouldn't process any idle
> runnables.
Why? If we have say one input event pending and we're going to fire it at time tick - 1, we may have
15ms for nothing. 

In case there are no pending ticks, then input events should be handled before idle.
(In reply to Olli Pettay [:smaug] from comment #61)
> (In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #60)
> > It seems like if we have input events maybe we shouldn't process any idle
> > runnables.
> Why? If we have say one input event pending and we're going to fire it at
> time tick - 1, we may have
> 15ms for nothing. 

I guess it depends how much we trust idle callbacks to actually stay within their deadline bound.
(Assignee)

Comment 63

18 days ago
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #62)
> (In reply to Olli Pettay [:smaug] from comment #61)
> > (In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #60)
> > > It seems like if we have input events maybe we shouldn't process any idle
> > > runnables.
> > Why? If we have say one input event pending and we're going to fire it at
> > time tick - 1, we may have
> > 15ms for nothing. 
> 
> I guess it depends how much we trust idle callbacks to actually stay within
> their deadline bound.

Maybe we could use time tick - 1 as soft criteria to process pending input events. But we also process input events if there are no other pending events, then we process idle callbacks. So that we still process all pending events before idle callbacks as current behavior.
(Assignee)

Comment 64

18 days ago
Created attachment 8875653 [details] [diff] [review]
WIP Part2 - Add a priority queue for input events
Attachment #8875263 - Attachment is obsolete: true
(Assignee)

Comment 65

17 days ago
Created attachment 8876086 [details] [diff] [review]
WIP Part2 - Add a priority queue for input events
Attachment #8875653 - Attachment is obsolete: true
(Assignee)

Updated

14 days ago
Attachment #8875653 - Flags: feedback?(bugs)
(Assignee)

Updated

14 days ago
Attachment #8875653 - Flags: feedback?(bugs)
(Assignee)

Updated

14 days ago
Attachment #8876086 - Flags: feedback?(bugs)
(Assignee)

Comment 66

12 days ago
Created attachment 8877380 [details] [diff] [review]
WIP Part4 - Revise those test cases that have some tasks have to be processed before or after the synthesized key events

Fixed more failures
Attachment #8875250 - Attachment is obsolete: true
(Assignee)

Comment 67

12 days ago
Created attachment 8877381 [details] [diff] [review]
WIP Part3 - Synthesize native input events with priority

Rebased
Attachment #8875248 - Attachment is obsolete: true
(Assignee)

Comment 68

12 days ago
Created attachment 8877382 [details] [diff] [review]
WIP Part1 - Add include header to TimerThread.h to fix compile errors

Rebased
Attachment #8875243 - Attachment is obsolete: true
(Assignee)

Comment 69

12 days ago
Comment on attachment 8877382 [details] [diff] [review]
WIP Part1 - Add include header to TimerThread.h to fix compile errors

This patch fixes the compile errors of missing the include header.
Attachment #8877382 - Flags: feedback?(bugs)
(Assignee)

Comment 70

12 days ago
Created attachment 8877386 [details] [diff] [review]
WIP Part3 - Synthesize native input events with priority

The test helper_touch_action_regions.html uses nsDOMWindowUtils to synthesize native input events and creates some runnables to trigger the test. It expects the runnables which synthesize native input events are processed first, then the runnables to continue the test, and finally the input events are forwarded from chrome process to content process. Enabling event prioritization may change the execution order. Wraps those runnables to synthesize native input events as priority=input and dispatches those runnables to continue the test with priority=input to make sure the execution order is as expected.
Attachment #8877381 - Attachment is obsolete: true
(Assignee)

Updated

12 days ago
Attachment #8877386 - Flags: feedback?(bugs)
(Assignee)

Comment 71

12 days ago
Comment on attachment 8877380 [details] [diff] [review]
WIP Part4 - Revise those test cases that have some tasks have to be processed before or after the synthesized key events

This patch revises the test cases that synthesize input events to chrome process and spawn a task on the content process to check the element attributes. Enabling event prioritization may change the order of the input events that are forwarded from chrome to content and the spawned tasks to check testing results.
Attachment #8877380 - Flags: feedback?(bugs)
(Assignee)

Comment 72

12 days ago
Comment on attachment 8875251 [details] [diff] [review]
WIP Part5 - Revise test_bug1261673.html to continue the test after processing the wheel event.

Enabling event prioritization on content process may change the order of synthesized input event and the posted message. Instead of posting a message, uses the event listener to resolve it.
Attachment #8875251 - Flags: feedback?(bugs)
(Assignee)

Comment 73

12 days ago
Created attachment 8877390 [details] [diff] [review]
WIP Part6 - Revise browser_toolbox_races.js to avoid jam content process by the synthesized input events

When the event prioritization is enabled, we'll reserve some time to process input events in each frame. In that case, the synthesized input events delay the execution of normal events a lot and cause the test timeout.
Attachment #8875253 - Attachment is obsolete: true
Attachment #8877390 - Flags: feedback?(bugs)
Ehsan and I were testing input latency on Twitter direct messages and he was wondering if your patches help here, Stone. I filed the overpainting I'm seeing there as bug 1373023.
Flags: needinfo?(sshih)
I'll try to get to this tomorrow.
(Assignee)

Comment 76

11 days ago
(In reply to Andrew Overholt [:overholt] from comment #74)
> Ehsan and I were testing input latency on Twitter direct messages and he was
> wondering if your patches help here, Stone. I filed the overpainting I'm
> seeing there as bug 1373023.

Had a try without my patches but can not find an obvious input latency. Wondering if any steps I missed or it's because of different hardware?
Flags: needinfo?(sshih)

Updated

11 days ago
Attachment #8875251 - Flags: feedback?(bugs) → feedback+
Comment on attachment 8875251 [details] [diff] [review]
WIP Part5 - Revise test_bug1261673.html to continue the test after processing the wheel event.

I guess r+
Attachment #8875251 - Flags: review+
(In reply to Ming-Chou Shih [:stone] from comment #76)
> (In reply to Andrew Overholt [:overholt] from comment #74)
> > Ehsan and I were testing input latency on Twitter direct messages and he was
> > wondering if your patches help here, Stone. I filed the overpainting I'm
> > seeing there as bug 1373023.
> 
> Had a try without my patches but can not find an obvious input latency.
> Wondering if any steps I missed or it's because of different hardware?

Perhaps. Can you point me at a Windows (preferably PGO, non-debug) build with your patches so I can try it locally?
Flags: needinfo?(sshih)
Comment on attachment 8877390 [details] [diff] [review]
WIP Part6 - Revise browser_toolbox_races.js to avoid jam content process by the synthesized input events

Maybe someone more familiar could give feedback, or even review this.

One thing to remember that Promise scheduling will change eventually, so
perhaps safer to do something like
setTimeout(resolve) in Promises, not just resolve().
Attachment #8877390 - Flags: feedback?(bugs) → feedback?(pbrosset)
Comment on attachment 8877382 [details] [diff] [review]
WIP Part1 - Add include header to TimerThread.h to fix compile errors

not sure why, but fine.
Attachment #8877382 - Flags: feedback?(bugs) → review+
Comment on attachment 8877380 [details] [diff] [review]
WIP Part4 - Revise those test cases that have some tasks have to be processed before or after the synthesized key events

I don't understand the naming here. What is sync? I see a promise which is then resolved.
Would you rename the functions to something like
waitForNativeMouseEvent etc. With that, r+
Attachment #8877380 - Flags: feedback?(bugs) → review+
Comment on attachment 8876086 [details] [diff] [review]
WIP Part2 - Add a priority queue for input events


>+++ b/dom/ipc/PContent.ipdl
>@@ -326,19 +326,19 @@ both:
>     // type PopupIPCTabContext.  The parent checks that if the opener is a
>     // browser element, the context is also for a browser element.
>     //
>     // If |sameTabGroupAs| is non-zero, the new tab should go in the same
>     // TabGroup as |sameTabGroupAs|. This parameter should always be zero
>     // for PBrowser messages sent from the child to the parent.
>     //
>     // Keep the last 3 attributes in sync with GetProcessAttributes!
>-    async PBrowser(TabId tabId, TabId sameTabGroupAs,
>-                   IPCTabContext context, uint32_t chromeFlags,
>-                   ContentParentId cpId, bool isForBrowser);
>+    prio(high) async PBrowser(TabId tabId, TabId sameTabGroupAs,
>+                              IPCTabContext context, uint32_t chromeFlags,
>+                              ContentParentId cpId, bool isForBrowser);
This change has nothing to do with this bug, so please remove.


>@@ -417,16 +419,17 @@ TabChild::TabChild(nsIContentChild* aMan
>     }
>   };
> 
>   // preloaded TabChild should not be added to child map
>   if (mUniqueId) {
>     MOZ_ASSERT(NestedTabChildMap().find(mUniqueId) == NestedTabChildMap().end());
>     NestedTabChildMap()[mUniqueId] = this;
>   }
>+  nsThreadManager::get().EnableMainThreadEventPrioritization();
This looks quite odd place to call this.
Should be somewhere in ContentChild

>@@ -1335,16 +1338,17 @@ TabChild::RecvHandleTap(const GeckoConte
> {
>   nsCOMPtr<nsIPresShell> presShell = GetPresShell();
>   if (!presShell) {
>     return IPC_OK();
>   }
>   if (!presShell->GetPresContext()) {
>     return IPC_OK();
>   }
>+  AutoTimeDurationHelper helper;
It is error prone to add AutoTimeDurationHelper to each prio(input) implementation.
I would do the similar thing while handling input events in thread level

>+++ b/ipc/chromium/src/chrome/common/ipc_message.h
billm should give feedback to ipc internal stuff

>@@ -41,18 +41,20 @@ class Message : public Pickle {
> 
>   enum NestedLevel {
>     NOT_NESTED = 1,
>     NESTED_INSIDE_SYNC = 2,
>     NESTED_INSIDE_CPOW = 3
>   };
> 
>   enum PriorityValue {
>-    NORMAL_PRIORITY,
>-    HIGH_PRIORITY,
>+    NORMAL_PRIORITY = 0,
>+    INPUT_PRIORITY = 1,
>+    VSYNC_PRIORITY = 2,
>+    HIGH_PRIORITY = 3,
I don't understand what is VSYNC vs HIGH. Do we really need them both? I would expect high being enough
and keep its behavior as it is. Let's not make this bug more complicated than it is.



>+++ b/layout/base/nsRefreshDriver.cpp
>@@ -253,16 +253,26 @@ public:
>       return aDefault;
>     }
> 
>     idleEnd = idleEnd - TimeDuration::FromMilliseconds(
>       nsLayoutUtils::IdlePeriodDeadlineLimit());
>     return idleEnd < aDefault ? idleEnd : aDefault;
>   }
> 
>+  Maybe<TimeStamp> GetNextTickHint()
>+  {
>+    MOZ_ASSERT(NS_IsMainThread());
>+    if (LastTickSkippedAnyPaints()) {
>+      return Nothing();
>+    }
As bug 1373085 hints, LastTickSkippedAnyPaints() check is wrong.


>+    TimeStamp nextTick = MostRecentRefresh() + GetTimerRate();
>+    return nextTick < TimeStamp::Now() ? Nothing() : Some(nextTick);
>+  }
I think we could just always do this, and not do anything with LastTickSkippedAnyPaints...
but I haven't yet read how this value is used.

>+PrioritizedEventQueue::GetEvent(bool aMayWait, nsIRunnable** aEvent,
>+                                unsigned short* aPriority,
>+                                MutexAutoLock& aProofOfLock)
>+{
>+  MOZ_ASSERT(mIsReadyForPrioritization);
>+  bool retVal = false;
>+  do {
>+    // Always process high priority events first. Assuming there are only few
>+    // high priority events.
>+    retVal = mHighQueue->GetEvent(false, aEvent, aProofOfLock);
>+    if (*aEvent) {
>+      SetPriorityIfNotNull(aPriority, nsIRunnablePriority::PRIORITY_HIGH);
>+      return retVal;
>+    }
>+
>+    // Use mProcessVsyncQueueRunnable to prevent the vsync events from consuming
>+    // all cpu time on slow hardware and causing starvation.
>+    if (mProcessVsyncQueueRunnable) {
>+      mProcessVsyncQueueRunnable = false;
>+      retVal = mVsyncQueue->GetEvent(false, aEvent, aProofOfLock);
>+      MOZ_ASSERT(*aEvent);
>+      mHandleInputDeadline =
>+        EventStatistics::Get()
>+          .GetHandleInputDeadline(mInputQueue->Count(aProofOfLock));
How does this work? We take vsync from queue, but don't process it. Then we call GetHandleInputDeadline, which returns something related to the next tick, 
which is something we're just about to handle, since we have vsync. I think I'm missing something here


>+      return retVal;
>+    }
>+    mProcessVsyncQueueRunnable = mVsyncQueue->HasPendingEvent(aProofOfLock);
>+
>+    uint32_t pendingInputCount = mInputQueue->Count(aProofOfLock);
>+    if (pendingInputCount > 0 && TimeStamp::Now() > mHandleInputDeadline) {
...oh, deadline mean something else I thought. It is not how long to handle, but only that after it we should handle input events.
But I still don't understand why we call GetHandleInputDeadline just before handling vsync.
GetNextTickHint return almost always something very close to Now(). Maybe I'm missing something.


>+    // We don't want to wait if there are some high priority events in the
>+    // queue.
>+    bool reallyMayWait = aMayWait && !mProcessVsyncQueueRunnable &&
>+                         pendingInputCount == 0;
The comment talks about high prio, but check vsync

> nsThread::nsChainedEventQueue::GetEvent(bool aMayWait, nsIRunnable** aEvent,
>                                         unsigned short* aPriority,
>-                                        mozilla::MutexAutoLock& aProofOfLock)
>+                                        MutexAutoLock& aProofOfLock)
> {
>+  if (mSecondaryQueue && mSecondaryQueue->IsReadyForPrioritization()) {
I'm having trouble to see why we have still this vsync queue in nsThread::nsChainedEventQueue


>+nsThread::nsChainedEventQueue::PutEvent(already_AddRefed<nsIRunnable> aEvent,
>+                                        MutexAutoLock& aProofOfLock)
>+{
>+  if (mSecondaryQueue) {
>+    mSecondaryQueue.get()->PutEvent(Move(aEvent), aProofOfLock);
>+  } else {
>+    mNormalQueue->PutEvent(Move(aEvent), aProofOfLock);
>+  }
> }
... oh, secondary queue implements all the prioritization, and normal queue won't be used then at all.
Could we perhaps merge PrioritizedEventQueue to nsChainedEventQueue, and just not use input event queue if it isn't enabled for that thread?


Looks reasonable. I would simplify this a bit, and support only those prios we actually need here.
However, I don't understand GetHandleInputDeadline usage.
Attachment #8876086 - Flags: feedback?(bugs) → feedback+

Updated

10 days ago
Attachment #8877386 - Flags: feedback?(bugs) → feedback+
Comment on attachment 8877390 [details] [diff] [review]
WIP Part6 - Revise browser_toolbox_races.js to avoid jam content process by the synthesized input events

Review of attachment 8877390 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me, but I don't know this test or code very well. Alex wrote it, but he's on PTO until end of next week. So I'll flag jryans in the meantime. If this can wait for a little more than a week, I suggest flagging ochameau for review.
Attachment #8877390 - Flags: feedback?(pbrosset)
Attachment #8877390 - Flags: feedback?(jryans)
Attachment #8877390 - Flags: feedback+
(Assignee)

Comment 84

10 days ago
(In reply to Andrew Overholt [:overholt] from comment #78)
> (In reply to Ming-Chou Shih [:stone] from comment #76)
> > (In reply to Andrew Overholt [:overholt] from comment #74)
> > > Ehsan and I were testing input latency on Twitter direct messages and he was
> > > wondering if your patches help here, Stone. I filed the overpainting I'm
> > > seeing there as bug 1373023.
> > 
> > Had a try without my patches but can not find an obvious input latency.
> > Wondering if any steps I missed or it's because of different hardware?
> 
> Perhaps. Can you point me at a Windows (preferably PGO, non-debug) build
> with your patches so I can try it locally?

Please let me know if this doesn't work for you. Thanks.
https://treeherder.mozilla.org/#/jobs?repo=try&author=sshih@mozilla.com&selectedJob=107663461
Flags: needinfo?(sshih)
Comment on attachment 8877390 [details] [diff] [review]
WIP Part6 - Revise browser_toolbox_races.js to avoid jam content process by the synthesized input events

Review of attachment 8877390 [details] [diff] [review]:
-----------------------------------------------------------------

Hmm, so this might be okay, but I don't really follow what the actual issue you encountered was and why these changes fix it.

Can you please add some comments to the test to explain why you've done what you have, so it's clear to next reader why the version is needed?
Attachment #8877390 - Flags: feedback?(jryans) → feedback-
(Assignee)

Comment 86

7 days ago
Created attachment 8879059 [details] [diff] [review]
Part6 - Revise browser_toolbox_races.js to avoid jam content process by the synthesized input events

Added some comments and revised the issue addressed in comment#79
Attachment #8877390 - Attachment is obsolete: true
(Assignee)

Comment 87

7 days ago
Comment on attachment 8879059 [details] [diff] [review]
Part6 - Revise browser_toolbox_races.js to avoid jam content process by the synthesized input events

I'd like to request your feedback if the comments are clear? Thanks.
Attachment #8879059 - Flags: feedback?(jryans)
(Assignee)

Comment 88

7 days ago
Created attachment 8879061 [details] [diff] [review]
Part2 - Add a priority queue for input events
Attachment #8876086 - Attachment is obsolete: true
(Assignee)

Comment 89

7 days ago
(In reply to Olli Pettay [:smaug] from comment #82)
> Comment on attachment 8876086 [details] [diff] [review]
> WIP Part2 - Add a priority queue for input events
> 
> 
> >+++ b/dom/ipc/PContent.ipdl
> >@@ -326,19 +326,19 @@ both:
> >     // type PopupIPCTabContext.  The parent checks that if the opener is a
> >     // browser element, the context is also for a browser element.
> >     //
> >     // If |sameTabGroupAs| is non-zero, the new tab should go in the same
> >     // TabGroup as |sameTabGroupAs|. This parameter should always be zero
> >     // for PBrowser messages sent from the child to the parent.
> >     //
> >     // Keep the last 3 attributes in sync with GetProcessAttributes!
> >-    async PBrowser(TabId tabId, TabId sameTabGroupAs,
> >-                   IPCTabContext context, uint32_t chromeFlags,
> >-                   ContentParentId cpId, bool isForBrowser);
> >+    prio(high) async PBrowser(TabId tabId, TabId sameTabGroupAs,
> >+                              IPCTabContext context, uint32_t chromeFlags,
> >+                              ContentParentId cpId, bool isForBrowser);
> This change has nothing to do with this bug, so please remove.
Since the input events may preempt the normal events, I have to put it as higher priority than the input events to ensure the PBrowserChild is created before dispatching any events to it.

> 
> 
> >@@ -417,16 +419,17 @@ TabChild::TabChild(nsIContentChild* aMan
> >     }
> >   };
> > 
> >   // preloaded TabChild should not be added to child map
> >   if (mUniqueId) {
> >     MOZ_ASSERT(NestedTabChildMap().find(mUniqueId) == NestedTabChildMap().end());
> >     NestedTabChildMap()[mUniqueId] = this;
> >   }
> >+  nsThreadManager::get().EnableMainThreadEventPrioritization();
> This looks quite odd place to call this.
> Should be somewhere in ContentChild
Moved to ContentChild::RecvPBrowserConstructor.

> 
> >@@ -1335,16 +1338,17 @@ TabChild::RecvHandleTap(const GeckoConte
> > {
> >   nsCOMPtr<nsIPresShell> presShell = GetPresShell();
> >   if (!presShell) {
> >     return IPC_OK();
> >   }
> >   if (!presShell->GetPresContext()) {
> >     return IPC_OK();
> >   }
> >+  AutoTimeDurationHelper helper;
> It is error prone to add AutoTimeDurationHelper to each prio(input)
> implementation.
> I would do the similar thing while handling input events in thread level
Revised and handled it before nsThread run a event with priority = input.

> 
> >+++ b/ipc/chromium/src/chrome/common/ipc_message.h
> billm should give feedback to ipc internal stuff
> 
> >@@ -41,18 +41,20 @@ class Message : public Pickle {
> > 
> >   enum NestedLevel {
> >     NOT_NESTED = 1,
> >     NESTED_INSIDE_SYNC = 2,
> >     NESTED_INSIDE_CPOW = 3
> >   };
> > 
> >   enum PriorityValue {
> >-    NORMAL_PRIORITY,
> >-    HIGH_PRIORITY,
> >+    NORMAL_PRIORITY = 0,
> >+    INPUT_PRIORITY = 1,
> >+    VSYNC_PRIORITY = 2,
> >+    HIGH_PRIORITY = 3,
> I don't understand what is VSYNC vs HIGH. Do we really need them both? I
> would expect high being enough
> and keep its behavior as it is. Let's not make this bug more complicated
> than it is.
It's about ensuring PBrowserChild is created before dispatching any events to it. Can't put the PBrwoserChild creation message in the same queue as the vsync events since We may yield to normal events.
> 
> 
> 
> >+++ b/layout/base/nsRefreshDriver.cpp
> >@@ -253,16 +253,26 @@ public:
> >       return aDefault;
> >     }
> > 
> >     idleEnd = idleEnd - TimeDuration::FromMilliseconds(
> >       nsLayoutUtils::IdlePeriodDeadlineLimit());
> >     return idleEnd < aDefault ? idleEnd : aDefault;
> >   }
> > 
> >+  Maybe<TimeStamp> GetNextTickHint()
> >+  {
> >+    MOZ_ASSERT(NS_IsMainThread());
> >+    if (LastTickSkippedAnyPaints()) {
> >+      return Nothing();
> >+    }
> As bug 1373085 hints, LastTickSkippedAnyPaints() check is wrong.
> 
> 
> >+    TimeStamp nextTick = MostRecentRefresh() + GetTimerRate();
> >+    return nextTick < TimeStamp::Now() ? Nothing() : Some(nextTick);
> >+  }
> I think we could just always do this, and not do anything with
> LastTickSkippedAnyPaints...
> but I haven't yet read how this value is used.
Revised.
> 
> >+PrioritizedEventQueue::GetEvent(bool aMayWait, nsIRunnable** aEvent,
> >+                                unsigned short* aPriority,
> >+                                MutexAutoLock& aProofOfLock)
> >+{
> >+  MOZ_ASSERT(mIsReadyForPrioritization);
> >+  bool retVal = false;
> >+  do {
> >+    // Always process high priority events first. Assuming there are only few
> >+    // high priority events.
> >+    retVal = mHighQueue->GetEvent(false, aEvent, aProofOfLock);
> >+    if (*aEvent) {
> >+      SetPriorityIfNotNull(aPriority, nsIRunnablePriority::PRIORITY_HIGH);
> >+      return retVal;
> >+    }
> >+
> >+    // Use mProcessVsyncQueueRunnable to prevent the vsync events from consuming
> >+    // all cpu time on slow hardware and causing starvation.
> >+    if (mProcessVsyncQueueRunnable) {
> >+      mProcessVsyncQueueRunnable = false;
> >+      retVal = mVsyncQueue->GetEvent(false, aEvent, aProofOfLock);
> >+      MOZ_ASSERT(*aEvent);
> >+      mHandleInputDeadline =
> >+        EventStatistics::Get()
> >+          .GetHandleInputDeadline(mInputQueue->Count(aProofOfLock));
> How does this work? We take vsync from queue, but don't process it. Then we
> call GetHandleInputDeadline, which returns something related to the next
> tick, 
> which is something we're just about to handle, since we have vsync. I think
> I'm missing something here
That's a bug. I had revised the patch to fetch the deadline in the first time we are going to handle an input event in each frame.
> 
> 
> >+      return retVal;
> >+    }
> >+    mProcessVsyncQueueRunnable = mVsyncQueue->HasPendingEvent(aProofOfLock);
> >+
> >+    uint32_t pendingInputCount = mInputQueue->Count(aProofOfLock);
> >+    if (pendingInputCount > 0 && TimeStamp::Now() > mHandleInputDeadline) {
> ...oh, deadline mean something else I thought. It is not how long to handle,
> but only that after it we should handle input events.
> But I still don't understand why we call GetHandleInputDeadline just before
> handling vsync.
> GetNextTickHint return almost always something very close to Now(). Maybe
> I'm missing something.
> 
> 
> >+    // We don't want to wait if there are some high priority events in the
> >+    // queue.
> >+    bool reallyMayWait = aMayWait && !mProcessVsyncQueueRunnable &&
> >+                         pendingInputCount == 0;
> The comment talks about high prio, but check vsync
Revised.
> 
> > nsThread::nsChainedEventQueue::GetEvent(bool aMayWait, nsIRunnable** aEvent,
> >                                         unsigned short* aPriority,
> >-                                        mozilla::MutexAutoLock& aProofOfLock)
> >+                                        MutexAutoLock& aProofOfLock)
> > {
> >+  if (mSecondaryQueue && mSecondaryQueue->IsReadyForPrioritization()) {
> I'm having trouble to see why we have still this vsync queue in
> nsThread::nsChainedEventQueue
> 
> 
> >+nsThread::nsChainedEventQueue::PutEvent(already_AddRefed<nsIRunnable> aEvent,
> >+                                        MutexAutoLock& aProofOfLock)
> >+{
> >+  if (mSecondaryQueue) {
> >+    mSecondaryQueue.get()->PutEvent(Move(aEvent), aProofOfLock);
> >+  } else {
> >+    mNormalQueue->PutEvent(Move(aEvent), aProofOfLock);
> >+  }
> > }
> ... oh, secondary queue implements all the prioritization, and normal queue
> won't be used then at all.
> Could we perhaps merge PrioritizedEventQueue to nsChainedEventQueue, and
> just not use input event queue if it isn't enabled for that thread?
Revised.
> 
> 
> Looks reasonable. I would simplify this a bit, and support only those prios
> we actually need here.
> However, I don't understand GetHandleInputDeadline usage.
GetHandleInputDeadline is when we should start processing input events. It's the next tick - predicted time cost for handling input events.
(Assignee)

Updated

7 days ago
Attachment #8879061 - Flags: review?(bugs)
(In reply to Ming-Chou Shih [:stone] from comment #89)
> Since the input events may preempt the normal events, I have to put it as
> higher priority than the input events to ensure the PBrowserChild is created
> before dispatching any events to it.
oh, I see.


> > >+  nsThreadManager::get().EnableMainThreadEventPrioritization();
> > This looks quite odd place to call this.
> > Should be somewhere in ContentChild
> Moved to ContentChild::RecvPBrowserConstructor.
Why so late? Why not when ContentChild is created?
Comment on attachment 8879059 [details] [diff] [review]
Part6 - Revise browser_toolbox_races.js to avoid jam content process by the synthesized input events

Review of attachment 8879059 [details] [diff] [review]:
-----------------------------------------------------------------

Hmm, it's still pretty mysterious for future readers I feel, as it seems quite hard to hard to know in advance that this approach would work when the previous one did not.

But, the test is trying to force a race, which is a bit tricky to do anyway, so I guess it's acceptable to need to subtle tweaks like this.

::: devtools/client/framework/test/browser_toolbox_races.js
@@ +43,5 @@
> +        toggle();
> +        toggle();
> +      } else {
> +        gDevTools.off("toolbox-created", onCreated);
> +        setTimeout(resolve);

Why is `setTimeout` needed?  Is that important for either the test or the input prioritization?  Could you just call `resolve` directly?

@@ +60,5 @@
> +        toggle();
> +        toggle();
> +      } else {
> +        gDevTools.off("toolbox-destroyed", onDestroyed);
> +        setTimeout(resolve);

Same question from above.

@@ +80,4 @@
>    info("Toggled the toolbox 3 times");
>  
>    // Now wait for the 3rd toolbox to be fully ready before closing it.
>    // We close the last toolbox manually, out of the first while() loop to

First while loop removed, please update this comment.
Attachment #8879059 - Flags: feedback?(jryans) → feedback+
Comment on attachment 8879061 [details] [diff] [review]
Part2 - Add a priority queue for input events

># HG changeset patch
># User Stone Shih <sshih@mozilla.com>
># Date 1490082252 -28800
>#      二  3月 21 15:44:12 2017 +0800
># Node ID 258278bb68bc2667b37d42da50fa99ea36a2f247
># Parent  2941f7832a5efcc6c16e92e53f4c87e167790e6c
>Add a priority queue for input events
>
>MozReview-Commit-ID: EXxpViVt3Cd
>
>diff --git a/dom/ipc/ContentChild.cpp b/dom/ipc/ContentChild.cpp
>--- a/dom/ipc/ContentChild.cpp
>+++ b/dom/ipc/ContentChild.cpp
>@@ -1609,16 +1609,17 @@ ContentChild::RecvPBrowserConstructor(PB
> 
>   static bool hasRunOnce = false;
>   if (!hasRunOnce) {
>     hasRunOnce = true;
>     MOZ_ASSERT(!gFirstIdleTask);
>     RefPtr<CancelableRunnable> firstIdleTask = NewCancelableRunnableFunction(FirstIdle);
>     gFirstIdleTask = firstIdleTask;
>     NS_IdleDispatchToCurrentThread(firstIdleTask.forget());
>+    nsThreadManager::get().EnableMainThreadEventPrioritization();
This feels still very late place to call this.
Shouldn't you add prioritize_main_thread_events.enable pref to ContentPrefs.cpp and then
in ContentChild::InitXPCOM call EnableMainThreadEventPrioritization



So you don't change ParentProcessVsyncNotifier to use vsync priority. It keeps using high priority which
gets now totally different meaning.

>+// Control the event prioritization on content main thread
>+#ifdef NIGHTLY_BUILD
>+pref("prioritize_main_thread_events.enable", true);
>+#else
>+pref("prioritize_main_thread_events.enable", false);
>+#endif
Could we call the pref something else, since this doesn't affect to vsync, and it isn't prioritizing all the main thread
events.
prioritized_input_events.enabled ?
(I know it doesn't cover 'high', but that should be used normally)



>+
>+// The maximum and minimum time (milliseconds) we reserve for handling input
>+// events in each frame.
>+pref("prioritize_main_thread_events.input_duration.max", 8);
>+pref("prioritize_main_thread_events.input_duration.min", 1);
Then these could be something like prioritized_input_events.duration.max prioritized_input_events.duration.min

>+
>+// The default amount of time (milliseconds) required for handling a input
>+// event.
>+pref("prioritize_main_thread_events.default_duration_per_input", 1);
prioritized_input_events.default_duration_per_event


>+// The number of processed input events we use to predict the amount of time
>+// required to process the following input events.
>+pref("prioritize_main_thread_events.input_count_for_prediction", 9);
prioritized_input_events.count_for_prediction

>+class EventStatistics
Could you call this InputEventStatistics, and also the file names

>+  public:
>+    TimeDurationCircularBuffer(uint32_t aSize, TimeDuration& aDefaultValue)
>+      : mSize(aSize)
>+      , mCurrentIndex(0)
>+    {
>+      mSize = mSize == 0 ? sInputCountForPrediction : mSize;
>+      for (int16_t index = 0; index < mSize; ++index) {
>+        mBuffer.AppendElement(aDefaultValue);
>+        mTotal += aDefaultValue;
>+      }
>+    }
>+    void Insert(TimeDuration& aDuration)
>+    {
>+      mTotal += (aDuration - mBuffer[mCurrentIndex]);
>+      mBuffer[mCurrentIndex++] = aDuration;
>+      if (mCurrentIndex == mSize) {
>+        mCurrentIndex = 0;
>+      }
>+    }
>+    TimeDuration GetMean();
Could you add a new line between methods. Would ease reading.

>+class MOZ_RAII AutoTimeDurationHelper final
>+{
>+public:
>+  AutoTimeDurationHelper()
>+  {
>+    mHappened = TimeStamp::Now();
>+  }
>+
>+  ~AutoTimeDurationHelper()
>+  {
>+    EventStatistics::Get().UpdateInputDuration(TimeStamp::Now() - mHappened);
>+  }
>+
>+private:
>+  TimeStamp mHappened;
mStartTime


>+    // Use mProcessVsyncQueueRunnable to prevent the vsync events from consuming
>+    // all cpu time and causing starvation.
>+    if (mProcessVsyncQueueRunnable) {
>+      MOZ_ASSERT(mVsyncQueue->HasPendingEvent(aProofOfLock));
>+      retVal = mVsyncQueue->GetEvent(false, aEvent, aProofOfLock);
>+      MOZ_ASSERT(*aEvent);
>+      SetPriorityIfNotNull(aPriority, nsIRunnablePriority::PRIORITY_HIGH);
We just took event from vsync queue, but assign priority HIGH here. Should be VSYNC


>+    mProcessVsyncQueueRunnable = mVsyncQueue->HasPendingEvent(aProofOfLock);
>+
>+    uint32_t pendingInputCount = mInputQueue->Count(aProofOfLock);
>+    if (pendingInputCount > 0) {
>+      if (mHandleInputDeadline.IsNull()) {
>+        mHandleInputDeadline =
>+          EventStatistics::Get()
>+            .GetHandleInputDeadline(mInputQueue->Count(aProofOfLock));
>+      }
>+      if (TimeStamp::Now() > mHandleInputDeadline) {
I think use of "deadline" here and elsewhere is surprising, since it is the start time for input event handling.
Could you use mInputHandlingStartTime or such, and similar for the GetHandleInputDeadline() method?


>+    // We don't want to wait if there are some high priority runnables or input
>+    // events in the queue.
>+    bool reallyMayWait = aMayWait && !mProcessVsyncQueueRunnable &&
>+                         pendingInputCount == 0;
Fix the comment. It talks about high priority, yet the code is about vsync.


>+nsThread::nsChainedEventQueue::GetNonPrioritizedEvent(bool aMayWait,
>+                                                      nsIRunnable** aEvent,
>+                                                      unsigned short* aPriority,
>+                                                      MutexAutoLock& aProofOfLock)
>+{
>+  bool retVal = false;
>+  // Get an event from mNormalQueue since all events are put in it when event
>+  // prioritization is disabled.
>+  do {
>+    retVal = mNormalQueue->GetEvent(aMayWait, aEvent, aProofOfLock);
>+    if (*aEvent) {
>+      if (aPriority) {
>+        SetPriorityIfNotNull(aPriority, nsIRunnablePriority::PRIORITY_NORMAL);
>+      }
>+      return retVal;
>+    }
>+  } while (aMayWait);
>   return retVal;
> }
Oh, also vsync get set behind the new flag. I don't think we want that. We are already shipping vsync prioritization.

Looking rather good.
Some of this code need to move around once we have TabGroup based scheduling working, but I don't think we need to wait that work in this particular bug. That work and this are very much overlapping and whatever
lands later needs to do some major merging.
Attachment #8879061 - Flags: review?(bugs) → review-
(Assignee)

Updated

6 days ago
Attachment #8877386 - Flags: review?(bugmail)
(Assignee)

Comment 93

6 days ago
Created attachment 8879455 [details] [diff] [review]
Part6 - Revise browser_toolbox_races.js to avoid jam content process by the synthesized input events

Revised the comments
Attachment #8879059 - Attachment is obsolete: true
Attachment #8879455 - Flags: feedback+
(Assignee)

Comment 94

6 days ago
(In reply to J. Ryan Stinnett [:jryans] (use ni?) from comment #91)
> Comment on attachment 8879059 [details] [diff] [review]
> Part6 - Revise browser_toolbox_races.js to avoid jam content process by the
> synthesized input events
> 
> Review of attachment 8879059 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Hmm, it's still pretty mysterious for future readers I feel, as it seems
> quite hard to hard to know in advance that this approach would work when the
> previous one did not.
> 
> But, the test is trying to force a race, which is a bit tricky to do anyway,
> so I guess it's acceptable to need to subtle tweaks like this.
> 
> ::: devtools/client/framework/test/browser_toolbox_races.js
> @@ +43,5 @@
> > +        toggle();
> > +        toggle();
> > +      } else {
> > +        gDevTools.off("toolbox-created", onCreated);
> > +        setTimeout(resolve);
> 
> Why is `setTimeout` needed?  Is that important for either the test or the
> input prioritization?  Could you just call `resolve` directly?
> 
It's addressed in comment#79

> @@ +60,5 @@
> > +        toggle();
> > +        toggle();
> > +      } else {
> > +        gDevTools.off("toolbox-destroyed", onDestroyed);
> > +        setTimeout(resolve);
> 
> Same question from above.
> 
> @@ +80,4 @@
> >    info("Toggled the toolbox 3 times");
> >  
> >    // Now wait for the 3rd toolbox to be fully ready before closing it.
> >    // We close the last toolbox manually, out of the first while() loop to
> 
> First while loop removed, please update this comment.
Revised.
(Assignee)

Comment 95

6 days ago
Created attachment 8879458 [details] [diff] [review]
Part2 - Add a priority queue for input events
Attachment #8879061 - Attachment is obsolete: true
(Assignee)

Comment 96

6 days ago
(In reply to Olli Pettay [:smaug] from comment #92)
> >diff --git a/dom/ipc/ContentChild.cpp b/dom/ipc/ContentChild.cpp
> >--- a/dom/ipc/ContentChild.cpp
> >+++ b/dom/ipc/ContentChild.cpp
> >@@ -1609,16 +1609,17 @@ ContentChild::RecvPBrowserConstructor(PB
> > 
> >   static bool hasRunOnce = false;
> >   if (!hasRunOnce) {
> >     hasRunOnce = true;
> >     MOZ_ASSERT(!gFirstIdleTask);
> >     RefPtr<CancelableRunnable> firstIdleTask = NewCancelableRunnableFunction(FirstIdle);
> >     gFirstIdleTask = firstIdleTask;
> >     NS_IdleDispatchToCurrentThread(firstIdleTask.forget());
> >+    nsThreadManager::get().EnableMainThreadEventPrioritization();
> This feels still very late place to call this.
> Shouldn't you add prioritize_main_thread_events.enable pref to
> ContentPrefs.cpp and then
> in ContentChild::InitXPCOM call EnableMainThreadEventPrioritization
Revised. It seems we don't need to add the preference to ContentPrefs.cpp since ContentChild::InitXPCOM already sets the preferences.
> 
> 
> So you don't change ParentProcessVsyncNotifier to use vsync priority. It
> keeps using high priority which
> gets now totally different meaning.
I missed it. Revised.

> >+// Control the event prioritization on content main thread
> >+#ifdef NIGHTLY_BUILD
> >+pref("prioritize_main_thread_events.enable", true);
> >+#else
> >+pref("prioritize_main_thread_events.enable", false);
> >+#endif
> Could we call the pref something else, since this doesn't affect to vsync,
> and it isn't prioritizing all the main thread
> events.
> prioritized_input_events.enabled ?
> (I know it doesn't cover 'high', but that should be used normally)
Revised.

> >+// The maximum and minimum time (milliseconds) we reserve for handling input
> >+// events in each frame.
> >+pref("prioritize_main_thread_events.input_duration.max", 8);
> >+pref("prioritize_main_thread_events.input_duration.min", 1);
> Then these could be something like prioritized_input_events.duration.max
> prioritized_input_events.duration.min
Revised.

> >+// The default amount of time (milliseconds) required for handling a input
> >+// event.
> >+pref("prioritize_main_thread_events.default_duration_per_input", 1);
> prioritized_input_events.default_duration_per_event
Revised.

> >+// The number of processed input events we use to predict the amount of time
> >+// required to process the following input events.
> >+pref("prioritize_main_thread_events.input_count_for_prediction", 9);
> prioritized_input_events.count_for_prediction
Revised.

> >+class EventStatistics
> Could you call this InputEventStatistics, and also the file names
Revised.

> >+  public:
> >+    TimeDurationCircularBuffer(uint32_t aSize, TimeDuration& aDefaultValue)
> >+      : mSize(aSize)
> >+      , mCurrentIndex(0)
> >+    {
> >+      mSize = mSize == 0 ? sInputCountForPrediction : mSize;
> >+      for (int16_t index = 0; index < mSize; ++index) {
> >+        mBuffer.AppendElement(aDefaultValue);
> >+        mTotal += aDefaultValue;
> >+      }
> >+    }
> >+    void Insert(TimeDuration& aDuration)
> >+    {
> >+      mTotal += (aDuration - mBuffer[mCurrentIndex]);
> >+      mBuffer[mCurrentIndex++] = aDuration;
> >+      if (mCurrentIndex == mSize) {
> >+        mCurrentIndex = 0;
> >+      }
> >+    }
> >+    TimeDuration GetMean();
> Could you add a new line between methods. Would ease reading.
Revised.

> >+class MOZ_RAII AutoTimeDurationHelper final
> >+{
> >+public:
> >+  AutoTimeDurationHelper()
> >+  {
> >+    mHappened = TimeStamp::Now();
> >+  }
> >+
> >+  ~AutoTimeDurationHelper()
> >+  {
> >+    EventStatistics::Get().UpdateInputDuration(TimeStamp::Now() - mHappened);
> >+  }
> >+
> >+private:
> >+  TimeStamp mHappened;
> mStartTime
Revised.

> >+    // Use mProcessVsyncQueueRunnable to prevent the vsync events from consuming
> >+    // all cpu time and causing starvation.
> >+    if (mProcessVsyncQueueRunnable) {
> >+      MOZ_ASSERT(mVsyncQueue->HasPendingEvent(aProofOfLock));
> >+      retVal = mVsyncQueue->GetEvent(false, aEvent, aProofOfLock);
> >+      MOZ_ASSERT(*aEvent);
> >+      SetPriorityIfNotNull(aPriority, nsIRunnablePriority::PRIORITY_HIGH);
> We just took event from vsync queue, but assign priority HIGH here. Should
> be VSYNC
Revised.

> >+    mProcessVsyncQueueRunnable = mVsyncQueue->HasPendingEvent(aProofOfLock);
> >+
> >+    uint32_t pendingInputCount = mInputQueue->Count(aProofOfLock);
> >+    if (pendingInputCount > 0) {
> >+      if (mHandleInputDeadline.IsNull()) {
> >+        mHandleInputDeadline =
> >+          EventStatistics::Get()
> >+            .GetHandleInputDeadline(mInputQueue->Count(aProofOfLock));
> >+      }
> >+      if (TimeStamp::Now() > mHandleInputDeadline) {
> I think use of "deadline" here and elsewhere is surprising, since it is the
> start time for input event handling.
> Could you use mInputHandlingStartTime or such, and similar for the
> GetHandleInputDeadline() method?
Revised.

> >+    // We don't want to wait if there are some high priority runnables or input
> >+    // events in the queue.
> >+    bool reallyMayWait = aMayWait && !mProcessVsyncQueueRunnable &&
> >+                         pendingInputCount == 0;
> Fix the comment. It talks about high priority, yet the code is about vsync.
Revised.

> >+nsThread::nsChainedEventQueue::GetNonPrioritizedEvent(bool aMayWait,
> >+                                                      nsIRunnable** aEvent,
> >+                                                      unsigned short* aPriority,
> >+                                                      MutexAutoLock& aProofOfLock)
> >+{
> >+  bool retVal = false;
> >+  // Get an event from mNormalQueue since all events are put in it when event
> >+  // prioritization is disabled.
> >+  do {
> >+    retVal = mNormalQueue->GetEvent(aMayWait, aEvent, aProofOfLock);
> >+    if (*aEvent) {
> >+      if (aPriority) {
> >+        SetPriorityIfNotNull(aPriority, nsIRunnablePriority::PRIORITY_NORMAL);
> >+      }
> >+      return retVal;
> >+    }
> >+  } while (aMayWait);
> >   return retVal;
> > }
> Oh, also vsync get set behind the new flag. I don't think we want that. We
> are already shipping vsync prioritization.
I revised the patch to merge the GetXXXEvent functions into one and found a problem, that is the PBrowserConstructor may preempt SetXPCOMProcessAttributes and cause some problems. I have to add the high priority events in mNormalQueue before input event prioritization is enabled. The other solution is put SetXPCOMProcessAttributes as high priority. But I don't know if there are other dependencies or other risks about it.

> Looking rather good.
> Some of this code need to move around once we have TabGroup based scheduling
> working, but I don't think we need to wait that work in this particular bug.
> That work and this are very much overlapping and whatever
> lands later needs to do some major merging.
Comment on attachment 8877386 [details] [diff] [review]
WIP Part3 - Synthesize native input events with priority

Review of attachment 8877386 [details] [diff] [review]:
-----------------------------------------------------------------

The changes to the test seem reasonable to me, given the comments in the commit message and what you added to the test. Thanks! I don't think I'm qualified to review how the changes to nsThread* fit into the overall scheme so I'll give this a f+ instead of a r+. Maybe smaug can do the final review.
Attachment #8877386 - Flags: review?(bugmail) → feedback+
(Assignee)

Comment 98

5 days ago
Created attachment 8879907 [details] [diff] [review]
Part2 - Add a priority queue for input events

Found a bug about a queued high priority event (before input event prioritization is enabled) is preempted by another newly created high priority event (after input event prioritization is enabled) and caused assert in [1].

Before the input event prioritization is enabled, we can't put high priority events in mHighQueue to prevent PBrowserConstructor preempt SetXPCOMProcessAttributes. After enabling input event prioritization, we have to consume all queued events in mNormalQueue then we can start prioritizing events.

Revised the patch and renamed mIsReadyToPrioritizeInputEvents to mIsReadyToPrioritizeEvents.

[1] http://searchfox.org/mozilla-central/rev/714606a8145636d93b116943d3a65a6a49d2acf8/ipc/glue/MessageChannel.cpp#1810
Attachment #8879458 - Attachment is obsolete: true
(In reply to Ming-Chou Shih [:stone] from comment #84)
> (In reply to Andrew Overholt [:overholt] from comment #78)
> > (In reply to Ming-Chou Shih [:stone] from comment #76)
> > > (In reply to Andrew Overholt [:overholt] from comment #74)
> > > > Ehsan and I were testing input latency on Twitter direct messages and he was
> > > > wondering if your patches help here, Stone. I filed the overpainting I'm
> > > > seeing there as bug 1373023.
> > > 
> > > Had a try without my patches but can not find an obvious input latency.
> > > Wondering if any steps I missed or it's because of different hardware?
> > 
> > Perhaps. Can you point me at a Windows (preferably PGO, non-debug) build
> > with your patches so I can try it locally?
> 
> Please let me know if this doesn't work for you. Thanks.
> https://treeherder.mozilla.org/#/jobs?repo=try&author=sshih@mozilla.
> com&selectedJob=107663461

Thanks! We tried it but it looks like slow reflows are getting the better of us: bug 1373023.
You need to log in before you can comment on or make changes to this bug.