Closed Bug 1363829 Opened 7 years ago Closed 7 years ago

consider using a single nsITimer for setTimeout() in windows

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla55

Project Flags:

Performance Impact

low

Tracking Flags:

Tracking

Status

firefox55

---

fixed

People

(Reporter: bkelly, Assigned: bkelly)

References

(Depends on 1 open bug, Regressed 2 open bugs)

Details

Attachments

(17 files, 61 obsolete files)

hmm 7 years ago Ben Kelly [:bkelly, not reviewing] 14.94 KB, patch		Details \| Diff \| Splinter Review
P1 Remove some deadcode in TimeoutManager related to processing pending timers. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 5.16 KB, patch		Details \| Diff \| Splinter Review
P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 10.14 KB, patch		Details \| Diff \| Splinter Review
P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 15.25 KB, patch		Details \| Diff \| Splinter Review
P4 Add an executor class to represent the single timer executing timers for a window. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 6.95 KB, patch		Details \| Diff \| Splinter Review
P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 7.46 KB, patch		Details \| Diff \| Splinter Review
P1 Remove some deadcode in TimeoutManager related to processing pending timers. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 5.21 KB, patch		Details \| Diff \| Splinter Review
P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 8.95 KB, patch		Details \| Diff \| Splinter Review
P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 15.58 KB, patch		Details \| Diff \| Splinter Review
P4 Store the scheduled delay on Timeout instead of relying on the nsITimer to hold it. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 4.01 KB, patch		Details \| Diff \| Splinter Review
P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 7.92 KB, patch		Details \| Diff \| Splinter Review
P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 20.55 KB, patch		Details \| Diff \| Splinter Review
P7 Remove the Timeout nsITimer code. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 5.67 KB, patch		Details \| Diff \| Splinter Review
P1 Remove some deadcode in TimeoutManager related to processing pending timers. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 5.21 KB, patch		Details \| Diff \| Splinter Review
P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 8.95 KB, patch		Details \| Diff \| Splinter Review
P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 20.18 KB, patch		Details \| Diff \| Splinter Review
P4 Store the scheduled delay on Timeout instead of relying on the nsITimer to hold it. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 4.01 KB, patch		Details \| Diff \| Splinter Review
P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 8.85 KB, patch		Details \| Diff \| Splinter Review
P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 20.58 KB, patch		Details \| Diff \| Splinter Review
P7 Remove the Timeout nsITimer code. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 5.81 KB, patch		Details \| Diff \| Splinter Review
P8 Remove TimeoutManager back pressure mechanism. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 14.50 KB, patch		Details \| Diff \| Splinter Review
P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 9.34 KB, patch		Details \| Diff \| Splinter Review
P1 Remove some deadcode in TimeoutManager related to processing pending timers. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 5.21 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 8.95 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 20.18 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P4 Store the scheduled delay on Timeout instead of relying on the nsITimer to hold it. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 4.01 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 10.01 KB, patch		Details \| Diff \| Splinter Review
P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 26.49 KB, patch		Details \| Diff \| Splinter Review
P7 Remove the Timeout nsITimer code. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 5.83 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P8 Remove TimeoutManager back pressure mechanism. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 14.53 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
P9 Make docshell chrome tests wait for next paint to avoid races. r=smaug 7 years ago Ben Kelly [:bkelly, not reviewing] 3.38 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
P10 Make image map reftest wait for first paint before triggering focus/blur. r=dbaron 7 years ago Ben Kelly [:bkelly, not reviewing] 2.69 KB, patch	dbaron : review+	Details \| Diff \| Splinter Review
P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 9.87 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 25.76 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P11 Make debugger mochitests perform a full CC in addition to the forced GC during test cleanup. r=jryans 7 years ago Ben Kelly [:bkelly, not reviewing] 1.28 KB, patch	jryans : review+	Details \| Diff \| Splinter Review
P12 Prevent setTimeout() from firing early by removing TimeoutExecutor flex time. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 4.30 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P13 Creat a low precision mode and use it for scheduling TimeoutExecutor in background tabs. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 14.86 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P14 Make pushPrefEnv() use setTimeout(1) since setTimeout(0) fires slightly faster now. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 219 bytes, patch		Details \| Diff \| Splinter Review
P14 Make pushPrefEnv() use setTimeout(1) since setTimeout(0) fires slightly faster now. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 1.55 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 10.53 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 25.96 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P7 Remove the Timeout nsITimer code. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 6.92 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P12 Create a low precision mode and use it for scheduling TimeoutExecutor in background tabs. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 14.71 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P13 Make pushPrefEnv() use setTimeout(1) since setTimeout(0) fires slightly faster now. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 2.11 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P12 Create a low precision mode and use it for scheduling TimeoutExecutor in background tabs. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 15.25 KB, patch		Details \| Diff \| Splinter Review
P12 Expose nsITimer::PreciseInitWithCallback() that takes a TimeDuration delay. r=froydnj 7 years ago Ben Kelly [:bkelly, not reviewing] 15.68 KB, patch	froydnj : review-	Details \| Diff \| Splinter Review
P13 Use the nsITimer::PreciseInitWithCallback() instead of rounding. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 1.71 KB, patch		Details \| Diff \| Splinter Review
P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj 7 years ago Ben Kelly [:bkelly, not reviewing] 16.03 KB, patch		Details \| Diff \| Splinter Review
P13 Use the nsITimer::InitWithCallbackAndTimeDuration() instead of rounding. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 2.48 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj 7 years ago Ben Kelly [:bkelly, not reviewing] 18.60 KB, patch		Details \| Diff \| Splinter Review
P14 Make browser_bug462289.js test use setTimeout(f, 0) like it did before bug 1102017. r=dao 7 years ago Ben Kelly [:bkelly, not reviewing] 2.77 KB, patch	dao : review+	Details \| Diff \| Splinter Review
P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj 7 years ago Ben Kelly [:bkelly, not reviewing] 17.89 KB, patch		Details \| Diff \| Splinter Review
P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj 7 years ago Ben Kelly [:bkelly, not reviewing] 17.87 KB, patch		Details \| Diff \| Splinter Review
P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj 7 years ago Ben Kelly [:bkelly, not reviewing] 9.54 KB, patch	froydnj : review+	Details \| Diff \| Splinter Review
P15 Expose nsITimer::GetAllowedEarlyFiringMicroseconds(). r=froydnj 7 years ago Ben Kelly [:bkelly, not reviewing] 6.45 KB, patch	froydnj : review+	Details \| Diff \| Splinter Review
P16 Allow setTimeout() to fire early based on nsIThread::GetAllowedEarlyFiringMicroseconds(). r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 4.23 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review
P17 Delay browser_net_simple-request-data.js teardown to allow all events to fire. r=Honza 7 years ago Ben Kelly [:bkelly, not reviewing] 1.21 KB, patch		Details \| Diff \| Splinter Review
P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza 7 years ago Ben Kelly [:bkelly, not reviewing] 8.74 KB, patch		Details \| Diff \| Splinter Review
P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza 7 years ago Ben Kelly [:bkelly, not reviewing] 9.56 KB, patch		Details \| Diff \| Splinter Review
P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza 7 years ago Ben Kelly [:bkelly, not reviewing] 10.75 KB, patch	Honza : review+	Details \| Diff \| Splinter Review
P1 Remove some deadcode in TimeoutManager related to processing pending timers. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 5.21 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 8.95 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 20.18 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P4 Store the scheduled delay on Timeout instead of relying on the nsITimer to hold it. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 4.01 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 10.53 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 25.96 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P7 Remove the Timeout nsITimer code. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 6.92 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P8 Remove TimeoutManager back pressure mechanism. r=smaug 7 years ago Ben Kelly [:bkelly, not reviewing] 14.50 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P9 Make docshell chrome tests wait for next paint to avoid races. r=smaug 7 years ago Ben Kelly [:bkelly, not reviewing] 3.38 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P10 Make image map reftest wait for first paint before triggering focus/blur. r=dbaron 7 years ago Ben Kelly [:bkelly, not reviewing] 2.69 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P11 Make debugger mochitests perform a full CC in addition to the forced GC during test cleanup. r=jryans 7 years ago Ben Kelly [:bkelly, not reviewing] 1.28 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj 7 years ago Ben Kelly [:bkelly, not reviewing] 9.52 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P12 Expose nsITimer::InitHighResolutionWithCallback() method. r=froydnj 7 years ago Ben Kelly [:bkelly, not reviewing] 9.52 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P13 Use the nsITimer::InitHighResolutionWithCallback() instead of rounding. r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 2.47 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P14 Make browser_bug462289.js test use setTimeout(f, 0) like it did before bug 1102017. r=dao 7 years ago Ben Kelly [:bkelly, not reviewing] 2.77 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P15 Expose nsITimer::GetAllowedEarlyFiringMicroseconds(). r=froydnj 7 years ago Ben Kelly [:bkelly, not reviewing] 6.31 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P16 Allow setTimeout() to fire early based on nsIThread::GetAllowedEarlyFiringMicroseconds(). r=ehsan 7 years ago Ben Kelly [:bkelly, not reviewing] 4.23 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza 7 years ago Ben Kelly [:bkelly, not reviewing] 10.75 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Description

•

7 years ago

Currently we register a separate nsITimer for every setTimeout()/setInterval() call in nsGlobalWindow.  This is unnecessary since we only really need to be woken up for the *next* timer.  We can then schedule another nsITimer for the next one and so on.

This would be beneficial because we still have some performance issues with large numbers of nsITimers.  In particular, trying to cancel many timers so we can reschedule them triggers a lot of mutex contention with the timer thread.  This can block the main thread.  We do this sort of thing when switching tabs and rescheduling throttled timers due to the window coming to the foreground.

If we make this change we would be relying on the ThrottledEventQueue less.  The queue would still be useful to yield between timers from many windows in the same TabGroup, but timer floods would not back up the queue any more.  So the back pressure mechanism would likely need to change to look at the TimeoutManager's list of timers instead.

I'm going to see if I can get this into FF55.

Nathan Froyd [:froydnj]

Comment 1

•

7 years ago

(In reply to Ben Kelly [not reviewing due to deadline][:bkelly] from comment #0)
> This would be beneficial because we still have some performance issues with
> large numbers of nsITimers.  In particular, trying to cancel many timers so
> we can reschedule them triggers a lot of mutex contention with the timer
> thread.  This can block the main thread.

Would it be equally feasible to add APIs for canceling multiple timers in one go instead?

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 2

•

7 years ago

(In reply to Nathan Froyd [:froydnj] from comment #1)
> Would it be equally feasible to add APIs for canceling multiple timers in
> one go instead?

We could, but I think it would be strictly worse than what I am proposing.  Keeping the current many-timers approach with bulk cancellation would:

1) Use more memory for nsITimers objects
2) Lock the timer thread mutex more frequently for insertions
3) Generate more runnables for timeouts firing near the same time
4) Require building the list of timers to cancel, which would probably not scale well

Using a single timer per window avoids all these issues.  It just does less work for the same functional behavior.

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 3

•

7 years ago

Ehsan asked me to put a profile showing the problem:

https://perfht.ml/2r2LQRy

This is from running the stress test at:

https://people-mozilla.org/~bkelly/timer-flood/index.html

Search for "CriticalSectionContended".

Cycle collection is also bad here, but its unclear to me if there is anything we can do to improve analyzing so many cycle collectable timer objects.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 4

•

7 years ago

Attached patch hmm (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Depends on: 1365111

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

No longer depends on: 1365111

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 5

•

7 years ago

Attached patch P1 Remove some deadcode in TimeoutManager related to processing pending timers. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8867897 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 6

•

7 years ago

Attached patch P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 7

•

7 years ago

Attached patch P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 8

•

7 years ago

Attached patch P4 Add an executor class to represent the single timer executing timers for a window. r=ehsan (obsolete) — Details — Splinter Review

This is still a work-in-progress.  I have the executor written and some cleanup done, but I still have to make TimeoutManager use the executor instead of InitTimer().

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 9

•

7 years ago

Attached patch P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan (obsolete) — Details — Splinter Review

Attachment #8867964 - Attachment is obsolete: true

Andreas Farre [:farre]

Updated

•

7 years ago

Blocks: 1362322

Kannan Vijayan [:djvj]

Updated

•

7 years ago

Whiteboard: [qf] → [qf:p3]

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 10

•

7 years ago

Attached patch P1 Remove some deadcode in TimeoutManager related to processing pending timers. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8867963 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 11

•

7 years ago

Attached patch P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan (obsolete) — Details — Splinter Review

Attachment #8867968 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 12

•

7 years ago

Attached patch P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8867965 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 13

•

7 years ago

Attached patch P4 Store the scheduled delay on Timeout instead of relying on the nsITimer to hold it. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8867966 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 14

•

7 years ago

Attached patch P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 15

•

7 years ago

Attached patch P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 16

•

7 years ago

Attached patch P7 Remove the Timeout nsITimer code. r=ehsan (obsolete) — Details — Splinter Review

These changes look pretty promising locally.  They greatly smooth out the servo tiles demo.  Lets get a build on some other platforms to see how it behaves there (and if it breaks tests):

https://treeherder.mozilla.org/#/jobs?repo=try&revision=838a4951ae9352a0cfa590e702cdbe0a3933198b

I still need to figure out what to do about the back pressure mechanism.  I either need to rewrite it not to depend on ThrottledEventQueue or just remove it.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 17

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=11eeab80e58f8ec063bc3f5a6781a7032380ac63

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 18

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ddd9b3eb8a8559e9697c12c1d2f683e4d8f786e1

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 19

•

7 years ago

Hopefully this clears up most of the orange:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b4f72aac2d49ef43c10c5fc5d8c7dc6e82d252a8

It looks like there are some tests, though, that race setTimeout(f, 0) against other behavior and expect setTimeout() to lose.  Since this patch speeds up setTimeout() these tests are failing.  I'll have to work through these things.

The reason setTimeout(f, 0) is much faster now is because we uses to schedule an nsITimer for it, but now we just dispatch a runnable to the current thread to execute the timer.  I believe this is correct behavior and its the tests that are broken in this case.

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Depends on: 1277657

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 20

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a88380abe62230380f5585afa7a184a130dd1031

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 21

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=dc573d3a4617d6479a74453f75a61dc51e501660

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 22

•

7 years ago

Trying to figure out why reftests are failing:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=96e6ccbb66fdec536f65a691b79da630af39f214

Andrew Overholt [:overholt]

Updated

•

7 years ago

Blocks: Speedometer_V2

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 23

•

7 years ago

Still some orange to figure out, but this is showing some improvements on Talos:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=bbf0763d07cb003559566318c5c7e0e00cf424aa&newProject=try&newRevision=53e3b97ff70e2dac6d888d9b1911cf0579bc0ea3&framework=1&showOnlyImportant=0&showOnlyConfident=1

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Depends on: 1366642

Jean Gong :jgong

Updated

•

7 years ago

Blocks: TimeToFirstPaint_FB

Jean Gong :jgong

Updated

•

7 years ago

No longer blocks: TimeToFirstPaint_FB

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Depends on: 1366783

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Depends on: 1366791

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 24

•

7 years ago

Attached patch P1 Remove some deadcode in TimeoutManager related to processing pending timers. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8869226 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 25

•

7 years ago

Attached patch P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan (obsolete) — Details — Splinter Review

Attachment #8869227 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 26

•

7 years ago

Attached patch P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8869228 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 27

•

7 years ago

Attached patch P4 Store the scheduled delay on Timeout instead of relying on the nsITimer to hold it. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8869229 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 28

•

7 years ago

Attached patch P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8869230 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 29

•

7 years ago

Attached patch P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8869231 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 30

•

7 years ago

Attached patch P7 Remove the Timeout nsITimer code. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8869232 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 31

•

7 years ago

Attached patch P8 Remove TimeoutManager back pressure mechanism. r=ehsan (obsolete) — Details — Splinter Review

Still some orange to fix, but getting close I think.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6059dfb0ec0599eb65aea49f7e65209e9273903a

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 32

•

7 years ago

Attached patch P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan (obsolete) — Details — Splinter Review

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a84ef47ab7745af248978bb40ad61c2b9d499c04

Attachment #8870120 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 33

•

7 years ago

Unfortunately the talos test improvements were due to a bug where we would often fire timers earlier than before.  I've fixed that and now its looking like this bug is talos neutral.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 34

•

7 years ago

Attached patch P1 Remove some deadcode in TimeoutManager related to processing pending timers. r=ehsan (obsolete) — Details — Splinter Review

The first few patches here will refactor TimeoutManager so that I can remove dependencies on the nsITimer and the specific Timeout in RunTimeout.

To start, this patch removes some dead code that used to involve passing nullptr to RunTimeout() for the Timeout arg.  This code was orphaned by my changes previously in bug 1303167.

Attachment #8870116 - Attachment is obsolete: true

Attachment #8870244 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 35

•

7 years ago

Attached patch P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan (obsolete) — Details — Splinter Review

Next, replace the Timeout argument to RunTimeout() with simply the deadline we wish to execute up to.  Most of the code in RunTimeout() only needs the deadline.

The code that checks to see if the we executed the target Timeout is removed.  This was really previously needed for cases where we did aggressive yielding.  Consider:

1. Timer t1 is scheduled
2. Timer t2 is scheduled
3. nsITimer for t2 fires, executes t1, and yields after only one timer
4. When t1 is executed its nsITimer is canceled.
5. No nsITimer ever fires again and t2 is never executed.

With the remaining patches in this bug this will no longer be a problem at all.  Also, after bug 1343912 we yield much less aggressively.  Its highly unlikely we will hit this condition even if someone were to run on just this patch and not the others in this bug.

Attachment #8870117 - Attachment is obsolete: true

Attachment #8870245 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 36

•

7 years ago

Attached patch P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan (obsolete) — Details — Splinter Review

Next, we need to change how the Timeout objects are held alive.  The nsITimer for each Timeout holds a ref and that will be going away.  Also, manual AddRef()/Release() is used for holding a ref while the timer is in the list which is error prone.

This patch simplifies everything by just using LinkedList<RefPtr<Timeout>> now that we support it.  This makes things much easier to reason about.

The ref held during the nsITimer callback is implemented as a self RefPtr<Timeout>.  This is also easier to understand, but it will be going away in a later patch anyway.

Attachment #8870118 - Attachment is obsolete: true

Attachment #8870248 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 37

•

7 years ago

Attached patch P4 Store the scheduled delay on Timeout instead of relying on the nsITimer to hold it. r=ehsan (obsolete) — Details — Splinter Review

The last place the nsITimer is used is for extracting the actual scheduled delay when adjusting timers.  This patch just stores this value on the Timeout instead so we don't need to access the nsITimer.

Attachment #8870119 - Attachment is obsolete: true

Attachment #8870251 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 38

•

7 years ago

Attached patch P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan (obsolete) — Details — Splinter Review

This patch adds the bulk of the new execution code.  Its contained in a new TimeoutExecutor class:

* The TimeoutExecutor is can be scheduled in "Immediate" mode as a runnable or "Delayed" mode as an nsITimerCallback.
* There is the concept of "flex time" representing how far into the future we may fire a Timeout.  This matches the rounding performed by TimerThread today.
* If the next deadline has passed or less than "flex time" in the future then Immediate mode is triggered as a runnable dispatch.
* Otherwise the deadline is scheduled as a nsITimer in the future.
* The TimeoutExecutor provides a MaybeSchedule(deadline) method that can be called frequently.  It only reschedules the TimeoutExecutor if it sees an earlier deadline than its already scheduled for.
* The TimeoutExecutor must deal with duplicate execution calls.  The timer may still fire even if its canceled and we cannot cancel the nsIRunnable dispatch.

The next patch will actually hook this code up into TimeoutManager.

Attachment #8870147 - Attachment is obsolete: true

Attachment #8870254 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 39

•

7 years ago

Attached patch P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan (obsolete) — Details — Splinter Review

This integrates the TimeoutExecutor into the TimeoutManager.  This mainly involves calling MaybeSchedule() whenever we get a new Timeout or when we know the executor has been stopped.

The only places we need to explicitly stop the executor are:

1. If the next soonest Timeout is canceled with ClearTimeout().
2. If all the timeouts are canceled.
3. If the window is suspended.

Attachment #8870121 - Attachment is obsolete: true

Attachment #8870256 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 40

•

7 years ago

Attached patch P7 Remove the Timeout nsITimer code. r=ehsan (obsolete) — Details — Splinter Review

This removes the now unused Timeout nsITimer code.

Attachment #8870122 - Attachment is obsolete: true

Attachment #8870257 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 41

•

7 years ago

Attached patch P8 Remove TimeoutManager back pressure mechanism. r=ehsan (obsolete) — Details — Splinter Review

This removes the back pressure mechanism I added for FF52.  It is designed around every Timeout getting a runnable in the ThrottledEventQueue, but that is not longer the case.

I considered ways of implementing a new backpressure mechanism, but I decided not to for a few reasons:

1. We already provide some back pressure by limiting how many timers we will implement at once.  In particular, once you get many Timeout objects the cycle collector tends to limit things as well.
2. The old back pressure mechanism didn't work very well.  It prevented memory from growing rapidly, but it couldn't stop it for every work load.  It also triggered jank in this work loads due to ResetTimersForThrottleReduction().
3. The backpressue mechanism is complicated and hard to maintain.  The old one had to be tuned a few times as the browser changed.

Overall I think we are better off without a backpressure mechanism for now.  If I can think up a simple way to do it in the future we can always add it again.

My limited testing showed the CC tended to slow things down and memory growth slowed after 1+GB of Timeout objects.

Attachment #8870123 - Attachment is obsolete: true

Attachment #8870260 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 42

•

7 years ago

Attached patch P9 Make docshell chrome tests wait for next paint to avoid races. r=smaug (obsolete) — Details — Splinter Review

Olli, I asked you about this test last week, I think.  Basically, it assumes that we can call setTimeout(f, 0) and that various async work will be done.  This is racy and fails with the new setTimeout() here.  Since we dispatch a runnable instead of using nsITimer we can execute slightly faster.

This patch converts to use requestAnimationFrame() to wait for next paint.  A double rAF is used since paint has not occurred after the first one.  (I tried with a single rAF and it did not fix the test.)

Attachment #8870261 - Flags: review?(bugs)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 43

•

7 years ago

Attached patch P10 Make image map reftest wait for first paint before triggering focus/blur. r=dbaron (obsolete) — Details — Splinter Review

David, I think I asked you about this reftest a little in IRC.  It turned out not to be a problem with the reftest harness.

Instead, this reftest does a setTimeout(f, 0) on load and then tries to call focus() on an image map.  Unfortunately bug 1366642 means this won't work if first paint has not occurred yet.

This patch changes the reftest to use requestAnimationFrame() to wait for first paint before starting.

Attachment #8870262 - Flags: review?(dbaron)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 44

•

7 years ago

This try is basically the same as the code I just asked for review on.  It just has the round.patch separated from the P5 here.  For review I folded it into P5.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=47262140282a6aa0e5273b69d3c25b93c116a1e8

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 45

•

7 years ago

Talos results:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=94d7c5dfa2fbca52b417f0244dfd124b237b1acc&newProject=try&newRevision=47262140282a6aa0e5273b69d3c25b93c116a1e8&framework=1&showOnlyImportant=0&showOnlyConfident=1

Note, Joel suggested the bloom results are not significant.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 46

•

7 years ago

It seems this test is failing about 10% of the time with these patches:

https://dxr.mozilla.org/mozilla-central/source/browser/base/content/test/general/browser_bug462289.js

I'll try to figure out what is wrong there before anything lands.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 47

•

7 years ago

Comment on attachment 8870254 [details] [diff] [review]
P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan

Sorry, I did not actually fold the rounding patch into here yet.

Attachment #8870254 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 48

•

7 years ago

Here is a profile with these patches of the timer-flood stress test:

https://perfht.ml/2qR48aC

Compare to comment 3.

There are still some long pauses, but they are all due to cycle collection now.  The only CriticalSectionContended sample is in a malloc call.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 49

•

7 years ago

What was the failure mode for the image map reftests in P10 without the patch?

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 50

•

7 years ago

(In reply to David Baron :dbaron: ⌚️UTC-4 from comment #49)
> What was the failure mode for the image map reftests in P10 without the
> patch?

It depended on the timing, but typically one of them would run their init() method before first paint.  This would cause the image page to miss the focus() and not trigger any rendering of the dotted outline.  This would then fail the image comparison.  Sometimes it was the -1.html output that was wrong and sometimes it was the -1-ref.html output that was wrong.

I put more description of the underlying problem in bug 1366642.

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 51

•

7 years ago

Here is an example of one of the failures:

https://treeherder.mozilla.org/logviewer.html#?job_id=100314787&repo=try&lineNumber=2929

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 52

•

7 years ago

Comment on attachment 8870261 [details] [diff] [review]
P9 Make docshell chrome tests wait for next paint to avoid races. r=smaug

rs+

Attachment #8870261 - Flags: review?(bugs) → review+

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 53

•

7 years ago

Comment on attachment 8870262 [details] [diff] [review]
P10 Make image map reftest wait for first paint before triggering focus/blur. r=dbaron

OK, r=dbaron, I guess.

I'm a little concerned that there might be tests that are no longer testing what they were originally designed to test, that you didn't detect through test failures, though.  Is there a way you could find such tests?

Attachment #8870262 - Flags: review?(dbaron) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 54

•

7 years ago

(In reply to David Baron :dbaron: ⌚️UTC-4 from comment #53)
> I'm a little concerned that there might be tests that are no longer testing
> what they were originally designed to test, that you didn't detect through
> test failures, though.  Is there a way you could find such tests?

Hmm, I'm not sure how if they don't fail in try.  To be clear, I believe this is a race in the test that has always been there due to the underlying image map issue.  Unfortunately I don't know a good way to find setTimeout() races in non-failing tests.

I will, however, do a new try without this patch to see if its still necessary with the final rounding changes I put in.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9a86663f19f4cd74b32e4a3993c3b278373acb47

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 55

•

7 years ago

On my windows desktop with a opt non-pgo build the patches here make setTimeout(f, 0) about 5us faster.  A t-test shows this is statistically significant.

I need to adjust some of the delayed rounding code, though.  My tests show that the current patches fire about 500us later than before.  The old code did no rounding, so I am probably being too conservative here.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 56

•

7 years ago

Comment on attachment 8870256 [details] [diff] [review]
P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan

Going to change this patch as well to fix some of the rounding issues.

Attachment #8870256 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 57

•

7 years ago

(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #54)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=9a86663f19f4cd74b32e4a3993c3b278373acb47

P9 and P10 are still necessary.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 58

•

7 years ago

Lets try with a different rounding strategy.  This gives us the most accurate firing time in my local tests.  For a 25ms timer:

current m-c: mean delay of 27.5ms with stdev of 2.4ms
this patch set: mean delay of 25.1ms with stdev of 0.7ms

I wonder if our tests can handle this improved accuracy:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=7e7192e9c9fac05998e9ebf559d851ee73898eb9

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 59

•

7 years ago

I need to investigate these two test failures:

devtools/client/debugger/test/mochitest/browser_dbg_worker-window.js
browser/base/content/test/general/browser_bug462289.js

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 60

•

7 years ago

Attached patch P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan (obsolete) — Details — Splinter Review

See comment 38 for the main description.

This update patch further adjusts how precise delays are converted to integral milliseconds for nsITimer.  The previous version of the patch was trying to do a more typical round operation while previous m-c truncated in many cases.

After doing some testing this code now favors rounding down pretty aggressively.  If we fire too early we then schedule another runnable or 0ms timer to get us to the right time.  This provides greatly improved accuracy at the cost of only a couple more runnables.  Even on a fast machine it does not fire large number of runnables.  Also, this cost won't be incurred if the browser is busy as we are unlikely to get the timer fired early in that case.

Attachment #8870254 - Attachment is obsolete: true

Attachment #8870575 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 61

•

7 years ago

Attached patch P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan (obsolete) — Details — Splinter Review

See comment 39 for the main description.

This updated patch just adjusted how RunTimeout() uses the now timestamp a bit.

Attachment #8870256 - Attachment is obsolete: true

Attachment #8870577 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 62

•

7 years ago

I'm going to make any additional changes as follow-on patches, so the current set can be reviewed.  I won't change them any more.  Sorry for the churn.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 63

•

7 years ago

Attached patch P11 Make debugger mochitests perform a full CC in addition to the forced GC during test cleanup. r=jryans (obsolete) — Details — Splinter Review

So, every time we fail browser_dbg_worker-window.js the WorkerPrivate is cycle collected after the test tab-leak-check.  Every time we succeed the WorkerPrivate is cycle collected first.

While I didn't trace the object graph completely, I believe this test is creating a cycle between the window and worker.  It depends on CC to fire in order to clean up properly.

In the past we happened to always get the CC to run before the end of the test, but it was not guaranteed.  With the new setTimeout() implementation sometimes the test finishes just before the CC and we fail.

Ryan, this patch expands the debugger test cleanup routine to perform a CC in addition to the existing forceGC().  I also made it wait for an exact GC to complete for good measure.

This seemed more robust for the overall test suite instead of trying to manually cut the cycle in this particular test.  We don't want to have to hunt these things down again in the future.  I don't think manually cutting cycles is the point of these tests.

Here's a try build with 20 retriggers that shows it seems to fix the problems:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f227fff8334685ffa5c73af25687286370dc242a

I believe its because of the race with the CC timer I described above, but I suppose it could also just be changing timing again.

Here's a full try build with the new patch set:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2cc9a071d92b9eaf9dced6f197125bd56e64def8

Attachment #8870627 - Flags: review?(jryans)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 64

•

7 years ago

I think just browser/base/content/test/general/browser_bug462289.js to sort out now.

J. Ryan Stinnett [:jryans] (Use needinfo, replies may be slow)

Comment 65

•

7 years ago

Comment on attachment 8870627 [details] [diff] [review]
P11 Make debugger mochitests perform a full CC in addition to the forced GC during test cleanup. r=jryans

Review of attachment 8870627 [details] [diff] [review]:
-----------------------------------------------------------------

We do similar things in other DevTools test directories, and you appear to have a good reason for it, so looks good to me.

Attachment #8870627 - Flags: review?(jryans) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 66

•

7 years ago

Ryan convinced me in another bug we should not be firing early as often as we are.  So I'm removing the "flex time" to fire into the future:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3435f35fee4504063c1b2dd2f22d37a8cf26f186

Using this tool to measure precision:

https://people-mozilla.org/~bkelly/timeout-precision?mode=delta&delay=10&count=100

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 67

•

7 years ago

Attached patch P12 Prevent setTimeout() from firing early by removing TimeoutExecutor flex time. r=ehsan (obsolete) — Details — Splinter Review

This patch removes the "flex time" concept introduced in P5.  Ryan convinced me its actually a compat problem to fire setTimeout() early.  Our old implementation does it, but my testing shows that most other browsers don't.

Let's fix this by removing the flex time and preventing timers from firing early.

I still round down our nsITimer delay by 100 microseconds in order to enable the high precision mode.  My next patch will avoid this in background tabs.

Again, for reference I am using this tool to measure our timeout precision:

https://people-mozilla.org/~bkelly/timeout-precision/?mode=delta&delay=10&count=100

On windows I get about 25% of the timeouts scheduled by this page firing early.  Chrome doesn't fire any early.  Edge fires maybe 2% early.

Attachment #8870952 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 68

•

7 years ago

Attached patch P13 Creat a low precision mode and use it for scheduling TimeoutExecutor in background tabs. r=ehsan (obsolete) — Details — Splinter Review

This patch introduces the concept of low and high precision.  In low precision we favor rounding up our nsITimer delay.  In high precision we favor rounding down our nsITimer delay.

The significance is that when we round down the nsITimer delay we may end up dispatching some additional 0-ms nsITimer or immediate runnables while we wait for the last few microseconds to count down to the deadline.  This is not that expensive, but lets avoid doing it on background tabs.  Who cares if they are a few milliseconds late.

Attachment #8870953 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 69

•

7 years ago

Note, I profiled the "high precision mode" where we may fire extra runnables until the deadline expires.  It doesn't even show up in the profile at all:

https://perfht.ml/2qW4ooP
https://perfht.ml/2qWvMTP

(no longer active)

Updated

•

7 years ago

Attachment #8870244 - Flags: review?(ehsan) → review+

(no longer active)

Comment 70

•

7 years ago

Comment on attachment 8870245 [details] [diff] [review]
P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan

Review of attachment 8870245 [details] [diff] [review]:
-----------------------------------------------------------------

Fantastic!

Attachment #8870245 - Flags: review?(ehsan) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 71

•

7 years ago

Attached patch P14 Make pushPrefEnv() use setTimeout(1) since setTimeout(0) fires slightly faster now. r=ehsan (obsolete) — Details — Splinter Review

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1632f5f3099eafcb0ce021fe69987a4212ae30a7

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 72

•

7 years ago

Attached patch P14 Make pushPrefEnv() use setTimeout(1) since setTimeout(0) fires slightly faster now. r=ehsan (obsolete) — Details — Splinter Review

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1632f5f3099eafcb0ce021fe69987a4212ae30a7

Attachment #8871012 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 73

•

7 years ago

Full try on updated tip:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6a55b82a5fdc7ca141690d5c08e89ac7a1167f91

Talos comparison build:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e0a2db5642d3573fc1bf528f641eaa1bec6995bd

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 74

•

7 years ago

Comment on attachment 8871014 [details] [diff] [review]
P14 Make pushPrefEnv() use setTimeout(1) since setTimeout(0) fires slightly faster now. r=ehsan

So it seems specialpowers in general and pushPrefEnv in particular use setTimeout(f, 0) to determine "completion".  Now that setTimeout(f, 0) runs slightly faster this causes some tests fail.

For example, this test becomes intermittent:

https://dxr.mozilla.org/mozilla-central/source/layout/base/tests/test_bug394057.html

I can't see anything in this test other than the pushPrefEnv that could be async.

This patch bumps up the setTimeout calls to a delay of 1 which will ensure an nsITimer is used.  It fixes the test linked here and at least one other.

I feel this patch is kind of lame, but I don't have a better solution since it seems these operations in specialpowers don't have good completion events to really base this stuff on.

Attachment #8871014 - Flags: review?(ehsan)

(no longer active)

Comment 75

•

7 years ago

Comment on attachment 8870248 [details] [diff] [review]
P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan

Review of attachment 8870248 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, I think this is a great improvement over the current state of things!

::: dom/base/Timeout.h
@@ +100,5 @@
>  
>    // The language-specific information about the callback.
>    nsCOMPtr<nsITimeoutHandler> mScriptHandler;
>  
> +  RefPtr<Timeout> mClosureSelfRef;

Nit: This isn't really a closure, unless I'm missing something?  How about simply calling it mSelfRef?

Attachment #8870248 - Flags: review?(ehsan) → review+

(no longer active)

Updated

•

7 years ago

Attachment #8870251 - Flags: review?(ehsan) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 76

•

7 years ago

(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #75)
> > +  RefPtr<Timeout> mClosureSelfRef;
> 
> Nit: This isn't really a closure, unless I'm missing something?  How about
> simply calling it mSelfRef?

This member is removed in P6 anyway, so I probably won't bother renaming this, if thats ok.

(no longer active)

Comment 77

•

7 years ago

Comment on attachment 8870577 [details] [diff] [review]
P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan

Review of attachment 8870577 [details] [diff] [review]:
-----------------------------------------------------------------

This was a joy to review, thanks so much for doing this!

::: dom/base/TimeoutManager.cpp
@@ +1033,5 @@
>      return true;
>    }
>  
>    // Reschedule the OS timer. Don't bother returning any error codes if
>    // this fails since the callers of this method don't care about them.

This comment isn't accurate any more.

::: dom/base/TimeoutManager.h
@@ +219,5 @@
>  
>    // Each nsGlobalWindow object has a TimeoutManager member.  This reference
>    // points to that holder object.
>    nsGlobalWindow&             mWindow;
> +  RefPtr<TimeoutExecutor>     mExecutor;

Hmm, why is this an indirect member and not just an inline |TimeoutExecutor mExecutor;|?  I haven't yet looked at all of the remaining parts but if something else doesn't need the indirection please remove it.

Attachment #8870577 - Flags: review?(ehsan) → review+

(no longer active)

Comment 78

•

7 years ago

Comment on attachment 8870575 [details] [diff] [review]
P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan

Review of attachment 8870575 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/TimeoutExecutor.cpp
@@ +220,5 @@
> +
> +NS_IMETHODIMP
> +TimeoutExecutor::SetName(const char* aName)
> +{
> +  return NS_ERROR_NOT_IMPLEMENTED;

Can you please implement this method similar to existing nsINamed implementations as a debug-only thing?

::: dom/base/TimeoutExecutor.h
@@ +22,5 @@
> +  nsCOMPtr<nsITimer> mTimer;
> +  TimeStamp mDeadline;
> +  const TimeDuration mFlexTime;
> +
> +  enum class Mode

Can you please add some comment about what this means?  I didn't really understand this well before reading the next part of the patch series, it would be helpful to have some description of the states the object can live in here.

@@ +56,5 @@
> +  void
> +  Shutdown();
> +
> +  nsresult
> +  MaybeSchedule(const TimeStamp& aDeadline);

I found the names of these methods a bit hard to follow, especially in the .cpp file.  I'd consider avoiding using "Maybe" in the names of the methods.  But that requires a lot of code changes, so it's probably not worth it!  I don't really feel that strongly.

Attachment #8870575 - Flags: review?(ehsan) → review+

(no longer active)

Comment 79

•

7 years ago

(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #76)
> (In reply to :Ehsan Akhgari (super long backlog, slow to respond) from
> comment #75)
> > > +  RefPtr<Timeout> mClosureSelfRef;
> > 
> > Nit: This isn't really a closure, unless I'm missing something?  How about
> > simply calling it mSelfRef?
> 
> This member is removed in P6 anyway, so I probably won't bother renaming
> this, if thats ok.

Of course!  (I'm reviewing these as I go, sorry about missing stuff like this.)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 80

•

7 years ago

(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #77)
> This was a joy to review, thanks so much for doing this!

Thanks!

> Hmm, why is this an indirect member and not just an inline |TimeoutExecutor
> mExecutor;|?  I haven't yet looked at all of the remaining parts but if
> something else doesn't need the indirection please remove it.

In theory the execution runnable/timer can be scheduled when the nsGlobalWindow/TimeoutManager is destroyed.  It seemed easiest to handle this by making TimeoutExecutor implement the nsIRunnable/nsITimerCallback interfaces and be its own ref-counted object.

I don't think its too bad because we are saving a ton of heap allocations per nsITimer with the patches in this bug.  We don't have to allocate a separate set of nsTimer/nsTimerImpl/runnable objects for each Timeout like we used to.  Instead we just have a single TimeoutExecutor that stays alive and gets re-used.

(no longer active)

Comment 81

•

7 years ago

Comment on attachment 8870257 [details] [diff] [review]
P7 Remove the Timeout nsITimer code. r=ehsan

Review of attachment 8870257 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/Timeout.cpp
@@ -27,5 @@
>      mPopupState(openAllowed)
>  {
>  }
>  
>  Timeout::~Timeout()

And then this will go also.

::: dom/base/Timeout.h
@@ +30,4 @@
>    : public LinkedListElement<RefPtr<Timeout>>
>  {
>  public:
>    Timeout();

Nit: now you could also have |~Timeout() = default;|.

Attachment #8870257 - Flags: review?(ehsan) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 82

•

7 years ago

(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #78)
> > +NS_IMETHODIMP
> > +TimeoutExecutor::SetName(const char* aName)
> > +{
> > +  return NS_ERROR_NOT_IMPLEMENTED;
> 
> Can you please implement this method similar to existing nsINamed
> implementations as a debug-only thing?

I'm not sure what you mean, many (most?) are implemented with NS_ERROR_NOT_IMPLEMENTED:

https://dxr.mozilla.org/mozilla-central/source/dom/base/nsContentSink.cpp#1641
https://dxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp#665
https://dxr.mozilla.org/mozilla-central/source/dom/events/EventStateManager.cpp#247
etc...

(no longer active)

Comment 83

•

7 years ago

Comment on attachment 8870260 [details] [diff] [review]
P8 Remove TimeoutManager back pressure mechanism. r=ehsan

Review of attachment 8870260 [details] [diff] [review]:
-----------------------------------------------------------------

This is fine with me, but I would like Olli to sign off on this since originally he reviewed your patches for it.  In general I agree that it's a complicated and hard to maintain feature and I trust your judgement on whether it serves us more than it hurts us.

Attachment #8870260 - Flags: review?(ehsan) → review?(bugs)

(no longer active)

Comment 84

•

7 years ago

Comment on attachment 8870952 [details] [diff] [review]
P12 Prevent setTimeout() from firing early by removing TimeoutExecutor flex time. r=ehsan

Review of attachment 8870952 [details] [diff] [review]:
-----------------------------------------------------------------

If your version control setup makes it easy, please fold this into part 5, so that we don't add this and take it away needlessly!

Attachment #8870952 - Flags: review?(ehsan) → review+

(no longer active)

Comment 85

•

7 years ago

Comment on attachment 8870953 [details] [diff] [review]
P13 Creat a low precision mode and use it for scheduling TimeoutExecutor in background tabs. r=ehsan

Review of attachment 8870953 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/TimeoutExecutor.h
@@ +16,5 @@
>  
> +enum class TimeoutPrecision : uint8_t {
> +  Low,
> +  High
> +};

The patch comment (which is great!) should probably go here.

Attachment #8870953 - Flags: review?(ehsan) → review+

(no longer active)

Comment 86

•

7 years ago

Comment on attachment 8871014 [details] [diff] [review]
P14 Make pushPrefEnv() use setTimeout(1) since setTimeout(0) fires slightly faster now. r=ehsan

Review of attachment 8871014 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/specialpowers/content/specialpowersAPI.js
@@ +707,5 @@
>      if (typeof window != 'undefined')
>        setTimeout(callback, 0);
>      // for mochitest-plain
>      else
> +      content.window.setTimeout(callback, 1);

Please add a scary comment here warning people that it's important for this to not go below 1!

@@ +721,5 @@
>           if (typeof window != 'undefined')
>             setTimeout(aCallback, 0);
>           // For mochitest-plain
>           else
> +           content.window.setTimeout(aCallback, 1);

Ditto.

Attachment #8871014 - Flags: review?(ehsan) → review+

(no longer active)

Comment 87

•

7 years ago

(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #82)
> (In reply to :Ehsan Akhgari (super long backlog, slow to respond) from
> comment #78)
> > > +NS_IMETHODIMP
> > > +TimeoutExecutor::SetName(const char* aName)
> > > +{
> > > +  return NS_ERROR_NOT_IMPLEMENTED;
> > 
> > Can you please implement this method similar to existing nsINamed
> > implementations as a debug-only thing?
> 
> I'm not sure what you mean, many (most?) are implemented with
> NS_ERROR_NOT_IMPLEMENTED:
> 
> https://dxr.mozilla.org/mozilla-central/source/dom/base/nsContentSink.
> cpp#1641
> https://dxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.
> cpp#665
> https://dxr.mozilla.org/mozilla-central/source/dom/events/EventStateManager.
> cpp#247
> etc...

Oh... you're right, my memory is failing me.  Runnable::SetName() is *the exception* not the rule.  Sorry, my bad.

(no longer active)

Comment 88

•

7 years ago

(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #80)
> (In reply to :Ehsan Akhgari (super long backlog, slow to respond) from
> comment #77)
> > This was a joy to review, thanks so much for doing this!
> 
> Thanks!
> 
> > Hmm, why is this an indirect member and not just an inline |TimeoutExecutor
> > mExecutor;|?  I haven't yet looked at all of the remaining parts but if
> > something else doesn't need the indirection please remove it.
> 
> In theory the execution runnable/timer can be scheduled when the
> nsGlobalWindow/TimeoutManager is destroyed.  It seemed easiest to handle
> this by making TimeoutExecutor implement the nsIRunnable/nsITimerCallback
> interfaces and be its own ref-counted object.

Oh, so it's lifetime can actually survive past the window/timeout manager... OK, that's true in theory.  Now that I think about it more, it may be better to actually not codify any lifetime assumptions in the code here and keep this a RefPtr.  But can you please add a comment explaining the lifetime of the member?

> I don't think its too bad because we are saving a ton of heap allocations
> per nsITimer with the patches in this bug.  We don't have to allocate a
> separate set of nsTimer/nsTimerImpl/runnable objects for each Timeout like
> we used to.  Instead we just have a single TimeoutExecutor that stays alive
> and gets re-used.

Yeah, the patches are improving things overall so I'm not very concerned about this.  :-)  And your argument about is actually a strong reason *not* to have this as an inline member.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 89

•

7 years ago

Attached patch P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8870575 - Attachment is obsolete: true

Attachment #8870952 - Attachment is obsolete: true

Attachment #8871460 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 90

•

7 years ago

Attached patch P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8870577 - Attachment is obsolete: true

Attachment #8871461 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 91

•

7 years ago

Attached patch P7 Remove the Timeout nsITimer code. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8870257 - Attachment is obsolete: true

Attachment #8871463 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 92

•

7 years ago

Attached patch P12 Create a low precision mode and use it for scheduling TimeoutExecutor in background tabs. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8870953 - Attachment is obsolete: true

Attachment #8871464 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 93

•

7 years ago

Attached patch P13 Make pushPrefEnv() use setTimeout(1) since setTimeout(0) fires slightly faster now. r=ehsan (obsolete) — Details — Splinter Review

I added a comment.  I also decided we should use 1ms throughout for the setTimeout() calls here.  I was confused and thought some of them were using nsITimer jsm thing, but they are all using the nsGlobalWindow version.

Attachment #8871465 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 94

•

7 years ago

Attached patch P12 Create a low precision mode and use it for scheduling TimeoutExecutor in background tabs. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8871464 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 95

•

7 years ago

A couple more try builds:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=039531b582f3bb791a59836afd0007b064127662
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ed69269adedbc701a2cc57942c6c073dc509e187

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 96

•

7 years ago

Maybe P13 changes to specialpowers setTimeout is causing more trouble than it fixed.  Lets see what it looks like without that:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ef58152d396e2d5c2d44a11f95ccc568cd4f3aa8

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 97

•

7 years ago

Comment on attachment 8870260 [details] [diff] [review]
P8 Remove TimeoutManager back pressure mechanism. r=ehsan

I'm ok removing this. TimeoutManager has gotten so complicated that some code removals are good.

Attachment #8870260 - Flags: review?(bugs) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 98

•

7 years ago

Comment on attachment 8871465 [details] [diff] [review]
P13 Make pushPrefEnv() use setTimeout(1) since setTimeout(0) fires slightly faster now. r=ehsan

In retrospect I don't think this patch is necessary.

Attachment #8871465 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 99

•

7 years ago

Comment on attachment 8871468 [details] [diff] [review]
P12 Create a low precision mode and use it for scheduling TimeoutExecutor in background tabs. r=ehsan

I think I have a better solution than the "high precision mode" thing that does a short async runnable spin of the event loop.

Attachment #8871468 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 100

•

7 years ago

Attached patch P12 Expose nsITimer::PreciseInitWithCallback() that takes a TimeDuration delay. r=froydnj (obsolete) — Details — Splinter Review

The root cause of the precision problems in this bug are that we require an integer number of milliseconds be passed to nsITimer. This is an artificial limitation, though. The underlying implementation already supports a finer grained measurement.

This patch simply pipes TimeDuration into the nsITimer using a new PreciseInitWithCallback() method. I only bothered with the nsITimerCallback variant since that is what I needed. Others could be added if desired.

I also preserved TimeDuration throughout the TimerThread run loop, Monitor, and CondVar layers. This allows the precise delay to be passed down to the OS level call.

You may be asking, why do we need this now and not before?

To answer, consider this situation:

1. setTimeout(f, 1);
2. do some work for 500us
3. setTimeout(g, 1);

This should fire f() and g() 500us apart.

In our existing code this works because each setTimeout() gets its on nsITimer. The 1ms is preserved and pass through to nsITimer which converts it to a TimeStamp, etc.

With the patches in this bug there is only one nsITimer. We fire f() and then try to schedule a new nsITimer for g(). Its only 500us in the future, though! What do we do? We either fire early and spin for a while or fire late.

The best solution I've found is to preserve the accuracy through the nsITimer implementation. This avoids extra runnables and wakeups, while also improving our overall accuracy.

On my machine using this approach changed the output on this site:

https://people-mozilla.org/~bkelly/timeout-precision/?mode=delta&delay=10&count=100

From:

min:0.095 mean:0.640 median:0.525 max:6.860 stdev:0.727

To:

min:0.015 mean:0.842 median:0.052 max:5.055 stdev:1.589

As you can see, our median deviation from the target firing time is greatly improved.

Attachment #8871910 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 101

•

7 years ago

Attached patch P13 Use the nsITimer::PreciseInitWithCallback() instead of rounding. r=ehsan (obsolete) — Details — Splinter Review

Ehsan, this patch replaces the old patch to create a High/Low precision enumeration.  Instead of doing that we instead just use the new API to pass the precise delay to nsITimer.  This avoid the need to round at all.

Attachment #8871912 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 102

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0532cfb8b08ef9dbd8aa5e6bac190e2f60d40898

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Attachment #8871912 - Flags: review?(ehsan)

Nathan Froyd [:froydnj]

Comment 103

•

7 years ago

Comment on attachment 8871910 [details] [diff] [review]
P12 Expose nsITimer::PreciseInitWithCallback() that takes a TimeDuration delay. r=froydnj

Review of attachment 8871910 [details] [diff] [review]:
-----------------------------------------------------------------

I think this is OK; it would be nice if your explanation was in the commit message and/or found a nice comment in the source code.  I can't think of a good place to drop the comment in the source code, though.

r-'ing just for the wait() thing.  The naming thing is just complaining, but if you have a better idea for the name...InitWithCallbackAndTimeDuration?

::: xpcom/threads/CondVar.h
@@ -64,5 @@
>  #ifdef MOZILLA_INTERNAL_API
>      GeckoProfilerThreadSleepRAII sleep;
>  #endif //MOZILLA_INTERNAL_API
> -    if (aInterval == PR_INTERVAL_NO_TIMEOUT) {
> -      mImpl.wait(*mLock);

Can we please try and retain this behavior of calling the no-timeout wait?  The no-argument/PR_INTERVAL_NO_TIMEOUT case is overwhemingly the common case and involves significantly less work on all of our platforms.

::: xpcom/threads/nsITimer.idl
@@ +170,5 @@
> +   * @param aFunc      nsITimerCallback interface to call when timer expires
> +   * @param aDelay     The high precision interval
> +   * @param aType      Timer type per TYPE* consts defined above
> +   */
> +  [noscript] void preciseInitWithCallback(in nsITimerCallback aCallback,

I wish there was a different adjective to use to describe this.  We have "precise" in the TYPE* constants which has (had) a completely different meaning than the one here.  Unfortunately, I can't think of a different adjective to use here, and I think providing a function overload doesn't work unless we [binaryName] the existing initWithCallback...which just moves the naming issue somewhere else and makes the patch for this bug much bigger.

Attachment #8871910 - Flags: review?(nfroyd) → review-

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 104

•

7 years ago

(In reply to Nathan Froyd [:froydnj] from comment #103)
> ::: xpcom/threads/CondVar.h
> @@ -64,5 @@
> >  #ifdef MOZILLA_INTERNAL_API
> >      GeckoProfilerThreadSleepRAII sleep;
> >  #endif //MOZILLA_INTERNAL_API
> > -    if (aInterval == PR_INTERVAL_NO_TIMEOUT) {
> > -      mImpl.wait(*mLock);
> 
> Can we please try and retain this behavior of calling the no-timeout wait? 
> The no-argument/PR_INTERVAL_NO_TIMEOUT case is overwhemingly the common case
> and involves significantly less work on all of our platforms.

I can put this back, but the underlying primitive explicitly handles TimeDuration::Forever():

https://dxr.mozilla.org/mozilla-central/source/mozglue/misc/ConditionVariable_posix.cpp#126
https://dxr.mozilla.org/mozilla-central/source/mozglue/misc/ConditionVariable_windows.cpp#69

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 105

•

7 years ago

(In reply to Nathan Froyd [:froydnj] from comment #103)
> I think this is OK; it would be nice if your explanation was in the commit
> message and/or found a nice comment in the source code.  I can't think of a
> good place to drop the comment in the source code, though.

I'll put the explanation in a comment in the next patch where TimeoutExecutor uses this API.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 106

•

7 years ago

Attached patch P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj (obsolete) — Details — Splinter Review

Attachment #8871910 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 107

•

7 years ago

Comment on attachment 8871938 [details] [diff] [review]
P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj

Review of attachment 8871938 [details] [diff] [review]:
-----------------------------------------------------------------

Updated per our IRC conversation.

Attachment #8871938 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 108

•

7 years ago

Attached patch P13 Use the nsITimer::InitWithCallbackAndTimeDuration() instead of rounding. r=ehsan (obsolete) — Details — Splinter Review

Updated with new method name and added explanatory comment.

Attachment #8871912 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Attachment #8871938 - Attachment description: P12 Expose nsITimer::PreciseInitWithCallback() that takes a TimeDuration delay. r=froydnj → P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Attachment #8871954 - Attachment description: Use the nsITimer::InitWithCallbackAndTimeDuration() instead of rounding. r=ehsan → P13 Use the nsITimer::InitWithCallbackAndTimeDuration() instead of rounding. r=ehsan

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 109

•

7 years ago

With instrumentation I see we still get 1 or 2 spurious wakeups when nsITimer fires early.  Unfortunately I don't think we can do any better right now without allowing early setTimeout() callbacks.  I think that is very undesirable from a web compat perspective.

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Attachment #8871954 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Attachment #8871014 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 110

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8d64a2b20f507a25cb4b9e5cb9ca12f1775f448d

Emanuel Hoogeveen [:ehoogeveen]

Comment 111

•

7 years ago

(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #109)
> With instrumentation I see we still get 1 or 2 spurious wakeups when
> nsITimer fires early.  Unfortunately I don't think we can do any better
> right now without allowing early setTimeout() callbacks.  I think that is
> very undesirable from a web compat perspective.

Any idea why? Is this just the OS being wonky? To wit, Sleep(1) doesn't sleep for 1ms but rather the shortest supported period, which is usually 0.5ms (at least after calling timeBeginPeriod(1), as I believe we do). Sleep(0) almost does the same thing as SwitchToThread() (that is, yields the rest of its slice to other threads), except that it also allows I/O to execute. Sleep(n) with n >= 2 work as you would expect.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 112

•

7 years ago

(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #111)
> Any idea why? Is this just the OS being wonky? To wit, Sleep(1) doesn't
> sleep for 1ms but rather the shortest supported period, which is usually
> 0.5ms (at least after calling timeBeginPeriod(1), as I believe we do).
> Sleep(0) almost does the same thing as SwitchToThread() (that is, yields the
> rest of its slice to other threads), except that it also allows I/O to
> execute. Sleep(n) with n >= 2 work as you would expect.

Uh, because I missed some bits in TImerThread that still use the old integer based interval logic.  If I fix those I stop getting early wake ups.

Also, FWIW, this code does not use sleep.  It waits on a condition variable.

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Attachment #8871938 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 113

•

7 years ago

Attached patch P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj (obsolete) — Details — Splinter Review

Attachment #8871938 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 114

•

7 years ago

(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #112)
> (In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #111)
> > Any idea why? Is this just the OS being wonky? To wit, Sleep(1) doesn't
> > sleep for 1ms but rather the shortest supported period, which is usually
> > 0.5ms (at least after calling timeBeginPeriod(1), as I believe we do).
> > Sleep(0) almost does the same thing as SwitchToThread() (that is, yields the
> > rest of its slice to other threads), except that it also allows I/O to
> > execute. Sleep(n) with n >= 2 work as you would expect.
> 
> Uh, because I missed some bits in TImerThread that still use the old integer
> based interval logic.  If I fix those I stop getting early wake ups.
> 
> Also, FWIW, this code does not use sleep.  It waits on a condition variable.

I'm going to have to investigate this a bit more.  I'm not sure how safe it is to try to change this logic given that underlying windows code is based on integer milliseconds:

https://dxr.mozilla.org/mozilla-central/source/mozglue/misc/ConditionVariable_windows.cpp#72

I don't want to accidentally make the TimerThread spin a bunch.

I think review of this will have to wait for next Wednesday after the holiday.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 115

•

7 years ago

Attached patch P14 Make browser_bug462289.js test use setTimeout(f, 0) like it did before bug 1102017. r=dao (obsolete) — Details — Splinter Review

Dao, I found that browser_bug462289.js fails frequently with these patches.  I saw that you changed this file to use executeSoon() instead of setTimeout(f, 0) in bug  1102017.  Was there a reason for that?

I spoke with Blake and he suggested the significant change bug 1102017 was waiting for TabSwitchDone and that the executeSoon() changes may not be strictly necessary.

If I switch the executeSoon() calls back to setTimeout(f, 0) the test is no longer intermittent.

Given that the timing of setTimeout is slightly different its possible this test is racing with a setTimeout somewhere else in browser chrome code.  Using setTimeout here would provide some ordering guarantees that executeSoon does not.  That is my best theory about why it works here.

Attachment #8871998 - Flags: review?(dao+bmo)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 116

•

7 years ago

(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #112)
> (In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #111)
> > Any idea why? Is this just the OS being wonky? To wit, Sleep(1) doesn't
> > sleep for 1ms but rather the shortest supported period, which is usually
> > 0.5ms (at least after calling timeBeginPeriod(1), as I believe we do).
> > Sleep(0) almost does the same thing as SwitchToThread() (that is, yields the
> > rest of its slice to other threads), except that it also allows I/O to
> > execute. Sleep(n) with n >= 2 work as you would expect.
> 
> Uh, because I missed some bits in TImerThread that still use the old integer
> based interval logic.  If I fix those I stop getting early wake ups.

This was not why we were getting spurious wakeups.  The issue is that TimerThread will execute a timer up to have the resolution of a PRInterval early.  On windows PRInterval is 1ms, so this means TimerThread will fire a timer up to 500us early.

I'm not sure I feel comfortable changing that here.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 117

•

7 years ago

Or maybe it will fire 250us early.  In any case, its a fairly large number of microseconds that TimerThread explicitly decides to fire early on windows.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 118

•

7 years ago

Attached patch P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj (obsolete) — Details — Splinter Review

Attachment #8871996 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 119

•

7 years ago

Attached patch P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj (obsolete) — Details — Splinter Review

Attachment #8872001 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 120

•

7 years ago

Comment on attachment 8872003 [details] [diff] [review]
P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj

Ok, this implements the changes we discussed in IRC:

1. Separate CondVar::Wait() in non-debug.
2. Change the name to InitWithCallbackAndTimeDuration()

I also changed TimerThread::Run() to use TimeDuration throughout.

Finally, I was using TimeDuration::Resolution() in TimerThread::Run().  I don't think this is equivalent to the halfMicroSecondsIntervalResolution, though.  I switched it back to use the TimerThread's specific half resolution calculation instead.

As I mentioned above, we do still get early wakeups from nsITimer with these changes on windows.  They are, however, greatly reduced.

Attachment #8872003 - Flags: review?(nfroyd)

(no longer active)

Comment 121

•

7 years ago

Comment on attachment 8871954 [details] [diff] [review]
P13 Use the nsITimer::InitWithCallbackAndTimeDuration() instead of rounding. r=ehsan

Review of attachment 8871954 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/TimeoutExecutor.cpp
@@ +70,5 @@
> +  // 3. setTimeout(g, 1);
> +  //
> +  // This should fire f() and g() 500us apart.
> +  //
> +  // In the past worked because each setTimeout() got its own nsITimer.  The 1ms

Nit: the first sentence seems like it's missing a word somewhere in the middle.

Attachment #8871954 - Flags: review?(ehsan) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 122

•

7 years ago

Comment on attachment 8872003 [details] [diff] [review]
P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj

On further reflection I'm not sure this patch is a good idea.

Removing the PR_MicrosecondsToInterval() call here effectively removes some rounding:

https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/TimerThread.cpp#514

This patch results in use doing a floor to the nearest interval instead.  So its likely spinning the TimerThread.

I'll post a new patch that just uses TimeDuration to set the nsTimerImpl::mTimeout.  The logic for scheduling the wakeup will not change, though.

The end result is that the patch set in this bug will probably make us fire late a bit more frequently than we do today.

Attachment #8872003 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 123

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=7a57205b7ef6be580e412548ab2709da9b238102

Dão Gottwald [:dao]

Updated

•

7 years ago

Attachment #8871998 - Flags: review?(dao+bmo) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 124

•

7 years ago

In the interest of trying to maintain our old behavior, lets see if we can allow early firing again.  This will only be limited to the amount allowed by TimerThread.  If our TimerThread ever becomes more accurate then this would automatically adjust.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=69f9ff7e02187f35180197dca6fa9fd798905161

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 125

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=75523509a49eddfb83f33a017a20afd917c8b14c

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 126

•

7 years ago

Ok, this latest try is looking good.  I don't see spurious wakeups locally.  The precision is about the same as we have had historically.

I'm going to clean these remaining patches up so maybe I can get this reviewed and landed on Wednesday.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 127

•

7 years ago

Attached patch P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj (obsolete) — Details — Splinter Review

Attachment #8872003 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 128

•

7 years ago

Attached patch P15 Expose nsITimer::GetAllowedEarlyFiringMicroseconds(). r=froydnj (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 129

•

7 years ago

Attached patch P16 Allow setTimeout() to fire early based on nsIThread::GetAllowedEarlyFiringMicroseconds(). r=ehsan (obsolete) — Details — Splinter Review

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1e0965a1e28e6d32514bff2c0dec99c56a9477d3

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 130

•

7 years ago

Try build based on m-c tip:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8c0d86c9bfed1463aea422365537fc91357a3395

Talos comparison push:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3514b634dd5885723624fe119d0bea2115ea8ce2

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 131

•

7 years ago

There is a new persistent failure I need to investigate:

devtools/client/netmonitor/test/browser_net_simple-request-data.js

Looks like this test was just changed recently for bug 1242505.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 132

•

7 years ago

Paolo, do you have any advice on how to approach this error?  It started showing up with my patchset here after you landed bug 1242505.

https://treeherder.mozilla.org/logviewer.html#?job_id=102884059&repo=try&lineNumber=5879

I don't actually do anything specific with this test, but my code does slightly change setTimeout() timings.  I don't see any specific setTimeout() calls in this test though.

Flags: needinfo?(paolo.mozmail)

:Paolo Amadini

Comment 133

•

7 years ago

It seems the failure occurs in devtools/client/netmonitor/test/head.js line 408. If this is intermittent, it may be one of the test shutdown issues with devtools, and in bug 1242505 we just just whitelisted them in these three places:

https://dxr.mozilla.org/mozilla-central/search?q=%22tabTarget+is+null%22&redirect=true

If this is deterministic after your patch lands, then you should either debug and address the issue if it's easy to fix, or use expectUncaughtRejection in the individual test file.

Flags: needinfo?(paolo.mozmail)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 134

•

7 years ago

(In reply to :Paolo Amadini from comment #133)
> If this is deterministic after your patch lands, then you should either
> debug and address the issue if it's easy to fix, or use
> expectUncaughtRejection in the individual test file.

I'll do some more pushes, but it does not look deterministic to me.  For example, in my try build it triggered in win7 VM opt, but not in win7 opt.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 135

•

7 years ago

Attached patch P17 Delay browser_net_simple-request-data.js teardown to allow all events to fire. r=Honza (obsolete) — Details — Splinter Review

https://treeherder.mozilla.org/#/jobs?repo=try&revision=62d685c7f432a576f2c651c38edf51c776a07b33

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Attachment #8872661 - Attachment description: Bug 1363829 P17 Delay browser_net_simple-request-data.js teardown to allow all events to fire. r=Honza → P17 Delay browser_net_simple-request-data.js teardown to allow all events to fire. r=Honza

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 136

•

7 years ago

Comment on attachment 8872422 [details] [diff] [review]
P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj

Review of attachment 8872422 [details] [diff] [review]:
-----------------------------------------------------------------

Nathan, this patch is similar to the last one I submitted except:

1. It only includes the bits letting us pass a TimeDuration into nsITimer.
2. It does not change the Monitor or CondVar APIs since underneath they do not support better than millisecond resolution on all platforms.
3. It does not modify the main TimerThread::Run() loop logic at all.

Basically it preserves the rounding TimerThread::Run() that we have always done.  The benefit of the patch is to avoid rounding at the nsITimer API layer as well.

Attachment #8872422 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 137

•

7 years ago

Comment on attachment 8872423 [details] [diff] [review]
P15 Expose nsITimer::GetAllowedEarlyFiringMicroseconds(). r=froydnj

Review of attachment 8872423 [details] [diff] [review]:
-----------------------------------------------------------------

Nathan, this patch exposes an AllowedEarlyFiringTime() method on nsITimer.  The TimerThread internally decides to fire timers up to 250us early.  I'd like to make the setTimeout() logic aware of this value so it can avoid scheduling additional timers if its within this range of the deadline.

I did this as a method on nsITimer so that it can be adjusted if we ever support different timer implementations based on platform, etc.

Attachment #8872423 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 138

•

7 years ago

Comment on attachment 8872424 [details] [diff] [review]
P16 Allow setTimeout() to fire early based on nsIThread::GetAllowedEarlyFiringMicroseconds(). r=ehsan

Review of attachment 8872424 [details] [diff] [review]:
-----------------------------------------------------------------

Ehsan, this patch restores the ability of setTimeout() to fire slightly early.  Previous patches called this "flex time".  In this patch, however, the value is based on the actual calculated value from our nsITimer implementation.  If the underlying implementation changes then setTimeout() will automatically adjust as well.

AFAICT, we currently use 250us for this value on our main platforms today.

Attachment #8872424 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 139

•

7 years ago

Attached patch P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza (obsolete) — Details — Splinter Review

Jan, I was getting intermittent failures in browser_net_simple-request-data.js with my patches here.  They change the timing of things slightly which tends to reveal race conditions.

In this case browser_net_simple-request-data.js has a race condition in that waitForNetworkEvents() resolves its promise when it first receives the last expected event.  The STARTED_RECEIVING_RESPONSE event, however, has asynchronous steps that need to be completed before teardown.

This patch makes the test explicitly wait for all of the queued async test steps.

I also changed the cookies test assertions since they just don't seem accurate any more.  The .requestCookies and .responseCookies attributes don't have a .cookies attribute.  They are just arrays themselves.  I don't think they were getting picked up by the test previously because .once() doesn't catch exceptions.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=31ca9c4aaadb88c217b89818070be7040df50c80

Attachment #8872725 - Flags: review?(odvarko)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Attachment #8872661 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 140

•

7 years ago

Rebased on tip:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8f82514e8f3b0ce6ab3f6e41fbe9b11f70dc48a7

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 141

•

7 years ago

Ugh, the browser_net_simple-request-data.js test is picking up an extra cookie from browser_net_filter* tests.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 142

•

7 years ago

Attached patch P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza (obsolete) — Details — Splinter Review

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4e9bfc149ea936c5c1137d1ec40829999262b912

Attachment #8872725 - Attachment is obsolete: true

Attachment #8872725 - Flags: review?(odvarko)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 143

•

7 years ago

Comment on attachment 8872755 [details] [diff] [review]
P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza

Jan, please seem comment 139 for the initial description.

This patch further adds a Service.cookies.removeAll() call to the test directories cleanup handler.  The test was failing in automation because a cookie was leaking from another test to this one.

Looks green on try now.

Attachment #8872755 - Flags: review?(odvarko)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 144

•

7 years ago

Latest try is looking pretty good:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f8eb8d48591c95155f932879b9fea8d1fd77a2dc

Assuming the last few reviews go ok I'm hoping to land this tomorrow.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 145

•

7 years ago

Comment on attachment 8872755 [details] [diff] [review]
P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza

Unfortunately this test still gives periodic errors like this.

https://treeherder.mozilla.org/logviewer.html#?job_id=103177945&repo=try&lineNumber=4469

Attachment #8872755 - Flags: review?(odvarko)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 146

•

7 years ago

Attached patch P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza (obsolete) — Details — Splinter Review

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b65ec1569770fe89a741d609d17d6a11f9378426

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Attachment #8872755 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 147

•

7 years ago

Comment on attachment 8872844 [details] [diff] [review]
P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza

Jan, sorry for the flag spam here.  See comment 139 and comment 143 for the previous descriptions.

I had to further update the test here to check for null requestItem in the various waitUntil() polling functions.  It seems its possible for the poll to sometimes run fast enough to get an undefined item in the list.

Again, I don't think the test reported these kinds of errors before because once() does not catch exceptions.  This patch now converts them to promise rejections.

Attachment #8872844 - Flags: review?(odvarko)

Jan Honza Odvarko [:Honza] (always need-info? me)

Comment 148

•

7 years ago

Comment on attachment 8872844 [details] [diff] [review]
P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza

Review of attachment 8872844 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.

R+ assuming Try is green

Thanks for the patch!

Honza

Attachment #8872844 - Flags: review?(odvarko) → review+

Nathan Froyd [:froydnj]

Comment 149

•

7 years ago

Comment on attachment 8872422 [details] [diff] [review]
P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj

Review of attachment 8872422 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the name change.

::: xpcom/threads/nsITimer.idl
@@ +172,5 @@
> +   * @param aType      Timer type per TYPE* consts defined above
> +   */
> +  [noscript] void InitWithCallbackAndTimeDuration(in nsITimerCallback aCallback,
> +                                                  [const] in TimeDuration aDelay,
> +                                                  in unsigned long aType);

Having had a chance to think about it a little more, WDYT about calling this InitHighResolutionWithCallback?  This name seems preferable to me, and is resilient in the face of TimeDuration renaming changes or similar.

Attachment #8872422 - Flags: review?(nfroyd) → review+

Nathan Froyd [:froydnj]

Comment 150

•

7 years ago

Comment on attachment 8872424 [details] [diff] [review]
P16 Allow setTimeout() to fire early based on nsIThread::GetAllowedEarlyFiringMicroseconds(). r=ehsan

Review of attachment 8872424 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/TimeoutExecutor.h
@@ +22,5 @@
>    nsCOMPtr<nsITimer> mTimer;
>    TimeStamp mDeadline;
>  
> +  // Limits how far we allow timers to fire into the future from their
> +  // deadline.  Starts of at zero, but is then adjusted when we start

Nit: "Starts *off* at zero..."

Nathan Froyd [:froydnj]

Comment 151

•

7 years ago

Comment on attachment 8872423 [details] [diff] [review]
P15 Expose nsITimer::GetAllowedEarlyFiringMicroseconds(). r=froydnj

Review of attachment 8872423 [details] [diff] [review]:
-----------------------------------------------------------------

These comments-as-a-previous-review are great, but they'd be better as proper commit messages...

::: xpcom/threads/nsITimer.idl
@@ +252,5 @@
> +  /**
> +   * The number of microseconds this nsITimer implementation can possibly
> +   * fire early.
> +   */
> +  [noscript] readonly attribute unsigned long allowedEarlyFiringMicroseconds;

Make this [noscript,infallible] so the C++ side can just call GetAllowedEarlyFiringMicroseconds()...we don't ever expect that this could fail, right?

Attachment #8872423 - Flags: review?(nfroyd) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 152

•

7 years ago

(In reply to Nathan Froyd [:froydnj] from comment #151)
> Make this [noscript,infallible] so the C++ side can just call
> GetAllowedEarlyFiringMicroseconds()...we don't ever expect that this could
> fail, right?

I tried doing [infallible], but that requires the interface to be [builtinclass].  That seemed to be problematic with the way nsTimer/nsTimerImpl are currently written.  I was trying not to change too much here, so I just skipped the infallible for now.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 153

•

7 years ago

Attached patch P1 Remove some deadcode in TimeoutManager related to processing pending timers. r=ehsan — Details — Splinter Review

Attachment #8870244 - Attachment is obsolete: true

Attachment #8873196 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 154

•

7 years ago

Attached patch P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan — Details — Splinter Review

Attachment #8870245 - Attachment is obsolete: true

Attachment #8873197 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 155

•

7 years ago

Attached patch P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan — Details — Splinter Review

Attachment #8870248 - Attachment is obsolete: true

Attachment #8873198 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 156

•

7 years ago

Attached patch P4 Store the scheduled delay on Timeout instead of relying on the nsITimer to hold it. r=ehsan — Details — Splinter Review

Attachment #8870251 - Attachment is obsolete: true

Attachment #8873200 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 157

•

7 years ago

Attached patch P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan — Details — Splinter Review

Attachment #8871460 - Attachment is obsolete: true

Attachment #8873202 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 158

•

7 years ago

Attached patch P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan — Details — Splinter Review

Attachment #8871461 - Attachment is obsolete: true

Attachment #8873203 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 159

•

7 years ago

Attached patch P7 Remove the Timeout nsITimer code. r=ehsan — Details — Splinter Review

Attachment #8871463 - Attachment is obsolete: true

Attachment #8873204 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 160

•

7 years ago

Attached patch P8 Remove TimeoutManager back pressure mechanism. r=smaug — Details — Splinter Review

Attachment #8870260 - Attachment is obsolete: true

Attachment #8873206 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 161

•

7 years ago

Attached patch P9 Make docshell chrome tests wait for next paint to avoid races. r=smaug — Details — Splinter Review

Attachment #8870261 - Attachment is obsolete: true

Attachment #8873207 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 162

•

7 years ago

Attached patch P10 Make image map reftest wait for first paint before triggering focus/blur. r=dbaron — Details — Splinter Review

Attachment #8870262 - Attachment is obsolete: true

Attachment #8873213 - Flags: review+

(no longer active)

Updated

•

7 years ago

Attachment #8872424 - Flags: review?(ehsan) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 163

•

7 years ago

Attached patch P11 Make debugger mochitests perform a full CC in addition to the forced GC during test cleanup. r=jryans — Details — Splinter Review

Attachment #8870627 - Attachment is obsolete: true

Attachment #8873240 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 164

•

7 years ago

Attached patch P12 Expose nsITimer::InitWithCallbackAndTimeDuration() method. r=froydnj (obsolete) — Details — Splinter Review

This includes the suggested rename.

Attachment #8872422 - Attachment is obsolete: true

Attachment #8873241 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 165

•

7 years ago

Attached patch P12 Expose nsITimer::InitHighResolutionWithCallback() method. r=froydnj — Details — Splinter Review

Rename the method in the commit message as well.

Attachment #8873241 - Attachment is obsolete: true

Attachment #8873242 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 166

•

7 years ago

Attached patch P13 Use the nsITimer::InitHighResolutionWithCallback() instead of rounding. r=ehsan — Details — Splinter Review

Attachment #8871954 - Attachment is obsolete: true

Attachment #8873243 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 167

•

7 years ago

Attached patch P14 Make browser_bug462289.js test use setTimeout(f, 0) like it did before bug 1102017. r=dao — Details — Splinter Review

Attachment #8871998 - Attachment is obsolete: true

Attachment #8873245 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 168

•

7 years ago

Attached patch P15 Expose nsITimer::GetAllowedEarlyFiringMicroseconds(). r=froydnj — Details — Splinter Review

Attachment #8872423 - Attachment is obsolete: true

Attachment #8873246 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 169

•

7 years ago

Attached patch P16 Allow setTimeout() to fire early based on nsIThread::GetAllowedEarlyFiringMicroseconds(). r=ehsan — Details — Splinter Review

Attachment #8872424 - Attachment is obsolete: true

Attachment #8873247 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 170

•

7 years ago

Attached patch P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza — Details — Splinter Review

Attachment #8872844 - Attachment is obsolete: true

Attachment #8873249 - Flags: review+

Pulsebot

Comment 171

•

7 years ago

Pushed by bkelly@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2e778ec02f89
P1 Remove some deadcode in TimeoutManager related to processing pending timers. r=ehsan
https://hg.mozilla.org/integration/mozilla-inbound/rev/7eec06a678cb
P2 Removing the Timeout argument to TimeoutManager::RunTimeout(). r=ehsan
https://hg.mozilla.org/integration/mozilla-inbound/rev/60ce370ec87b
P3 Improve Timeout ref counting to avoide bare AddRef/Release calls. r=ehsan
https://hg.mozilla.org/integration/mozilla-inbound/rev/aea41a0174ae
P4 Store the scheduled delay on Timeout instead of relying on the nsITimer to hold it. r=ehsan
https://hg.mozilla.org/integration/mozilla-inbound/rev/9ac282f67370
P5 Add an executor class to represent the single timer executing timers for a window. r=ehsan
https://hg.mozilla.org/integration/mozilla-inbound/rev/5214ce6bed25
P6 Use the TimeoutExecutor in TimeoutManager. r=ehsan
https://hg.mozilla.org/integration/mozilla-inbound/rev/32c98a6c8750
P7 Remove the Timeout nsITimer code. r=ehsan
https://hg.mozilla.org/integration/mozilla-inbound/rev/bd452eda2e83
P8 Remove TimeoutManager back pressure mechanism. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/a70379a2937c
P9 Make docshell chrome tests wait for next paint to avoid races. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/cd8ae8487d00
P10 Make image map reftest wait for first paint before triggering focus/blur. r=dbaron
https://hg.mozilla.org/integration/mozilla-inbound/rev/833f4fe14182
P11 Make debugger mochitests perform a full CC in addition to the forced GC during test cleanup. r=jryans
https://hg.mozilla.org/integration/mozilla-inbound/rev/79c1d7e0d539
P12 Expose nsITimer::InitHighResolutionWithCallback() method. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/1c99a1222200
P13 Use the nsITimer::InitHighResolutionWithCallback() instead of rounding. r=ehsan
https://hg.mozilla.org/integration/mozilla-inbound/rev/9b5fb2901eac
P14 Make browser_bug462289.js test use setTimeout(f, 0) like it did before bug 1102017. r=dao
https://hg.mozilla.org/integration/mozilla-inbound/rev/d27d106fc45d
P15 Expose nsITimer::GetAllowedEarlyFiringMicroseconds(). r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/95421442eec4
P16 Allow setTimeout() to fire early based on nsIThread::GetAllowedEarlyFiringMicroseconds(). r=ehsan
https://hg.mozilla.org/integration/mozilla-inbound/rev/7c6116ac71e9
P17 Make browser_net_simple-request-data.js wait for all async tests to complete before finishing. r=honza

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 172

•

7 years ago

Note, I did not measure any improvement on speedometer with this patch.  Dropping the dependency.

No longer blocks: Speedometer_V2

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 173

•

7 years ago

For future reference, here is a profile of:

https://mozdevs.github.io/servo-experiments/experiments/tiles/

On my mac book pro using 55.0a1 (2017-05-28) (64-bit):

https://perfht.ml/2soSOQz

Once a nightly spins with the patches from this bug I will compare.

Carsten Book [:Tomcat]

Comment 174

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/2e778ec02f89
https://hg.mozilla.org/mozilla-central/rev/7eec06a678cb
https://hg.mozilla.org/mozilla-central/rev/60ce370ec87b
https://hg.mozilla.org/mozilla-central/rev/aea41a0174ae
https://hg.mozilla.org/mozilla-central/rev/9ac282f67370
https://hg.mozilla.org/mozilla-central/rev/5214ce6bed25
https://hg.mozilla.org/mozilla-central/rev/32c98a6c8750
https://hg.mozilla.org/mozilla-central/rev/bd452eda2e83
https://hg.mozilla.org/mozilla-central/rev/a70379a2937c
https://hg.mozilla.org/mozilla-central/rev/cd8ae8487d00
https://hg.mozilla.org/mozilla-central/rev/833f4fe14182
https://hg.mozilla.org/mozilla-central/rev/79c1d7e0d539
https://hg.mozilla.org/mozilla-central/rev/1c99a1222200
https://hg.mozilla.org/mozilla-central/rev/9b5fb2901eac
https://hg.mozilla.org/mozilla-central/rev/d27d106fc45d
https://hg.mozilla.org/mozilla-central/rev/95421442eec4
https://hg.mozilla.org/mozilla-central/rev/7c6116ac71e9

Status: ASSIGNED → RESOLVED

Closed: 7 years ago

status-firefox55: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla55

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 175

•

7 years ago

I took some profiles before and after this bug landed on:

Tiles:
* https://mozdevs.github.io/servo-experiments/experiments/tiles/
* before:
* https://perfht.ml/2sx78Xn
* https://perfht.ml/2sxlsPv
* after:
* https://perfht.ml/2swX0xI
* https://perfht.ml/2rzrUbu

Here you can see in the "before" case that there are large 200ms to 500ms pauses. During this period there is a lot of setTimeout() and nsTimer runnables executing. About ~65ms to ~165ms are spent in setTimeout.

In the after case pauses are limited to ~80ms. Of these pauses setTimeout takes ~30ms.

I don't completely understand this result, but there are a couple things that could be helping here:

1) Our setTimeout() doesn't always lock with TimerThread any more. So we can execute more continuously on the main thread without forcing context switching.
2) We are dispatching many fewer timer events, so the possibility to intermix nsTimer runnables during these periods is reduced.
3) Processing timeouts faster allows us to drain the timeout list more quickly reducing the amount of timeouts we have to search in Insert().

Flood:
* https://people-mozilla.org/~bkelly/timer-flood/index.html
* before:
* https://perfht.ml/2sxkPFT
* https://perfht.ml/2sxdDtm
* after:
* https://perfht.ml/2rzqm1n
* https://perfht.ml/2rz3cIk

The main take away from these profiles is that we end up contending with the TimerThread a lot less. If you search for "Acquire" you can see how often mutexes show up in the profile (this is windows). There is a lot more contention in the "before" case.

Also, there is a huge amount of contention during back pressure reduction. This is one reason I chose to remove the back pressure mechanism for now.

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

7 years ago

Depends on: 1369440

(no longer active)

Comment 176

•

7 years ago

(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #175)
> I took some profiles before and after this bug landed on:
> 
> Tiles:
>   * https://mozdevs.github.io/servo-experiments/experiments/tiles/
>     * before:
>       * https://perfht.ml/2sx78Xn

This profile isn't a good example.  All of the cost here is bug 1369494 (around 5% is nsITimer related).

>       * https://perfht.ml/2sxlsPv
>     * after:
>       * https://perfht.ml/2swX0xI
>       * https://perfht.ml/2rzrUbu
> 
> Here you can see in the "before" case that there are large 200ms to 500ms
> pauses.  During this period there is a lot of setTimeout() and nsTimer
> runnables executing.  About ~65ms to ~165ms are spent in setTimeout.
> 
> In the after case pauses are limited to ~80ms.  Of these pauses setTimeout
> takes ~30ms.
> 
> I don't completely understand this result, but there are a couple things
> that could be helping here:
> 
> 1) Our setTimeout() doesn't always lock with TimerThread any more.  So we
> can execute more continuously on the main thread without forcing context
> switching.
> 2) We are dispatching many fewer timer events, so the possibility to
> intermix nsTimer runnables during these periods is reduced.
> 3) Processing timeouts faster allows us to drain the timeout list more
> quickly reducing the amount of timeouts we have to search in Insert().

Well the fact remains that setTimeout() is still freaking expensive!  In the second before profile, around 11-12% of the cost of setTimeout() is about creating the nsITimer, the rest is Timeouts::Insert()!  That function is *really* inefficient.

> Flood:
>   * https://people-mozilla.org/~bkelly/timer-flood/index.html
>     * before:
>       * https://perfht.ml/2sxkPFT
>       * https://perfht.ml/2sxdDtm
>     * after:
>       * https://perfht.ml/2rzqm1n
>       * https://perfht.ml/2rz3cIk
> 
> The main take away from these profiles is that we end up contending with the
> TimerThread a lot less. If you search for "Acquire" you can see how often
> mutexes show up in the profile (this is windows).  There is a lot more
> contention in the "before" case.

Indeed, that's exactly working as planned!  \o/

The other interesting difference between these two cases is that Timeouts::Insert() doesn't show up that much in this profile.  And the reason is that this test case won't cause you to end up having a long queue of timeouts.  This difference bothered me a bit, so I decided to do the math.  I broke in the JS debugger in the servo demo to see how many tiles I have.  I had a 91*50 grid.  That means we'll iterate over 4549 timeouts *in the last iteration of the loop*, and I bet the real number is much lower than that.  And this loop only runs one time in the beginning of each round of a new effect rolling over a new picture (so not frequently at all).  This really shouldn't be that slow, and we should barely see it in the profile...

> Also, there is a huge amount of contention during back pressure reduction. 
> This is one reason I chose to remove the back pressure mechanism for now.

Seems like it indeed was a good idea!

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 177

•

7 years ago

(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #176)
> >       * https://perfht.ml/2sxlsPv
> >     * after:
> >       * https://perfht.ml/2swX0xI
> >       * https://perfht.ml/2rzrUbu
> > 
> > Here you can see in the "before" case that there are large 200ms to 500ms
> > pauses.  During this period there is a lot of setTimeout() and nsTimer
> > runnables executing.  About ~65ms to ~165ms are spent in setTimeout.
> > 
> > In the after case pauses are limited to ~80ms.  Of these pauses setTimeout
> > takes ~30ms.
> > 
> > I don't completely understand this result, but there are a couple things
> > that could be helping here:
> > 
> > 1) Our setTimeout() doesn't always lock with TimerThread any more.  So we
> > can execute more continuously on the main thread without forcing context
> > switching.
> > 2) We are dispatching many fewer timer events, so the possibility to
> > intermix nsTimer runnables during these periods is reduced.
> > 3) Processing timeouts faster allows us to drain the timeout list more
> > quickly reducing the amount of timeouts we have to search in Insert().
> 
> Well the fact remains that setTimeout() is still freaking expensive!  In the
> second before profile, around 11-12% of the cost of setTimeout() is about
> creating the nsITimer, the rest is Timeouts::Insert()!  That function is
> *really* inefficient.

I think the other factor in the painting pauses on the tiles before was that it might have been triggering back pressure.  We started at 5000 timers and I think tiles does more than that.  So that could partially be why there are those pauses.

> The other interesting difference between these two cases is that
> Timeouts::Insert() doesn't show up that much in this profile.  And the
> reason is that this test case won't cause you to end up having a long queue
> of timeouts.  This difference bothered me a bit, so I decided to do the
> math.  I broke in the JS debugger in the servo demo to see how many tiles I
> have.  I had a 91*50 grid.  That means we'll iterate over 4549 timeouts *in
> the last iteration of the loop*, and I bet the real number is much lower
> than that.  And this loop only runs one time in the beginning of each round
> of a new effect rolling over a new picture (so not frequently at all).  This
> really shouldn't be that slow, and we should barely see it in the profile...

It seems like windows high resolution TimeStamp might be expensive to compare.  The CheckQPC thing seems pretty complex and we do it a lot.

Making Insert scale better in general, though, would be good.  We can discuss in bug 1369494.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

7 years ago

Depends on: 1369350

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Depends on: 1371020

Arjen de Jager

Updated

•

7 years ago

Depends on: 1378149

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Depends on: 1378394

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

7 years ago

Depends on: 1410930

Kestrel

Updated

•

5 years ago

Depends on: 1489463

Nobody; OK to take it and work on it

Updated

•

5 years ago

Component: DOM → DOM: Core & HTML

Kestrel

Updated

•

5 years ago

No longer depends on: 1489463

Regressions: 1489463

Markus Stange [:mstange]

Updated

•

3 years ago

Regressions: 1683560

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

2 years ago

Performance Impact: --- → P3

Whiteboard: [qf:p3]

Alice0775 White

Updated

•

1 year ago

Regressions: 1824472

You need to log in before you can comment on or make changes to this bug.