A loop in a task that yields promises is janky

RESOLVED INVALID

Status

()

RESOLVED INVALID
3 years ago
2 years ago

People

(Reporter: markh, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

3 years ago
In bug 1186714 we are seeing janky bookmarks backup. It turns out that this jank can be reproduced with a Task.jsm task that does, say:

> for (let i = 0; i < 100000; i++)
>  yield Promise.resolve();

which surprises me, and seemed to have surprised the bookmarks backup code too. A possible complication here is that Task.jsm uses Promise.jsm, which IIUC, has less sophisticated scheduling than DOM promises, so there's a chance that this will not be reproducible with DOM promises.

The jank can be reduced by having the task give up the event loop every now and then with code like:

> yield new Promise(resolve => {
>   Services.tm.currentThread.dispatch(resolve, Ci.nsIThread.DISPATCH_NORMAL);
> });

but the problems with that approach are that (a) the developer needs to identify a loop fits this pattern (eg, the bookmarks backup code isn't obviously an example of this) and (b) the optimal "every now and then" factor is impossible to know ahead of time as it will depend on the performance of each machine running the code.

It seem the ideal scenario would be that DOM promises magically solves it, or failing that, for Task.jsm to monitor and implement this itself.
DOM promises, by specification, cannot solve this. If I recall correctly, Promise.jsm was patched ~1 year ago to match this specification, which may have caused jank regressions such as what you describe.

So if there is a solution, it needs to be in Task.jsm.
Flags: needinfo?(paolo.mozmail)

Comment 2

3 years ago
Neither Task nor Promise are designed to yield to the operating system's event loop - they're even required not to do that in order to improve the performance of loops like the one in comment 0. The fact we used an event to implement the requirement that "then" must return before any handler invocation was a hack, used before we implemented the equivalent of Microtasks in the platform.

If you have a long running CPU-intensive operation that you want to throttle, you definitely have to use "dispatch" or even better a timer, like you would do with a callback-based approach. This can be based on actual time spent on the loop or number of iterations.

That said, I don't see why bookmark backups would be CPU intensive, as I'd expect most of the time to be spent on I/O off the main thread. Maybe the cause of unresponsive UI should be sought in redundant loops or leftover I/O on the main thread?
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Flags: needinfo?(paolo.mozmail)
Resolution: --- → INVALID
(Reporter)

Comment 3

3 years ago
INVALID is unfortunate IMO. Is there ever a case where browser code would use a task and *expect* jank (ie, would consider it acceptable)? I'd have thought any task that demonstrates jank (eg, bug 1186714) is clearly a bug and in every single case we should take steps to avoid it.

The work-around in that bug will mean that the operation is still janky for slow machines, and slower than it can be for fast machines. The lack of tooling support for an optimal fix means such a blunt work-around is likely to be used everywhere this pops up - it seems highly unlikely a reliable time-based solution will be re-implemented from scratch each time (especially given wall-time isn't relevant - something like PerformanceStats.jsm's jank monitor would be necessary)

Paolo, can you please clarify your thoughts here? Maybe you are saying we *should* have tooling, but it shouldn't be in Task.jsm? (I could accept that, although I don't understand in what cases people would explicitly choose to allow jank). Or are you saying we don't need tooling for this at all and everyone should (probably poorly) reinvent a local solution? Or something else?
Flags: needinfo?(paolo.mozmail)

Comment 4

3 years ago
Mark, I think you're conflating two distinct aspects here.

Task.jsm is not, and was never intended to be, a tool to remove jank. It is a tool to make code more expressive, so that it becomes easier to write (and read) asynchronous functions that enable the use of tools that remove jank, like OS.File, Sqlite.jsm, and so on.

(In reply to Mark Hammond [:markh] from comment #3)
> INVALID is unfortunate IMO. Is there ever a case where browser code would
> use a task and *expect* jank (ie, would consider it acceptable)?

There is no case where browser code would expect jank (ie, would consider it acceptable). Whatever flow control technique we place in the "and" clause (callbacks, Promise, Task, DOM events) becomes irrelevant. It does not mean we should instrument callbacks, Promise, Task, or DOM events to prevent people from writing janky code.

> The work-around in that bug will mean that the operation is still janky for
> slow machines, and slower than it can be for fast machines.

Hm, how can you prevent that, other than offloading work out of the main thread? Any throttling mechanism on the main thread would either make code slower or keep jank...

> The lack of
> tooling support for an optimal fix means such a blunt work-around is likely
> to be used everywhere this pops up - it seems highly unlikely a reliable
> time-based solution will be re-implemented from scratch each time
> (especially given wall-time isn't relevant - something like
> PerformanceStats.jsm's jank monitor would be necessary)
> 
> Maybe you are saying we
> *should* have tooling, but it shouldn't be in Task.jsm?

We have tooling to remove jank, though main thread throttling is not one of the canonical ones at present. If we have enough use cases for it, we could definitely implement that mechanism in a Throttle.jsm module. If a CPU bound operation is expected to be long-running, I don't think we need to use PerformanceStats.jsm or throttle based on CPU cycles - a fairly simple implementation based on wall-clock time would give similar results.

That said, I don't expect bookmark export to be a use case for a potential Throttle.jsm. That's because I expect I/O to take much more time than CPU operations, and I/O is already off-loaded to a worker thread (one of our canonical tools to prevent jank). I suggest you take an actual performance profile of the operation to see if what I just said is correct - if it's not, maybe that's the problem we need to solve.

If you don't think you have the time now, I'd suggest to land the workaround in bug 1186714 as is, and file a bug to investigate the actual cause of the jank.
Flags: needinfo?(paolo.mozmail)
(Reporter)

Comment 5

3 years ago
(In reply to :Paolo Amadini from comment #4)
> Mark, I think you're conflating two distinct aspects here.
> 
> Task.jsm is not, and was never intended to be, a tool to remove jank.

I do understand that :) But async APIs, in general, are. I'm surprised and impressed if the bookmarks backup code is obviously janky to your eyes ;)

> If you don't think you have the time now, I'd suggest to land the workaround
> in bug 1186714 as is, and file a bug to investigate the actual cause of the
> jank.

I think the actual cause is that it reduces to the snippet in comment 0, hence this bug. Everything you need to reproduce the issue is in that bug if you think the actual issue is something else.
(Reporter)

Comment 6

3 years ago
I think part of my/the problem is that I can't work out how to unroll the snippet in comment 0 to something using pure-promises while keeping the same characteristics. I believe the problem is unique to the expressiveness offered by generators/tasks (in ways I've not yet found the time to understand better)

Comment 7

3 years ago
(In reply to Mark Hammond [:markh] from comment #5)
> I think the actual cause is that it reduces to the snippet in comment 0,
> hence this bug. Everything you need to reproduce the issue is in that bug if
> you think the actual issue is something else.

Well, I'm arguing that it should resolve to something I/O bound, like:

for (let i = 0; i < 100000; i++)
  yield OS.File.stat("C:\\");

I ran that snippet in the Browser Console on Windows and it is not janky at all.

Unfortunately I don't have the time to debug the specific issue with bookmarks right now :-(

(In reply to Mark Hammond [:markh] from comment #6)
> I think part of my/the problem is that I can't work out how to unroll the
> snippet in comment 0 to something using pure-promises while keeping the same
> characteristics.

Untested:

let i = 0;
function a() {
  if (i++ > 10000) return;
  Promise.resolve().then(a);
}

This code may actually put more strain on the garbage collector but should be roughly equivalent to the Task in comment 0. It is also expected to not process any OS level events before it terminates.
You need to log in before you can comment on or make changes to this bug.