Make console output faster

RESOLVED FIXED in Firefox 34

Status

defect
RESOLVED FIXED
5 years ago
10 months ago

People

(Reporter: rcampbell, Assigned: bgrins)

Tracking

(Depends on 1 bug, Blocks 1 bug, {perf})

unspecified
Firefox 35
Dependency tree / graph

Firefox Tracking Flags

(firefox32 wontfix, firefox33 wontfix, firefox34 fixed, firefox35 fixed)

Details

(URL)

Attachments

(6 attachments, 3 obsolete attachments)

(Reporter)

Description

5 years ago
We've recently done a bunch of output work on the console. We should measure its performance and improve if necessary.
(Reporter)

Updated

5 years ago
Depends on: 946336
(Reporter)

Updated

5 years ago
Depends on: 1026295
(Reporter)

Updated

5 years ago
Depends on: 951753
(Reporter)

Comment 1

5 years ago
did a quick measurement with the profiler against:

for (let i = 0; i < 10000; i++) console.log("%d hello %s", i, window.location, window);

results seem to point to pruneOutput() in webconsole.js:2123.

Raising the cap on results might actually improve performance when outputting large amounts of data.
(Reporter)

Updated

5 years ago
Summary: Measure Performance → Measure Console Performance
(In reply to Rob Campbell [:rc] (:robcee) from comment #1)
> Created attachment 8465491 [details]
> Screen Shot 2014-07-31 at 11.29.23.png
> 
> did a quick measurement with the profiler against:
> 
> for (let i = 0; i < 10000; i++) console.log("%d hello %s", i,
> window.location, window);
> 
> results seem to point to pruneOutput() in webconsole.js:2123.
> 
> Raising the cap on results might actually improve performance when
> outputting large amounts of data.

Or it might just be causing unnecessary reflows
(Assignee)

Comment 3

5 years ago
Posted patch console-timings.patch (obsolete) — Splinter Review
Using console.time I've broken the function into a few parts.  The time is split between outputting the messages, pruning the old ones, and scrolling into view.  All three seem pretty slow.

for (let i = 0; i < 10000; i++) console.log("%d hello %s", i, window.location, window);

Here is the output of running the above command with my patch applied (don't even try to run the patch with the Browser Console opened - or else things get crazy b/c each log statements triggers another call):

console.time: 'Flush message queue outer' @ Fri Aug 01 2014 09:25:52 GMT-0500 (CDT)
console.time: 'Output the current batch of messages.' @ Fri Aug 01 2014 09:25:52 GMT-0500 (CDT)
console.timeEnd: 'Output the current batch of messages.' 41ms
console.time: 'Prune messages if needed' @ Fri Aug 01 2014 09:25:52 GMT-0500 (CDT)
console.timeEnd: 'Prune messages if needed' 23ms
console.time: 'scrollToVisible' @ Fri Aug 01 2014 09:25:52 GMT-0500 (CDT)
console.timeEnd: 'scrollToVisible' 1ms
console.time: 'finishing up' @ Fri Aug 01 2014 09:25:52 GMT-0500 (CDT)
console.timeEnd: 'finishing up' 0ms
console.timeEnd: 'Flush message queue outer' 66ms
console.time: 'Flush message queue outer' @ Fri Aug 01 2014 09:25:55 GMT-0500 (CDT)
console.time: 'Output the current batch of messages.' @ Fri Aug 01 2014 09:25:55 GMT-0500 (CDT)
console.timeEnd: 'Output the current batch of messages.' 195ms
console.time: 'Prune messages if needed' @ Fri Aug 01 2014 09:25:55 GMT-0500 (CDT)
console.timeEnd: 'Prune messages if needed' 103ms
console.time: 'scrollToVisible' @ Fri Aug 01 2014 09:25:55 GMT-0500 (CDT)
console.timeEnd: 'scrollToVisible' 72ms
console.time: 'finishing up' @ Fri Aug 01 2014 09:25:55 GMT-0500 (CDT)
console.timeEnd: 'finishing up' 0ms
console.timeEnd: 'Flush message queue outer' 370ms
console.time: 'Flush message queue outer' @ Fri Aug 01 2014 09:25:57 GMT-0500 (CDT)
console.time: 'Output the current batch of messages.' @ Fri Aug 01 2014 09:25:57 GMT-0500 (CDT)
console.timeEnd: 'Output the current batch of messages.' 184ms
console.time: 'Prune messages if needed' @ Fri Aug 01 2014 09:25:57 GMT-0500 (CDT)
console.timeEnd: 'Prune messages if needed' 185ms
console.time: 'scrollToVisible' @ Fri Aug 01 2014 09:25:58 GMT-0500 (CDT)
console.timeEnd: 'scrollToVisible' 77ms
console.time: 'finishing up' @ Fri Aug 01 2014 09:25:58 GMT-0500 (CDT)
console.timeEnd: 'finishing up' 0ms
console.timeEnd: 'Flush message queue outer' 449ms
console.time: 'Flush message queue outer' @ Fri Aug 01 2014 09:25:59 GMT-0500 (CDT)
console.time: 'Output the current batch of messages.' @ Fri Aug 01 2014 09:25:59 GMT-0500 (CDT)
console.timeEnd: 'Output the current batch of messages.' 184ms
console.time: 'Prune messages if needed' @ Fri Aug 01 2014 09:25:59 GMT-0500 (CDT)
console.timeEnd: 'Prune messages if needed' 185ms
console.time: 'scrollToVisible' @ Fri Aug 01 2014 09:25:59 GMT-0500 (CDT)
console.timeEnd: 'scrollToVisible' 85ms
console.time: 'finishing up' @ Fri Aug 01 2014 09:26:00 GMT-0500 (CDT)
console.timeEnd: 'finishing up' 0ms
console.timeEnd: 'Flush message queue outer' 456ms
(Assignee)

Comment 4

5 years ago
(In reply to Rob Campbell [:rc] (:robcee) from comment #1)
> results seem to point to pruneOutput() in webconsole.js:2123.
> 
> Raising the cap on results might actually improve performance when
> outputting large amounts of data.

I think it is a trade-off, if we raise the limit than each current batch of messages will be larger (currently 200), which will shift more time into the 'Output the current batch of messages.' step (and adding a single message seems to take at least 1ms in my testing).

I do think the limit should be higher anyway, since sometimes messages can get lost if there are a bunch of logs.
(Assignee)

Comment 5

5 years ago
It'd be interesting to see the results from the new profiler in which these lines were commented out:

    if (this._outputQueue.length > toDisplay && this._pruneOutputQueue()) {
      toDisplay = Math.min(this._outputQueue.length, toDisplay);
      shouldPrune = true;
    }

This would force all of the output to be rendered and prevent any from being pruned before it is added to the console so it would take away some timing variability from the results (plus it will cause some serious jank).

Rob, do you just run the profiler from the browser toolbox to take the measurement?
Flags: needinfo?(rcampbell)
(Reporter)

Comment 6

5 years ago
(In reply to Brian Grinstead [:bgrins] from comment #5)
> It'd be interesting to see the results from the new profiler in which these
> lines were commented out:
> 
>     if (this._outputQueue.length > toDisplay && this._pruneOutputQueue()) {
>       toDisplay = Math.min(this._outputQueue.length, toDisplay);
>       shouldPrune = true;
>     }
> 
> This would force all of the output to be rendered and prevent any from being
> pruned before it is added to the console so it would take away some timing
> variability from the results (plus it will cause some serious jank).
> 
> Rob, do you just run the profiler from the browser toolbox to take the
> measurement?

that's right. No magic here.
Flags: needinfo?(rcampbell)
(In reply to Brian Grinstead [:bgrins] from comment #5)
> It'd be interesting to see the results from the new profiler in which these
> lines were commented out:
> 
>     if (this._outputQueue.length > toDisplay && this._pruneOutputQueue()) {
>       toDisplay = Math.min(this._outputQueue.length, toDisplay);
>       shouldPrune = true;
>     }
> 
> This would force all of the output to be rendered and prevent any from being
> pruned before it is added to the console so it would take away some timing
> variability from the results (plus it will cause some serious jank).

This would really slow things down a lot. What we can do is use more doc fragments - use of doc fragments currently is not constant, we should use them everywhere. We can also improve how we prune messages from output (not sure about the output queue itself).

We should also check what's the perf impact of marshalling objects and various console messages through the remote debugging protocol. I suspect this code got slower over time as well.
(Assignee)

Comment 8

5 years ago
(In reply to Mihai Sucan [:msucan] from comment #7)
> (In reply to Brian Grinstead [:bgrins] from comment #5)
> > It'd be interesting to see the results from the new profiler in which these
> > lines were commented out:
> > 
> >     if (this._outputQueue.length > toDisplay && this._pruneOutputQueue()) {
> >       toDisplay = Math.min(this._outputQueue.length, toDisplay);
> >       shouldPrune = true;
> >     }
> > 
> > This would force all of the output to be rendered and prevent any from being
> > pruned before it is added to the console so it would take away some timing
> > variability from the results (plus it will cause some serious jank).
> 
> This would really slow things down a lot. What we can do is use more doc
> fragments - use of doc fragments currently is not constant, we should use
> them everywhere. We can also improve how we prune messages from output (not
> sure about the output queue itself).

Yes, I only meant that not pruning it while profiling may give a useful picture of what what is slowest while generating the output.  Once the new profiler UI lands I'll give it a try and measure it with / without doc fragments.
Any word on this?
(Assignee)

Comment 10

5 years ago
I must be doing something wrong with the profiler session, because here are the results I see when running the command with Browser Toolbox profiler running (nothing like Comment 1, just a single entry for CS_RecordEvent in ConsoleAPIStorage.js).  Rob, do you have any special settings to get this to work?  Do I maybe need to build in debug mode?
Flags: needinfo?(rcampbell)
(In reply to Brian Grinstead [:bgrins] from comment #10)
> Created attachment 8471844 [details]
> Screenshot 2014-08-12 15.49.32.png
> 
> I must be doing something wrong with the profiler session, because here are
> the results I see when running the command with Browser Toolbox profiler
> running (nothing like Comment 1, just a single entry for CS_RecordEvent in
> ConsoleAPIStorage.js).  Rob, do you have any special settings to get this to
> work?  Do I maybe need to build in debug mode?

I see the same issue if I try to profile on any chrome page, such as about:app-manager or about:preferences.

However, Paul said it was working for him...  I also tried a new profile, but still don't get results on chrome pages.
Show gecko platform data = true.

Updated

5 years ago
Whiteboard: [status:planned]
(Assignee)

Comment 13

5 years ago
(In reply to Victor Porof [:vporof][:vp] from comment #12)
> Show gecko platform data = true.

That worked, thanks
Flags: needinfo?(rcampbell)
(Assignee)

Comment 14

5 years ago
Posted image scroll-perf.png
Something interesting I noticed when profiling this is that the -moz-box-flex: 1 on #output-container [0] is causing amazingly slow scroll perf, especially when I remove the output pruning (which obviously we won't do, but it still highlights the issue).  It's not really a huge deal once we re-add the pruning though, but it gets worse if the number of allowed logs increase (which lowers the pressure on pruning).  If I get rid of that rule, unfortunately the messages won't span all the way across the console, though.

[0]: http://dxr.mozilla.org/mozilla-central/source/browser/themes/shared/devtools/webconsole.inc.css#143
Can someone own tracking this down?
(Assignee)

Comment 16

5 years ago
I will work on it
Status: NEW → ASSIGNED

Updated

5 years ago
Whiteboard: [status:planned] → [status:inflight]
(Assignee)

Updated

5 years ago
Summary: Measure Console Performance → Make console output faster
(Assignee)

Comment 17

5 years ago
Ideally we would be finishing this method in around 17ms to hit 60fps.  There are three main areas that are preventing that from even coming close:

1) _pruneOutputQueue, which removes statements that will not be printed from the queue - cleaning up the actors takes ~.5ms but it can have to run on thousands at a time - this was the slowest thing I noticed at first.  Note that simply increasing the limit doesn't really help, since that makes (2) much slower.
2) _outputMessageFromQueue - this function builds the actual method and appends it into the DOM.  Each message seems to take .5-2ms (I'm sure it varies depending on the message), and it runs for 200 messages at a time (100-400ms)
3) Scroll calls - measuring the scrollTop / scrollHeight inside of this function is amazingly slow - it's quite variable, but I've seen 200-400ms just to check scrollNode.scrollHeight after appending the messages, There are also calls to Utils.scrollToVisible, all of these are very very slow.  This actually seems quite a bit better if I remove the -moz-box-flex: 1 on #output-container, but that causes the messages to not span the whole width.

In this patch I address (1) by just marking unprinted items for deletion inside of the normal output method, then doing cleanup only when there isn't anything else to do.  There is also a change to prevent one of the calls to scrollHeight when it won't be needed, and I suspect that we could remove other calls in certain cases (like if there are 200 messages to display, don't try to restore the previous scrollHeight since it won't make sense to do that - just go to the bottom of the output area).

Anyway, with this patch applied when running `for (let i = 0; i < 10000; i++) console.log("%d hello %s", i, window.location, window);` I'm consistently getting 300-400ms per function call, as opposed to 500-800ms (it takes 4-5 calls to clear out the queue, and I'm sure the numbers vary greatly depending on the system).  It still kind of hangs up while trying to scroll during this time, but I think that may be able to be addressed with the comment in (3).
Assignee: nobody → bgrinstead
Attachment #8466215 - Attachment is obsolete: true
(Assignee)

Comment 18

5 years ago
I should say that we don't need to hit 17ms ideally, but rather whatever OUTPUT_INTERVAL is set to (currently 50, I've lowered it to 30 within the patch), since staying under that would keep the output coming at the same speed that it normally would.
(Assignee)

Comment 19

5 years ago
Posted patch console-speedup-1.patch (obsolete) — Splinter Review
Panos, feel free to reassign the review if you think someone else should do it.  This patch speeds up the console output, especially when messages are received in really large bursts by moving the item destroying outside of the main loop in _flushOutputQueue, and by trying to do less stuff with scrollHeight / scrollTop when possible.

I also have a few refactors to do with _initOutputTimer and _outputTimerInitialized that made the code easier for me to understand - basically I moved all of the logic about whether it is scheduled into initOutputTimer, so that it is safe to call that at any time, and it is in charge of deciding whether to schedule a callback.
Attachment #8481533 - Attachment is obsolete: true
Attachment #8481594 - Flags: review?(past)
Comment on attachment 8481594 [details] [diff] [review]
console-speedup-1.patch

Review of attachment 8481594 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/devtools/webconsole/webconsole.js
@@ +2392,5 @@
>      let logLimit = Utils.logLimitForCategory(aCategory);
>      let messageNodes = this.outputNode.querySelectorAll(".message[category=" +
>                         CATEGORY_CLASS_FRAGMENTS[aCategory] + "]");
>      let n = Math.max(0, messageNodes.length - logLimit);
> +    [...messageNodes].slice(0, n).forEach(this.removeOutputMessage, this);

Since we are trying to squeeze every bit of perf gain we can get, it would be much faster to use a normal for here.

Here is a comparison jsperf : http://jsperf.com/iteratorvsfo

Other places where forEach is used can also benefit a gain, but here, we have spreading + slicing + forEach, so the gain is more
(Assignee)

Comment 21

5 years ago
Comment on attachment 8481594 [details] [diff] [review]
console-speedup-1.patch

Review of attachment 8481594 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/devtools/webconsole/webconsole.js
@@ +2392,5 @@
>      let logLimit = Utils.logLimitForCategory(aCategory);
>      let messageNodes = this.outputNode.querySelectorAll(".message[category=" +
>                         CATEGORY_CLASS_FRAGMENTS[aCategory] + "]");
>      let n = Math.max(0, messageNodes.length - logLimit);
> +    [...messageNodes].slice(0, n).forEach(this.removeOutputMessage, this);

I'm not necessarily opposed to doing that, but it isn't going to be "much" faster, especially since the maximum size of this array is 200.  forEach is pretty well optimized (though I've definitely seen some real perf issues with `for of` - https://bugzilla.mozilla.org/show_bug.cgi?id=998344#c10).

The bigger bottleneck is definitely in _outputMessageFromQueue, which is the slowest part of _flushMessageQueue once this patch is applied - I'm interested in any ideas on how to speed that up.
Comment on attachment 8481594 [details] [diff] [review]
console-speedup-1.patch

Review of attachment 8481594 [details] [diff] [review]:
-----------------------------------------------------------------

Nice work!

Regarding the discussion about forEach vs. traditional for loops, shu had described the reasons for the former being slower than the latter a while ago:

http://rfrn.org/~shu/2013/03/20/two-reasons-functional-style-is-slow-in-spidermonkey.html

I'm not sure if that is still accurate, but in any case we should be using profile data to guide our optimizations, not hunches, and in this case it doesn't look like that method contributes significantly to the performance problems.

::: browser/devtools/webconsole/webconsole.js
@@ +148,3 @@
>  
>  // The number of messages to display in a single display update. If we display
>  // too many messages at once we slow the Firefox UI too much.

While you are here: "we slow down the Firefox UI..."

@@ +2078,5 @@
>  
>      // Determine how many messages we can display now.
>      let toDisplay = Math.min(this._outputQueue.length, MESSAGES_IN_INTERVAL);
> +
> +    // If there aren't any messages to display right now, then take care of some

This comment isn't entirely accurate, we also do cleanup work if there are messages to display, but we are throttling updates.
Attachment #8481594 - Flags: review?(past) → review+
Posted file profile.json.zip
Here is a profile I captured from a build with this patch on http://htmlpad.org/console-perf. I didn't touch the browser after clicking on the button to start the loop until I stopped the recording, and in that case it looks like the frontend code only accounts for around 12% of the CPU time. Backend code like object previewers account for more than 50% though.

If that is the main case we are trying to optimize, then perhaps we are in a good spot now. If not, let's define the benchmark we are measuring against.
(Assignee)

Comment 24

5 years ago
(In reply to Panos Astithas [:past] from comment #23)
> Created attachment 8482266 [details]
> profile.json.zip
> 
> Here is a profile I captured from a build with this patch on
> http://htmlpad.org/console-perf. I didn't touch the browser after clicking
> on the button to start the loop until I stopped the recording, and in that
> case it looks like the frontend code only accounts for around 12% of the CPU
> time. Backend code like object previewers account for more than 50% though.
>
> If that is the main case we are trying to optimize, then perhaps we are in a
> good spot now. If not, let's define the benchmark we are measuring against.

Agreed, and there may be some low hanging fruit with the backend perf compared to _flushMessageQueue now.  Do you see any obvious targets for optimization in the TypedArray function that your profile is pointing to (http://dxr.mozilla.org/mozilla-central/source/toolkit/devtools/server/actors/script.js#3775)?
Flags: needinfo?(past)
(Assignee)

Comment 25

5 years ago
One thing that's odd on my computer is that I can run two profiles get very different results.  The first one I ran was almost entirely frontend still and the second was closer to yours, with about 40% backend and 20% frontend.

At this point, it seems like the bulk of the frontend slowness remaining (at least with the current benchmark) is:

Messages.ConsoleGeneric.prototype<.render console-output.js 1296, which then is consumed by Widgets.URLString.prototype<.isURL console-output.js 2206.
(Assignee)

Comment 26

5 years ago
Updated comments
Attachment #8481594 - Attachment is obsolete: true
Attachment #8482714 - Flags: review+
(Assignee)

Updated

5 years ago
Keywords: checkin-needed
OS: Mac OS X → All
Hardware: x86 → All
I don't see anything glaringly obvious, but I was wondering if we should tweak the order in the DebuggerServer.ObjectActorPreviewers.Object array or use a different mechanism altogether. Sadly the profile is somewhat confusing here (see bug 1061772), so I would want to examine a few profiles more to feel confident that we are looking at the right thing.
Flags: needinfo?(past)
(In reply to Panos Astithas [:past] from comment #28)
> I don't see anything glaringly obvious, but I was wondering if we should
> tweak the order in the DebuggerServer.ObjectActorPreviewers.Object array or
> use a different mechanism altogether.

Would be interesting to have telemetry probes to count how often we are using each object previewer and use that to guide ordering. Of course, this needn't block anything else being discussed here :)

This is under the assumption that /choosing/ a previewer is taking a significant amount of time, and I'd love to hear whether that is actually the case or whether time is mostly spent /creating/ the preview or if its even, etc.
https://hg.mozilla.org/integration/fx-team/rev/505529c35c7f
Keywords: checkin-needed
Whiteboard: [status:inflight] → [status:inflight][fixed-in-fx-team]
https://hg.mozilla.org/mozilla-central/rev/505529c35c7f
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Whiteboard: [status:inflight][fixed-in-fx-team] → [status:inflight]
Target Milestone: --- → Firefox 35
Interested to know how Joe feels about uplifting to Aurora. If we can help users I'd prefer to do it as quickly as possible.
Flags: needinfo?(jwalker)
(Assignee)

Comment 34

5 years ago
(In reply to Jeff Griffiths (:canuckistani) from comment #33)
> Interested to know how Joe feels about uplifting to Aurora. If we can help
> users I'd prefer to do it as quickly as possible.

Yes, I'll request uplift
Flags: needinfo?(jwalker)
(Assignee)

Comment 35

5 years ago
Comment on attachment 8482714 [details] [diff] [review]
console-speedup-2.patch

Approval Request Comment
[Feature/regressing bug #]: console.log
[User impact if declined]: Console output is very slow when logging thousands of messages at once, causing the browser UI to become unresponsive
[Describe test coverage new/current, TBPL]: Web console output is already well-tested, and this is a performance optimization patch
[Risks and why]: Risk is only in sessions when browsing with devtools opened to the webconsole panel.  Any risk would most likely be performance oriented with regards to console API statements using parameters / options that we haven't profiled specifically.  Seems unlikely, as it improves things pretty dramatically for typical log statements that print a couple of strings and an object. 
[String/UUID change made/needed]:
Attachment #8482714 - Flags: approval-mozilla-aurora?
(Assignee)

Updated

5 years ago
Whiteboard: [status:inflight] → [status:landedon]
Comment on attachment 8482714 [details] [diff] [review]
console-speedup-2.patch

Aurora+
Attachment #8482714 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(Assignee)

Updated

4 years ago
See Also: → 1122483
(Assignee)

Updated

4 years ago
Blocks: 1156747

Updated

10 months ago
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.