Open Bug 744470 Opened 12 years ago Updated 2 years ago

Track phantom setTimeouts like slow sql

Categories

(Core :: DOM: Events, defect, P5)

x86_64
Windows 7
defect

Tracking

()

People

(Reporter: taras.mozilla, Unassigned)

References

Details

(Whiteboard: [Snappy:P2][telemetry-needed])

Attachments

(2 files)

setTimeouts have a function parameter. We should be able to either
a) extract the function file_location:line
or
b) get a stack backtrace at setTimeout set.

We should then report frequency + time taken by setTimeouts like we do with slow sql telemetry. Chrome ones we can identify, webpage ones we can be less specific about.'

At this point it doesn't matter fi we report this via telemetry or surface this for local debugging only. "Surprise/WTF" setTimeouts are a cause of jank that should be pursued vigilantly.
Grabbing the stack on each setTimeout call would be somewhat expensive, I suspect.  So that can be done in a special debug mode, but not via telemetry.

Extracting the file:line from the function is possible in some cases, but not in the common:

  setTimeout(function f() {}, N);

case.  :(
Oh, and I assume that this is different from bug 744315?
(In reply to Boris Zbarsky (:bz) from comment #1)
> Extracting the file:line from the function is possible in some cases, but
> not in the common:
> 
>   setTimeout(function f() {}, N);
> 
> case.  :(

Dumb question: Is that a spec limitation or a limitation of the current engine?
There can't be a spec limitation about file:line, because those aren't available via things the spec talks about.  So it's a pure implementation limitation.

Oh, the other one that would be hard to deal with is:

  setTimeout("foo()", N);
http://dxr.lanedo.com/mozilla-central/dom/base/nsGlobalWindow.cpp.html#l9348 shows that either you have an expression, in which case the handler has the filename and line number, or you have a scriptObject, which is JSObject *.

From the scriptObject, you can get the JSFunction * with fun = JS_GetObjectFunction(scriptObject).

From the function, you can get the JSScript * with JS_GetFunctionScript(cx, fun).

That requires a context, which you can get from handler->mContext->GetNativeContext()

From the JSScript*, you can use JS_GetScriptFilename and JS_GetScriptBaseLineNumber.

All these require jsdbgapi.h, which is already included in nsGlobalWindow.cpp.
> in which case the handler has the filename and line number

It does?  It doesn't right now, though I suppose we could add them if they're not too expensive to get.

> From the JSScript*, you can use JS_GetScriptFilename and JS_GetScriptBaseLineNumber.

Yes, but the JSScript line number for the case in comment 1 is either 0 or 1, no matter where the actual code is, iirc.
(In reply to Boris Zbarsky (:bz) from comment #4)
> There can't be a spec limitation about file:line, because those aren't
> available via things the spec talks about.  So it's a pure implementation
> limitation.
> 
> Oh, the other one that would be hard to deal with is:
> 
>   setTimeout("foo()", N);

My understanding is that our first concern is with setTimeout calls issued by FF itself, e.g. during tab switching or any other internal operation. In this context, using a string is not only weird, it is probably a coding error.

(In reply to Boris Zbarsky (:bz) from comment #6)
> > From the JSScript*, you can use JS_GetScriptFilename and JS_GetScriptBaseLineNumber.
> 
> Yes, but the JSScript line number for the case in comment 1 is either 0 or
> 1, no matter where the actual code is, iirc.

I believe that we can get the function name, if any is available. If so, in Mozilla code, even the file name + a line number of 0 or 1 is a useful information: it tells us where we can refine our code (by naming unnamed functions) to easily obtain more precise information.
> My understanding is that our first concern is with setTimeout calls issued by FF itself,

Yeah, you can probably assume those are not using strings.  ;)

> I believe that we can get the function name, if any is available

Yes, indeed.  Sounds good.
Whiteboard: [Snappy][telemetry-needed] → [Snappy:P2][telemetry-needed]
(In reply to Boris Zbarsky (:bz) from comment #6)
> > in which case the handler has the filename and line number
> 
> It does?  It doesn't right now, though I suppose we could add them if
> they're not too expensive to get.

Ok, I have no idea if they are actually filled in. I was just looking at the line in the code:

      handler->GetLocation(&filename, &lineNo);

lineNo will always be 0|1, I suppose, but at least the filename will be filled in.

> > From the JSScript*, you can use JS_GetScriptFilename and JS_GetScriptBaseLineNumber.
> 
> Yes, but the JSScript line number for the case in comment 1 is either 0 or
> 1, no matter where the actual code is, iirc.

Really? Why? I trust your recollection far more than mine, but I don't see why that script wouldn't know the actual line it came from.
> but I don't see why that script wouldn't know the actual line it came from.

The "script" for the function object created by the function expression is not the same as the "script" that called the function expression, iirc.  The information _could_ possibly be passed through, but it wasn't last I checked.
Attached patch wipSplinter Review
Steve, thanks for the tips. Worked like magic. Seems that we have position info for most setTimeouts.

Here is setTimeout jank from my current session.
13811ms https://etherpad.mozilla.org/DXRplan:540
9001ms https://mail.mozilla.com/zimbra/js/Startup1_1_all.js.zgz?v=120105102559:3269
4869ms chrome://chatzilla/content/static.js:1523
3738ms http://ajax.googleapis.com/ajax/libs/dojo/1.5/dojo/dojo.xd.js:14
3736ms https://si0.twimg.com/c/phoenix/en/bundle/t1-hogan-core.f760c184d1eaaf1cf27535473a7306ef.js:40
2596ms https://si0.twimg.com/c/phoenix/en/bundle/t1-hogan-core.f760c184d1eaaf1cf27535473a7306ef.js:20
2328ms https://mail.google.com/mail/u/1/?ui=2&view=js&name=main,tlist&ver=d0NxGttN08Y.en.&am=!3jbNbBCV_yxhRf2R88zeAZosLfR2vmrAHpqLF_XefdA7BT4OY32GrZULPVP_E18bOjPTpw:571
2294ms Unknown
2183ms chrome://chatzilla/content/static.js:1521
2169ms https://mail.google.com/mail/u/0/?ui=2&view=js&name=main,tlist&ver=d0NxGttN08Y.en.&am=!3jbNbBCV_yxhRf2RM8veAZmRbUDRlwvS_ngb353b4P_jvef78jLJIINT4gimZ_aEH2WzFQ&random=1334473886634:569
2013ms resource:///components/nsSessionStore.js:3086
1593ms http://cdn.taboolasyndication.com/libtrc/businessweek/libtrc.5-23-3-4-54915.js:1
662ms http://www.businessweek.com/assets/application-96f996779ae5a83fb214283d0e465244.js:26
644ms https://si0.twimg.com/c/phoenix/en/bundle/t1-hogan-more.350488d9f88b89f2f1c8c15502bdd73a.js:23
472ms http://www.google.com/reader/ui/4208243601-en-scroll.js?hl=en:199
453ms chrome://global/content/bindings/tabbox.xml:775
451ms https://etherpad.mozilla.org/DXRplan:843
438ms chrome://browser/content/browser.js:5117
341ms https://si0.twimg.com/c/phoenix/en/bundle/t1-hogan-more.350488d9f88b89f2f1c8c15502bdd73a.js:74
328ms http://static.chartbeat.com/js/chartbeat.js:1
266ms https://plus.google.com/_/apps-static/_/js/nw/nw_i/rt=h/ver=4Bwyl-cwnBg.en./sv=1/am=!Yl-tOOBqnkSJBf3CEbk-Qz_Vduxr257N_nsTkw/d=1/rs=AItRSTMfSeZ45rG2H3-BYaY_H2RYf6sd4g:210
203ms chrome://chatzilla/content/static.js:300
192ms chrome://chatzilla/content/static.js:2856
191ms chrome://global/content/bindings/autocomplete.xml:1019
160ms chrome://smoothwheel/content/smoothwheel.js:1352
123ms chrome://global/content/bindings/autocomplete.xml:1143
114ms https://si0.twimg.com/c/phoenix/en/bundle/t1-hogan-core.f760c184d1eaaf1cf27535473a7306ef.js:30
111ms https://etherpad.mozilla.org/DXRplan:500
94ms https://etherpad.mozilla.org/DXRplan:1
74ms chrome://chatzilla/content/static.js:857
73ms https://mail.mozilla.com/zimbra/js/Startup1_1_all.js.zgz?v=120105102559:6901
72ms chrome://chatzilla/content/output-window.js:64
67ms chrome://chatzilla/content/handlers.js:833
62ms http://bwbeta.disqus.com/embed.js:3
51ms https://etherpad.mozilla.org/static/compressed/03a78d64125e260de582793f8d305546.js:1057
Mmmm, nice! One thing I'm unsure about; are these _cumulative_ times over multiple calls, or the slowest call? Do these include GC/CC that happens to run?


I took a look at the chrome:// callers since they're not immediately obvious:

  453ms chrome://global/content/bindings/tabbox.xml:775

Looks like that's the <handler event="mousedown"> that you started off grumbling about... :)

  438ms chrome://browser/content/browser.js:5117

Looks to be delayedStartup()? Ouch.

  191ms chrome://global/content/bindings/autocomplete.xml:1019

I think that's |setTimeout(function(self) self.adjustHeight(), 0, this);|... This is probably all expensive layout adjustment.

  123ms chrome://global/content/bindings/autocomplete.xml:1143

// yield after each batch of items so that typing the url bar is responsive
setTimeout(function (self) { self._appendCurrentResult(); }, 0, this);


(And: lolchatzilla)
Yeah this is cumulative. It's not super-helpful without counts, etc. Just shows that this is a problem that needs to be surfaced better + dealt with.
(In reply to Boris Zbarsky (:bz) from comment #11)
> > but I don't see why that script wouldn't know the actual line it came from.
> 
> The "script" for the function object created by the function expression is
> not the same as the "script" that called the function expression, iirc.  The
> information _could_ possibly be passed through, but it wasn't last I checked.

(This is correct, but we do put a correct line number on the function script, not just 0 or 1.)
How expensive are these checks?  Can we roll them out in, let's say, profiling nightlies, to collect some hopefully interesting Telemetry reports?
(In reply to Ehsan Akhgari [:ehsan] from comment #17)
> How expensive are these checks?  Can we roll them out in, let's say,
> profiling nightlies, to collect some hopefully interesting Telemetry reports?

it seems to be a good local debugging aid..since most of the bad ones come from websites, it wont be much good to report those as unknown via telemetry. I think developer time is better spent on mitigating these via bug 715376
https://bugzilla.mozilla.org/show_bug.cgi?id=1472046

Move all DOM bugs that haven’t been updated in more than 3 years and has no one currently assigned to P5.

If you have questions, please contact :mdaly.
Priority: -- → P5
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: