Closed
Bug 461031
Opened 16 years ago
Closed 16 years ago
Investigate mobile responsiveness problems caused by not returning to the event loop
Categories
(Core :: DOM: Core & HTML, defect, P1)
Tracking
()
RESOLVED
FIXED
People
(Reporter: mrbkap, Unassigned)
References
Details
(4 keywords)
Attachments
(5 files, 2 obsolete files)
27.43 KB,
text/plain
|
Details | |
26.01 KB,
patch
|
jst
:
review+
jst
:
superreview+
|
Details | Diff | Splinter Review |
7.23 KB,
patch
|
sicking
:
review+
sicking
:
superreview+
sicking
:
approval1.9.1+
|
Details | Diff | Splinter Review |
14.28 KB,
patch
|
Details | Diff | Splinter Review | |
2.63 KB,
patch
|
mozilla+ben
:
review+
|
Details | Diff | Splinter Review |
Vlad wrote an instrumentation patch to see why Fennec running on VMs has serious responsiveness issues while loading "heavy" pages such as CNN.com. Part of his patch works by basically doing setTimout(0) and watching to see how long it takes before the setTimeouts fire. On a VM, he saw a lot of setTimeouts firing at regular intervals, followed by a massive slowdown (on the order of one second).
In a test run, he saw the following event being handled:
{BM(44)BM(10)ES(94)BM(103)BM(7)BM(54)BM(18)BM(37)BM(33)BM(11)BM(17)BM(10)BM(21)BM(62)BM(20)BM(16)BM(20)BM(7)BM(40)BM(17)BM(92)RP4(676)-E(0xb1a53de0,676)}
To translate: we're calling BuildModel a ton of times during the same event without ever returning to the event loop. The sum of all of the BuildModel calls is quite a lot and could explain the loss of responsiveness.
I realized that what this probably means is that we're executing a script with many calls to document.write(), each of which calls into nsParser::BuildModel. In this case, the parser won't ever interrupt (and even when it's done parsing, we won't get all the way back out to the event loop). We could verify this by adding a printf around nsParser::ResumeParse to make sure that we're not sticking around in ResumeParse forever but actually entering and exiting the parser code.
Updated•16 years ago
|
The instrumentation indicates that that is not what is happening. We were instrumenting how much time was spent while the scriptloader was calling EvaluateScript, it's the ES(94) number in the log. First of all it looks like we're only spending 94milliseconds executng script, out of 676ms spent in ResumeParse. Also, almost all calls to BuildModel happens after EvaluateScript finishes executing.
There is definitely a chance that we instrumeted things wrongly though, so don't take my word for it.
Reporter | ||
Comment 2•16 years ago
|
||
(In reply to comment #1)
> There is definitely a chance that we instrumeted things wrongly though, so
> don't take my word for it.
Nope, you're correct. I added a bunch of instrumentation here, and the path through ResumeParse that I missed was the inline script case. When we see an inline script, we stop tokenizing and BuildModel. In CNN's case, there are about 6 inline scripts in a row, some of which do document.write() (causing more rounds through BuildModel) and all of these rounds around BuildModel cause us to reset the "last time we were in the event loop" counter. Ugh.
I think Ben was going to try to patch this.
Comment 3•16 years ago
|
||
If I understand correctly, we code like the following to alert before 1000ms have elapsed?
<html>
<head>
<script>
function wait(ms) {
var start = new Date().getTime();
while (new Date().getTime() - start < ms)
/* empty */ ;
}
var start = new Date().getTime();
setTimeout(function() { alert(new Date().getTime() - start) }, 0);
for (var i = 0; i < 20; i++)
document.write("<scr"+"ipt>wait(50)</scr"+"ipt>");
</script>
</head>
<body></body>
</html>
Comment 4•16 years ago
|
||
(In reply to comment #3)
> If I understand correctly, we code like the following to alert before 1000ms
> have elapsed?
s/we code/we want code/
Reporter | ||
Comment 5•16 years ago
|
||
Yes.
Comment 6•16 years ago
|
||
Fix to come soon.
Reporter | ||
Comment 7•16 years ago
|
||
Wait, we can't fix that case. The browser can't interrupt the for loop and each of those scripts executes synchronously. Instead, replace the for loop with 20 inline scripts in the source.
Comment 8•16 years ago
|
||
No longer calling nsContentSink::WillProcessTokens (which resets mDelayTimerStart) in CNavDTD::BuildModel. This change partially implements mrbkap's suggestion, but it goes a bit too far, since it causes the timer never to be reset. In order to reset the timer less frequently (rather than never), I need to find other places to call WillProcessTokens (e.g., nsParser::OnDataAvailable).
Changed NS_MAX_TOKENS_DEFLECTED_IN_LOW_FREQ_MODE from 200 to 0. This is obviously too drastic. A value of 50 seems to produce good results on Vlad's XUL app and in the mobile browser. I was pandering to my somewhat unrealistic test case, which wastes time not by parsing tokens but by sitting in a javascript loop for 50ms at a time. So I need to (1) choose a more realistic value for the maximum number of tokens to deflect before sampling the clock and (2) rewrite my test so that it wastes time in a realistic way (i.e., by parsing a lot of tokens).
Have to run home now, but I wanted to put this out there so others can play with it (maybe on the n810 itself?).
Attachment #344709 -
Attachment is obsolete: true
Reporter | ||
Comment 9•16 years ago
|
||
Here's the log I generated (I'll attach the patch in a moment) to prove that we're seeing a lot of inline scripts (note: we only flush tokens for <script>).
Reporter | ||
Comment 10•16 years ago
|
||
So, the timer that was controlled by nsIHTMLContentSink::WillProcessTokens really wants to be timing the amount of time that we've spent since last running the event loop. Resetting the time from the DTD's BuildModel function was totally wrong for this, since we re-enter the DTD all the time and (for inline scripts) end up simply iterating again.
I also noticed, that when we implemented incremental XML parsing, we added yet *another* notification (nsIContentSink::WillTokenize) that is only used by the XML content sink for the purpose of calling WillProcessTokens.
I got rid of that in favor of a brand new notification, "WillParse" which is only called when we enter the parser from the event loop (the three functions where this happens are OnDataAvaiable, OnStopRequest and ContinueInterruptedParsing). I haven't tested this yet, but I do know it compiles and displays web pages correctly.
Attachment #345188 -
Flags: superreview?(jst)
Attachment #345188 -
Flags: review?(jst)
Reporter | ||
Comment 11•16 years ago
|
||
Oh, and the patch is gigantic because we have something like 7 or 8 implementations of nsI{,HTML}ContentSink in our tree.
Updated•16 years ago
|
Attachment #345188 -
Flags: superreview?(jst)
Attachment #345188 -
Flags: superreview+
Attachment #345188 -
Flags: review?(jst)
Attachment #345188 -
Flags: review+
Comment 12•16 years ago
|
||
Can we set this selectively for mobile?
Attachment #345027 -
Attachment is obsolete: true
Comment 13•16 years ago
|
||
Setting this differently for mobile makes a lot of sense...
The 200 value for desktop should probably be revisited too.
Reporter | ||
Comment 14•16 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/9d236c675d0b ... this isn't really fixed though, should I file a spinoff to track the remaining work needed?
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Comment 15•16 years ago
|
||
yeah, please file another bug, since there is some work left to do here
Reporter | ||
Comment 16•16 years ago
|
||
For posterity's sake...
Comment 17•16 years ago
|
||
This landed before branching.
Attachment #345387 -
Flags: superreview+
Attachment #345387 -
Flags: review+
Attachment #345387 -
Flags: approval1.9.1+
Comment 18•16 years ago
|
||
The test included with my previous patch was a useful diagnostic, but actually enabling it seems like a bad idea, since its dependency on timeouts probably would cause intermittent mochitest failures.
Attachment #365826 -
Flags: review+
Assignee | ||
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•