Closed Bug 461031 Opened 17 years ago Closed 17 years ago

Investigate mobile responsiveness problems caused by not returning to the event loop

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: mrbkap, Unassigned)

References

Details

(4 keywords)

Attachments

(5 files, 2 obsolete files)

mochitest case 17 years ago Ben Newman (:bnewman) (:benjamn) 1.96 KB, patch		Details \| Diff \| Splinter Review
WIP proof-of-concept patch 17 years ago Ben Newman (:bnewman) (:benjamn) 5.85 KB, patch		Details \| Diff \| Splinter Review
Log of parser during cnn.com 17 years ago Blake Kaplan (:mrbkap) (inactive) 27.43 KB, text/plain		Details
More complete parser patch 17 years ago Blake Kaplan (:mrbkap) (inactive) 26.01 KB, patch	jst : review+ jst : superreview+	Details \| Diff \| Splinter Review
controlling max deflected tokens with a user preference 17 years ago Ben Newman (:bnewman) (:benjamn) 7.23 KB, patch	sicking : review+ sicking : superreview+ sicking : approval1.9.1+	Details \| Diff \| Splinter Review
Patch that I used to get the CNN log 16 years ago Blake Kaplan (:mrbkap) (inactive) 14.28 KB, patch		Details \| Diff \| Splinter Review
patch 345387 without buggy mochitest 16 years ago Ben Newman (:bnewman) (:benjamn) 2.63 KB, patch	mozilla+ben : review+	Details \| Diff \| Splinter Review

Blake Kaplan (:mrbkap) (inactive)

Reporter

Description

•

17 years ago

Vlad wrote an instrumentation patch to see why Fennec running on VMs has serious responsiveness issues while loading "heavy" pages such as CNN.com. Part of his patch works by basically doing setTimout(0) and watching to see how long it takes before the setTimeouts fire. On a VM, he saw a lot of setTimeouts firing at regular intervals, followed by a massive slowdown (on the order of one second). In a test run, he saw the following event being handled: {BM(44)BM(10)ES(94)BM(103)BM(7)BM(54)BM(18)BM(37)BM(33)BM(11)BM(17)BM(10)BM(21)BM(62)BM(20)BM(16)BM(20)BM(7)BM(40)BM(17)BM(92)RP4(676)-E(0xb1a53de0,676)} To translate: we're calling BuildModel a ton of times during the same event without ever returning to the event loop. The sum of all of the BuildModel calls is quite a lot and could explain the loss of responsiveness. I realized that what this probably means is that we're executing a script with many calls to document.write(), each of which calls into nsParser::BuildModel. In this case, the parser won't ever interrupt (and even when it's done parsing, we won't get all the way back out to the event loop). We could verify this by adding a printf around nsParser::ResumeParse to make sure that we're not sticking around in ResumeParse forever but actually entering and exiting the parser code.

Stuart Parmenter

Updated

•

17 years ago

Blocks: 441364

Stuart Parmenter

Updated

•

17 years ago

Flags: blocking1.9.1?

Keywords: mobile, perf, topperf

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 1

•

17 years ago

The instrumentation indicates that that is not what is happening. We were instrumenting how much time was spent while the scriptloader was calling EvaluateScript, it's the ES(94) number in the log. First of all it looks like we're only spending 94milliseconds executng script, out of 676ms spent in ResumeParse. Also, almost all calls to BuildModel happens after EvaluateScript finishes executing. There is definitely a chance that we instrumeted things wrongly though, so don't take my word for it.

Blake Kaplan (:mrbkap) (inactive)

Reporter

Comment 2

•

17 years ago

(In reply to comment #1) > There is definitely a chance that we instrumeted things wrongly though, so > don't take my word for it. Nope, you're correct. I added a bunch of instrumentation here, and the path through ResumeParse that I missed was the inline script case. When we see an inline script, we stop tokenizing and BuildModel. In CNN's case, there are about 6 inline scripts in a row, some of which do document.write() (causing more rounds through BuildModel) and all of these rounds around BuildModel cause us to reset the "last time we were in the event loop" counter. Ugh. I think Ben was going to try to patch this.

Ben Newman (:bnewman) (:benjamn)

Comment 3

•

17 years ago

If I understand correctly, we code like the following to alert before 1000ms have elapsed? <html> <head> <script> function wait(ms) { var start = new Date().getTime(); while (new Date().getTime() - start < ms) /* empty */ ; } var start = new Date().getTime(); setTimeout(function() { alert(new Date().getTime() - start) }, 0); for (var i = 0; i < 20; i++) document.write("<scr"+"ipt>wait(50)</scr"+"ipt>"); </script> </head> <body></body> </html>

Ben Newman (:bnewman) (:benjamn)

Comment 4

•

17 years ago

(In reply to comment #3) > If I understand correctly, we code like the following to alert before 1000ms > have elapsed? s/we code/we want code/

Blake Kaplan (:mrbkap) (inactive)

Reporter

Comment 5

•

17 years ago

Yes.

Ben Newman (:bnewman) (:benjamn)

Comment 6

•

17 years ago

Attached patch mochitest case (obsolete) — Details — Splinter Review

Fix to come soon.

Blake Kaplan (:mrbkap) (inactive)

Reporter

Comment 7

•

17 years ago

Wait, we can't fix that case. The browser can't interrupt the for loop and each of those scripts executes synchronously. Instead, replace the for loop with 20 inline scripts in the source.

Ben Newman (:bnewman) (:benjamn)

Comment 8

•

17 years ago

Attached patch WIP proof-of-concept patch (obsolete) — Details — Splinter Review

No longer calling nsContentSink::WillProcessTokens (which resets mDelayTimerStart) in CNavDTD::BuildModel. This change partially implements mrbkap's suggestion, but it goes a bit too far, since it causes the timer never to be reset. In order to reset the timer less frequently (rather than never), I need to find other places to call WillProcessTokens (e.g., nsParser::OnDataAvailable). Changed NS_MAX_TOKENS_DEFLECTED_IN_LOW_FREQ_MODE from 200 to 0. This is obviously too drastic. A value of 50 seems to produce good results on Vlad's XUL app and in the mobile browser. I was pandering to my somewhat unrealistic test case, which wastes time not by parsing tokens but by sitting in a javascript loop for 50ms at a time. So I need to (1) choose a more realistic value for the maximum number of tokens to deflect before sampling the clock and (2) rewrite my test so that it wastes time in a realistic way (i.e., by parsing a lot of tokens). Have to run home now, but I wanted to put this out there so others can play with it (maybe on the n810 itself?).

Attachment #344709 - Attachment is obsolete: true

Blake Kaplan (:mrbkap) (inactive)

Reporter

Comment 9

•

17 years ago

Attached file Log of parser during cnn.com — Details

Here's the log I generated (I'll attach the patch in a moment) to prove that we're seeing a lot of inline scripts (note: we only flush tokens for <script>).

Blake Kaplan (:mrbkap) (inactive)

Reporter

Comment 10

•

17 years ago

Attached patch More complete parser patch — Details — Splinter Review

So, the timer that was controlled by nsIHTMLContentSink::WillProcessTokens really wants to be timing the amount of time that we've spent since last running the event loop. Resetting the time from the DTD's BuildModel function was totally wrong for this, since we re-enter the DTD all the time and (for inline scripts) end up simply iterating again. I also noticed, that when we implemented incremental XML parsing, we added yet *another* notification (nsIContentSink::WillTokenize) that is only used by the XML content sink for the purpose of calling WillProcessTokens. I got rid of that in favor of a brand new notification, "WillParse" which is only called when we enter the parser from the event loop (the three functions where this happens are OnDataAvaiable, OnStopRequest and ContinueInterruptedParsing). I haven't tested this yet, but I do know it compiles and displays web pages correctly.

Attachment #345188 - Flags: superreview?(jst)

Attachment #345188 - Flags: review?(jst)

Blake Kaplan (:mrbkap) (inactive)

Reporter

Comment 11

•

17 years ago

Oh, and the patch is gigantic because we have something like 7 or 8 implementations of nsI{,HTML}ContentSink in our tree.

Johnny Stenback (:jst)

Updated

•

17 years ago

Attachment #345188 - Flags: superreview?(jst)

Attachment #345188 - Flags: superreview+

Attachment #345188 - Flags: review?(jst)

Attachment #345188 - Flags: review+

Ben Newman (:bnewman) (:benjamn)

Comment 12

•

17 years ago

Attached patch controlling max deflected tokens with a user preference — Details — Splinter Review

Can we set this selectively for mobile?

Attachment #345027 - Attachment is obsolete: true

Boris Zbarsky [:bzbarsky]

Comment 13

•

17 years ago

Setting this differently for mobile makes a lot of sense... The 200 value for desktop should probably be revisited too.

Blake Kaplan (:mrbkap) (inactive)

Reporter

Comment 14

•

17 years ago

http://hg.mozilla.org/mozilla-central/rev/9d236c675d0b ... this isn't really fixed though, should I file a spinoff to track the remaining work needed?

Status: NEW → RESOLVED

Closed: 17 years ago

Resolution: --- → FIXED

Stuart Parmenter

Comment 15

•

17 years ago

yeah, please file another bug, since there is some work left to do here

Blake Kaplan (:mrbkap) (inactive)

Reporter

Comment 16

•

16 years ago

Attached patch Patch that I used to get the CNN log — Details — Splinter Review

For posterity's sake...

Benjamin Smedberg

Comment 17

•

16 years ago

This landed before branching.

Flags: blocking1.9.1? → blocking1.9.1+

Keywords: fixed1.9.1

Priority: -- → P1

Stuart Parmenter

Updated

•

16 years ago

Blocks: 481566

Jonas Sicking (:sicking) No longer reading bugmail consistently

Updated

•

16 years ago

Attachment #345387 - Flags: superreview+

Attachment #345387 - Flags: review+

Attachment #345387 - Flags: approval1.9.1+

Ben Newman (:bnewman) (:benjamn)

Comment 18

•

16 years ago

Attached patch patch 345387 without buggy mochitest — Details — Splinter Review

The test included with my previous patch was a useful diagnostic, but actually enabling it seems like a bad idea, since its dependency on timeouts probably would cause intermittent mochitest failures.

Attachment #365826 - Flags: review+

timeless

Updated

•

16 years ago

Component: Content → DOM

QA Contact: content → general

Nobody; OK to take it and work on it

Assignee

Updated

•

6 years ago

Component: DOM → DOM: Core & HTML

You need to log in before you can comment on or make changes to this bug.