Open Bug 1151048 Opened 9 years ago Updated 23 days ago

Slow pageload on a page full of unbalanced document.write() calls that make tokenizer speculation fail over and over again

Tracking

()

Status:

REOPENED

Tracking Flags:

Tracking

Status

firefox40

---

fixed

People

(Reporter: kangax, Unassigned)

References

(Blocks 1 open bug,
URL
)

Details

Attachments

(2 files)

Minimalish testcase showing the performance problem 9 years ago Boris Zbarsky [:bzbarsky] 129.11 KB, text/html		Details
Disable speculative tokenization in the parser if it's failing too much 9 years ago Boris Zbarsky [:bzbarsky] 3.93 KB, patch	hsivonen : review+	Details \| Diff \| Splinter Review

Juriy "kangax" Zaytsev

Reporter

Description

•

9 years ago

The ECMAScript compatibility table is rendering very slow on Firefox (tested on 40.0a1 (2015-04-03))

https://kangax.github.io/compat-table/es6/

It's rendering significantly faster on Chrome.

Juriy "kangax" Zaytsev

Reporter

Updated

•

9 years ago

URL: https://kangax.github.io/compat-table...

Tom S [:evilpie]

Updated

•

9 years ago

Mentor: evilpies

Boris Zbarsky [:bzbarsky]

Comment 1

•

9 years ago

It renders instantaneously if I turn off JS (and in particular, if I save the final DOM state and then turn on JS it renders instantaneously).

So the real question is how the scripts on this page are interacting with the layout.

The console for this page it has a whole bunch of:

  An unbalanced tree was written using document.write() causing data from the network to
  be reparsed. For more information 
  https://developer.mozilla.org/en/Optimizing_Your_Pages_for_Speculative_Parsing

So basically, we're speculating the parse on this page, but it's repeatedly making us throw away the speculation and try again due to the way we handle the document.write() calls it makes.  So we effectively end up tokenizing the page O(N) times where N is the number of unbalanced document.write calls, since I bet there's one or more per row of the table.  Profile agrees: 30% of it is HTML tokenization, another 15% is other parser work.

So what do the strings passed to these document.write() calls look like?  Like this:

  <td class="yes current">Yes</td><td></td>
  <td class="no current">No</td><td></td>

What's the unbalanced part?  That's due to where these document.write() calls are happening.  Stripping down and simplifying the markup a bit, it looks like this:

  <tr><td><span>test description</span><script>document.write(testResult)</script></td>...

So in particular, the speculation is that the </td> after the </script> will close the <td> that began before the <script>.  But then the document.write() writes out a <td>, which closes the currently-open <td>, because the <td> end tag is optional when followed by <td> or <th>.  And the </td> after the </script> becomes misplaced garbage that gets ignored, I think (because by the time the parser sees it, the open tag is <tr>, not <td>).

So a simple experiment.  If I make a copy of this page and measure the time to DOMContentLoaded, I get times on the order of 6-7s.  If I just throw in a </td> before ever <script data-source="...">, that drops to 1s.  Chrome's time is about 1.1s either way.

Henri, can we just turn off speculation in pathological cases like this to avoid wasting time on it or something?

Component: Layout: Tables → HTML: Parser

Flags: needinfo?(hsivonen)

Summary: Table rendering very slow → Slow pageload on a page full of unbalanced document.write() calls that make tokenizer speculation fail over and over again

Boris Zbarsky [:bzbarsky]

Comment 2

•

9 years ago

Attached file Minimalish testcase showing the performance problem — Details

This testcase alerts somewhere in the range of 900ms for me.  If I just add a </td> before every <script>, it's 140ms.

Interestingly, doubling the number of rows doesn't seem to quadruple the time, just doubles it, so we're not actually ending up with O(N^2) behavior; just a much larger constant.

Henri Sivonen (:hsivonen)

Comment 3

•

9 years ago

(In reply to Not doing reviews right now from comment #1)
> Henri, can we just turn off speculation in pathological cases like this to
> avoid wasting time on it or something?

I think (untested!) we could do this:

 * Add a failure counter to nsHtml5StreamParser.
 * Increment the failure counter in nsHtml5StreamParser::ContinueAfterScripts whenever speculationFailed gets set to true.
 * Before the loop starts in nsHtml5StreamParser::ParseAvailableData, if mSpeculating is true and the failure counter exceeds some magic threshold, return early.

Any idea what the magic threshold should be?

But is there a reason to believe that this failure mode is common or bad enough that adding complexity is justified (as opposed to asking kangax not to use document.write like this on this particular page)?

(ni: bz for the questions.)

Flags: needinfo?(hsivonen) → needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 4

•

9 years ago

> Any idea what the magic threshold should be?

100 seems like a nice round number.

> is there a reason to believe that this failure mode is common or bad enough

It just seems pretty easy to trigger...  If the fix is simple enough, I think we should just fix it.  Alternately, we could try adding telemetry to see how many speculations we end up failing per document or something.

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 5

•

9 years ago

Attached patch Disable speculative tokenization in the parser if it's failing too much — Details — Splinter Review

Oddly, this changes nothing about my small testcase, but it definitely helps the original page.  That's the opposite of how small performance testcases usually go!

Attachment #8591008 - Flags: review?(hsivonen)

Boris Zbarsky [:bzbarsky]

Updated

•

9 years ago

Assignee: nobody → bzbarsky

Status: NEW → ASSIGNED

Henri Sivonen (:hsivonen)

Updated

•

9 years ago

Attachment #8591008 - Flags: review?(hsivonen) → review+

Boris Zbarsky [:bzbarsky]

Comment 6

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/d22bb27be448

Carsten Book [:Tomcat]

Comment 7

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/d22bb27be448

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

status-firefox40: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla40

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 8

•

9 years ago

I did some quick verification for this today on Windows 7 x64 and Mac OS X 10.8.5, but I see almost no difference between Firefox 39.0 and Firefox 40.0 Beta 3.

Test results:
- Minimal testcase 
     - ~450-550 on both Mac & Windows with Firefox (39 & 40b3)
     - ~450-550 on Mac with Chrome
     - ~230-250 on Win with Chrome
- https://kangax.github.io/compat-table/es6/
     - ~11 seconds on both Mac & Windows with Firefox (39 & 40b3)
     - ~3 seconds on Mac & Windows with Chrome

Boris, any thoughts on this? Are you guys getting similar numbers?

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 9

•

9 years ago

Yeah, something is fishy here.  I could have sworn I saw a 2x or so speedup on the page itself, but I can't reproduce that now... and worse yet I never see us take the early return if !IsSpeculationEnabled().

I'm going to just reopen this, because it's clearly not fixed.  :(  Thank you for catching it!

Status: RESOLVED → REOPENED

Flags: needinfo?(bzbarsky)

Resolution: FIXED → ---

Target Milestone: mozilla40 → ---

Boris Zbarsky [:bzbarsky]

Comment 10

•

9 years ago

So it looks like the problem is that mSpeculating is always false when we enter nsHtml5StreamParser::ParseAvailableData.  Then we get to the mTreeBuilder->HasScript() bit, create the nsHtml5Speculation, then it fails and we set mSpeculating backto false.

I tried just early-returning from inside the mTreeBuilder->HasScript() block, but then we never manage to finish parsing.  :(

Boris Zbarsky [:bzbarsky]

Comment 11

•

9 years ago

And in fact, doing it then is too late: we've already tokenized mFirstBuffer.

At this point I'm really not sure what the right way to disable speculation is here.  It seems like we really want the tokenizer to stop if it gets to a <script> token and speculation is disabled...  Or something.

Going to needinfo myself again for now, since Henri's not accepting needinfo requests. But really, I need to talk to him about this.

Assignee: bzbarsky → nobody

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 12

•

9 years ago

Henri, thoughts on comment 10 and comment 11?

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Updated

•

9 years ago

Flags: needinfo?(hsivonen)

Juriy "kangax" Zaytsev

Reporter

Comment 13

•

9 years ago

Also found this ticket which is somewhat related but not really — https://bugzilla.mozilla.org/show_bug.cgi?id=1191995 That one is about poor scrolling performance. So it seems that this huge table of ours is really causing multiple issues (poor loading perf., poor after-loading perf.)

I'd also love to know if there's anything we (authors) can do to improve perf.

Boris Zbarsky [:bzbarsky]

Comment 14

•

9 years ago

Sure.  Don't output misnested stuff with document.write.  e.g. see comment 2 first paragraph.

Henri Sivonen (:hsivonen)

Updated

•

8 years ago

Blocks: 1298764

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Henri Sivonen (:hsivonen)

Comment 15

•

23 days ago

(In reply to Boris Zbarsky [:bzbarsky] from comment #12)

Henri, thoughts on comment 10 and comment 11?

I think we need to introduce some new state in https://searchfox.org/mozilla-central/rev/159929cd10b8fba135c72a497d815ab2dd5a521c/parser/html/nsHtml5StreamParser.cpp#2499 and then make it so that no data from network gets tokenized while in that state and ContinueAfterScripts exists that state.

Flags: needinfo?(hsivonen)

You need to log in before you can comment on or make changes to this bug.