nsScanner construction is 5% of setting innerHTML

RESOLVED FIXED in mozilla1.9beta3

Status

()

defect
P5
normal
RESOLVED FIXED
12 years ago
11 years ago

People

(Reporter: bzbarsky, Assigned: bent.mozilla)

Tracking

(Blocks 1 bug, {perf})

Trunk
mozilla1.9beta3
x86
macOS
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 3 obsolete attachments)

This part of the profile in bug 386769:

36   6276 nsScanner::nsScanner(nsAString_internal&, nsACString_internal&, int)
     5242 nsScanner::SetDocumentCharset(nsACString_internal const&, int)
      733 nsScanner::AppendToBuffer(nsAString_internal const&)
       29 nsScanner::AppendToBuffer(nsScannerBufferList::Buffer*, nsIRequest*)
       29 do_GetService(char const*, unsigned int*)
       27 nsCOMPtr::nsCOMPtr(nsGetServiceByContractIDWithError const&)
       22 nsCharsetConverterManager::GetUnicodeDecoderRaw

This charset stuff seems silly.
Keywords: perf

Comment 1

12 years ago
Nominating for blocking since this is a performance spinoff of 386769 
Flags: blocking1.9?
Assignee: nobody → jonas
Flags: blocking1.9? → blocking1.9+
Priority: -- → P5

Comment 2

12 years ago
Jonas you got anything here?
Assignee: jonas → bent.mozilla
Posted patch Patch, v1 (obsolete) — Splinter Review
This patch special cases simple text to avoid the parser overhead. Sadly I had to make my own version of FindCharInSet because that is only implemented on the concrete string classes...

Using the testcase from attachment 288941 [details] (bug 386769) I see the following improvements (average of 10 runs):

                  | before | after
------------------+--------+-------
innerHTML only    |   53.7 |  21.5 (2.49x faster)
innerHTML + style |   93.0 |  58.3 (1.60x faster)
innerHTML concat  |   72.5 |  69.7 (1.04x faster)
innerHTML copy    |  118.9 |  52.3 (2.27x faster)
Attachment #297391 - Flags: superreview?(bzbarsky)
Attachment #297391 - Flags: review?(jst)
Comment on attachment 297391 [details] [diff] [review]
Patch, v1

That looks wrong.  We do need to go through the parser to handle the various containment rules.  For example, "simple text" is not allowed inside some tags.

This bug was about simplifying the scanner setup for this case ("this" == innerHTML in general), not about trying to skip the parser altogether.  You can't do the latter, given how innerHTML is specified.
Attachment #297391 - Flags: superreview?(bzbarsky) → superreview-
Comment on attachment 297391 [details] [diff] [review]
Patch, v1

Oops, I attached this to the wrong bug (should have been bug 386769). But according to bz this patch is incorrect anyway.
Attachment #297391 - Attachment is obsolete: true
Attachment #297391 - Flags: review?(jst)
(In reply to comment #4)
> We do need to go through the parser to handle the various
> containment rules.
Could we perhaps bypass parser in some common, simple cases?
Like when setting text value to .innerHTML of a span or div?

(I thought I once wrote this comment already)
Could be, but it has absolutely nothing to do with this bug.
Posted patch Real patch, v1 (obsolete) — Splinter Review
Ok, this should be better. In my own profiles nsContentUtils::CreateContextualFragment spent 10.6% of its time in nsScanner::SetDocumentCharset. With this patch that's down to 2.5%.

Basic changes were:
1) prevent *3* GetService() calls per nsScanner object created
2) Change some string usage to copy less (both in comparisons and assignments).

I don't really see any other low-hanging stuff in here, but I'm definitely up for suggestions.
Attachment #298801 - Flags: review?(jst)
> I don't really see any other low-hanging stuff in here

This apparently didn't make its way into this bug, somehow....  The low-hanging fruit here is that for the case when we're parsing an nsAString the charset that's being set on the scanner is completely unused, since we never pump _bytes_ into the scanner.  So instead (or in addition to) optimizing the code we should be skipping it altogether.  That is, have an nsScanner constructor that skips all the charset stuff.  Or something.
Posted patch Real patch, v2 (obsolete) — Splinter Review
Ha, so that worked just fine. How does the rest of this look?
Attachment #298801 - Attachment is obsolete: true
Attachment #298868 - Flags: review?(bzbarsky)
Attachment #298801 - Flags: review?(jst)
I'm not qualified to review the intl parts, and the scanner part should really get looked at by someone other than me (since I suggested the patch).  I'd try mrbkap for that and smontagu for intl.
Comment on attachment 298868 [details] [diff] [review]
Real patch, v2

Ok, thanks bz.

mrbkap could you look over the scanner/parser stuff here?

smontagu, what do you think of these intl changes?
Attachment #298868 - Flags: review?(smontagu)
Attachment #298868 - Flags: review?(mrbkap)
Attachment #298868 - Flags: review?(bzbarsky)
Comment on attachment 298868 [details] [diff] [review]
Real patch, v2

>diff -NprU8 mozilla.6e203f810693/parser/htmlparser/src/nsParser.cpp mozilla.239a07582cb1/parser/htmlparser/src/nsParser.cpp
>   if (aURL) {
>-    nsCAutoString spec;
>+    nsCString spec;

What's this change for? Won't this make us simply have to malloc in more cases?

>diff -NprU8 mozilla.6e203f810693/parser/htmlparser/src/nsScanner.cpp mozilla.239a07582cb1/parser/htmlparser/src/nsScanner.cpp

I have a couple of nits here, I know they were here before, but might as well fix them now.

>-    // different, need to change it
>-    nsCAutoString charsetName;
>-    res = calias->GetPreferred(aCharset, charsetName);
>+  }
>+  // different, need to change it
>+  nsCString charsetName;

Please add a newline before the comment.

>+  if(NS_FAILED(res) && (kCharsetUninitialized == mCharsetSource) )

konstant == variable looks weird to my eyes, I'd rather see this swapped.

fwiw, the last part of the diff could have used a helping of -w, but whatever, looks good from the parser end of things.
Attachment #298868 - Flags: review?(mrbkap) → review+
Comment on attachment 298868 [details] [diff] [review]
Real patch, v2

r=me for intl.

Not necessarily this bug, but it might be good to find a better test for identifying single byte decoders than {name begins with "iso-8859"}
Attachment #298868 - Flags: review?(smontagu) → review+
Carrying mrbkap's and smontagu's r+ forward, mrbkap's comments addressed. That string change was unintentional and has been reverted.
Attachment #298868 - Attachment is obsolete: true
Attachment #299805 - Flags: superreview?(bzbarsky)
Attachment #299805 - Flags: review+
I doubt I'll be able to get to this before beta freeze.  Might be worth trying a different sr.
Attachment #299805 - Flags: superreview?(bzbarsky) → superreview+
Fixed.
Status: NEW → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9beta3

Comment 18

12 years ago
The patch for this broke my build, see bug 414677.
Depends on: 414677

Updated

11 years ago
No longer blocks: 386769

Updated

11 years ago
Blocks: 442348
You need to log in before you can comment on or make changes to this bug.