nsScanner construction is 5% of setting innerHTML

RESOLVED FIXED in mozilla1.9beta3

Status

()

P5
normal
RESOLVED FIXED
11 years ago
11 years ago

People

(Reporter: bzbarsky, Assigned: bent.mozilla)

Tracking

(Blocks: 1 bug, {perf})

Trunk
mozilla1.9beta3
x86
Mac OS X
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 3 obsolete attachments)

(Reporter)

Description

11 years ago
This part of the profile in bug 386769:

36   6276 nsScanner::nsScanner(nsAString_internal&, nsACString_internal&, int)
     5242 nsScanner::SetDocumentCharset(nsACString_internal const&, int)
      733 nsScanner::AppendToBuffer(nsAString_internal const&)
       29 nsScanner::AppendToBuffer(nsScannerBufferList::Buffer*, nsIRequest*)
       29 do_GetService(char const*, unsigned int*)
       27 nsCOMPtr::nsCOMPtr(nsGetServiceByContractIDWithError const&)
       22 nsCharsetConverterManager::GetUnicodeDecoderRaw

This charset stuff seems silly.
(Reporter)

Updated

11 years ago
Keywords: perf

Comment 1

11 years ago
Nominating for blocking since this is a performance spinoff of 386769 
Flags: blocking1.9?
Assignee: nobody → jonas
Flags: blocking1.9? → blocking1.9+
Priority: -- → P5

Comment 2

11 years ago
Jonas you got anything here?
Assignee: jonas → bent.mozilla
Created attachment 297391 [details] [diff] [review]
Patch, v1

This patch special cases simple text to avoid the parser overhead. Sadly I had to make my own version of FindCharInSet because that is only implemented on the concrete string classes...

Using the testcase from attachment 288941 [details] (bug 386769) I see the following improvements (average of 10 runs):

                  | before | after
------------------+--------+-------
innerHTML only    |   53.7 |  21.5 (2.49x faster)
innerHTML + style |   93.0 |  58.3 (1.60x faster)
innerHTML concat  |   72.5 |  69.7 (1.04x faster)
innerHTML copy    |  118.9 |  52.3 (2.27x faster)
Attachment #297391 - Flags: superreview?(bzbarsky)
Attachment #297391 - Flags: review?(jst)
(Reporter)

Comment 4

11 years ago
Comment on attachment 297391 [details] [diff] [review]
Patch, v1

That looks wrong.  We do need to go through the parser to handle the various containment rules.  For example, "simple text" is not allowed inside some tags.

This bug was about simplifying the scanner setup for this case ("this" == innerHTML in general), not about trying to skip the parser altogether.  You can't do the latter, given how innerHTML is specified.
Attachment #297391 - Flags: superreview?(bzbarsky) → superreview-
Comment on attachment 297391 [details] [diff] [review]
Patch, v1

Oops, I attached this to the wrong bug (should have been bug 386769). But according to bz this patch is incorrect anyway.
Attachment #297391 - Attachment is obsolete: true
Attachment #297391 - Flags: review?(jst)

Comment 6

11 years ago
(In reply to comment #4)
> We do need to go through the parser to handle the various
> containment rules.
Could we perhaps bypass parser in some common, simple cases?
Like when setting text value to .innerHTML of a span or div?

(I thought I once wrote this comment already)
(Reporter)

Comment 7

11 years ago
Could be, but it has absolutely nothing to do with this bug.
Created attachment 298801 [details] [diff] [review]
Real patch, v1

Ok, this should be better. In my own profiles nsContentUtils::CreateContextualFragment spent 10.6% of its time in nsScanner::SetDocumentCharset. With this patch that's down to 2.5%.

Basic changes were:
1) prevent *3* GetService() calls per nsScanner object created
2) Change some string usage to copy less (both in comparisons and assignments).

I don't really see any other low-hanging stuff in here, but I'm definitely up for suggestions.
Attachment #298801 - Flags: review?(jst)
(Reporter)

Comment 9

11 years ago
> I don't really see any other low-hanging stuff in here

This apparently didn't make its way into this bug, somehow....  The low-hanging fruit here is that for the case when we're parsing an nsAString the charset that's being set on the scanner is completely unused, since we never pump _bytes_ into the scanner.  So instead (or in addition to) optimizing the code we should be skipping it altogether.  That is, have an nsScanner constructor that skips all the charset stuff.  Or something.
Created attachment 298868 [details] [diff] [review]
Real patch, v2

Ha, so that worked just fine. How does the rest of this look?
Attachment #298801 - Attachment is obsolete: true
Attachment #298868 - Flags: review?(bzbarsky)
Attachment #298801 - Flags: review?(jst)
(Reporter)

Comment 11

11 years ago
I'm not qualified to review the intl parts, and the scanner part should really get looked at by someone other than me (since I suggested the patch).  I'd try mrbkap for that and smontagu for intl.
Comment on attachment 298868 [details] [diff] [review]
Real patch, v2

Ok, thanks bz.

mrbkap could you look over the scanner/parser stuff here?

smontagu, what do you think of these intl changes?
Attachment #298868 - Flags: review?(smontagu)
Attachment #298868 - Flags: review?(mrbkap)
Attachment #298868 - Flags: review?(bzbarsky)
Comment on attachment 298868 [details] [diff] [review]
Real patch, v2

>diff -NprU8 mozilla.6e203f810693/parser/htmlparser/src/nsParser.cpp mozilla.239a07582cb1/parser/htmlparser/src/nsParser.cpp
>   if (aURL) {
>-    nsCAutoString spec;
>+    nsCString spec;

What's this change for? Won't this make us simply have to malloc in more cases?

>diff -NprU8 mozilla.6e203f810693/parser/htmlparser/src/nsScanner.cpp mozilla.239a07582cb1/parser/htmlparser/src/nsScanner.cpp

I have a couple of nits here, I know they were here before, but might as well fix them now.

>-    // different, need to change it
>-    nsCAutoString charsetName;
>-    res = calias->GetPreferred(aCharset, charsetName);
>+  }
>+  // different, need to change it
>+  nsCString charsetName;

Please add a newline before the comment.

>+  if(NS_FAILED(res) && (kCharsetUninitialized == mCharsetSource) )

konstant == variable looks weird to my eyes, I'd rather see this swapped.

fwiw, the last part of the diff could have used a helping of -w, but whatever, looks good from the parser end of things.
Attachment #298868 - Flags: review?(mrbkap) → review+
Comment on attachment 298868 [details] [diff] [review]
Real patch, v2

r=me for intl.

Not necessarily this bug, but it might be good to find a better test for identifying single byte decoders than {name begins with "iso-8859"}

Updated

11 years ago
Attachment #298868 - Flags: review?(smontagu) → review+
Created attachment 299805 [details] [diff] [review]
Patch with mrbkap's comments addressed

Carrying mrbkap's and smontagu's r+ forward, mrbkap's comments addressed. That string change was unintentional and has been reverted.
Attachment #298868 - Attachment is obsolete: true
Attachment #299805 - Flags: superreview?(bzbarsky)
Attachment #299805 - Flags: review+
(Reporter)

Comment 16

11 years ago
I doubt I'll be able to get to this before beta freeze.  Might be worth trying a different sr.

Updated

11 years ago
Attachment #299805 - Flags: superreview?(bzbarsky) → superreview+
Fixed.
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9beta3

Comment 18

11 years ago
The patch for this broke my build, see bug 414677.
(Reporter)

Updated

11 years ago
Depends on: 414677

Updated

11 years ago
No longer blocks: 386769

Updated

11 years ago
Blocks: 442348
You need to log in before you can comment on or make changes to this bug.