Optimize nsHTMLContentSerializer::AppendAndTranslateEntities

RESOLVED FIXED

Status

()

Core
DOM: Core & HTML
RESOLVED FIXED
8 years ago
8 years ago

People

(Reporter: smaug, Assigned: smaug)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments, 2 obsolete attachments)

(Assignee)

Description

8 years ago
nsHTMLContentSerializer::AppendAndTranslateEntities shows up pretty
badly in GetInnerHTML profiles.
In particular, called from attribute-serialization code....
(Assignee)

Comment 2

8 years ago
That actually depends on the case, I think.
(Assignee)

Comment 3

8 years ago
I think I can cut the time spent in AppendAndTranslateEntities to 
half or so in the cases when we don't actually translate any entities.
Patch coming after some testing.
Assignee: nobody → Olli.Pettay
(Assignee)

Comment 4

8 years ago
Created attachment 479095 [details] [diff] [review]
WIP

A bit ugly approach to make a fast loop for the innerHTML case.
(Assignee)

Comment 5

8 years ago
test_bug424359-2.html fails with the patch.
(Assignee)

Comment 6

8 years ago
Created attachment 479393 [details] [diff] [review]
patch

This is really for GetInnerHTML. Helps especially attribute handling.
Attachment #479393 - Flags: review?(jonas)
(Assignee)

Comment 7

8 years ago
Comment on attachment 479393 [details] [diff] [review]
patch

Bah, I managed to do something wrong here.
Attachment #479393 - Flags: review?(jonas)
(Assignee)

Comment 8

8 years ago
Created attachment 479483 [details] [diff] [review]
patch

Better to not call -- on PRUint32 variable which is 0.
Attachment #479393 - Attachment is obsolete: true
Attachment #479483 - Flags: review?(jonas)
(Assignee)

Updated

8 years ago
Attachment #479483 - Flags: review?(laurent)

Comment 9

8 years ago
Comment on attachment 479483 [details] [diff] [review]
patch


>+      if (val <= kValNBSP && entityTable[val]) {
>+        // So there is something to replace. Iterate from
>+        // end to the first character which needs replament and
>+        // translate needed entities.
>+        nsAutoString str(aStr);

s/replament/replacement


Ok for me.
Attachment #479483 - Flags: review?(laurent) → review+
This still seems like it could be sped up quite a bit in the case when there are multiple characters in the string that needs to be turned into entities. Once you've found something that needs replacing, why not iterate from then on forwards and insert character by character as you go and replace on the fly?
(Assignee)

Comment 11

8 years ago
I could profile that case too.
But inserting character by character is not cheap!
(Assignee)

Comment 12

8 years ago
Created attachment 481791 [details]
lots of entities -testcase
(Assignee)

Comment 13

8 years ago
Created attachment 481794 [details] [diff] [review]
character by character

Ok, I overestimated the cost of Append.
(And it is getting faster in another bug.)
Attachment #479483 - Attachment is obsolete: true
Attachment #481794 - Flags: review?(jonas)
Attachment #479483 - Flags: review?(jonas)
(Assignee)

Comment 14

8 years ago
But actually, I need to test also less worst-case scenario.
(Assignee)

Comment 15

8 years ago
Comment on attachment 481794 [details] [diff] [review]
character by character

This is not good for not-worst-case test.
Attachment #481794 - Flags: review?(jonas)
(Assignee)

Updated

8 years ago
Attachment #481791 - Attachment is patch: false
Attachment #481791 - Attachment mime type: text/plain → text/html
(Assignee)

Comment 16

8 years ago
Created attachment 481796 [details]
less-entities -tescase
(Assignee)

Comment 17

8 years ago
Created attachment 481834 [details] [diff] [review]
patch
Attachment #481834 - Flags: review?(jonas)
Comment on attachment 481834 [details] [diff] [review]
patch

Actually, I realized that the way we usually do these types of things is that we first do one run-through and calculate the resulting length. Then we do a single allocation and then finally go through and convert the data. This in order to optimize for few allocations.

That's the strategy used for for example UTF8/16 conversion.

Anyhow, the current patch looks ok to me too.
Attachment #481834 - Flags: review?(jonas) → review+
(Assignee)

Comment 19

8 years ago
thanks for the review.

afaics iterating twice would cause a significant perf regression
in cases when there are no entities.
With this patch you already are. First iterating while looking for an entity, then iterating when doing the string copy.
(Assignee)

Comment 21

8 years ago
what "iterating when doing string copy"? Well, ok, if you think of memcpy
as iteration.
(Assignee)

Updated

8 years ago
Attachment #481834 - Flags: approval2.0?
Yes, for the simple case when there are no entities it would reduce to the same code being executed. Something like:

Iterate through the string to measure the size. If the resulting size is the length of the string just do an assign. Otherwise call SetCapacity and iterate and either write a single char or write an entity. Once the remaining size is equal to the number of remaining chars, do a single string-append call.

Alternatively:

Iterate through the string to measure the size and also remember the location of the first found entity. Call SetCapacity. Copy up to the first found entity. Iterate and either write a single char or write an entity. Once the remaining size is equal to the number of remaining chars, do a single string-append call.
Does this need need to be pushed?
(Assignee)

Comment 24

8 years ago
http://hg.mozilla.org/mozilla-central/rev/5066079367fe
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.