Closed Bug 599982 Opened 14 years ago Closed 14 years ago

Optimize nsHTMLContentSerializer::AppendAndTranslateEntities

Categories

(Core :: DOM: Core & HTML, defect)

x86
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: smaug, Assigned: smaug)

References

(Blocks 1 open bug)

Details

Attachments

(5 files, 2 obsolete files)

nsHTMLContentSerializer::AppendAndTranslateEntities shows up pretty
badly in GetInnerHTML profiles.
In particular, called from attribute-serialization code....
That actually depends on the case, I think.
I think I can cut the time spent in AppendAndTranslateEntities to 
half or so in the cases when we don't actually translate any entities.
Patch coming after some testing.
Assignee: nobody → Olli.Pettay
Attached patch WIPSplinter Review
A bit ugly approach to make a fast loop for the innerHTML case.
test_bug424359-2.html fails with the patch.
Attached patch patch (obsolete) — Splinter Review
This is really for GetInnerHTML. Helps especially attribute handling.
Attachment #479393 - Flags: review?(jonas)
Comment on attachment 479393 [details] [diff] [review]
patch

Bah, I managed to do something wrong here.
Attachment #479393 - Flags: review?(jonas)
Attached patch patch (obsolete) — Splinter Review
Better to not call -- on PRUint32 variable which is 0.
Attachment #479393 - Attachment is obsolete: true
Attachment #479483 - Flags: review?(jonas)
Attachment #479483 - Flags: review?(laurent)
Comment on attachment 479483 [details] [diff] [review]
patch


>+      if (val <= kValNBSP && entityTable[val]) {
>+        // So there is something to replace. Iterate from
>+        // end to the first character which needs replament and
>+        // translate needed entities.
>+        nsAutoString str(aStr);

s/replament/replacement


Ok for me.
Attachment #479483 - Flags: review?(laurent) → review+
This still seems like it could be sped up quite a bit in the case when there are multiple characters in the string that needs to be turned into entities. Once you've found something that needs replacing, why not iterate from then on forwards and insert character by character as you go and replace on the fly?
I could profile that case too.
But inserting character by character is not cheap!
Ok, I overestimated the cost of Append.
(And it is getting faster in another bug.)
Attachment #479483 - Attachment is obsolete: true
Attachment #481794 - Flags: review?(jonas)
Attachment #479483 - Flags: review?(jonas)
But actually, I need to test also less worst-case scenario.
Comment on attachment 481794 [details] [diff] [review]
character by character

This is not good for not-worst-case test.
Attachment #481794 - Flags: review?(jonas)
Attachment #481791 - Attachment is patch: false
Attachment #481791 - Attachment mime type: text/plain → text/html
Attached patch patchSplinter Review
Attachment #481834 - Flags: review?(jonas)
Comment on attachment 481834 [details] [diff] [review]
patch

Actually, I realized that the way we usually do these types of things is that we first do one run-through and calculate the resulting length. Then we do a single allocation and then finally go through and convert the data. This in order to optimize for few allocations.

That's the strategy used for for example UTF8/16 conversion.

Anyhow, the current patch looks ok to me too.
Attachment #481834 - Flags: review?(jonas) → review+
thanks for the review.

afaics iterating twice would cause a significant perf regression
in cases when there are no entities.
With this patch you already are. First iterating while looking for an entity, then iterating when doing the string copy.
what "iterating when doing string copy"? Well, ok, if you think of memcpy
as iteration.
Attachment #481834 - Flags: approval2.0?
Yes, for the simple case when there are no entities it would reduce to the same code being executed. Something like:

Iterate through the string to measure the size. If the resulting size is the length of the string just do an assign. Otherwise call SetCapacity and iterate and either write a single char or write an entity. Once the remaining size is equal to the number of remaining chars, do a single string-append call.

Alternatively:

Iterate through the string to measure the size and also remember the location of the first found entity. Call SetCapacity. Copy up to the first found entity. Iterate and either write a single char or write an entity. Once the remaining size is equal to the number of remaining chars, do a single string-append call.
Does this need need to be pushed?
http://hg.mozilla.org/mozilla-central/rev/5066079367fe
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: