[PERF] Tracking: Performance improvements plaintext paste and insert

NEW
Unassigned

Status

()

P3
major
19 years ago
6 months ago

People

(Reporter: mscott, Unassigned)

Tracking

(Depends on: 1 bug, Blocks: 1 bug, {perf})

Trunk
x86
Other
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [perf])

Attachments

(2 attachments)

(Reporter)

Description

19 years ago
On a debug windows PII-450 MHZ machine with 256 MB of RAM, it takes 3 minutes to
insert 130K of text into the compose window. That is, I'm replying to a 100K
text message and it's taking about that long before I see the body inserted into
the editor instance.

I've done a lot of analysis on where the hotspots were for this operation and
one of the big ones was here in WillInsertText.

I sent a more detailed message to Joe and some others but here's the gist of
where we can kill some low hanging fruit:

1) The method is taking an nsString which is over 130K (corresponding to the
message body) as an input argument and it isn't passing the nsString by
reference. This causes an extra allocation of the string.

2) This method extracts one line at a time, copying that line into a new
nsString before calling DoTextInsertion (which appears to be an expensive
operation). A couple things we do here, searching for the next line is actually
expensive using nsString::FindCharInSet on a string as large as this one. Can
DoTextInsertion take larger chunks than one line at time? Also, I replaced the
search code with FindChar which is a little faster and this helped a bit.
 Also, we don't need to copy each extracted line into a new nsString and passing
that string to DoTextInsertion. Instead, we could pass an index into the
original string into DoTextInsertion.

3) We make a couple copies of the input String in this method. At 130K per copy
they are expensive. I wonder if we can't somehow re-examine if they are all
necessary.

Thanks!
(Reporter)

Comment 1

19 years ago
Nominating for beta1, adding perf to the keyword.
Keywords: beta1, perf
(Reporter)

Comment 2

19 years ago
Just for analysis sake (I'm definetly not saying this is the right thing to do).
I modified nsTextEditRules::WillInsertText to insert the entire message body in
one step instead of trying to find a line, copy that line out into a new
nsString then insert that string into the document and repeat.

Time BEFORE: 3-4 minutes
Time AFTER: 20 seconds

The remaining 20 seconds is time spent in nsHTMLToTXTSinkStream::Write searching
for new line characters in this huge nsString.

This gives us an idea of how much of a performance win we are looking at in here.

Keep in mind that my numbers for a high end machine too.

Comment 3

19 years ago
m15; cc sfraser
Target Milestone: M15
(Reporter)

Comment 4

19 years ago
Actually I'm nominating this for beta1. I'm not sure we can triage it to m15
which is after beta1 until PDT determines if they will rule it PDT+.

Comment 5

19 years ago
*** Bug 26388 has been marked as a duplicate of this bug. ***

Comment 6

19 years ago
acceptin bug
Status: NEW → ASSIGNED

Updated

19 years ago
Whiteboard: [PDT+]

Comment 7

19 years ago
*** Bug 19273 has been marked as a duplicate of this bug. ***

Comment 8

19 years ago
pasting in my comments from email:

Scott MacGregor wrote:
> We've got a really bad performance problem [...]
> 1) nsTextEditRules:: WillInsertText and DoTextInsertion are costing us
> a lot. [...]  Mailnews is giving the editor instance a nsString,  
> in my example this string was 130K. WillInsertText makes several 
> copies of the string .

You correctly point out that this routine is slow.  We can certainly
speed it up, though if it's really taking 4 minutes for you then we may
be doomed.  

> First, we should pass the nsString by reference in the method
> signature for WillInsertText to reduce an extra copy of the string
> when calling the method. Second, we immediately copy the entire string
> into aOutString as part of the intialization work even though later we
> may not end up using that copied value. Can we avoid that?

I would like to get rid of the outString (I've wanted to do this
before).  In some cases outString simply makes no sense.  But there are
other cases where some old code depends on it.  I'm hopeful that we can
ditch it with a little work, though.

> However I think the most expensive part of this method is the while
> loop which determines how we call DoTextInsertion. [...] Ideally,
> it'd be great if DoTextInsertion could handle chunks larger than just
> one line a time.

Any way you slice it, I have to break it up into runs of text separated
by newlines.  This is because I have to use <break> nodes instead of
newlines, even when inside of preformatted text.  The selection & caret
code cant deal with blank lines caused by newlines in preformatted text,
but it can deal with break nodes.  

At the bottom level, we have to insert a separate text node for each run
of chars between breaks.  

> I'll file a bug so we can get the ball rolling on this part for beta1
> if we can.

I can cut down on string copying for beta1.  And we can tune the rest. 
But the basic problem (needing to scan through/break up huge text
strings) remains.  :-(

If you can talk the layout folks into generating frames for blank lines
caused by newlines in preformatted text, then we can avoid this problem.
(Reporter)

Comment 9

19 years ago
>If you can talk the layout folks into generating frames for blank lines caused
>by newlines in preformatted text, then we can avoid this problem.
I agree completely. I've been running with that little hack in my tree where i
just insert the entire string in WillInsertText instead of slicing line by line
and it is so much faster. Although I did it as just a hack, I don't see any side
effects! I must be missing something =)

In any case I will definetly help push on the layout folks with you after beta1
so you can get out of the business of reading out line by line.

Even if you have to insert one line at a time, you could get a pretty big
performance boost by not copying each line into a new nsString. If you just
passed a ptr into the string buffer and a length parameter into DoTextInsertion
instead of a new nsString you'd save a lot.

I briefly looked at doing this and it looks like it would involve changing
several method signatures underneath DoTextInsertion to also take a PRUnichar *
+ a length instead of a nsString. But I didn't see that any of these methods
were modifying the input nsString in any way so it may be as easy as just
changing the signatures.

This would help save a lot as then you are only bounded by:
1) the time it takes to search out the position of the next new line
2) the time it takes for the right frame creation and text insertion for each
line of text processed by DoTextInsertion.

Just a few thoughts...

Comment 10

19 years ago
Reassigning to kin@netscape.com. I'll take a look at rewriting some of the code to use nsSubsumeStr to see if that speeds some things up.
Assignee: jfrancis → kin
Status: ASSIGNED → NEW

Comment 11

19 years ago
Must have a fix by 03/03 or will punt for beta1.
Whiteboard: [PDT+] → [PDT+] Need fix by 03/03

Comment 12

19 years ago
Is this performance an end user issue?  If not, can we push this to post beta 1?
(Reporter)

Comment 13

19 years ago
bijals: yes it is. Please read my comments at the top of the bug report where I
report that inserting a 100K document into an ender widget takes over 4 minutes
on a high end machine. This is very common when replying to large emails. I'd
say this effects the end user =).

Updated

19 years ago
Whiteboard: [PDT+] Need fix by 03/03 → [PDT+] w/b minus on 03/03

Comment 14

19 years ago
Accepting bug.
Status: NEW → ASSIGNED

Comment 15

19 years ago
I think think the searching, dup'ing, and splitting of the 160k nsString is the 
least of our worries. I commented out the calls to InsertBreak() and 
DoTextInsertion() and found that the searching, dup'ing, and splitting of the 
160k string only took about 4 seconds of the total time ... I was able to get 
this down to less than a second by rewriting the code to use offsets 
and nsSubsumeStr.

I'm currently trying get Quantify to work to see where the hotspots in the 
InsertBreak() and DoTextInsertion() calls are.
(Reporter)

Comment 16

19 years ago
I agree completely with the numbers you saw =). I've been running with a hacked
patch in my tree where we always insert the entire nsString in one shot instead
of extracting line by line. I know this is supposed to be wrong but I've been
running with it in my tree all week and haven't seen anything bad happen.
Granted I'm only testing mailnew compose and not Editing documents.

Comment 17

19 years ago
I'll take a look at this.
Assignee: kin → sfraser
Status: ASSIGNED → NEW

Updated

19 years ago
Blocks: 29584

Comment 18

19 years ago
cc'ing akkana

Comment 19

19 years ago
OK, we've found 3-4 performance hotspots that contribute to slowness here:

1. The implementation of nsEditor::GetChildOffset() is slow; each call could
   take from 2ms upwards, depending on the number of children.

   We can fix this method to use nsIContent::IndexOf(), which cuts the call time
   down to around 0.021 ms

2. nsIDOMSelection::Collapse is being called a bunch of times, and some of those
   calls can be time-consuming; 1.7ms or more. Those that take significant time
   spent most of that time in selectFrames(), which does 2 CreateInstance calls
   and some iterating through frames. Some of this is avoidable.

3. The call to nsHTMLEditor::InsertFormattingForNode() on each new text node
   calls nsIDOMElement::SetAttribute(), which is slow for new nodes (because it
   calls GetPrimaryFrameFor() on an uncached node; see bug 29730). One call to 
   SetAttribute takes around 1.8ms, of which 1.5ms is in GetPrimaryFrameFor().

4. We don't currently recycle nsRange objects. We could.

5. nsEditor::InsertTextImpl() was playing funny games with inserting a " "
   string, then selecting it, then replacing it with the inserted text. This
   can probably be removed.

Comment 20

19 years ago
So with provisionary fixes to 1, 2, 3 and 5 we halved the time taken to insert 
1000 lines of text, from 20 seconds to around 10 seconds. This is not good 
enough, but a start.

Comment 21

19 years ago
simon and i thought the numbers here sounded suspicious, since kin was reporting 
4 minutes to paste with a 2000 line file (using the same lines we were using in 
the shorter test).  So I independently did the comarison in my tree.  I confirm 
Simon's numbers.  I did a 500 line test, and got 5.1 seconds with our speedups, 
and 9.5 seconds without them.  

With the full 2200+ lines in Kin's test file, my numbers were 57.5 seconds vs 	
130 seconds.  As Simon said, we can do better.

Comment 22

19 years ago
just filed bug 29737 on seelction being slow after you paste in the 2200 line 
file.  I suspect it's the same GetPrimaryFrameFor() problem, but I haven't 
investigated.

Comment 23

19 years ago
I've got pasting the 500 lines down to about 3 seconds, and 2200+ lines down to 
about 27 seconds on my PII 450 Mhz machine. I'll attatch the diffs that I'm 
using.

My diffs contain the following changes:

  1. GetChildOffset() calls nsIContent::IndexOf().
  2. Commented out the setting of " " in InsertTextImpl().
  3. Call InsertFormattingForNode() before InsertBefore() in 
CreateElementTxn::Do().
  4. Rewrite of nsHTMLEditRules::WillInsertText() and 
nsTextEditRules::WillInsertText() code to use nsSubsumeString() instead of 
duping and splitting.
  5. Reordered the CID checks in TransactionFactory::GetNewTransaction()so that 
the most commonly used during editing and pasting are at the top.
  6. nsDOMSelection::selectFrames() bails early if the selection is collapsed 
AND we are currently selecting the range passed in.

Comment 24

19 years ago
Created attachment 5957 [details] [diff] [review]
Patch set containing fixes from kin, mjudge, jfrancis, and sfraser

Comment 25

19 years ago
Some things I'm noticing:

1. Pasting in HTML Mail Compose and Composer are faster than pasting in 
PlainText Mail Compose. Pasting 500 lines in HTML Mail Compose and Composer is 
about 3 seconds, and 10 seconds in PlainText Mail.

2. In Composer, if I paste 2200+ lines, then undo or SelectAll-Delete, and then 
paste again, 10 seconds is added to the total amount of time it takes for the 
content to appear. If I repeat this again, another 10 seconds is added to the 
total time.

Comment 26

19 years ago
cc: troy, since I'm going to be mentioning GetPrimaryFrameFor
Status: NEW → ASSIGNED

Comment 27

19 years ago
I don't want to hear any mention of GetPrimaryFrameFor(). It's exactly the same 
issue as for BR frames. We don't add text frames to the map by default, because 
it drastically increases the size of the map, and text nodes are anonymous so 
normal DOM usage doesn't involve doing a content->frame lookup

If you want text nodes added to the map by default in editable documents, then 
go ahead

Comment 28

19 years ago
Here's a rundown of where I think we're spending time, after applying fixes
for the performance problems described above. These data are for pasting
a single line of text (text, then a <br>) into the editor (taken from the 
middle of pasting 100 lines). Times are microseconds.

Two main top-level calls themselves make up almost 100% of the total
pasting time:

                                                 Num calls      Time (us)
--------------------------------------------------------------------------
nsHTMLEditor::InsertBreak							1			1690
nsEditor::InsertTextImpl (enclosing call only)		1			1862
--------------------------------------------------------------------------
Total															
3352

Here are the (non-inclusive) calls that take significant time within
the above calls (and together account for around 50% of the time)

--------------------------------------------------------------------------
PresShell::ContentInserted (frame construction)    	2			434  (12%)
nsDOMSelection::Collapse							5			395  
(11%)
nsEditor::GetPriorNode
		( ->IsEditable -> GetPrimaryFrameFor)		3			359  (10%)
nsGenericDOMDataNode::ReplaceData
		( -> GetPrimaryFrameFor)					1			373	 
(10%)
InsertTextTxn::Init	(a string dup)					1           203  (6%)

Comment 29

19 years ago
Here are those data non-mangled:

                                                 Num calls      Time (us)
--------------------------------------------------------------------------
nsHTMLEditor::InsertBreak                           1           1690
nsEditor::InsertTextImpl (enclosing call only)      1           1862
--------------------------------------------------------------------------
Total                                                           3352

Here are the (non-inclusive) calls that take significant time within
the above calls (and together account for around 50% of the time)

--------------------------------------------------------------------------
PresShell::ContentInserted (frame construction)     2           434  (12%)
nsDOMSelection::Collapse                            5           395  (11%)
nsEditor::GetPriorNode
        ( ->IsEditable -> GetPrimaryFrameFor)       3           359  (10%)
nsGenericDOMDataNode::ReplaceData
        ( -> GetPrimaryFrameFor)                    1           373  (10%)
InsertTextTxn::Init (a string dup)                  1           203  (6%)

Comment 30

19 years ago
removing PDT+ for reconsideration.  I recommend we go with M15 for this.  We've 
identified a lot of performance improvements, but there is no silver bullet.  We 
need a lot of small and medium sized changes.  I'd feel better about running with 
this code throughout M15 + M16 and having it be in beta2.
Whiteboard: [PDT+] w/b minus on 03/03

Comment 31

19 years ago
i'm expirementing with a rewrite of much of the insert text codepath, designed to 
avoid the pitfalls that we have discovered with the current code.  Right now my 
first quick+dirty cut at the rewrite takes around the same amount of time as 
"fast" version of the old code (ie, the old code with all the improvements we 
have discovered).  

I'm trying to write the new version in such a manner as to avoid triggering 
expensive GetPrimaryFrameFor() calls and also avoid any selection related calls 
until the paste is completely finished.

Comment 32

19 years ago
I know Joe is rewriting the editor code to not call collapse so much ... but I 
went ahead and modified nsDOMSelection so that it caches the iterators used in 
selectFrames(). This seems to shave about a second off the time it takes to 
paste the 500 and 2250 line case. I'll attatch the updated diffs for 
nsRangeList.cpp.

Comment 33

19 years ago
Created attachment 6072 [details] [diff] [review]
Updated nsRangeList.cpp diff with cached iterators.

Comment 34

19 years ago
PDT-
Whiteboard: [PDT-]

Comment 35

19 years ago
reassigning to self
Assignee: sfraser → jfrancis
Status: ASSIGNED → NEW

Comment 36

19 years ago
moving to M17
Target Milestone: M15 → M17

Comment 37

19 years ago
per my recent work and re-profiling, adding dependencies on 5 bugs.
Status: NEW → ASSIGNED
Depends on: 35292, 35293, 35294, 35295, 35296

Comment 38

19 years ago
moving to future
Keywords: beta1
Whiteboard: [PDT-]
Target Milestone: M17 → Future

Comment 39

19 years ago
moving this back to m19 since it blocks 29584 -- I didn't see the dependency 
earlier
Target Milestone: Future → M19

Comment 40

19 years ago
35292 has been futured
35293 is futured (just now)
35294 is dependent on 43863 (which is futured) and 35294 is futured
35295 is nsbeta3+
35296 is nsbeta3+

since over half of the dependency bugs are futured, moving this to future
Target Milestone: M19 → Future

Updated

19 years ago
Blocks: 53252

Comment 41

19 years ago
Removing dependencies on 35292,35293,35294,35295,35296 since they really aren't 
dependencies. Bug 53252 is now the tracking bug for all areas that need tuning.
No longer blocks: 53252
No longer depends on: 35292, 35293, 35294, 35295, 35296

Updated

19 years ago
Blocks: 53252

Updated

19 years ago
Severity: normal → major
Keywords: rtm
Priority: P3 → P1
Whiteboard: [rtm+]
Target Milestone: Future → M19

Comment 42

19 years ago
removed future now that a parent paste bug has been opened, paste performance
must be fixed before rtm

Updated

19 years ago
Whiteboard: [rtm+] → [rtm+][p:1]

Comment 43

19 years ago
per beppe removing status whiteboard
Keywords: rtm
Whiteboard: [rtm+][p:1]
Target Milestone: M19 → Future

Comment 44

18 years ago
Mozilla 0.9 this for joe. Joe, if you disagree, please retriage.
Target Milestone: Future → mozilla0.9

Updated

18 years ago
Keywords: mailtrack

Updated

18 years ago
Target Milestone: mozilla0.9 → mozilla0.9.1

Comment 45

18 years ago
the core problem here is 35294.  adding dependency. note that bug is blocked by 
43863.  i'm pushing this off to 0.9.1, not because it isn't critical, but because 
i don't anticipate movement on the dependencies.

Updated

18 years ago
Target Milestone: mozilla0.9.1 → mozilla0.9.2

Updated

18 years ago
Whiteboard: [perf]

Updated

18 years ago
Target Milestone: mozilla0.9.2 → mozilla1.0

Comment 46

18 years ago
although Beth has moved this out, I think we will make some progress on this in 
092.

Comment 47

18 years ago
updating summary, milestone, dependencies
Depends on: 35292, 35293, 35294, 35295, 35296
Summary: Performance improvements for nsTextEditRules::WillInsertText → [PERF] Tracking: Performance improvements plaintext paste and insert
Target Milestone: mozilla1.0 → mozilla0.9.7

Comment 48

18 years ago
awaiting 35294 work.  

Updated

17 years ago
Target Milestone: mozilla0.9.7 → mozilla0.9.9

Comment 49

17 years ago
the swami says: things that will not land in 099!
Target Milestone: mozilla0.9.9 → mozilla1.0
Moving to Mozilla1.1. Engineers are overloaded working on higher priority bugs.
Target Milestone: mozilla1.0 → mozilla1.1

Comment 51

17 years ago
a lot of work has been done here.  We need to start from scratch on perf 
investigation since many of the hotspots have received attention.  On fix that 
hasn't yet landed is at the front of the first attached patch: kin speeded things 
up by reordering the switch statement in the editor factory method.

Comment 52

17 years ago
The trunk is the wave of the future!
Target Milestone: mozilla1.1alpha → mozilla1.1beta

Comment 53

17 years ago
The days of having a half dozen milestones out in front of us to divide bugs 
between seem to be gone, though I dont know why.  Lumping everything together as 
far out as I can.  I'll pull back things that I am working on as I go.
Target Milestone: mozilla1.1beta → mozilla1.2beta

Comment 54

17 years ago
migrating milestones now that there are more available.  i'll pull these back as
I work on them
Target Milestone: mozilla1.2beta → mozilla1.4beta
QA Contact: sujay → editor
Assignee: mozeditor → nobody
Status: ASSIGNED → NEW
Target Milestone: mozilla1.4beta → ---
old bug.  down priority to P3.
Priority: P1 → P3
You need to log in before you can comment on or make changes to this bug.