too much string appending in CSS Parser

NEW
Unassigned

Status

()

Core
CSS Parsing and Computation
P5
normal
18 years ago
6 years ago

People

(Reporter: dbaron, Unassigned)

Tracking

({memory-footprint, perf})

Trunk
Future
memory-footprint, perf
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

18 years ago
I've heard comments that the CSS parser does too much string appending (and it 
sure looks this way :-/.   **If this is true**, I think there's a relatively 
easy solution.

The reason it does so much appending to strings is that the CSSParser builds up 
a string representing the selector text one token at a time.  It would be much 
simpler, I think, if the tokenizer were able to store the buffer that it was 
tokenizing, which could then be retrieved by the parser if it wanted to know the 
text that it parsed since a certain point.

What I'm not sure of is whether the parser would ever want the scanner to be 
remembering two different things at once.  If not, the scanner could have 
methods such as |StartMemory|, |GetMemory|, and |ClearMemory|.  The memory could 
initially be just a mark in the buffer (currently 256 chars), but when the 
buffer was refreshed it would have to be a string storing old buffers.

A solution like this would probably make it easier to implement some of the 
cssText properties (that currently look unimplemented in nsCSSStyleRule.cpp.
It would hopefully allow nsCSSToken::AppendToString to be removed.

Does this sound reasonable? (assuming this is a real performance problem)
(Reporter)

Updated

18 years ago
Keywords: perf

Comment 1

18 years ago
This sounds reasonable to me. Using a string class for building up tokens seems 
reasonable too, provided the string class is efficient at appending. Is the 
problem that our strings are lousy when it comes to appending? Could we just set 
the capacity of the strings in advance and thus avoid the inevitable reallocs 
that result from growing strings char by char? That seems like a simpler change, 
provided our strings still support it. 

Has anyone seen string appends in the CSSParser showing up in quantify?
(Reporter)

Comment 2

18 years ago
When scc had the string changes in that made the allocation different, it was a 
performance problem.  It also seems to be a good percentage of the work the CSS 
parser is doing.  Harish also mentioned to me earlier that he had seen 
performance problems in the CSS parser (at some other time).  I'll ask him for 
more detail...

Comment 3

18 years ago
If Harish did some performance analysis before the new string classes landed, it 
is probably obsolete by now. I think it's a better idea to do it all over again, 
compare with the numbers that Harish obtained previously, and if there is still a 
problem, gather some details and ask Scott Collins how he can help us to help 
ourselves. Maybe he will modify the string classes for the benefit of everybody, 
or just tell you how to use the existing classes more efficiently in the parser. 
Get him involved anyhow (and tell him I sent you :-).
(Reporter)

Comment 4

18 years ago
I also wonder if these strings are causing bloat.  I saw a good bit of bloat
caused by CSSOM structures in one of beard's recent GC-generated bloat logs.  I
have the feeling they are built for CSSOM manipulation of the cssText attribute
(which may only work for some things anyway -- I'm not sure).  Is there any need
for these strings at all?

Assigning this to Marc since I'm not going to have time to work on this.  I'm
not really sure if it's serious or not.  Frankly, the bloat issue may be a
bigger issue than the performance, but I'm just not sure...

For the record, I did email scc, but never heard back.
Assignee: dbaron → attinasi
Keywords: footprint

Comment 5

18 years ago
Moving to Future until told otherwise...

To move this in from the Future, nominate for beta3 and explain why it is 
important enought for beta3/RTM of Netscape6. The powers that be will approve or 
deny it.
Status: NEW → ASSIGNED
Target Milestone: --- → Future
(Reporter)

Comment 6

18 years ago
Perhaps the best solution here is to generate the CSSText properties dynamically 

and get rid of all this string stuff in the CSS parser.

Comment 7

18 years ago
Pulling in from Future: performance work is back on the radar...
Target Milestone: Future → mozilla0.9
(Reporter)

Comment 8

17 years ago
Hewitt recently checked in some changes to generate the cssText for selectors
dynamically (since it wasn't generated at all before).  I don't think there's
too much work left to generate it all dynamically and get rid of all the
source storage.

Comment 9

17 years ago
Moving out to Mozilla 1.0 milestone - small benefit in terms of performance and 
footprint.
Target Milestone: mozilla0.9 → mozilla1.0

Comment 10

17 years ago
Moving to correct QA Contact- here ya go Ian!
QA Contact: ckritzer → ian
code level bug -> dbaron
QA Contact: ian → dbaron
(Reporter)

Comment 12

17 years ago
Daniel Glazman fixed part of this problem as bug 70995.
Reassigning to Pierre. 

Note: Marc's comment "small benefit in terms of performance and footprint", so
I'm setting milestone to future.

Assignee: attinasi → pierre
Status: ASSIGNED → NEW
Target Milestone: mozilla1.0 → Future

Comment 14

17 years ago
First we must check whether there is still something to improve.  Boris, do you 
want to look into it?
Status: NEW → ASSIGNED
(Reporter)

Comment 15

16 years ago
Assigning pierre's remaining Style System-related bugs to myself.
Assignee: pierre → dbaron
Status: ASSIGNED → NEW
So what are the outstanding issues here?

The scanner is still doing some appending to tokens, but those are using
nsAutoString for the mIdent, so for the most part that should be fine.  The one
exception I can see is comments... The part of nsCSSScanner::Next that calls
SetCapacity(2) on the comment token's mIdent is basically a no-op (we should
remove that, methinks).

I have to wonder whether we could win anything by not even creating comment
tokens....  That would certainly save us having to append the license comments
in the chrome CSS (which are over 80 chars long, so probably cause lots of
reallocs).

Anything else that misuses strings that I'm missing?
(Reporter)

Comment 17

16 years ago
Created attachment 94836 [details]
jprof profile of CSS parsing on http://www.people.fas.harvard.edu/~dbaron/home

This profile still shows a good bit of time spent doing string appending.
(Reporter)

Updated

16 years ago
Status: NEW → ASSIGNED

Updated

15 years ago
Blocks: 106356

Comment 18

13 years ago
As said in comment 16, the string handling is now done using nsAutoString, and furthermore the comments are now skipped (SkipCComment), so they are no longer a performance issue. See also bug 60290, where the comment skipping was realised.

Proposing to mark this closed as WONTFIX (due to bug 60290).
Status: ASSIGNED → RESOLVED
Last Resolved: 13 years ago
Resolution: --- → WONTFIX
Reopening.  Please do NOT wontfix bugs without module owner permission (in this case, see comment 17).  At the very least, this bug is still being used to track bugs on specific instances of appending that can be eliminated.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
(Reporter)

Updated

11 years ago
Assignee: dbaron → nobody
Status: REOPENED → NEW
QA Contact: dbaron → style-system
(Reporter)

Updated

6 years ago
Priority: P3 → P5
You need to log in before you can comment on or make changes to this bug.