Open Bug 399223 Opened 14 years ago Updated 11 years ago

make aToken a member of nsCSSScanner instead of passing as function argument

Categories

(Core :: CSS Parsing and Computation, defect)

defect
Not set
normal

Tracking

()

ASSIGNED

People

(Reporter: alfredkayser, Assigned: alfredkayser)

References

Details

(Keywords: memory-footprint, perf)

Attachments

(1 file, 1 obsolete file)

nsCSSScanner/Parser: instead of passing aToken/aErrorCode as argument make these full members of nsCSSCanner.

aErrorCode and aToken are passed hunderds of times as arguments to functions, while both could be just as easily real members of nsCSSScanner. This makes function calling faster, and should make the parser slightly faster overall.

Total codesize saving is almost 3K, but more importantly it makes the code more readable.
Comment on attachment 284202 [details] [diff] [review]
V1: Make mErrorCode and mToken real members

The patch looks huge, but should be actually 'easy' to review.
Attachment #284202 - Flags: review?(dbaron)
Summary: nsCSSScanner/Parser: instead of passing aToken/aErrorCode as argument make these full member of nsCSSCanner → make aToken/aErrorCode members of nsCSSScanner instead of function arguments
Assignee: nobody → alfredkayser
Status: NEW → ASSIGNED
Attachment #284202 - Flags: superreview?(bzbarsky)
I won't be able to look at this until May.
Version: unspecified → Trunk
Depends on: 452275
An equivalent to the aErrorCode part of this patch should be landed shortly from bug 452275.

I am not convinced the aToken change is appropriate.  It's already a member of the CSSParserImpl structure; moving it to the scanner would simplify a small number of functions in the scanner and complicate many more functions in the parser.  And there aren't that many callsites into the scanner; almost everywhere goes through CSSParserImpl::GetToken.
There are about a dozen functions that pass mToken (or mIdent) as parameter around, while it is also available in the structure. Some of these functions are used quite a lot (Next,NextUrl,ParseIdent,etc). And it does save a lot of code, and cpu cycles.

We could though to make the patch smaller though by splitting it into two steps:
address aErrorCode (through bug 452275) and then mToken.

By the way, thanks for taking up the patch on aErrorCode, which I submitted for review long time ago.
Comment on attachment 284202 [details] [diff] [review]
V1: Make mErrorCode and mToken real members

This is going to need merging to bug 452275.  :(
Attachment #284202 - Flags: superreview?(bzbarsky)
Attachment #284202 - Flags: review?(dbaron)
boris: I recommend you r- the existing patch and wait for me to have a good hard look at whether the rest of the changes are actually a good idea.

alfred: I doubt any code size or speed improvements from the aToken->mToken change are significant, so the only thing that matters to me is clarity, and as I said, I'm not convinced it's a win there.  I do intend to look at it though.
Within nsCSSScanner, aToken is passed around a bunch, but (unlike with aErrorCode) all the functions that receive it use it directly, instead of just passing it on down.  Therefore there will be no major clarity gain within nsCSSScanner from removing the argument.  I'd actually argue that having aToken be an out-parameter instead of a field in nsCSSScanner makes it more obvious where the result of tokenization ends up.

There are four places where CSSParserImpl calls nsCSSScanner and passes it the address of mToken: in the definitions of GetToken, GetURLToken, REPORT_UNEXPECTED_TOKEN, and REPORT_UNEXPECTED_TOKEN_P.  These are all called all over the place, but they do not themselves take mToken as an argument.  Therefore, the need to pass a token parameter to nsCSSScanner is invisible within nsCSSParser.cpp.

There are approximately 300 places within CSSParserImpl functions where the 'mToken' instance variable is referred to directly.  If mToken is moved to the scanner, all of these places must be rewritten to read mScanner.mToken instead.  This is a major clarity *loss*.

I'd bet the quoted 3% code size reduction in the original submission was almost entirely from the removal of aErrorCode, which was just being passed around through dozens of functions without being used.

So I am recommending we WONTFIX the remaining component of this bug.  Alfred, if you have a suggestion for how we could move mToken into the scanner *without* having to refer to it as mScanner.mToken in three hundred places in the parser, or performance / code size numbers that show a significant effect from your proposed change to mToken *alone* (i.e. if you redo the patch on top of the 452275 changes and still see an effect) I'm willing to change my mind.
(In reply to comment #8)
> Created an attachment (id=337962) [details]
> V2: Make mToken a parent of nsCSSScanner so that all can access it

I do not like this, it strikes me as *more* confusing to have things like "mScanner.IsSymbol('?')" than the previous iteration; but I would still like to see code size numbers.
Also, allow me to suggest an approach that wouldn't offend my sense of aesthetics as much: leave mToken as a member of CSSParserImpl, but have the nsCSSScanner instance hold a reference to it.  Like so:

  class nsCSSScanner {
     ...
     nsCSSToken& mToken;
     ...
     nsCSSScanner(); // do not implement
     nsCSSScanner(nsCSSToken& aToken) : mToken(aToken) { ... }
  }

  class CSSParserImpl : ... {
     nsCSSToken mToken;
     nsCSSScanner mScanner;
  }

  CSSParserImpl::CSSParserImpl()
    : mToken(), mScanner(mToken) ...

I still actually prefer having Next() and NextURL() take the current token as an out-parameter, because it makes the data flow more obvious, but I'd be willing to put up with the above if you can demonstrate significant speed and/or code size improvements from it.  If you can swing it, please post numbers for multiple architectures (i386/linux, i386/windows, i386/mac, x86-64, powerpc, and arm would all be interesting).
I will that approach. 

Note that the overall purpose of this is the overall cleanup of Parser/Scanner, so to allow several optmizations. However it was requested that I break this up into several smaller patches (aErrorCode, mToken, ParseNumber, NextUrl (which is really broken), removing the dynamic pushback buffer, etc). Each of these will have minimal impact (and minimal risk), but the total should be better.

Note 2: Either pass mToken as reference during creation of mScanner and use it throughout, or pass it as a reference with Next/NextURL. The latter could be nicer, as it would potentially allow to use several tokens in mScanner and would make it more clear that each Next/NextURL call gets a token (unless EOF or error).
(In reply to comment #11)
> Note that the overall purpose of this is the overall cleanup of Parser/Scanner,
> so to allow several optmizations. However it was requested that I break this up
> into several smaller patches (aErrorCode, mToken, ParseNumber, NextUrl (which
> is really broken), removing the dynamic pushback buffer, etc). Each of these
> will have minimal impact (and minimal risk), but the total should be better.

If you have a complete patch series to your goal state I would like to see it.  I am currently doing feature work on the parser and scanner and it would be helpful to know exactly what you have in mind.  (Already-submitted bug numbers are also good to know.)

> Note 2: Either pass mToken as reference during creation of mScanner and use it
> throughout, or pass it as a reference with Next/NextURL. The latter could be
> nicer

Isn't the latter just what we have now?
Indeed, but by storing a reference to aToken into mToken, it saves passing mToken to the other methods of nsCSSCanner.

Others bugs are: bug 337287, bug 399495, bug 399349.
Other things on my list:
Rewrite of ParseAndAppendEscape (incl. removal of EatNewline).
Removal of Dynamic PushbackBuffer.
Fix EatWhitespace to eat all whitespace (using IsWhitespace) and remove return value (is not used).
Reset the contents of mToken at start of Next/NextURL instead of everywhere.
Summary: make aToken/aErrorCode members of nsCSSScanner instead of function arguments → make aToken a member of nsCSSScanner instead of passing as function argument
Created bug 483971 for EatWhiteSpace and EatNewline.
You need to log in before you can comment on or make changes to this bug.