Closed Bug 314920 Opened 19 years ago Closed 19 years ago

nsCSSScanner/nsCSSParser: parse string directly instead of through a StringInputReader

Categories

(Core :: CSS Parsing and Computation, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: alfredkayser, Assigned: alfredkayser)

References

Details

Attachments

(2 files, 3 obsolete files)

nsCSSParser parses a lot of string through the use of StringInputReader.
But nsCSSScanner actually can easily be adjusted to directly parse from the stream. 

This saves a 'NS_NewStringUnicharInputStream(getter_AddRefs(input), &aBuffer, PR_FALSE);', as well as 'memcpy' from that aBuffer (via the stream) to the buffer of nsCSSScanner, while the nsCSSScanner could just as easily directoy parse from the aBuffer itself.
This patch modifies the nsCSSScanner to also accept a PRUnichar *buffer as input for the scanning, so that it can directly scan those characters, instead of memcpy'ing them from a StringInputStream wrapped around that buffer.
Attachment #201744 - Flags: review?(dbaron)
Attached patch Slightly smaller edition (obsolete) — Splinter Review
Attachment #201744 - Attachment is obsolete: true
Attachment #201745 - Flags: review?(dbaron)
Attachment #201744 - Flags: review?(dbaron)
Summary: nsCSSScanner/nsCCSParser: parse string directly instead of through a StringInputReader → nsCSSScanner/nsCSSParser: parse string directly instead of through a StringInputReader
Comment on attachment 201745 [details] [diff] [review]
Slightly smaller edition

>Index: nsCSSParser.cpp
>+  nsAString::const_iterator iter;
>+  aBuffer.BeginReading(iter);
>+  mScanner.Init(iter.get(), aBuffer.Length(), aSheetURI, aLineNumber);

This is against the rules for nsAString.  You should just use PromiseFlatString (although what you really need is just PromiseSingleFragmentString), since most of the time that shouldn't copy.  You may even be able to change the signatures on the methods so they use nsSubstring instead of nsAString (which is a good change if it's possible do make that change all the way back to the origins of the strings without touching frozen interfaces).

>Index: nsCSSScanner.cpp

>@@ -234,47 +229,80 @@ nsCSSScanner::~nsCSSScanner()
>   NS_IF_RELEASE(gStringBundle);
> #endif
> }
> 
> void nsCSSScanner::Init(nsIUnicharInputStream* aInput, nsIURI* aURI,
>                         PRUint32 aLineNumber)
> {
>   NS_PRECONDITION(aInput, "Null input stream pointer");
>-  NS_PRECONDITION(!mInput, "Should not have an existing input stream!");
>+  NS_PRECONDITION(!mInputStream, "Should not have an existing input stream!");

This precondition should check mReadPointer as well (which you should probably null in the constructor (in addition to in Close, where you already do) so that you can check it).



>+void nsCSSScanner::Init(const PRUnichar * aBuffer, PRInt32 aCount, 
>+                        nsIURI* aURI, PRUint32 aLineNumber)
>+{
>+  NS_PRECONDITION(aBuffer, "Null string");
>+  NS_PRECONDITION(!aBuffer, "Should not have an existing input string!");

You clearly didn't test this in a debug build.  The assertion about current state should probably be the same as the other Init.

>+  // Read directoy from the (String) buffer

// Read directly from the buffer

(spell "directly" right, and don't say "(String)")

r=dbaron with those changes; please ask bzbarsky to sr once you attach a revised patch.
Attachment #201745 - Flags: review?(dbaron) → review+
Attached patch V2: With comments from dbaron (obsolete) — Splinter Review
Replaced the aBuffer.BeginReading with PromiseFlatString.
(Note that nsUnicharInputStream also does a 'BeginReading(iter)'?)
Replacing the nsAString with nsSubstring in the API would impact the callers too much however. (May be a separate bug to change this?)

Precondition checks: made one mScanner::Init member to combine the logic of checking the parameters and initialisation status more clearly.

And fixed the spellings of the comments.
Attachment #201745 - Attachment is obsolete: true
Attachment #202348 - Flags: superreview?(bzbarsky)
Comment on attachment 202348 [details] [diff] [review]
V2: With comments from dbaron

>Index: nsCSSParser.cpp
> CSSParserImpl::InitScanner(const nsAString& aBuffer, nsIURI* aSheetURI,
>+  mScanner.Init(nsnull, PromiseFlatString(aBuffer).get(), 
>+                aBuffer.Length(), aSheetURI, aLineNumber);

That won't do the right thing if PromiseFlatString actually has to allocate a temporary... I think you want something mode like:

const nsAFlatString& flat = PromiseFlatString(aBuffer);
mScanner.Init(nsnull, flat.get(), flat.Length(), aSheetURI, aLineNumber)l

to make sure that the temporary sticks around long enough if it gets created.

>Index: nsCSSScanner.cpp
>+void nsCSSScanner::Init(nsIUnicharInputStream* aInput, 
>+                        const PRUnichar * aBuffer, PRInt32 aCount, 
>+                        nsIURI* aURI, PRUint32 aLineNumber)
>+{
>+  NS_PRECONDITION(!mInputStream, "Should not have an existing input stream!");
>+  NS_PRECONDITION(!mReadPointer, "Should not have an existing input buffer!");

Please also add:

NS_PRECONDITION(!aInput || !aBuffer, "Shouldn't have both input and buffer!");
NS_PRECONDITION(!aInput || aCount == 0, "Shouldn't have count with a stream");

sr=bzbarsky with those fixed.  Update the patch and I'll land it?  This should be a nice win on some of our DHTML stuff!
Attachment #202348 - Flags: superreview?(bzbarsky) → superreview+
Blocks: 229391
> const nsAFlatString& flat = PromiseFlatString(aBuffer);
> mScanner.Init(nsnull, flat.get(), flat.Length(), aSheetURI, aLineNumber);
Done.

> Please also add:
> NS_PRECONDITION(!aInput || !aBuffer, "Shouldn't have both input and buffer!");
> NS_PRECONDITION(!aInput || aCount == 0, "Shouldn't have count with a stream");

I placed these after the 'if (aInput)' statement as follows:
+  if (aInput) {
+    NS_PRECONDITION(!aBuffer, "Shouldn't have both input and buffer!");
+    NS_PRECONDITION(aCount == 0, "Shouldn't have count with a stream");
Attachment #202348 - Attachment is obsolete: true
Note, bug 311566 will give more speedup of CSS scanning/parsing.
Assignee: dbaron → alfredkayser
Fixed.  This seemed to help Tdhtml a bit (0.5%, maybe?).  Helps a bit more than that on some other testcases, though.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Hmm.  Is it me, or did this cause some CSS to not be parsed right?  For example, in a build with this patch:

data:text/html,<div style="width: 50%; border: 1px solid black">Text</div>

ends up with no border and a div that is auto width... (and the style declaration is empty).
But it doesn't seem to be consistent... :(
The data:text/html,<div style="width: 50%; border: 1px solid black">Text</div> thing works for me in my local SeaMonkey build with (only) this change applied.

Could it be that the flattened string still doesn't live long enough (as the nsAFlastString& flat reference only lives during the initialisation)?
In which case the reference needs to be a member of CSSParserImpl?
> Could it be that the flattened string still doesn't live long enough (as the
> nsAFlastString& flat reference only lives during the initialisation)?

Oh, duh.  Of course.  Yes.  That reference needs to be kept alive across the entire parsing process.

The simplest way to do that is probably to just make InitScanner take a const nsSubstring& and have the callers flatten the string if they need to.  I believe all caller scopes cover the entirety of the parsing process...

This is probably what causes bug 316040.
Depends on: 316040
Moved the nsaFlatString reference to callers of InitScanner to keep the reference (and its buffer) longer
Comment on attachment 202673 [details] [diff] [review]
Followon patch to keep reference to flattened string somewhat longer

r+sr=bzbarsky; I'll land this once the tree reopens...
Attachment #202673 - Flags: superreview+
Attachment #202673 - Flags: review+
Comment on attachment 202673 [details] [diff] [review]
Followon patch to keep reference to flattened string somewhat longer

Checked this in.
Depends on: 316088
Depends on: 316085
nsAFlatString is just a typedef for nsString, and at this point you should just use nsString and avoid typing nsAFlatString.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: