Closed Bug 541496 Opened 15 years ago Closed 13 years ago

Always provide strings rather than streams to the CSS parser

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla6

People

(Reporter: zwol, Assigned: zwol)

References

Details

Attachments

(10 files, 16 obsolete files)

proposed patch 15 years ago Zack Weinberg (:zwol) 72.91 KB, patch		Details \| Diff \| Splinter Review
patch v1a (rediff, delete a now-unused macro) 14 years ago Zack Weinberg (:zwol) 74.48 KB, patch		Details \| Diff \| Splinter Review
Tp4 stylesheet size distribution 14 years ago Zack Weinberg (:zwol) 61.17 KB, image/png		Details
v2: avoid a copy for inline style 14 years ago Zack Weinberg (:zwol) 88.49 KB, patch		Details \| Diff \| Splinter Review
v2a: avoid a copy for inline style 14 years ago Zack Weinberg (:zwol) 89.16 KB, patch	bzbarsky : review-	Details \| Diff \| Splinter Review
v3 part 1: assume infallible malloc 14 years ago Zack Weinberg (:zwol) 48.22 KB, patch		Details \| Diff \| Splinter Review
v3 part 2: don't use the low-level error to signal namespace lookup failure 14 years ago Zack Weinberg (:zwol) 11.21 KB, patch		Details \| Diff \| Splinter Review
v3 part 3: don't use streams 14 years ago Zack Weinberg (:zwol) 47.58 KB, patch		Details \| Diff \| Splinter Review
better handling of namespace lookup failure [checkin: comment 33] 14 years ago Zack Weinberg (:zwol) 8.81 KB, patch	zwol : review+	Details \| Diff \| Splinter Review
part 0: kill nsIUnicharStreamListener [checkin: comment 58] 13 years ago Zack Weinberg (:zwol) 20.77 KB, patch	hsivonen : review+	Details \| Diff \| Splinter Review
part 1 (INCOMPLETE): revamp nsIUnicharStreamLoader 13 years ago Zack Weinberg (:zwol) 19.22 KB, patch		Details \| Diff \| Splinter Review
part 0a: whitespace cleanup on style/Loader.cpp [checkin: comment 58] 13 years ago Zack Weinberg (:zwol) 31.19 KB, patch	dbaron : review+	Details \| Diff \| Splinter Review
part 1: revamp nsIUnicharStreamLoader and change css::Loader to use it 13 years ago Zack Weinberg (:zwol) 53.71 KB, patch		Details \| Diff \| Splinter Review
part 1 v2: revamp nsIUnicharStreamLoader and change css::Loader to use it 13 years ago Zack Weinberg (:zwol) 55.90 KB, patch		Details \| Diff \| Splinter Review
v4 part 1 13 years ago Zack Weinberg (:zwol) 6.78 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
v4 part 2 13 years ago Zack Weinberg (:zwol) 8.36 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
v4 part 3 13 years ago Zack Weinberg (:zwol) 9.48 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
v4 part 4 13 years ago Zack Weinberg (:zwol) 25.89 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
v4 part 5 13 years ago Zack Weinberg (:zwol) 9.06 KB, patch	bzbarsky : review-	Details \| Diff \| Splinter Review
v4 part 6 13 years ago Zack Weinberg (:zwol) 13.88 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
v5 part 1 13 years ago Zack Weinberg (:zwol) 11.13 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
v5 part 2 13 years ago Zack Weinberg (:zwol) 8.38 KB, patch	zwol : review+	Details \| Diff \| Splinter Review
v5 part 3 13 years ago Zack Weinberg (:zwol) 10.14 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
v5 part 4 13 years ago Zack Weinberg (:zwol) 26.08 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
v5 part 5 13 years ago Zack Weinberg (:zwol) 8.30 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
v5 part 6 13 years ago Zack Weinberg (:zwol) 9.10 KB, patch	zwol : review+	Details \| Diff \| Splinter Review

Zack Weinberg (:zwol)

Assignee

Description

•

15 years ago

The CSS parser currently can process either a string (nsAString, specifically) or a stream (nsIUnicharInputStream).  Almost all entry points pass a string.  It would simplify the parser slightly, and the scanner considerably, if we always passed a string; this would also facilitate lazy generation of CSS error messages.

Note that the CSSLoader already reads the entire stylesheet from the network before invoking the parser, so there is no loss of parallelism.

Zack Weinberg (:zwol)

Assignee

Comment 1

•

15 years ago

Attached patch proposed patch (obsolete) — Details — Splinter Review

Attachment #423093 - Flags: superreview?(dbaron)

Attachment #423093 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 2

•

15 years ago

If we do this, we can also remove the pushback buffer, right?  I assume that will happen in followups?

I assume the main drawback here is having to create that contiguous buffer for the string; something we avoided doing before?  Have we tried gathering data on how big that buffer tends to be in practice?

Zack Weinberg (:zwol)

Assignee

Comment 3

•

15 years ago

(In reply to comment #2)
> If we do this, we can also remove the pushback buffer, right?  I assume that
> will happen in followups?

I hadn't thought of that.  It would be nice, but the main read buffer is treated as constant, and I don't know if we always push back exactly the characters we just read.  I can probably ensure that the scanner does, but the parser uses Pushback() too... Definitely for a followup.

> I assume the main drawback here is having to create that contiguous buffer for
> the string; something we avoided doing before?

You know, I'm not sure we *were* avoiding that before, given that we always wait till OnStreamComplete to start parsing - isn't there a copy in the necko cache at least?

> Have we tried gathering data on how big that buffer tends to be in practice?

That would be an interesting question, yeah.  Is that a query you can run on the Google database, cevans?  (Size distribution of CSS files.)

Zack Weinberg (:zwol)

Assignee

Updated

•

15 years ago

Blocks: 516091

Boris Zbarsky [:bzbarsky]

Comment 4

•

15 years ago

> I don't know if we always push back exactly the characters we just read

Hmm.  In theory we might not, and more importantly what we read is not what's in the buffer (e.g. a lone '\r' in the buffer).  So yeah, something to look into later.

> You know, I'm not sure we *were* avoiding that before, given that we always
> wait till OnStreamComplete to start parsing - isn't there a copy in the necko
> cache at least?

There isn't.  nsUnicharStreamLoader dumps the incoming data into effectively a list of 4096 byte buffers (an nsIPipe, to be exact).  Then the converter input stream converts data into its internal buffer as it goes; every time we ask it to fill it just fills that internal buffer.

Zack Weinberg (:zwol)

Assignee

Comment 5

•

14 years ago

Attached patch patch v1a (rediff, delete a now-unused macro) (obsolete) — Details — Splinter Review

CSS_BUFFER_SIZE was used only by code deleted by this patch, so it should go too.

Attachment #423462 - Flags: superreview?(dbaron)

Attachment #423462 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Comment 6

•

14 years ago

(In reply to comment #4)
> nsUnicharStreamLoader dumps the incoming data into effectively a
> list of 4096 byte buffers (an nsIPipe, to be exact).  Then the converter input
> stream converts data into its internal buffer as it goes; every time we ask it
> to fill it just fills that internal buffer.

Okay, so we had the whole thing in memory but not as one contiguous block and not pre-converted, I get it now.

My *real* goal in wanting the entire sheet in one big buffer is to enable lazy calculation of line numbers for error messages; so I hope this turns out to be an acceptable tradeoff.

Boris Zbarsky [:bzbarsky]

Comment 7

•

14 years ago

I'd still like to see some numbers for the buffer sizes involved, if just from our own Tp run (on try).

Zack Weinberg (:zwol)

Assignee

Comment 8

•

14 years ago

I have just pushed my queue to try with a patch on top to report buffer sizes, and will analyze the results tomorrow morning.  Here are the sorted results for browser startup + Gmail + LiveJournal.

CSS::ParseSheet: 78 bytes
CSS::ParseSheet: 102 bytes
CSS::ParseSheet: 184 bytes
CSS::ParseSheet: 258 bytes
CSS::ParseSheet: 318 bytes
CSS::ParseSheet: 346 bytes
CSS::ParseSheet: 490 bytes
CSS::ParseSheet: 1226 bytes
CSS::ParseSheet: 1336 bytes
CSS::ParseSheet: 1340 bytes
CSS::ParseSheet: 1522 bytes
CSS::ParseSheet: 1924 bytes
CSS::ParseSheet: 2212 bytes
CSS::ParseSheet: 2552 bytes
CSS::ParseSheet: 2962 bytes
CSS::ParseSheet: 3624 bytes
CSS::ParseSheet: 4636 bytes
CSS::ParseSheet: 4908 bytes
CSS::ParseSheet: 5004 bytes
CSS::ParseSheet: 5086 bytes
CSS::ParseSheet: 5258 bytes
CSS::ParseSheet: 5350 bytes
CSS::ParseSheet: 5706 bytes
CSS::ParseSheet: 5770 bytes
CSS::ParseSheet: 6120 bytes
CSS::ParseSheet: 7982 bytes
CSS::ParseSheet: 8010 bytes
CSS::ParseSheet: 9006 bytes
CSS::ParseSheet: 10216 bytes
CSS::ParseSheet: 10368 bytes
CSS::ParseSheet: 10572 bytes
CSS::ParseSheet: 10654 bytes
CSS::ParseSheet: 10836 bytes
CSS::ParseSheet: 12096 bytes
CSS::ParseSheet: 12148 bytes
CSS::ParseSheet: 12884 bytes
CSS::ParseSheet: 14988 bytes
CSS::ParseSheet: 16516 bytes
CSS::ParseSheet: 21254 bytes
CSS::ParseSheet: 22712 bytes
CSS::ParseSheet: 24270 bytes
CSS::ParseSheet: 30686 bytes
CSS::ParseSheet: 32082 bytes
CSS::ParseSheet: 46172 bytes
CSS::ParseSheet: 50536 bytes
CSS::ParseSheet: 69298 bytes
CSS::ParseSheet: 106626 bytes
CSS::ParseSheet: 106700 bytes
CSS::ParseSheet: 250828 bytes
CSS::ParseSheet: 250828 bytes

Zack Weinberg (:zwol)

Assignee

Comment 9

•

14 years ago

It looks like the Opera folks did just the analysis we want here: http://dev.opera.com/articles/view/mama-css-quantities-and-sizes/ Their presentation isn't very readable, though - I'll try to make some graphs.

Zack Weinberg (:zwol)

Assignee

Comment 10

•

14 years ago

I'm not having any luck getting meaningful graphs out of the MAMA summary tables -- they've been too heavily binned.  I did find this graph from another, smaller analysis:

http://triin.net/archive/kool/webstat/figure-20.png
(from http://triin.net/2006/06/12/CSS )

which is consistent with MAMA's assertion that the average <link>ed stylesheet is about 8000 bytes long and the average inline (<style> or aggregate style="") stylesheet is about 1000 bytes long.  More importantly, both data sets agree that the size falls off very fast above the average, although there are some rare huge outliers of all three types (machine-generated HTML with over a megabyte of inline <style> content, for instance)

Zack Weinberg (:zwol)

Assignee

Comment 11

•

14 years ago

Attached image Tp4 stylesheet size distribution — Details

Here's a size distribution profile (in bytes of UTF-16) for Tp4's style sheets.  For some reason the PNG came out ridiculously huge, you will want to view it at 50% or so.

It's a little hard to tell because I couldn't persuade my plotting program to give me sensible X axis labels, but the majority of sheets are quite small.  Here's quantiles:

  25%   50%   75%   90% 
  720  3414 14336 43850 

There are, however, outliers to 120,000 bytes.

Attachment #423093 - Attachment is obsolete: true

Attachment #423093 - Flags: superreview?(dbaron)

Attachment #423093 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 12

•

14 years ago

> machine-generated HTML with over a megabyte of inline <style> content, for
> instance

Wait.  For that case we actually have an nsAString that we wrap in a stream.  See http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsStyleLinkElement.cpp#296

How about we just change LoadInlineStyle to pass an nsAString all the way through instead of messing with streams; your code looks like it'll just make another copy of the string instead of using it directly.

I bet some of those big sheets you saw before are also inline ones (certainly the gmail one is).

Zack Weinberg (:zwol)

Assignee

Comment 13

•

14 years ago

Attached patch v2: avoid a copy for inline style (obsolete) — Details — Splinter Review

(In reply to comment #12)
> 
> How about we just change LoadInlineStyle to pass an nsAString all the way
> through instead of messing with streams; your code looks like it'll just make
> another copy of the string instead of using it directly.
> 
> I bet some of those big sheets you saw before are also inline ones (certainly
> the gmail one is).

That turns out to be quite easy, although I have to bump the nsICSSLoader IID.  We are still doing one copy in nsContentUtils::GetNodeTextContent, I think, but I don't think it's practical to avoid that one even when the style element has just one text node child (which should be the norm) because of the way nsTextFragment works.  Revised patch attached.  There is some more piggyback nsSubstring->nsAString conversion and whitespace cleanup.

Attachment #423612 - Flags: superreview?(dbaron)

Attachment #423612 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Updated

•

14 years ago

Attachment #423462 - Attachment is obsolete: true

Attachment #423462 - Flags: superreview?(dbaron)

Attachment #423462 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Comment 14

•

14 years ago

Attached patch v2a: avoid a copy for inline style (obsolete) — Details — Splinter Review

Of course I remember to prune the header file list in nsStyleLinkingElement.cpp only after posting the patch.

Attachment #423612 - Attachment is obsolete: true

Attachment #423614 - Flags: superreview?(dbaron)

Attachment #423614 - Flags: review?(bzbarsky)

Attachment #423612 - Flags: superreview?(dbaron)

Attachment #423612 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Updated

•

14 years ago

Blocks: 543151

Boris Zbarsky [:bzbarsky]

Comment 15

•

14 years ago

Comment on attachment 423614 [details] [diff] [review]
v2a: avoid a copy for inline style

>diff --git a/layout/style/nsCSSLoader.cpp b/layout/style/nsCSSLoader.cpp
>   SheetLoadData(CSSLoaderImpl* aLoader,
>-		const nsSubstring& aTitle,
>+		const nsAString& aTitle,

It looks like there are tabs around here for some reason (both before and after
your change).  Might be worth it to clean those out (as a followup).

>   NS_IMETHOD LoadInlineStyle(nsIContent* aElement,
>-			     nsIUnicharInputStream* aStream,
>+           const nsAString& aBuffer,

This, on the otherhand, is ending up mis-indented as a result of going from tabs to spaces.

>+++ b/layout/style/nsCSSParser.cpp
> CSSParserImpl::ParseColorString(const nsAString& aBuffer,

The behavior of this function is possibly changing here (e.g. it's changing to return NS_ERROR_DOM_SYNTAX_ERR if !NS_ColorNameToRGB() instead of returning NS_OK and not setting *aColor).  Then again, can that call return false here?

I'm not sure what all the mLowLevelError changes have to do with this bug (at first glance they're unrelated).  And they look wrong because parsers get recycled.  And I don't see us resetting mLowLevelError to a success code anywhere; nsCSSSCanner::Init used to do it for the scanner.

>+++ b/xpcom/io/nsStreamUtils.cpp
>+NS_ConsumeStream(nsIUnicharInputStream *stream, PRUint32 maxCount,
>+    PRUint32 chunk = 4096;
>+
>+    while (maxCount) {
>+	if (chunk < PR_UINT32_MAX/2)
>+	    chunk *= 2;

This doesn't look correct to me.

In particular, Read() makes no guarantees about actually reading as many bytes as you ask for.  So say maxCount is 64000 and |stream| hands out data in 4KB chunks (which is exactly what will happen with the converter stream, I think).  Then we'll double |chunk| 16 times while reading the stream.  Since we start at 2^12 bytes for |chunk|, it will end up with string lengths of 2^28 bytes, or 250MB or so, to read that 64KB stream.

What really needs to be doubled here, presumably, is the string capacity when we need to resize, right?  But string capacity growth already uses a doubling algorithm; see nsTSubstring_CharT::MutatePrep.  So all we really need to do is start the string off with a capacity of 4096 and then make sure to SetLength() before our reads; the string code will handle the doubling as needed.

I can see doubling chunk sizes until we get to the point that |n| ends up less than |toread| if desired, but I suspect 4KB chunks are as big as we'll typically be able to get our of Read.

>+extern NS_COM nsresult
>+NS_ConsumeStream(nsIUnicharInputStream *aSource, PRUint32 aMaxCount,
>+		 nsAString &aBuffer);

Some tabs snuck in here.

Attachment #423614 - Flags: review?(bzbarsky) → review-

Zack Weinberg (:zwol)

Assignee

Updated

•

14 years ago

Attachment #423614 - Flags: superreview?(dbaron)

Zack Weinberg (:zwol)

Assignee

Comment 16

•

14 years ago

(In reply to comment #15)
> It looks like there are tabs around here for some reason (both before and
> after your change).  Might be worth it to clean those out (as a followup).

Noted.

> This, on the otherhand, is ending up mis-indented as a result of going from
> tabs to spaces.

Ditto.

> >+++ b/layout/style/nsCSSParser.cpp
> > CSSParserImpl::ParseColorString(const nsAString& aBuffer,
> 
> The behavior of this function is possibly changing here (e.g. it's changing to
> return NS_ERROR_DOM_SYNTAX_ERR if !NS_ColorNameToRGB() instead of returning
> NS_OK and not setting *aColor).  Then again, can that call return false here?

I'll look into it.

> I'm not sure what all the mLowLevelError changes have to do with this bug (at
> first glance they're unrelated).  And they look wrong because parsers get
> recycled.  And I don't see us resetting mLowLevelError to a success code
> anywhere; nsCSSSCanner::Init used to do it for the scanner.

The scanner only ever set the low-level error when a stream read failed, so after this change, it is more properly a parser flag.  And in the original
patch sequencing, parsers didn't get recycled anymore at this point.  This is
what I get for shuffling the queue around so heavily.

In the *current* revision (only on my laptop at this point) this patch is the last of a set of three that, collectively, eliminate the low-level error flag altogether.

> >+++ b/xpcom/io/nsStreamUtils.cpp
> >+NS_ConsumeStream(nsIUnicharInputStream *stream, PRUint32 maxCount,
> >+    PRUint32 chunk = 4096;
> >+
> >+    while (maxCount) {
> >+	if (chunk < PR_UINT32_MAX/2)
> >+	    chunk *= 2;
> 
> This doesn't look correct to me.
> 
> In particular, Read() makes no guarantees about actually reading as many bytes
> as you ask for.  So say maxCount is 64000 and |stream| hands out data in 4KB
> chunks (which is exactly what will happen with the converter stream, I think). 
> Then we'll double |chunk| 16 times while reading the stream.  Since we start at
> 2^12 bytes for |chunk|, it will end up with string lengths of 2^28 bytes, or
> 250MB or so, to read that 64KB stream.

I forget who it was, but someone else already tripped over this problem.  My current code looks like this:

+NS_ConsumeStream(nsIUnicharInputStream *stream, PRUint32 maxCount,
+                nsAString &result)
+{
+    nsresult rv = NS_OK;
+    result.Truncate();
+
+    // nsIUnicharInputStream does not provide Available(), so allocate
+    // exponentially larger chunks, starting with 8192 bytes.  Note
+    // that some implementations will only produce 4096 bytes per call
+    // to Read().
+    PRUint32 chunk = 4096;
+
+    while (maxCount) {
+       if (chunk < PR_UINT32_MAX/2)
+           chunk *= 2;
+
+        // resize result buffer
+        PRUint32 toread = PR_MIN(chunk, maxCount);
+        PRUint32 length = result.Length();
+        result.SetLength(length + toread);
+        if (result.Length() != (length + toread))
+            return NS_ERROR_OUT_OF_MEMORY;
+
+       // repeatedly read until we run out of room
+       while (length < result.Length()) {
+           PRUnichar *buf = result.BeginWriting() + length;
+           PRUint32 n;
+
+           rv = stream->Read(buf, result.Length() - length, &n);
+           if (NS_FAILED(rv) || n == 0)
+               return rv;
+
+           length += n;
+           maxCount -= n;
+       }
+    }
+
+    return rv;
+}

Maybe I can get rid of the inner loop, based on what you say here...

> What really needs to be doubled here, presumably, is the string capacity when
> we need to resize, right?  But string capacity growth already uses a doubling
> algorithm; see nsTSubstring_CharT::MutatePrep.  So all we really need to do is
> start the string off with a capacity of 4096 and then make sure to SetLength()
> before our reads; the string code will handle the doubling as needed.

... the idea was to make sure we were reading as much as we possibly could at each go, but we could just as well take advantage of knowing that nsConverterInputStream works in 4096-byte blocks, and that would be simpler code-wise.  What do you think?

zw

Boris Zbarsky [:bzbarsky]

Comment 17

•

14 years ago

I think we should assume 4096-byte blocks for now, and as a followup make the unichar input stream hand out a length estimate (which is likely to be bigger than the actual final length, but is the best it can do, in general).

Zack Weinberg (:zwol)

Assignee

Comment 18

•

14 years ago

Attached patch v3 part 1: assume infallible malloc (obsolete) — Details — Splinter Review

I've revised the "no streams" patch with two new lead-in patches.  This is the first.  Now that infallible operator new seems to have stuck (finally!) we can throw out all of the out-of-memory checks in the CSS parser.  I was considering sending this in ahead of that just because the out-of-memory checks were incomplete and not handled well by higher levels of the parser.

Attachment #423614 - Attachment is obsolete: true

Attachment #430431 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Comment 19

•

14 years ago

Attached patch v3 part 2: don't use the low-level error to signal namespace lookup failure (obsolete) — Details — Splinter Review

GetNamespaceIDForPrefix was abusing the "low-level error" flag to communicate a sticky error indicator to ParseSelectorString.  This patch makes it use its own bit for that instead.

Zack Weinberg (:zwol)

Assignee

Updated

•

14 years ago

Attachment #430432 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Comment 20

•

14 years ago

Attached patch v3 part 3: don't use streams (obsolete) — Details — Splinter Review

And here's the piece that corresponds to the original patch.  I fixed NS_ConsumeStream as discussed above, I fixed the indentation (I think) and I threw out the logic changes in ParseColorString for now.

The point of adding part 1 and 2 ahead of this change is simply that this patch can now eliminate the low-level error flag altogether.  Many of the parser entry points are now infallible (callers mostly ignore the potential error return already) but I'm leaving the external API as is for now.

Attachment #430435 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 21

•

14 years ago

Comment on attachment 430431 [details] [diff] [review]
v3 part 1: assume infallible malloc

>+++ b/layout/style/nsCSSParser.cpp
>+    NS_NewCSSStyleRule(&rule, nsnull, declaration);
>     *aResult = rule;

Why do you need |rule|?  Why not just NS_NewCSSStyleRule directly into aResult?

>@@ -1717,21 +1708,17 @@ CSSParserImpl::GatherMedia(nsMediaList* 
>-      nsresult rv = aMedia->AppendQuery(query);

AppendQuery doesn't allocate via operator new, so is still fallible.

>@@ -1749,21 +1736,16 @@ CSSParserImpl::ParseMediaQueryExpression
>   nsMediaExpression *expr = aQuery->NewExpression();

NewExpression is still fallible.

>@@ -1822,21 +1804,16 @@ CSSParserImpl::ParseMediaQueryExpression
>         nsRefPtr<nsCSSValue::Array> a = nsCSSValue::Array::Create(2);

I'm hoping explicit calls to |::operator new| (which is what this is) are infallible?  Please check with cjones.

>@@ -1955,20 +1925,17 @@ CSSParserImpl::ParseGroupRule(nsICSSGrou
>-  if (!PushGroup(aRule)) {

This happens to be ok, since it uses nsCOMArray, unlike the earlier callsites that use nsTArray...  I really dislike that fragility.  I would prefer to assume that array grow operations are fallible until all our allocation is infallible.  That's assuming that array grow operations really become infallible.

>@@ -2272,20 +2221,16 @@ CSSParserImpl::SkipUntilStack(nsAutoTArr
>-      // Just handle out-of-memory by parsing incorrectly.  It's
>-      // highly unlikely we're dealing with a legitimate style sheet
>-      // anyway.

aStack.AppendElement is still fallible.

>@@ -3077,20 +3005,16 @@ CSSParserImpl::ParsePseudoSelector(PRInt
>   nsCOMPtr<nsIAtom> pseudo = do_GetAtom(buffer);

I'm not sure I'm happy depending on the exact allocator atom guts use...

>@@ -4881,20 +4784,16 @@ CSSParserImpl::ParseImageRect(nsCSSValue
>   nsCSSValueGradientStop* stop = aGradient->mStops.AppendElement();
>-  if (!stop) {

AppendElement is fallible.

>@@ -7575,21 +7404,18 @@ CSSParserImpl::ParseFunctionInternals(co
>-    if (!aOutput.AppendElement(newValue)) {

That's still fallible.

r- for the array issues above.

Attachment #430431 - Flags: review?(bzbarsky) → review-

Boris Zbarsky [:bzbarsky]

Comment 22

•

14 years ago

Comment on attachment 430432 [details] [diff] [review]
v3 part 2: don't use the low-level error to signal namespace lookup failure

>+++ b/layout/style/nsCSSParser.cpp
CSSParserImpl::ParseSelectorString(const
>+  if (prefixErr)
>+    return NS_ERROR_DOM_NAMESPACE_ERR;
>+  else

No need for else after return.

Attachment #430432 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 23

•

14 years ago

Comment on attachment 430431 [details] [diff] [review]
v3 part 1: assume infallible malloc

And fr this part 1 patch, it might be good if dbaron OKed it.

Boris Zbarsky [:bzbarsky]

Comment 24

•

14 years ago

Is it possible to do an interdiff from the last thing I reviewed to part 3?

Zack Weinberg (:zwol)

Assignee

Comment 25

•

14 years ago

Comment on attachment 430431 [details] [diff] [review]
v3 part 1: assume infallible malloc

I'm gonna push back really **** the r-.  I claim that there is no point in doing this patch unless we can remove *all* of the null-checks after memory allocation (since the ultimate point here is to remove mLowLevelError and change most of the parser entry points to return void).  I also claim that the existing null-checks do more harm than good, because they conflate a sticky error state (out of memory) with a recoverable error state (syntax error) -- higher layers only see the PR_FALSE return from some parser function, assume a syntax error, and try to recover from that, when we should be bailing out completely.  But to do that reliably we'd have to check the low-level error in every parse-failure path, which is a silly thing to implement when we're trying to get rid of oom checks anyway.

Attachment #430431 - Flags: superreview?(dbaron)

Attachment #430431 - Flags: review?(bzbarsky)

Attachment #430431 - Flags: review-

Boris Zbarsky [:bzbarsky]

Comment 26

•

14 years ago

I understand why we want the patch.  I just think that currently its behavior will be that on OOM we try to scribble somewhere near address 0 (at least; I haven't carefully audited the codepaths that use nsTArray appends here) instead of just dying in a guaranteed safe way.  If we could guarantee that there are no null returns from those nsTArrays, then I would be fine with the change.  If we decide that such attempts to scribble near 0 (and crash as needed) are ok, then I might be fine with the change too.

Chris, what's the long-term plan for nsTArray?  Does it plan to become infallible?  Will it offer fallible and infallible grow methods?  Something else?

Zack Weinberg (:zwol)

Assignee

Comment 27

•

14 years ago

IIRC (from the frame poisoning stuff), all current OSes start up with a fairly large region of address space reserved near 0 (at least a megabyte), but will allow the application to trim that down to one page (4096 bytes) with directed-address allocations.  I didn't test whether that area would get used for undirected allocations.

It seems to me that most uses of nsTArray want infallible-resize semantics.

Boris Zbarsky [:bzbarsky]

Comment 28

•

14 years ago

This is reserved in the "can't read" sense, not just "can't write", yes?

Zack Weinberg (:zwol)

Assignee

Comment 29

•

14 years ago

(In reply to comment #28)
> This is reserved in the "can't read" sense, not just "can't write", yes?

Yup.

(In reply to comment #24)
> Is it possible to do an interdiff from the last thing I reviewed to part 3?

I'm not sure I understand you here.  You want to see part 3 relative to part 2 without part 1?  That would take some fixups to compile...

Boris Zbarsky [:bzbarsky]

Comment 30

•

14 years ago

> Yup.

Hmm.  In that case I'm probably ok with assuming nsTArray infallible here as long as we don't write to (or read from) memory pointed to by pointer members of the things we'd be allocating that way.  Or as long as we can guarantee that nsTArray infallible will land before this ever ships (including in an alpha).

> You want to see part 3 relative to part 2 without part 1?

No, I want to see how part 3 differs from "v2a", and in particular which parts I need to focus on reviewing there and which I have already reviewed...

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 31

•

14 years ago

But if an array grow fails in the current code, and you assume it succeeds, will you end up reading/writing past the end of the array?

Zack Weinberg (:zwol)

Assignee

Comment 32

•

14 years ago

(In reply to comment #30)
> > You want to see part 3 relative to part 2 without part 1?
> 
> No, I want to see how part 3 differs from "v2a", and in particular which parts
> I need to focus on reviewing there and which I have already reviewed...

interdiff v2a part3 comes out completely wrong - it shows a bunch of things that did *not* change and misses several things that did.  I don't know why.

But maybe we should just table part 3 until we decide what to do about part 1, because if I need to back off on part 1, part 3 has to change considerably.

(In reply to comment #31)
> But if an array grow fails in the current code, and you assume it succeeds,
> will you end up reading/writing past the end of the array?

It looks like we will, since nsTArray doesn't invalidate the array if an EnsureCapacity() call fails.  Bother.

Zack Weinberg (:zwol)

Assignee

Comment 33

•

14 years ago

Attached patch better handling of namespace lookup failure [checkin: comment 33] — Details — Splinter Review

I pushed "v3 part 2" by itself, with the requested change from comment 22, and taking out the nsCSSScanner pieces (which would have broken compilation in the absence of part 1).

http://hg.mozilla.org/mozilla-central/rev/5feacacc0d09

Attachment #430432 - Attachment is obsolete: true

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 34

•

14 years ago

(In reply to comment #26)
> Chris, what's the long-term plan for nsTArray?  Does it plan to become
> infallible?  Will it offer fallible and infallible grow methods?  Something
> else?

Good question!  I don't believe that's been discussed yet.  The STL-ification project wants mozilla::Vector with an infallible STL-like API; maybe a solution is to fork nsTArray into infallible mozilla::Vector and fallible mozilla::Fallible(?)Vector.

Zack Weinberg (:zwol)

Assignee

Updated

•

14 years ago

Attachment #430431 - Flags: superreview?(dbaron)

Attachment #430431 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Comment 35

•

14 years ago

Comment on attachment 430435 [details] [diff] [review]
v3 part 3: don't use streams

clearing review requests on patches I know will have to be revised before landing.

Attachment #430435 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

Attachment #430432 - Flags: review+

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

Attachment #430667 - Flags: review+

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

Attachment #430431 - Attachment is obsolete: true

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

Attachment #430435 - Attachment is obsolete: true

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

Attachment #430667 - Attachment description: v3 part 2 as landed [checkin: comment 33] → better handling of namespace lookup failure [checkin: comment 33]

Zack Weinberg (:zwol)

Assignee

Comment 36

•

13 years ago

Attached patch part 0: kill nsIUnicharStreamListener [checkin: comment 58] — Details — Splinter Review

Have finally come back around to this bug, & while mucking with ns(I)UnicharStreamLoader, I noticed that nsIUnicharStream*Listener* is only used to define a set of random callbacks from the old HTML parser - not the new one - and there is no user of these callbacks anywhere in the tree.

So I kill it.

Attachment #530171 - Flags: review?(hsivonen)

Zack Weinberg (:zwol)

Assignee

Comment 37

•

13 years ago

Attached patch part 1 (INCOMPLETE): revamp nsIUnicharStreamLoader (obsolete) — Details — Splinter Review

This is as far as I've gotten on the actual project here.  I rework nsIUnicharStreamLoader so that it produces the entire contents of the decoded stream as a nsString, rather than soaking up the data in a nsPipe which is then fed to an nsConverterInputStream.

Obviously this breaks the build, since it doesn't update css::Loader (the sole user of nsIUnicharStreamLoader, except for a test case).  However, before I do any more work I'd like to know if this is barking up the right tree and whether I have understood how to use nsIUnicodeDecoder correctly - the conversion loop (in WriteSegmentFun) was mostly copied from nsConverterInputStream and I'm not sure I got it right.

Attachment #530173 - Flags: feedback?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 38

•

13 years ago

In general this seems like the right approach, and what we discussed.  Was there a specific concern about a particular tree up which you feel you're barking?  ;)

I haven't reviewed the details of the OnStartRequest/OnStopRequest/OnDataAvailable changes but two comments:

1)  You don't have to guesstimate a 2x expansion (esp. because in the common
    cases it's 1x).  Use GetMaxLength() on the decoder instead.
2)  You should probably check that SetCapacity succeeded.

Other than those, this looks about right at first glance.

Boris Zbarsky [:bzbarsky]

Comment 39

•

13 years ago

Henri, do you have time to look at the unicode conversion code here?  You've dealt with unicode decoders most recently, I think.

Boris Zbarsky [:bzbarsky]

Updated

•

13 years ago

Attachment #530173 - Flags: feedback?(bzbarsky) → feedback+

Zack Weinberg (:zwol)

Assignee

Comment 40

•

13 years ago

(In reply to comment #38)
> In general this seems like the right approach, and what we discussed.  Was
> there a specific concern about a particular tree up which you feel you're
> barking?  ;)

I don't fully understand OnDataAvailable and ReadSegments; it's possible that I did something wrong there, as well as in the use of the UnicodeDecoder.  Especially, I'm not sure the mixture of Read and ReadSegments in OnDataAvailable is safe, but there could be other problems.

Boris Zbarsky [:bzbarsky]

Comment 41

•

13 years ago

Ah.  Mixing Read and ReadSegments should be ok.

Boris Zbarsky [:bzbarsky]

Comment 42

•

13 years ago

And in particular, nothing jumped out at me as obviously wrong.  I didn't check all the arithmetic details, of course....

Zack Weinberg (:zwol)

Assignee

Comment 43

•

13 years ago

(In reply to comment #38)

> 1)  You don't have to guesstimate a 2x expansion (esp. because in the common
>     cases it's 1x).  Use GetMaxLength() on the decoder instead.
> 2)  You should probably check that SetCapacity succeeded.

The common case (for the sole user, CSS) is ASCII becoming UTF-16, which is a 2x expansion.  But I will make these changes.

Boris Zbarsky [:bzbarsky]

Comment 44

•

13 years ago

It's a 2x expansion in number of bytes.  But the srcLen is measured in bytes while the dstLen is measured in PRUnichars.  And in the common case there are just as many PRUnichars as bytes (or fewer, for UTF-8).

Zack Weinberg (:zwol)

Assignee

Comment 45

•

13 years ago

Of course.  How silly of me.

Henri Sivonen (:hsivonen)

Comment 46

•

13 years ago

Comment on attachment 530171 [details] [diff] [review]
part 0: kill nsIUnicharStreamListener [checkin: comment 58]

Let's get rid this it seems that Google Desktop is now abandoned anyway, so this API isn't supported in the new parser anyway.

Attachment #530171 - Flags: review?(hsivonen) → review+

Henri Sivonen (:hsivonen)

Comment 47

•

13 years ago

(In reply to comment #39)
> Henri, do you have time to look at the unicode conversion code here?  You've
> dealt with unicode decoders most recently, I think.

I tried to look at the code carefully, and it seems to me it ends up doing the same thing as the code I assume to be correct:
http://mxr-test.konigsberg.mozilla.org/mozilla-central/source/parser/html/nsHtml5StreamParser.cpp#577

Aside: I think some day, we should change the decoder interface to suck less by having an interface more like http://download.oracle.com/javase/1.4.2/docs/api/java/nio/charset/CharsetDecoder.html

Zack Weinberg (:zwol)

Assignee

Comment 48

•

13 years ago

Attached patch part 0a: whitespace cleanup on style/Loader.cpp [checkin: comment 58] — Details — Splinter Review

layout/style/Loader.cpp has a *lot* of trailing whitespace in it.  As the next patch makes extensive changes to this file, I thought it might be a good time to get rid of all of that, but obviously this needs to be done separately from changes that need more of a review. :)

Attachment #530173 - Attachment is obsolete: true

Attachment #530385 - Flags: review?(dbaron)

Zack Weinberg (:zwol)

Assignee

Comment 49

•

13 years ago

Attached patch part 1: revamp nsIUnicharStreamLoader and change css::Loader to use it (obsolete) — Details — Splinter Review

This is the meat of the change.  I am now fairly confident that the changes in netwerk/ are correct, but it probably deserves a thorough going-over anyway.  The changes to css::Loader are *not* correct in at least one regard: the revised handling of synchronous sheet loads triggers a bunch of assertions...

###!!! ASSERTION: Should have an inner whose principal has not yet been set: '!mInner->mPrincipalSet', file /home/zack/src/mozilla/S-mc/layout/style/nsCSSStyleSheet.cpp, line 1136
###!!! ASSERTION: Bad loading table: 'mLoadingDatas.Get(&key, &loadingData) && loadingData == aLoadData', file /home/zack/src/mozilla/S-mc/layout/style/Loader.cpp, line 1632
###!!! ASSERTION: This is unsafe! Fix the caller!: 'Error', file /home/zack/src/mozilla/S-mc/content/events/src/nsEventDispatcher.cpp, line 534

(I'm not sure whether the last one of those is actually this change's fault.)  I'm posting this for review anyway because I've not been able to figure out how to do the synchronous load correctly.

One more patch to follow, but it'll be posted to bug 543151.

Attachment #530388 - Flags: review?(dbaron)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

13 years ago

Attachment #530385 - Flags: review?(dbaron) → review+

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 50

•

13 years ago

Comment on attachment 530388 [details] [diff] [review]
part 1: revamp nsIUnicharStreamLoader and change css::Loader to use it

bz is a better reviewer for the loader than I am

Attachment #530388 - Flags: review?(dbaron) → review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 51

•

13 years ago

One issue I just realiazed for the unichar stream loader.  It needs to make the callback in all cases; the new code seems to not do that.

We should probably use a void string to represent the cases we used a null stream for before.

Boris Zbarsky [:bzbarsky]

Comment 52

•

13 years ago

Er, yes.  And the synchronous stuff looks broken.  Spinning the event loop there is bad; that's why you get the asserts.  You should probably just NS_OpenURI as before, take the resulting nsIInputStream, read it into an nsCAutoString (possibly preallocated to the right size) using something like WriteSegmentToCString (copy that from nsBinaryStream.cpp) and then NS_ConvertUTFtoUTF16 to get an nsAString.

Boris Zbarsky [:bzbarsky]

Comment 53

•

13 years ago

Though that will behave differently from what we have now when the input is not actually valid UTF-8.  Maybe you want to just factor out the streamloader code that converts a incoming segments to UTF16 and use it both places?

Zack Weinberg (:zwol)

Assignee

Comment 54

•

13 years ago

(In reply to comment #52)
>  Spinning the event loop there is bad; that's why you get the asserts.

Doesn't a synchronous NS_OpenURI spin the event loop?  Not in exactly the same place within the function, but still.

It looks to me like the asserts are all about not setting up the principal quite right.

(In reply to comment #51)
> One issue I just realiazed for the unichar stream loader.  It needs to make the
> callback in all cases; the new code seems to not do that.

I thought I set it up so it would (with an empty string, if no data was received).

Boris Zbarsky [:bzbarsky]

Comment 55

•

13 years ago

> Doesn't a synchronous NS_OpenURI spin the event loop? 

It depends on the channel implementation.  If the channel has an underlying stream it can just hand out (say file://, data:, chrome://, etc, etc), then no.

> It looks to me like the asserts are all about not setting up the principal
> quite right.

The asserts happen because while spinning the event loop for a sync load of URI X you reenter this code with a request for another sync load of URI X and end up getting the same stylesheet object and setting a principal on it a second time. But the key problem is the spinning the event loop and reentering bit.

> I thought I set it up so it would (with an empty string, if no data was
> received).

In nsUnicharStreamLoader::OnStopRequest if DetermineCharset returns error you never call OnStreamComplete.  You also don't null out various things and so forth.....

Zack Weinberg (:zwol)

Assignee

Comment 56

•

13 years ago

Attached patch part 1 v2: revamp nsIUnicharStreamLoader and change css::Loader to use it (obsolete) — Details — Splinter Review

I've got it to the point where it doesn't throw any assertions while loading the UA sheets, but instead it crashes hard inside nsFrame::BoxReflow after spitting out some other assertions:

WARNING: Subdocument container has no content: file /home/zack/src/mozilla/S-mc/layout/base/nsDocumentViewer.cpp, line 2380
###!!! ASSERTION: Must be a box frame!: '!mScrollCornerBox || mScrollCornerBox->IsBoxFrame()', file /home/zack/src/mozilla/S-mc/layout/generic/nsGfxScrollFrame.cpp, line 3328
###!!! ASSERTION: A box layout method was called but InitBoxMetrics was never called: 'metrics', file /home/zack/src/mozilla/S-mc/layout/generic/nsFrame.cpp, line 7199

Program received signal SIGSEGV, Segmentation fault.
nsFrame::BoxReflow (this=0x7fffe1c174b8, aState=..., aPresContext=
    0x7fffe339e800, aDesiredSize=..., aRenderingContext=0x7fffe3170af0, aX=0, 
    aY=6000, aWidth=0, aHeight=0, aMoveFrame=<value optimized out>)
(gdb) bt
#0  nsFrame::BoxReflow (this=0x7fffe1c174b8, aState=..., 
    aPresContext=0x7fffe339e800, aDesiredSize=..., 
    aRenderingContext=0x7fffe3170af0, aX=0, aY=6000, aWidth=0, aHeight=0, 
    aMoveFrame=<value optimized out>)
    at /home/zack/src/mozilla/S-mc/layout/generic/nsFrame.cpp:7108
#1  0x00007ffff52cd727 in nsFrame::DoLayout (this=0x7fffe1c174b8, aState=...)
    at /home/zack/src/mozilla/S-mc/layout/generic/nsFrame.cpp:6899
#2  0x00007ffff54b6898 in nsIFrame::Layout (this=0x7fffe1c174b8, aState=...)
    at /home/zack/src/mozilla/S-mc/layout/xul/base/src/nsBox.cpp:559
#3  0x00007ffff54bc64a in nsBoxFrame::LayoutChildAt (aState=..., 
    aBox=0x7fffe1c174b8, aRect=...)
    at /home/zack/src/mozilla/S-mc/layout/xul/base/src/nsBoxFrame.cpp:2033
#4  0x00007ffff52e7f7e in LayoutAndInvalidate (aState=..., 
    aBox=0x7fffe1c174b8, aRect=..., aScrollbarIsBeingHidden=0)
    at /home/zack/src/mozilla/S-mc/layout/generic/nsGfxScrollFrame.cpp:3268
#5  0x00007ffff52ec32b in nsGfxScrollFrameInner::LayoutScrollbars (
    this=0x7fffe1c171e0, aState=..., aContentArea=..., aOldScrollArea=...)
    at /home/zack/src/mozilla/S-mc/layout/generic/nsGfxScrollFrame.cpp:3352
#6  0x00007ffff52f03b5 in nsHTMLScrollFrame::Reflow (this=0x7fffe1c17158, 
    aPresContext=<value optimized out>, aDesiredSize=..., aReflowState=..., 
    aStatus=@0x7fffffffc840)
    at /home/zack/src/mozilla/S-mc/layout/generic/nsGfxScrollFrame.cpp:916
#7  0x00007ffff52b9046 in nsContainerFrame::ReflowChild (
    this=<value optimized out>, aKidFrame=0x7fffe1c17158, 
    aPresContext=0x7fffe339e800, aDesiredSize=..., aReflowState=..., aX=0, 
    aY=0, aFlags=0, aStatus=@0x7fffffffc840, aTracker=0x0)
    at /home/zack/src/mozilla/S-mc/layout/generic/nsContainerFrame.cpp:959
#8  0x00007ffff5375d6f in ViewportFrame::Reflow (this=0x7fffe331bbc8, 
    aPresContext=0x7fffe339e800, aDesiredSize=..., aReflowState=..., 
    aStatus=@0x7fffffffc840)
    at /home/zack/src/mozilla/S-mc/layout/generic/nsViewportFrame.cpp:293
#9  0x00007ffff523e981 in PresShell::DoReflow (this=0x7fffe3391c00, 
    target=0x7fffe331bbc8, aInterruptible=0)
    at /home/zack/src/mozilla/S-mc/layout/base/nsPresShell.cpp:7735

this has me pretty well stumped, although I assume it's still an issue with the UA sheet not getting set up correctly.  (note: this is without the additional patch in bug 543151.)

Attachment #530388 - Attachment is obsolete: true

Attachment #530525 - Flags: review?(bzbarsky)

Attachment #530388 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 57

•

13 years ago

xul.css styles scrollcorners as -moz-box.  So at a guess, that either didn't load or didn't fully load.

Were there any CSS parse errors before the assertion point?  Can you step through what happens when xul.css is parsed?

The way the attached patch calls the streamloader is not quite right, but it only matters in error cases, so I doubt it would cause the issue above...

For what it's worth, it may be worthwhile to break the patch up into pieces to test them independently (e.g. push to try after every patch in the sequence).  Specifically:

1)  Make the unichar stream loader return a string but then have the CSSLoader
    just wrap a StringUnicharInputStream around it and pass that to the CSS
    parser.
2)  Change the CSS parser API to add an overload for parsing a full sheet from
    a string.  Convert the non-sync case in CSSLoader to use the new API.
3)  Convert inline style parsing to use the new API.
4)  Convert the paranoid sink to the new API.
5)  Convert sync sheet loads to use the unichar stream loader or something.
6)  Remove the old parsing API.

This would incidentally be easier to review too...

Oh, and could we please put the Accept header change in a separate bug?

Zack Weinberg (:zwol)

Assignee

Comment 58

•

13 years ago

I checked in patches 0 and 0a:
http://hg.mozilla.org/mozilla-central/rev/b0e5ffb6bfc7
http://hg.mozilla.org/mozilla-central/rev/b754c3a38759

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

Attachment #530173 - Flags: feedback+

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

Attachment #530525 - Attachment is obsolete: true

Attachment #530525 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

Attachment #530171 - Attachment description: part 0: kill nsIUnicharStreamListener → part 0: kill nsIUnicharStreamListener [checkin: comment 58]

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

Attachment #530385 - Attachment description: part 0a: whitespace cleanup on style/Loader.cpp → part 0a: whitespace cleanup on style/Loader.cpp [checkin: comment 58]

Zack Weinberg (:zwol)

Assignee

Comment 59

•

13 years ago

Update: having broken up the patch a bit, I have everything except sync sheet loads providing strings to the parser, but I also have a weird assertion failure in the leaktests:

localhost.localdomain - - [06/May/2011 12:11:45] "GET /screen.css HTTP/1.1" 200 -
###!!! ASSERTION: unexpected progress values: 'progress <= progressMax', file ../../../../netwerk/protocol/http/nsHttpChannel.cpp, line 4124

It appears that nsHttpChannel's idea of how much of 'screen.css' has been processed (sp. mLogicalOffset) has gotten out of sync with the input stream pump's idea of same.  This happens in the patch that changes nsIUnicharStreamLoader, and the affected loads are all CSS, so I deduce that the revised stream loader is handling OnDataAvailable incorrectly, but I don't see how.

Will post updated patch set to this point shortly.

Boris Zbarsky [:bzbarsky]

Comment 60

•

13 years ago

Oh, I'd missed that, sorry.  The onDataAvailable contract is that that the callee will read all the data from aInputStream.  The last attached patch on this bug will only read at most 4096 bytes, so if more than that is available things won't work right.

That might be the sync sheet issue too, since I bet this ended up reading only the first 4KB of all the chrome stylesheets.

Instead of passing 4096 to ReadSegments there you should pass aCount minus however much you've already read.

Zack Weinberg (:zwol)

Assignee

Comment 61

•

13 years ago

(In reply to comment #60)
> 
> Instead of passing 4096 to ReadSegments there you should pass aCount minus
> however much you've already read.

Oh, is *that* what that argument means?  It sounded like it was a limit on the size of each individual *chunk* passed to the callback function.

Zack Weinberg (:zwol)

Assignee

Comment 62

•

13 years ago

(In reply to comment #60)
> Oh, I'd missed that, sorry.  The onDataAvailable contract is that that the
> callee will read all the data from aInputStream.

Is that true even if onDataAvailable returns a failure code?

Boris Zbarsky [:bzbarsky]

Comment 63

•

13 years ago

> Oh, is *that* what that argument means?

Yeah, it's the total amount to read.  The amount passed to the callback function on each call to the callback is up to the stream.

> Is that true even if onDataAvailable returns a failure code?

No.  Only if you return success.  Should have been clearer about that.

Zack Weinberg (:zwol)

Assignee

Comment 64

•

13 years ago

Thanks for clarifying.  I think I've got that part working now, but I ran into a more ... elemental ... problem with using the UnicharStreamLoader for sync loads too: nsFileInputStream doesn't implement ReadSegments.

I am presently testing a patch that implements nsFileInputStream::ReadSegments, but MXR tells me that there are rather a lot of input stream classes that don't implement ReadSegments.  I don't really know which of them we could wind up using - either for a sync load or an async one - but I'm wondering if I shouldn't deal with this in nsUnicharStreamLoader, and if so, what the tidiest way to do it would be.

Zack Weinberg (:zwol)

Assignee

Comment 65

•

13 years ago

Attached patch v4 part 1 (obsolete) — Details — Splinter Review

Here we go with the broken-up patch series.  This works well enough to pass reftests on my local computer; try is running the full testsuite; there are still some places where I'm not sure the code is right, which I will call out below.

I did the breakup in a different order than the one bz suggested in comment 57 so that I did not have to introduce any new uses of streams.  This first patch just adds a new nsCSSParser entry point that parses an entire sheet, but takes a string rather than a stream.

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

Attachment #530876 - Flags: review?(dbaron)

Zack Weinberg (:zwol)

Assignee

Comment 66

•

13 years ago

Attached patch v4 part 2 (obsolete) — Details — Splinter Review

This piece changes nsHTMLParanoidFragmentSink to use the new API and also does some en passant #include-pruning.  I'm not sure who the appropriate reviewer is - dbaron, please feel free to bounce it to someone else.

Attachment #530877 - Flags: review?(dbaron)

Zack Weinberg (:zwol)

Assignee

Comment 67

•

13 years ago

Attached patch v4 part 3 (obsolete) — Details — Splinter Review

And this part uses the new parser entry point for "normal" use of inline style.  Again, please feel free to redirect review to someone more appropriate.

I have high confidence in the patches up to this point.

Attachment #530878 - Flags: review?(dbaron)

Zack Weinberg (:zwol)

Assignee

Comment 68

•

13 years ago

Attached patch v4 part 4 (obsolete) — Details — Splinter Review

Rewrite ns(I)UnicharStreamLoader to produce a string rather than a stream, and use it for async (non-chrome, that is) style loads.

Potentially a problem here is what I mentioned in comment 64: the revised streamloader assumes that ReadSegments works, but there are a lot of streams that don't implement it.  I don't know what can reasonably be expected of the type of stream we get on the async path.

Attachment #530879 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Comment 69

•

13 years ago

Attached patch v4 part 5 (obsolete) — Details — Splinter Review

This piece sends sync stylesheet loads through mostly the same code path as async ones.

Needing careful checking in this piece are the new NS_FeedStreamListener helper function, and the implementation of nsFileInputStream::ReadSegments that I added.  Also, is it safe to poke the content charset on a channel, as done here?

Attachment #530880 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Comment 70

•

13 years ago

Attached patch v4 part 6 (obsolete) — Details — Splinter Review

This final piece removes no-longer-used stream-taking methods from css::Loader and nsCSSParser.  nsCSSScanner can now be cleaned up as well - that will be done in bug 543151.

I am not going to be around much this coming week (travel) and may have very little time for Mozilla hacking after that until mid-August (summer job).

Attachment #530881 - Flags: review?(dbaron)

Boris Zbarsky [:bzbarsky]

Comment 71

•

13 years ago

> nsFileInputStream doesn't implement ReadSegments.

Indeed.  On purpose.  We could deal with it by wrapping in buffered streams as needed (by first testing whether the stream we're given is buffered).  This would just happen on the sync load path in the CSSLoader.

> I don't know what can reasonably be expected of the type of stream we get on
> the async path.

They will implement readSegments, as long as the caller is not breaking the API contract.  nsIStreamListener.idl documents this.

I'll take a careful look through the patches on Monday.  Given the above discussion, I doubt there's much that will need changing, so if you don't have time to address the review comments, if any, just let me know and I'll deal with it.  Thanks for working on this!

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 72

•

13 years ago

It might make more sense if bzbarsky reviews the whole patch series here...

Zack Weinberg (:zwol)

Assignee

Comment 73

•

13 years ago

No objection from me.

Boris Zbarsky [:bzbarsky]

Comment 74

•

13 years ago

Yeah, that's certainly my plan.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

13 years ago

Attachment #530876 - Flags: review?(dbaron) → review?(bzbarsky)

Attachment #530877 - Flags: review?(dbaron) → review?(bzbarsky)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

13 years ago

Attachment #530878 - Flags: review?(dbaron) → review?(bzbarsky)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

13 years ago

Attachment #530881 - Flags: review?(dbaron) → review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 75

•

13 years ago

Comment on attachment 530876 [details] [diff] [review]
v4 part 1

Instead of adding and then later removing a second copy of all the guts of ParseSheet, how about we do:

  template<typename T>
  ParseSheet(T aInput, ....);

for the existing CSSParserImpl code, then have the exact body we have right now and have nsCSSParer::Parse (the old signature) call ParseSheet.  Then in the last patch you can remove the templating and just leave the |const nsAString&| version.

r=me with that.

Attachment #530876 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 76

•

13 years ago

Comment on attachment 530877 [details] [diff] [review]
v4 part 2

r=me

Attachment #530877 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 77

•

13 years ago

Comment on attachment 530878 [details] [diff] [review]
v4 part 3

Again, I'd prefer a templated version of ParseSheet to a copy-and-remove...

r=me with that.

Attachment #530878 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 78

•

13 years ago

Comment on attachment 530879 [details] [diff] [review]
v4 part 4

>+++ b/netwerk/base/src/nsUnicharStreamLoader.cpp
>+nsUnicharStreamLoader::Init(nsIUnicharStreamLoaderObserver *aObserver)
>+  mDecoder = nsnull;
>+  mContext = nsnull;
>+  mChannel = nsnull;

Those should all be null already.  No need to null them again.

>+  mBuffer.Truncate();
>+  mRawData.Truncate();

Also no need for that.

>+  mRawData.SetCapacity(512);
>+
>   return NS_OK;

Here, on the other hand, you should check whether the capacity is actually 512.  If it isn't, return NS_ERROR_OUT_OF_MEMORY so this object won't be used.

>+nsUnicharStreamLoader::OnStopRequest(nsIRequest *aRequest,
>+  if (NS_FAILED(rv)) {
>+    // Call the observer but pass it no data.
>+    // XXX Should we pass rv instead of aStatus?
>+    nsAutoString dummy;
>+    mObserver->OnStreamComplete(this, mContext, aStatus, dummy);

Nix dummy and just pass EmptyString() as the last argument.

And yes, I'd say pass rv here.  And document this behavior in the idl.  That would make this case consistent with the case when we end up with an error from DetermineCharset in OnDataAvailable.

Also, maybe the code should look more like this:

  nsresult rv = NS_OK;
  if (mRawData.Length() > 0 && NS_SUCCEEDED(aStatus)) {
    NS_ABORT_IF_FALSE(mBuffer.Length() == 0,
                      "should not have both decoded and raw data");
    rv = DetermineCharset();
  }

and then your code as now, so that if aStatus is a failure we don't try to determine the charset for no reason.

>+nsUnicharStreamLoader::DetermineCharset()
>+  NS_ABORT_IF_FALSE(IsASCII(mCharset), "non-ASCII character set name");

Sadly, you can't do that.  mCharset can come from web or css content; so it could be all sorts of broken....  Just take that line out.

>+++ b/netwerk/base/src/nsUnicharStreamLoader.h
>+  nsCString                                mRawData;
>+  nsString                                 mBuffer;

Document what those members are for, please.

r=me with the nits

Attachment #530879 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 79

•

13 years ago

Comment on attachment 530880 [details] [diff] [review]
v4 part 5

Poking the content charset is fine.

I'd really rather not implement nsFileInputStream::ReadSegments here.  I also think that we shouldn't add NS_FeedStreamListener (which doesn't actually handle streams that don't have all the data available up front, by the way).  Since this code is all in content-ish land anyway, nsSyncLoadService::PushSyncStreamToListener will do what we want here and is already used and tested.  Bonus points for fixing that to correctly handle errors from OnStartRequest and OnDataAvailable, though!

Attachment #530880 - Flags: review?(bzbarsky) → review-

Boris Zbarsky [:bzbarsky]

Comment 80

•

13 years ago

Comment on attachment 530881 [details] [diff] [review]
v4 part 6

r=me, though I expect this will be smaller with the templating approach.

Attachment #530881 - Flags: review?(bzbarsky) → review+

Zack Weinberg (:zwol)

Assignee

Comment 81

•

13 years ago

(In reply to comment #75)
>   template<typename T>
>   ParseSheet(T aInput, ....);

I thought of that but was worried about weird template problems.  I'll try it.

> >+nsUnicharStreamLoader::Init(nsIUnicharStreamLoaderObserver *aObserver)
> >+  mDecoder = nsnull;
> >+  mContext = nsnull;
> >+  mChannel = nsnull;
> 
> Those should all be null already.  No need to null them again.

The old Init method may have been doing that too, I don't remember.  Will change.

> >+  mRawData.SetCapacity(512);
> >+
> >   return NS_OK;
> 
> Here, on the other hand, you should check whether the capacity is actually
> 512.  If it isn't, return NS_ERROR_OUT_OF_MEMORY so this object won't be
> used.

I thought string resizing was infallible these days?  But ok.

> Nix dummy and just pass EmptyString() as the last argument.

I learned something today!

> And yes, I'd say pass rv here.  And document this behavior in the idl.  That
> would make this case consistent with the case when we end up with an error
> from DetermineCharset in OnDataAvailable.

Ok.

...
> and then your code as now, so that if aStatus is a failure we don't try to
> determine the charset for no reason.

Ok.

> >+nsUnicharStreamLoader::DetermineCharset()
> >+  NS_ABORT_IF_FALSE(IsASCII(mCharset), "non-ASCII character set name");
> 
> Sadly, you can't do that.  mCharset can come from web or css content; so it
> could be all sorts of broken....  Just take that line out.

That was in the old version (as an ASSERTION rather than an ABORT_IF_FALSE); I saw it as enforcing a requirement for the OnDetermineCharset hook to sanitize its output.  I don't mind removing it tho.

> >+++ b/netwerk/base/src/nsUnicharStreamLoader.h
> >+  nsCString                                mRawData;
> >+  nsString                                 mBuffer;
> 
> Document what those members are for, please.

Ok.

> I'd really rather not implement nsFileInputStream::ReadSegments here.  I
> also think that we shouldn't add NS_FeedStreamListener (which doesn't
> actually handle streams that don't have all the data available up front, by
> the way).  Since this code is all in content-ish land anyway,
> nsSyncLoadService::PushSyncStreamToListener will do what we want here and is
> already used and tested.  Bonus points for fixing that to correctly handle
> errors from OnStartRequest and OnDataAvailable, though!

More handy things I didn't know about!  Ok, provided it turns out that I can include the necessary header from layout/style/.

I will try to produce revisions by Sunday.

Boris Zbarsky [:bzbarsky]

Comment 82

•

13 years ago

> I thought string resizing was infallible these days?

Not yet (and possibly not ever).

> That was in the old version (as an ASSERTION rather than an ABORT_IF_FALSE)

I was young and foolish.  ;)

> Ok, provided it turns out that I can include the necessary header from
> layout/style/.

You definitely can.  Worst case, add an entry to LOCAL_INCLUDEs in layout/style/Makefile.in.

Zack Weinberg (:zwol)

Assignee

Comment 83

•

13 years ago

I've made all the requested revisions and pushed the result to try.  I have to run now - will upload revised patches later.

Zack Weinberg (:zwol)

Assignee

Comment 84

•

13 years ago

Attached patch v5 part 1 — Details — Splinter Review

The templatized ParseSheet() worked except for a C++ wart.  You can't have a template function whose callers pass the argument-with-template-type as a pointer in some cases and a reference in others; if you try, by using bare 'T' as the argument type, C++ will try to pass T by *value* when it should've passed by reference.  So I had to change the nsIUnicharInputStream overload to take a reference too.  This is a little weird, but it's temporary, so I think we can live with it.  I also renamed Parse() to ParseSheet(), even though that overload goes away in part 6, because otherwise templatizing Loader::ParseSheet in part 3 wouldn't work.

Attachment #530876 - Attachment is obsolete: true

Attachment #532784 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Comment 85

•

13 years ago

Attached patch v5 part 2 — Details — Splinter Review

No substantive changes in this piece, carrying r=bz forward.

Attachment #532786 - Flags: review+

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

Attachment #530877 - Attachment is obsolete: true

Zack Weinberg (:zwol)

Assignee

Comment 86

•

13 years ago

Attached patch v5 part 3 — Details — Splinter Review

Only substantive change is switching to templatized Loader::ParseSheet.

Attachment #530878 - Attachment is obsolete: true

Attachment #532787 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Comment 87

•

13 years ago

Attached patch v5 part 4 — Details — Splinter Review

Changes as requested in comment 78.

Attachment #530879 - Attachment is obsolete: true

Attachment #532791 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Comment 88

•

13 years ago

Attached patch v5 part 5 — Details — Splinter Review

Uses nsSyncLoadService::PushSyncStreamToListener as suggested in comment 79.  I think I've "fixed [PushSyncStreamToListener] to correctly handle errors from OnStartRequest and OnDataAvailable", too, but I don't know how one would test that.

Attachment #530880 - Attachment is obsolete: true

Attachment #532794 - Flags: review?(bzbarsky)

Zack Weinberg (:zwol)

Assignee

Comment 89

•

13 years ago

Attached patch v5 part 6 — Details — Splinter Review

No substantive changes, carrying r=bz forward (this is indeed a lot smaller, as predicted).

bz: I'd appreciate it if you could take this from here.  I'm probably not going to have any hacking time this week (and if I do, I want to spend it on bug 543151) and I definitely won't have time to watch the tree.

Attachment #530881 - Attachment is obsolete: true

Attachment #532795 - Flags: review+

Boris Zbarsky [:bzbarsky]

Comment 90

•

13 years ago

Comment on attachment 532784 [details] [diff] [review]
v5 part 1

r=me

Attachment #532784 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 91

•

13 years ago

Comment on attachment 532787 [details] [diff] [review]
v5 part 3

r=me

Attachment #532787 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 92

•

13 years ago

Comment on attachment 532791 [details] [diff] [review]
v5 part 4

r=me

Attachment #532791 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 93

•

13 years ago

Comment on attachment 532794 [details] [diff] [review]
v5 part 5

r=me.  Thanks for fixing the sync load service code!

Attachment #532794 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 94

•

13 years ago

Pushed to cedar:

  http://hg.mozilla.org/projects/cedar/rev/daeafb8f2b6e
  http://hg.mozilla.org/projects/cedar/rev/d4fa788766a1
  http://hg.mozilla.org/projects/cedar/rev/6380116c6eb7
  http://hg.mozilla.org/projects/cedar/rev/ca4404b5a114
  http://hg.mozilla.org/projects/cedar/rev/303c10f7ae5c
  http://hg.mozilla.org/projects/cedar/rev/d9d8cade3433

Zack, are you planning to work on bug 543151, or do you want someone to steal it?

Flags: in-testsuite-

Whiteboard: [fixed-in-cedar]

Zack Weinberg (:zwol)

Assignee

Comment 95

•

13 years ago

(In reply to comment #94)
>
> Zack, are you planning to work on bug 543151, or do you want someone to
> steal it?

There's a relatively up-to-date patch in there which might even be correct as is; I was going to try to find time tomorrow or Friday to verify that and maybe also break it up for ease of review, but if you'd like to take it over, please be my guest.  dbaron hasn't commented on it at all :/

Boris Zbarsky [:bzbarsky]

Comment 96

•

13 years ago

dbaron's on vacation and probably overloaded on reviews.  Please toss that patch in my queue too (not instead of), and we'll see whether I can get to him before he does?

Zack Weinberg (:zwol)

Assignee

Comment 97

•

13 years ago

Done.

Mounir Lamouri (:mounir)

Comment 98

•

13 years ago

Pushed:
http://hg.mozilla.org/mozilla-central/rev/daeafb8f2b6e
http://hg.mozilla.org/mozilla-central/rev/d4fa788766a1
http://hg.mozilla.org/mozilla-central/rev/6380116c6eb7
http://hg.mozilla.org/mozilla-central/rev/ca4404b5a114
http://hg.mozilla.org/mozilla-central/rev/303c10f7ae5c
http://hg.mozilla.org/mozilla-central/rev/d9d8cade3433

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Whiteboard: [fixed-in-cedar]

Target Milestone: --- → mozilla6

Zack Weinberg (:zwol)

Assignee

Updated

•

13 years ago

No longer blocks: 516091

Boris Zbarsky [:bzbarsky]

Updated

•

13 years ago

Depends on: 670553

:Paolo Amadini

Comment 99

•

13 years ago

This bug changed the nsIUnicharStreamLoader interface, but the change does not
appear in the "Firefox 6 for developers" page nor the use of the interface is
detected in the add-ons compatibility checker. The effect is that I received a
severe bug report for an add-on I'm maintaining, the day after the release of
Firefox 6; probably, users testing on Aurora and Beta were not enough to detect
the issue earlier, even if the add-on has more than 50K users. Fortunately I was
available to address the issue promptly.

What is the process to track these kind of changes? Is it up to the patch author
to update the documentation? I'm asking firstly because I made an interface
change in another bug and don't want it to be lost, and secondly to understand
if, from now on, I should plan my vacations according to the release calendar :-)

Zack Weinberg (:zwol)

Assignee

Comment 100

•

13 years ago

My apologies; we thought the CSS loader was the only user of nsIUnicharStreamLoader and didn't check add-ons.  I am not sure what the procedure is for getting interface changes put into the release notes and the compatibility checker; suggest you ask on dev-platform.

Alice0775 White

Updated

•

13 years ago

Depends on: 682706

:Paolo Amadini

Comment 101

•

13 years ago

(In reply to Zack Weinberg (:zwol) from comment #100)
> My apologies; we thought the CSS loader was the only user of
> nsIUnicharStreamLoader and didn't check add-ons.  I am not sure what the
> procedure is for getting interface changes put into the release notes and
> the compatibility checker; suggest you ask on dev-platform.

Thanks for the pointer, we discussed this on dev-platform and we now have a
procedure :-) It basically consists in adding a new "addon-compat" keyword.
Also, it's welcome to CC Jorge of the Add-ons Team on the bug as well.

You need to log in before you can comment on or make changes to this bug.