Last Comment Bug 229827 - CSS parser diagnostics should escape unprintable characters
: CSS parser diagnostics should escape unprintable characters
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: CSS Parsing and Computation (show other bugs)
: Trunk
: All All
: -- minor (vote)
: mozilla19
Assigned To: Zack Weinberg (:zwol)
:
:
Mentors:
http://bugzilla.mozilla.org/attachmen...
Depends on: 1203346 CVE-2008-5510 455839 663291
Blocks: 413958
  Show dependency treegraph
 
Reported: 2004-01-01 00:03 PST by Mats Palmgren (:mats)
Modified: 2015-09-10 07:18 PDT (History)
5 users (show)
ryanvm: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (18.77 KB, patch)
2011-06-14 13:11 PDT, Zack Weinberg (:zwol)
dbaron: review-
Details | Diff | Splinter Review
patch v2 (22.74 KB, patch)
2011-06-14 17:56 PDT, Zack Weinberg (:zwol)
no flags Details | Diff | Splinter Review
patch v3 (26.45 KB, patch)
2011-06-16 10:21 PDT, Zack Weinberg (:zwol)
no flags Details | Diff | Splinter Review
patch v4 (28.09 KB, patch)
2012-09-16 15:30 PDT, Zack Weinberg (:zwol)
dbaron: review+
Details | Diff | Splinter Review
patch v5 (28.21 KB, patch)
2012-11-16 12:35 PST, Zack Weinberg (:zwol)
zackw: review+
Details | Diff | Splinter Review

Description Mats Palmgren (:mats) 2004-01-01 00:03:49 PST
(This is a follow-up from bug 228856.)

Console error messages are truncated at an embedded NUL.
I did investigate it briefly, one problem is that the logging API takes
a PRUnichar* parameter so I think we need to fix this near the source in
nsCSSScanner/Parser (where we know how and what needs to be escaped).

We should probably escape other non-printable characters as well.
For example c\1olor gives this rather amusing error message:
CSS Error (file :6.24): Unknown property 'color'.  Declaration dropped.

STEPS TO REPRODUCE:
1. Load "Testcase #5" (attachment 138243 [details]) from bug 228856 in "viewer"
2. Look at the error messages on the console

ACTUAL BEHAVIOUR:
CSS Error (file :7.29): Unknown property 'c
CSS Error (file :8.29): Unknown property 'color
CSS Error (file :9.29): Unknown property '
CSS Error (file :10.17): Expected declaration but found '
CSS Error (file :11.20): Expected ':' but found '
CSS Error (file :12.22): Expected ':' but found '
CSS Error (file :1.22): Unrecognized at-rule or error parsing at-rule '@media
CSS Error (file :1.15): Unrecognized at-rule or error parsing at-rule '@me

EXPECTED BEHAVIOUR:
CSS Error (file :7.23): Unknown property 'c\000000olor'.  Declaration dropped.
CSS Error (file :8.23): Unknown property 'color\000000'.  Declaration dropped.
CSS Error (file :9.23): Unknown property '\000000color'.  Declaration dropped.
CSS Error (file :10.23): Unknown property '\000000color'.  Declaration dropped.
CSS Error (file :11.23): Unknown property 'col\000000or'.  Declaration dropped.
CSS Error (file :12.23): Unknown property 'color\000000'.  Declaration dropped.
CSS Error (file :1.9): Unrecognized at-rule or error parsing at-rule
'@media\000000'.
CSS Error (file :1.9): Unrecognized at-rule or error parsing at-rule
'@me\000000dia'.
Comment 1 Zack Weinberg (:zwol) 2008-11-05 13:43:05 PST
I have a bunch of CSS scanner/parser work in the 1.9.2 cycle so I'll get this at the same time.
Comment 2 Zack Weinberg (:zwol) 2008-12-17 13:19:30 PST
Retitling - this is about more than just NULs.
Comment 3 [not reading bugmail] 2010-05-28 16:32:16 PDT
Maybe this should depend/block on the HTml5 parser bug?
Comment 4 Zack Weinberg (:zwol) 2011-06-14 13:11:48 PDT
Created attachment 539297 [details] [diff] [review]
patch

Test uses the console listener features added to SpecialPowers/SimpleTest in bug 663291.  Code changes have structural dependencies on bug 455839 and its dependencies.
Comment 5 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2011-06-14 16:41:41 PDT
Comment on attachment 539297 [details] [diff] [review]
patch

Do these differ from nsStyleUtil::AppendEscapedCSSString and nsStyleUtil::AppendEscapedCSSIdent?  If so, please explain how (and why you can't add the needed functionality to the existing ones); if not, please use the existing ones.
Comment 6 Zack Weinberg (:zwol) 2011-06-14 17:23:36 PDT
(In reply to comment #5)
> Comment on attachment 539297 [details] [diff] [review] [review]
> patch
> 
> Do these differ from nsStyleUtil::AppendEscapedCSSString and
> nsStyleUtil::AppendEscapedCSSIdent?  If so, please explain how (and why you
> can't add the needed functionality to the existing ones); if not, please use
> the existing ones.

Wow, how did I miss those?  They appear not to escape U+007F through U+009F, which is probably just an oversight; also, AppendEscapedCSSString gives no control over which quote characters it uses, which nsCSSToken::AppendToString needs.  However, I think I can adapt them to fit.
Comment 7 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2011-06-14 17:42:07 PDT
Well, U+0080 through U+009F are identifier characters in CSS so they don't have to be escaped, but it's probably reasonable to escape them given that they've now been assigned and they're nonprintable.

U+007F was clearly an oversight, and it should be trivial to make existing tests fail by adding string or ident values to property_database.js with it.

Escaping all of those numerically in both functions is fine with me.
Comment 8 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2011-06-14 17:43:02 PDT
And AppendEscapedCSSString could easily take a quote character as a (defaulted) argument, though it should assert that it's either " or '.
Comment 9 Zack Weinberg (:zwol) 2011-06-14 17:56:30 PDT
Created attachment 539387 [details] [diff] [review]
patch v2

New patch.  Changes to nsStyleUtil::AppendEscapedCSS* are to allow more control over the surrounding quotes provided by AppendEscapedCSSString; escape characters U+007F through U+009F (only U+00A0 and greater are allowed in identifiers - I guess it would have been okay to leave them alone in strings, but it seems more consistent this way); remove the requirement for an nsString on the input side; and make use of the new .AppendPrintf convenience method.
Comment 10 Zack Weinberg (:zwol) 2011-06-15 08:32:05 PDT
(In reply to comment #7)
> Well, U+0080 through U+009F are identifier characters in CSS so they don't
> have to be escaped, but it's probably reasonable to escape them given that
> they've now been assigned and they're nonprintable.

http://www.w3.org/TR/CSS2/syndata.html#characters says only U+00A0 and higher are valid in identifiers.  Is this a change from 2.0?

> U+007F was clearly an oversight, and it should be trivial to make existing
> tests fail by adding string or ident values to property_database.js with it.

Ok, I can do that.

(In reply to comment #8)
> And AppendEscapedCSSString could easily take a quote character as a
> (defaulted) argument, though it should assert that it's either " or '.

I'll add the assertion.
Comment 11 Zack Weinberg (:zwol) 2011-06-16 10:21:53 PDT
Created attachment 539820 [details] [diff] [review]
patch v3

Now with assertion and additional property_database.js testing.  Also I decided serialization of BadString was too exotic a case to support directly in AppendEscapedCSSString; instead, nsCSSToken::AppendToString chops off the extra quote character.
Comment 12 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2011-07-23 16:30:41 PDT
spec bug noted in
http://lists.w3.org/Archives/Public/www-style/2011Jul/0398.html
Comment 13 Zack Weinberg (:zwol) 2012-09-16 15:30:28 PDT
Created attachment 661643 [details] [diff] [review]
patch v4

Dusting this off now we're finally in a position to land bug 663291 and therefore to test CSS error reporting in detail.

One substantive change from the previous version: AppendEscapedCSSIdent now handles identifiers beginning with {nmchar}-but-not-{nmstart} characters correctly in more cases, and this is tested.

Functional dependency on bug 663291 for tests.  Textual dependency on bug 455839 and, transitively, bug 516091.
Comment 14 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2012-10-15 10:04:13 PDT
Comment on attachment 661643 [details] [diff] [review]
patch v4

> void
> ErrorReporter::ReportUnexpected(const char *aMessage,
>                                 const nsAString &aParam)
> {
>   if (!ShouldReportErrors()) return;
> 
>-  const PRUnichar *params[1] = { aParam.BeginReading() };
>+  nsAutoString qparam;
>+  nsStyleUtil::AppendEscapedCSSIdent(aParam, qparam);
>+  const PRUnichar *params[1] = { qparam.BeginReading() };
>+

While I'm here:  this should really be get() rather than BeginReading().  The BeginReading() API doesn't guarantee null-termination; this code is assuming null-termination (and in the case of the new code, guaranteeing it by constructing an nsAutoString).

But if the code requires null-termination, it should take const nsString& instead of const nsSubstring& (nsSubstring is the preferred name for nsAString now that nsAString is no longer abstract), and just use get().  The BeginReading() business is bogus.
Comment 15 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2012-10-15 10:17:36 PDT
Comment on attachment 661643 [details] [diff] [review]
patch v4

#include comment 14

> /* static */ void
>-nsStyleUtil::AppendEscapedCSSIdent(const nsString& aIdent, nsAString& aReturn)
>+nsStyleUtil::AppendEscapedCSSIdent(const nsAString& aIdent, nsAString& aReturn)

>+  for (; in != end; ++in) {
>+    PRUnichar ch = *in;
>+    if (ch < 0x20 || (0x7F <= ch && ch < 0xA0)) {
>+      // Escape U+0000 through U+001F and U+007F through U+009F numerically.
>+      aReturn.AppendPrintf("\\%hX ", *in);
>     } else {
>+      // Escape ASCII non-identifier printables as a backslash plus
>+      // the character.
>+      if (ch != '_' && ch != '-' &&
>+          (ch < '0' || '9' < ch) &&
>+          (ch < 'A' || 'Z' < ch) &&
>+          (ch < 'a' || 'z' < ch) &&
>+          ch < 0xA0) {

I think this last line would be clearer as "ch < 0x80".  It doesn't make a difference since 0x80 <= ch < 0xA0 never enters this branch because of the if above.

r=dbaron with that

(I didn't look at the new test closely; let me know if you want me to.)
Comment 16 Zack Weinberg (:zwol) 2012-10-20 20:09:56 PDT
(In reply to David Baron [:dbaron] from comment #15)
> I think this last line would be clearer as "ch < 0x80".  It doesn't make a
> difference since 0x80 <= ch < 0xA0 never enters this branch because of the
> if above.

OK, will change.

> (I didn't look at the new test closely; let me know if you want me to.)

Well, if you have a better idea for how to test this code than "intimate knowledge of how to evoke parser errors that contain user-controlled text" I would love to hear it, but we don't change that _that_ often, so I'm pretty happy with things as is.
Comment 17 Zack Weinberg (:zwol) 2012-10-21 07:37:08 PDT
(In reply to David Baron [:dbaron] from comment #14)
> >-  const PRUnichar *params[1] = { aParam.BeginReading() };
> >+  nsAutoString qparam;
> >+  nsStyleUtil::AppendEscapedCSSIdent(aParam, qparam);
> >+  const PRUnichar *params[1] = { qparam.BeginReading() };
> >+
> 
> While I'm here:  this should really be get() rather than BeginReading(). 
> The BeginReading() API doesn't guarantee null-termination; this code is
> assuming null-termination (and in the case of the new code, guaranteeing it
> by constructing an nsAutoString).
>
> But if the code requires null-termination, it should take const nsString&
> instead of const nsSubstring& (nsSubstring is the preferred name for
> nsAString now that nsAString is no longer abstract), and just use get(). 
> The BeginReading() business is bogus.

I'm terminally confused about the various string classes.  I'll make the suggested changes here and in the patch for bug 455839.
Comment 18 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2012-10-21 07:46:46 PDT
Does https://wiki.mozilla.org/Gecko:Overview#String help?
Comment 19 Zack Weinberg (:zwol) 2012-10-21 07:58:16 PDT
(In reply to Zack Weinberg (:zwol) from comment #16)
> (In reply to David Baron [:dbaron] from comment #15)
> > I think this last line would be clearer as "ch < 0x80".  It doesn't make a
> > difference since 0x80 <= ch < 0xA0 never enters this branch because of the
> > if above.
> 
> OK, will change.

On actually looking at it again, I think this is even better:

      // Escape ASCII non-identifier printables as a backslash plus
      // the character.
      if (ch < 0x7F &&
          ch != '_' && ch != '-' &&
          (ch < '0' || '9' < ch) &&
          (ch < 'A' || 'Z' < ch) &&
          (ch < 'a' || 'z' < ch)) {
        aReturn.Append(PRUnichar('\\'));
      }

Putting the 'ch < 0x7F' test up front gives parallel structure with the comment, and using 0x7F instead of 0x80 makes it match the if-block above.

> Does https://wiki.mozilla.org/Gecko:Overview#String help?

It clarifies what the classes presently are, but I am still somewhat mystified as to which one to choose in any given situation.

On an unrelated note, it occurs to me that since the parser uses macro stringification for all its error messages anyway, we could wrap them in NS_LITERAL_STRING and avoid runtime conversion from ASCII to UTF-16 for the stringbundle lookups.  I have this dim memory that we discussed this possibility once before, but I don't remember what you said about it.
Comment 20 Zack Weinberg (:zwol) 2012-11-16 12:35:26 PST
Created attachment 682580 [details] [diff] [review]
patch v5

Final version of patch.

https://hg.mozilla.org/integration/mozilla-inbound/rev/b242651c3c1b
Comment 21 Ryan VanderMeulen [:RyanVM] 2012-11-16 15:53:56 PST
Unfortunately, something in the push for this bug and bug 663291 caused mochitest-1/3/4 failures (some perma, some intermittent). Backed out.
https://hg.mozilla.org/integration/mozilla-inbound/rev/20fcf80a5242

Mochitest-1 (Intermittent)
https://tbpl.mozilla.org/php/getParsedLog.php?id=17116584&tree=Mozilla-Inbound
59145 ERROR TEST-UNEXPECTED-FAIL | /tests/content/events/test/test_bug489671.html | monitorConsole | number of messages - got 2, expected 3

Mochitest-4 (Frequent)
https://tbpl.mozilla.org/php/getParsedLog.php?id=17114235&tree=Mozilla-Inbound
851 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/workers/test/test_errorPropagation.html | monitorConsole | extra message | {"message":"[JavaScript Error: \"Error: expectedError\" {file: \"http://mochi.test:8888/tests/dom/workers/test/errorPropagation_worker.js\" line: 48}]","errorMessage":"Error: expectedError","sourceName":"http://mochi.test:8888/tests/dom/workers/test/errorPropagation_worker.js","sourceLine":"","lineNumber":48,"columnNumber":0,"category":"Web Worker","windowID":173,"isScriptError":true,"isWarning":false,"isException":false,"isStrict":false}
852 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/workers/test/test_errorPropagation.html | monitorConsole | extra message | {"message":"[JavaScript Error: \"Error: expectedError\" {file: \"http://mochi.test:8888/tests/dom/workers/test/errorPropagation_worker.js\" line: 48}]","errorMessage":"Error: expectedError","sourceName":"http://mochi.test:8888/tests/dom/workers/test/errorPropagation_worker.js","sourceLine":"","lineNumber":48,"columnNumber":0,"category":"Web Worker","windowID":173,"isScriptError":true,"isWarning":false,"isException":false,"isStrict":false}
853 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/workers/test/test_errorPropagation.html | monitorConsole | extra message | {"message":"[JavaScript Error: \"Error: expectedError\" {file: \"http://mochi.test:8888/tests/dom/workers/test/errorPropagation_worker.js\" line: 48}]","errorMessage":"Error: expectedError","sourceName":"http://mochi.test:8888/tests/dom/workers/test/errorPropagation_worker.js","sourceLine":"","lineNumber":48,"columnNumber":0,"category":"Web Worker","windowID":173,"isScriptError":true,"isWarning":false,"isException":false,"isStrict":false}
854 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/workers/test/test_errorPropagation.html | monitorConsole | extra message | {"message":"[JavaScript Error: \"Error: expectedError\" {file: \"http://mochi.test:8888/tests/dom/workers/test/errorPropagation_worker.js\" line: 48}]","errorMessage":"Error: expectedError","sourceName":"http://mochi.test:8888/tests/dom/workers/test/errorPropagation_worker.js","sourceLine":"","lineNumber":48,"columnNumber":0,"category":"Web Worker","windowID":173,"isScriptError":true,"isWarning":false,"isException":false,"isStrict":false}
855 ERROR TEST-UNEXPECTED-FAIL | /tests/dom/workers/test/test_errorPropagation.html | monitorConsole | extra message | {"message":"[JavaScript Error: \"Error: expectedError\" {file: \"http://mochi.test:8888/tests/dom/workers/test/errorPropagation_worker.js\" line: 48}]","errorMessage":"Error: expectedError","sourceName":"http://mochi.test:8888/tests/dom/workers/test/errorPropagation_worker.js","sourceLine":"","lineNumber":48,"columnNumber":0,"category":"Web Worker","windowID":173,"isScriptError":true,"isWarning":false,"isException":false,"isStrict":false}

Mochitest-5 (Perma)
https://tbpl.mozilla.org/php/getParsedLog.php?id=17114186&tree=Mozilla-Inbound
157805 ERROR TEST-UNEXPECTED-FAIL | /tests/parser/htmlparser/tests/mochitest/test_bug672453.html | monitorConsole | [7].errorMessage value - got An unsupported character encoding was declared for the HTML document using a meta tag. The declaration was ignored., strictly expected A meta tag was used to declare a character encoding the does not encode the Basic Latin range roughly like US-ASCII. The declaration was ignored.
Comment 22 Zack Weinberg (:zwol) 2012-11-16 19:18:05 PST
Re-landed: https://hg.mozilla.org/integration/mozilla-inbound/rev/b11550b854e8 . See bug 663291 for failure analysis.
Comment 23 Ryan VanderMeulen [:RyanVM] 2012-11-17 05:12:24 PST
https://hg.mozilla.org/mozilla-central/rev/b11550b854e8

Note You need to log in before you can comment on or make changes to this bug.