store any implied characters needed to properly serialize tokens at EOF, on nsCSSScanner

RESOLVED FIXED in mozilla28

Status

()

Core
CSS Parsing and Computation
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: heycam, Assigned: heycam)

Tracking

(Blocks: 1 bug)

unspecified
mozilla28
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qa-])

Attachments

(1 attachment, 1 obsolete attachment)

For things like:

  <style>p { var-a: a 'b</style>

we need to store the token stream of var-a as "a 'b'".  So we need the scanner to remember what characters are required at EOF for the final token to be serialized properly.
Would be useful for

  supportsRule.conditionText = '(content: "a';

to do the right thing, too.
Created attachment 801475 [details] [diff] [review]
patch
Assignee: nobody → cam
Status: NEW → ASSIGNED
Attachment #801475 - Flags: review?(dbaron)
Comment on attachment 801475 [details] [diff] [review]
patch

I'm not sure I follow the change to ScanIdent.  Is the idea that we want to act as though there's a double-backslash (i.e., an escaped backslash) when there's a backslash followed by EOF?  Those seem different, since one is an ident and one is a delim.  It also doesn't really seem like the right thing to do for backslash followed by EOF; I feel like the WG actually discussed this issue and resolved on something.  In fact, http://lists.w3.org/Archives/Public/www-style/2013Jun/0683.html says:

   RESOLVED: \[EOF] turns into U+FFFD except when inside a string, in which
             case it just gets dropped.

which avoids the problem.  And you implemented that in bug 880150 -- except I'm inclined to think that patch was incomplete since the aToken.mSymbol assignment to Peek() in that !GatherText() clause in ScanIdent seems wrong, since that means we're treating it as a backslash-delim rather than as dropping it (which I think would mean returning false, although that needs checking).

Also, it seems like ScanIdent should have an assertion inside that !GatherText() that aToken.mSymbol is a backslash.

Do you have tests for this in the variables patches?

Other than the issue of what to do in ScanIdent (and the lack of tests), I think this is fine, but I'd like to look at what you end up with there, so marking review-.
Attachment #801475 - Flags: review?(dbaron) → review-
(In reply to David Baron [:dbaron] (needinfo? me) from comment #3)
> I'm not sure I follow the change to ScanIdent.  Is the idea that we want to
> act as though there's a double-backslash (i.e., an escaped backslash) when
> there's a backslash followed by EOF?  Those seem different, since one is an
> ident and one is a delim.  It also doesn't really seem like the right thing
> to do for backslash followed by EOF; I feel like the WG actually discussed
> this issue and resolved on something.  In fact,
> http://lists.w3.org/Archives/Public/www-style/2013Jun/0683.html says:
> 
>    RESOLVED: \[EOF] turns into U+FFFD except when inside a string, in which
>              case it just gets dropped.
> 
> which avoids the problem.  And you implemented that in bug 880150 -- except
> I'm inclined to think that patch was incomplete since the aToken.mSymbol
> assignment to Peek() in that !GatherText() clause in ScanIdent seems wrong,
> since that means we're treating it as a backslash-delim rather than as
> dropping it (which I think would mean returning false, although that needs
> checking).

You're right that we shouldn't be setting eImpliedEOFCharacters_Backslash in there (and so I think there aren't any cases where we need eImpliedEOFCharacters_Backslash at all).  A lone backslash would have successfully be turned into U+FFFD by GatherEscape inside GatherText, which would have returned true.  GatherText fails only if there are no characters to consume (which won't be the case here, since we've already Peek()ed to determine we should call ScanIdent) or if it scans a backslash that does not begin a valid escape.  That's the case where we're assigning Peek() to aToken.mSymbol, so I think that part is correct.

> Also, it seems like ScanIdent should have an assertion inside that
> !GatherText() that aToken.mSymbol is a backslash.

Good idea.

> Do you have tests for this in the variables patches?

I thought so but apparently not.  I'll add some subtests that have backslashes just before EOF.

> Other than the issue of what to do in ScanIdent (and the lack of tests), I
> think this is fine, but I'd like to look at what you end up with there, so
> marking review-.

It's hard to test this patch in isolation, since it's only storing some internal state.  Take a look at the variables tests when I update them.
Thinking some more, wouldn't we need to imply a U+FFFD character after the backslash, to make sure that we correctly scan it as U+FFFD when followed by more characters?  For example if we had:

  <style>p { var-c:var(a)var(b); var-b:1; var-a:\</style>

we can't just insert /**/ between the backslash and the 1, as that would then re-scan as a delim followed by a number.  If we got the computed value of var-c I think it should be "\<U+FFFD>/**/1".
And then, when we are in a string, since need to drop the backslash, I guess we need to have another enum value to mean drop that character rather than add any additional characters?
Created attachment 816308 [details] [diff] [review]
patch (v1.1)

This renames ImpliedEOFCharacters to EOFCharacters, adding a new enum to mean "the final backslash should be dropped" and also one for "'fffd' must be appended".  EOFCharacters is now a bitfield, as the combinations of possible characters was getting high.  I moved CSSParserImpl::AppendImpliedEOFCharacters from a Variables patch to this patch, and put it on nsCSSScanner.
Attachment #801475 - Attachment is obsolete: true
Attachment #816308 - Flags: review?(dbaron)
For some tests that exercise this, see

  layout/style/test/test_css_supports_variables.html
  layout/style/test/test_variable_serialization_computed.html
  layout/style/test/test_variable_serialization_specified.html

in the "Part 26: Tests. (v1.1)" patch on bug 773296, which I will attach in a moment.
Comment on attachment 816308 [details] [diff] [review]
patch (v1.1)

r=dbaron
Attachment #816308 - Flags: review?(dbaron) → review+
https://hg.mozilla.org/mozilla-central/rev/957d85b31ff3
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28

Updated

4 years ago
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.