The default bug view has changed. See this FAQ.

test_ident_escaping.html and test_parse_ident.html fail if Firefox is built on DBCS locales

RESOLVED FIXED in mozilla6

Status

()

Core
CSS Parsing and Computation
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: emk, Assigned: emk)

Tracking

Trunk
mozilla6
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 2 obsolete attachments)

(Assignee)

Description

6 years ago
Microsoft C++ compiler doesn't support UTF-8 without BOM.
So nsCssScanner.cpp is treated as Shift_JIS source on Japanese locale.
Shift_JIS converter will eat second byte regardless the second byte is valid or not.
So lines 117, 119, 121 of nsCssScanner.cpp will be "commented out" because line terminators are eaten by last charcters in the previous lines.
https://hg.mozilla.org/mozilla-central/annotate/af907a37dcf0/layout/style/nsCSSScanner.cpp#l116
For example, 'ß' is encoded 'C3 9F' in UTF-8. '9F' is a valid first byte of Shift_JIS. so it eats next byte (0A). Therefore the next line is treated as if it is a part of the previous comment line.
This causes test_ident_escaping.html and test_parse_ident.html test failure.

I suppose this will also occur on Other DBCS locales.
(Assignee)

Comment 1

6 years ago
Created attachment 526606 [details] [diff] [review]
patch

Add dummy characters to prevent line terminators from being eaten.
Attachment #526606 - Flags: review?(dbaron)
(Assignee)

Updated

6 years ago
Summary: test_ident_escaping.html and test_parse_ident.html fails if Firefox is built on DBCS locales → test_ident_escaping.html and test_parse_ident.html fail if Firefox is built on DBCS locales
Comment on attachment 526606 [details] [diff] [review]
patch

I'd prefer if you remove the non-ASCII literals entirely, perhaps by replacing them with numbers like this:

// `   a   b   c   d   e   f   g   h   i   j   k   l   m   n   o
   0,  XSI,XSI,XSI,XSI,XSI,XSI,SI, SI, SI, SI, SI, SI, SI, SI, SI,
// p   q   r   s   t   u   v   w   x   y   z   {   |   }   ~
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, 0,  0,  0,  0,  0,
// U+008*
   0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
// U+009*
   0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
// U+00A*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,
// U+00B*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,
// U+00C*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,
// U+00D*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,
// U+00E*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,
// U+00F*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,

But r=dbaron either way.  Thanks for figuring this one out.  (Thankfully, I don't think this should affect our official builds -- only builds that developers build on Japanese (etc.) systems.)
Attachment #526606 - Flags: review?(dbaron) → review+
(Assignee)

Comment 3

6 years ago
Created attachment 526610 [details] [diff] [review]
patch

Added comments and static asserts
Attachment #526606 - Attachment is obsolete: true
Attachment #526610 - Flags: review?(dbaron)
(Assignee)

Comment 4

6 years ago
Sorry, mid-air collision occured. I'll fix a patch in accordance with review comment.
The PR_STATIC_ASSERT (and the removal of the explicit length) is a good idea -- keep that.
(Assignee)

Comment 6

6 years ago
Created attachment 526613 [details] [diff] [review]
patch for check in
Attachment #526610 - Attachment is obsolete: true
Attachment #526610 - Flags: review?(dbaron)
(Assignee)

Comment 7

6 years ago
(In reply to comment #2)
> (Thankfully, I
> don't think this should affect our official builds -- only builds that
> developers build on Japanese (etc.) systems.)
Right. It's the reason this bug was not caught by official automated test.
Keywords: checkin-needed

Updated

6 years ago
Assignee: nobody → VYV03354
Pushed in cedar:
http://hg.mozilla.org/projects/cedar/rev/aada30c4e0d3
Status: NEW → ASSIGNED
Keywords: checkin-needed
OS: Windows 7 → All
Hardware: x86 → All
Whiteboard: [fixed in cedar]
Version: unspecified → Trunk
(In reply to comment #7)
> (In reply to comment #2)
> > (Thankfully, I
> > don't think this should affect our official builds -- only builds that
> > developers build on Japanese (etc.) systems.)
> Right. It's the reason this bug was not caught by official automated test.

Well, locale-specific bugs that depend on the locale the user is *running* the build on (such as bug 646882 or bug 416581 comment 6) also aren't caught by our tests.
Pushed to mozilla-central:
http://hg.mozilla.org/mozilla-central/rev/aada30c4e0d3
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Whiteboard: [fixed in cedar]
Target Milestone: --- → mozilla6
It doesn't really matter here, but it would be nice to not have to worry about putting non-ASCII characters in comments.  Is there a way we could force MSVC++ to treat all our C++ source files as UTF-8 regardless of the OS locale in use?  (From inside the build system, presumably.)
(Assignee)

Comment 12

6 years ago
The only way is inserting a BOM which will make non-MS compilers unhappy :(
http://connect.microsoft.com/VisualStudio/feedback/details/341454/
You need to log in before you can comment on or make changes to this bug.