Closed Bug 650653 Opened 14 years ago Closed 14 years ago

test_ident_escaping.html and test_parse_ident.html fail if Firefox is built on DBCS locales

Categories

(Core :: CSS Parsing and Computation, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla6

People

(Reporter: emk, Assigned: emk)

Details

Attachments

(1 file, 2 obsolete files)

Microsoft C++ compiler doesn't support UTF-8 without BOM.
So nsCssScanner.cpp is treated as Shift_JIS source on Japanese locale.
Shift_JIS converter will eat second byte regardless the second byte is valid or not.
So lines 117, 119, 121 of nsCssScanner.cpp will be "commented out" because line terminators are eaten by last charcters in the previous lines.
https://hg.mozilla.org/mozilla-central/annotate/af907a37dcf0/layout/style/nsCSSScanner.cpp#l116
For example, 'ß' is encoded 'C3 9F' in UTF-8. '9F' is a valid first byte of Shift_JIS. so it eats next byte (0A). Therefore the next line is treated as if it is a part of the previous comment line.
This causes test_ident_escaping.html and test_parse_ident.html test failure.

I suppose this will also occur on Other DBCS locales.
Attached patch patch (obsolete) — Splinter Review
Add dummy characters to prevent line terminators from being eaten.
Attachment #526606 - Flags: review?(dbaron)
Summary: test_ident_escaping.html and test_parse_ident.html fails if Firefox is built on DBCS locales → test_ident_escaping.html and test_parse_ident.html fail if Firefox is built on DBCS locales
Comment on attachment 526606 [details] [diff] [review]
patch

I'd prefer if you remove the non-ASCII literals entirely, perhaps by replacing them with numbers like this:

// `   a   b   c   d   e   f   g   h   i   j   k   l   m   n   o
   0,  XSI,XSI,XSI,XSI,XSI,XSI,SI, SI, SI, SI, SI, SI, SI, SI, SI,
// p   q   r   s   t   u   v   w   x   y   z   {   |   }   ~
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, 0,  0,  0,  0,  0,
// U+008*
   0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
// U+009*
   0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
// U+00A*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,
// U+00B*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,
// U+00C*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,
// U+00D*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,
// U+00E*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,
// U+00F*
   SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI, SI,

But r=dbaron either way.  Thanks for figuring this one out.  (Thankfully, I don't think this should affect our official builds -- only builds that developers build on Japanese (etc.) systems.)
Attachment #526606 - Flags: review?(dbaron) → review+
Attached patch patch (obsolete) — Splinter Review
Added comments and static asserts
Attachment #526606 - Attachment is obsolete: true
Attachment #526610 - Flags: review?(dbaron)
Sorry, mid-air collision occured. I'll fix a patch in accordance with review comment.
The PR_STATIC_ASSERT (and the removal of the explicit length) is a good idea -- keep that.
Attachment #526610 - Attachment is obsolete: true
Attachment #526610 - Flags: review?(dbaron)
(In reply to comment #2)
> (Thankfully, I
> don't think this should affect our official builds -- only builds that
> developers build on Japanese (etc.) systems.)
Right. It's the reason this bug was not caught by official automated test.
Keywords: checkin-needed
Assignee: nobody → VYV03354
Pushed in cedar:
http://hg.mozilla.org/projects/cedar/rev/aada30c4e0d3
Status: NEW → ASSIGNED
Keywords: checkin-needed
OS: Windows 7 → All
Hardware: x86 → All
Whiteboard: [fixed in cedar]
Version: unspecified → Trunk
(In reply to comment #7)
> (In reply to comment #2)
> > (Thankfully, I
> > don't think this should affect our official builds -- only builds that
> > developers build on Japanese (etc.) systems.)
> Right. It's the reason this bug was not caught by official automated test.

Well, locale-specific bugs that depend on the locale the user is *running* the build on (such as bug 646882 or bug 416581 comment 6) also aren't caught by our tests.
Pushed to mozilla-central:
http://hg.mozilla.org/mozilla-central/rev/aada30c4e0d3
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [fixed in cedar]
Target Milestone: --- → mozilla6
It doesn't really matter here, but it would be nice to not have to worry about putting non-ASCII characters in comments.  Is there a way we could force MSVC++ to treat all our C++ source files as UTF-8 regardless of the OS locale in use?  (From inside the build system, presumably.)
The only way is inserting a BOM which will make non-MS compilers unhappy :(
http://connect.microsoft.com/VisualStudio/feedback/details/341454/
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: