Bugzilla

Comment 5

•

23 years ago

Franck, could you attach the two testcases to the bug report?  Thanks.

Hixie (not reading bugmail)

Comment 6

•

23 years ago

Reassigned to ftang.  Franck, please attach the two testcases to the bug report 
and reassign the bug to me.

Related bugs are bug 66190 and bug 63502

Assignee: pierre → ftang

Target Milestone: Future → ---

Frank Tang

Reporter

Comment 7

•

23 years ago

pierre- just visit
1. visit http://ftang/ftang/css2/kanji/bug.xml
2. visit http://ftnag/ftang/css2/kanji/correct.xml

Assignee: ftang → pierre

Updated

•

23 years ago

Summary: loading stylesheet in xml by using <?xml-stylesheet do not listen to charset parameter → loading stylesheet in xml by using <?xml-stylesheet do not listen to charset parameter [IMPORT]

Comment 8

•

23 years ago

Frank please attach the pages as there are people outside of netscape who might
be interested in this bug.

Comment 9

•

23 years ago

Attached file ftang's bug.xml — Details

Comment 10

•

23 years ago

Attached file ftang's correct.xml — Details

Comment 11

•

23 years ago

Attached file ftang's bug.css — Details

Comment 12

•

23 years ago

Attached file ftang's correct.css — Details

Comment 13

•

23 years ago

Boris: another charset bug... Do you want to take it?

Target Milestone: --- → mozilla1.0

Assignee

Comment 14

•

23 years ago

um.... Let me wrap up my other ones first.. I have no idea where to even start
on this one.  But I'll keep it in mind.  :)

Comment 15

•

23 years ago

using build 2001100903 win32

both testcases do not work. Is @charset also broken?

Asa Dotzler [:asa]

Updated

•

23 years ago

Blocks: 104166

Comment 16

•

23 years ago

Attached file Corrected test cases in z.ip file format. Use these 4 files instead. @charset is working. — Details

Comment 17

•

23 years ago

I think files were converted when uploaded with non-Japanese
browser encoding or something like that. I zipped up the original
4 files and attached it above. Unarchive the file with WinZip and
you should see the @charset working with correct.xml & correct.css files.

Keywords: nsbeta1

Comment 18

•

23 years ago

ok, with the zip attachment it works.

Updated

•

23 years ago

Summary: loading stylesheet in xml by using <?xml-stylesheet do not listen to charset parameter [IMPORT] → [charset]loading stylesheet in xml by using <?xml-stylesheet do not listen to charset parameter [IMPORT]

Assignee

Comment 19

•

23 years ago

Frank, I have the fix for bug 72658 in my tree, so the included testcase
worksforme (since the document charset and the stylesheet charset are the same).

I can disable that code while I work on this, but could you possibly create a
stylesheet in a _different_ charset from the document for testing and
verification purposes?

Comment 20

•

23 years ago

Attached file Additional test cases .zip file for this bug. Almost all the test cases have Element names in UTF-8 or Shift_JIS Japanese. Read 'attachcomments.txt' for explanation. Some test cases with Japanes elment names included for Bug 72658 also. — Details

Comment 21

•

23 years ago

Explanation of test cases from "attachcomments.txt" file:

The following test should be conducted with default browser encoding set to
Western (ISO-8859-1), Edit | Prefs | Navigator | Languages. auto-detection must
be OFF.

** For test cases 1-4: Element names are in Japanese: all XML & CSS files are in
Shift_JIS Japanese.

1. shiftjisA.xml/shiftjisa.css -- stylesheet charset; no @charset in .css file.
(Patch for 72658 or patch for this bug should work)

2. shiftjisB.xml/shiftjisb.css -- no stylesheet charset; @charset in .css file.
(should work now with no patches)

3. shiftjisC.xml/shiftjisc.css -- stylesheet charset; @charset in .css file.
(should work now with no patches)

4. shiftjisD.xml/shiftjisd.css -- no stylesheet charset; no @charset in .css
file. (Only the patch for 72658 can fix this problem.)


** All CSS files in the following tests are encoded in UTF-8. XML files are
either in Shift_JIS Japanese or UTF-8.

5. utf8a.xml/utf8a.css -- XML in Shift_JIS; stylesheet charset=UTF-8; no
@charset in .css file. (Color style works because the element names are in
ASCII. Character display is incorrect. Only Patch for this bug can fix the
latter problem.)

6. utf8b.xml/utf8b.css -- XML in Shift_JIS; stylesheet charset=UTF-8; no
@charset in .css file. (NO styling applied. Unlike 5, element names are UTF-8
Japanese in .css file. Only patch for this bug can fix it.)

7. utf8c.xml/utf8c.css -- XML in Shift_JIS; no stylesheet charset; @charset
exists in .css file. Element names in UTF-8 Japanese in .css file. (This should
work now without any patches)

8. utf8d.xml/utf8d.css -- XML doc in UTF-8 but no encoding declaration; no
stylesheet charset; no @charset in .css file. Element names in UTF-8 Japanese in
.css file. (Only the patch for 72658 can fix this problem.)

9. utf8e.xml/utf8e.css -- XML doc in UTF-8 but no encoding declaration; no
stylesheet charset; @charset=UTF-8 in .css file. Element names in UTF-8 Japanese
in .css file. (This should work now without any patches.)

Test cases 5 & 6 can be viwed correctly only with the fix for this
bug.

Test cases 4 & 8 can be correctly viwed only with the fix for Bug 72658.

Test case 1 can be fixed with the patch for this bug or Bug 72658.

** These test cases also show that Mozilla can handle non-ASCII element
names in CSS definitions. (IE6 cannot currently.) Mozilla can also
handle non-ASCII attribute names, values, and IDs in CSS definitions 
but these are not in the current test cases.

Assignee

Comment 22

•

23 years ago

Thanks for the testcases!

My build currently passes all of them except utf8d.xml/utf8d.css

At a guess, this is because the stylesheet is loaded _before_ we've done charset
sniffing on the XML document (I assume that's how we get the XML doc's charset).

In particular, we ask the document for its charset in that case and the document
tells us that it's in ISO-8859-1....

Assignee

Comment 23

•

23 years ago

Attached patch Proposed patch (works correctly on all of the attached testcases) — Details — Splinter Review

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 24

•

23 years ago

I'll take this one after all... :)

Patch fixes this and also bug 72658 and bug 83207

Assignee: pierre → bzbarsky

Keywords: patch, review

Comment 25

•

23 years ago

The comment below is not correct, since the default charset of an XML document
is UTF-8. I would advice deleting ", falling back to ISO-8869-1".

+    // NOTE: the SetCharset method will always get the preferred
+    // charset from the charset passed in unless it is the
+    // emptystring, which causes the default charset (that of the
+    // document, falling back to ISO-8869-1) to be set

Assignee

Comment 26

•

23 years ago

Hmm..  Perhaps I should clarify that to:

"that of the document, falling back to ISO-8859-1 if no document is present"

But that being said, would UTF-8 be a more reasonable fallback for the default
charset if we have absolutely no other way of getting it?

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

23 years ago

Blocks: 83207

Comment 27

•

23 years ago

The default charset for XML is UTF-8, I have no idea what the default charset
for CSS would be. See if the spec has anything to say. If not, I think
ISO-8859-1 is good for CSS.

Comment 28

•

23 years ago

according to http://www.w3.org/TR/REC-CSS2/syndata.html#q23 

<quote>
When a style sheet resides in a separate file, user agents must observe the
following priorities when determining a document's character encoding (from
highest priority to lowest):

1. An HTTP "charset" parameter in a "Content-Type" field.
2. The @charset at-rule.
3. Mechanisms of the language of the referencing document (e.g., in HTML, the
"charset" attribute of the LINK element).

</quote>

Assignee

Comment 29

•

23 years ago

Yes. And 

4.  Use the document's character encoding

What's the fallback in case all of 1-4 fail, though?  (yes, we _do_ have a case
in which this is necessary due to other issues that are sort of outside the
scope of this bug, imo).

Comment 30

•

23 years ago

Oops! Quoted the wrong part. Wanted to quote this part:
<quote>
For transmission and storage, these characters must be encoded by a character
encoding that supports the set of characters available in US-ASCII(e.g., ISO
8859-x, SHIFT JIS, etc.).
</quote>

it doesn't say what should be the default though. I'm wondering if it should be
the same as however moz treat html pages?

Assignee

Comment 31

•

23 years ago

Ok.... tracing through the code, the _only_ time that we actually need that #5
fallback is when we are loading the agent sheets.  There are ways to restructure
the code that would make this fallback unnecessary, as I said.  Not going to do
it as part of this patch.

But our internal sheets are fine in ISO-8859-1. So we can just leave it at that.
So, with my proposed change to that comment, reviews?

Comment 32

•

23 years ago

> What's the fallback in case all of 1-4 fail, though?  
> (yes, we _do_ have a case in which this is necessary 
> due to other issues that are sort of outside the
> scope of this bug, imo).

I meant testcase #8 to prove that whatever current 
document encoding determined by the browser should
propagate into unlabaled (in terms of charset/encoding) 
CSS files. You should just check what the final document
encoding is and then use that for CSS, too. 
My intent was that that encoding should be UTF-8 as 
required by XML 1.0. NOT ISO-8859-1.

Assignee

Comment 33

•

23 years ago

Yep.  That's what my patch does.  The XML document was actually reporting its
own encoding incorrectly.  That's what my change to nsXMLDocument.cpp fixes.

Comment 34

•

23 years ago

Replace fprintf(stderr) with a debug macro.  We have a couple of other instances 
of 'stderr' in nsCSSLoader that need to be removed.

Rename parameters to OnStreamComplete() as "a-Uppercase" (ie. aContext, 
aString...)

Why in CSSLoaderImpl::SetCharset() do you look for "@charset" in 
strStyleDataUndecoded instead of in aStyleSheetData directly?

Comment 35

•

23 years ago

Comment on attachment 53578 [details] [diff] [review]
Proposed patch (works correctly on all of the attached testcases)

r=pierre with minor changes above

Attachment #53578 - Flags: review+

Assignee

Comment 36

•

23 years ago

Oops. the stderr was not meant to be in there at all.  removed.  :)

Parameters renamed.

aStyleSheetData is a char*.  It used to be a nsString, but I've changed that...
basically, the creation of the nsString moved from OnStreamComplete to SetCharset().

bug 80106 will address further improvements to how we parse @charset; that's
what I plan to work on once this is done... 

Or did I misunderstand the comment?

Marc Attinasi

Comment 37

•

23 years ago

Comment on attachment 53578 [details] [diff] [review]
Proposed patch (works correctly on all of the attached testcases)

sr=attinasi

Attachment #53578 - Flags: superreview+