Closed
Bug 190302
Opened 23 years ago
Closed 22 years ago
any xsl transformations show wrong codepage, but draw normal page
Categories
(Core :: XSLT, enhancement)
Core
XSLT
Tracking
()
RESOLVED
FIXED
People
(Reporter: andrew_v, Assigned: peterv)
References
Details
Attachments
(1 file, 2 obsolete files)
3.91 KB,
patch
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021130
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021130
any xsl transformaions show wrong codepage, but draw normal page
Reproducible: Always
Steps to Reproduce:
1. do any xsl transformaions with any encoding( default value is "UTF-8")
2. Ctrl+I and view an "Encoding= 3. ISO-8859"
3. why?
Comment 1•23 years ago
|
||
well, we actually should set the encoding to UTF-16, regardless of what the
stylesheet says. As that is what we do.
(Andrew, encoding is a serialisation issue, and Mozilla does not serialize its
output but generates content directly. Internally, all string data is encoded in
2 byte strings, which, AFAICT, is UCS2 (sp?), closest to UTF-16?)
I should look up our possibilities before rambling.
Severity: normal → enhancement
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Linux → All
Hardware: PC → All
Most of the time the codepage is pretty useless information, but in some cases
it is used as default encoding when loading linked resources such as
stylesheets. If that is done according to some spec we should allow control over
that even for XSLT generated pages.
However if that is done strictly because we want to be intelligent, then we
should probably set the codepage of the result document to be the same as that
of the source document.
Assignee | ||
Comment 3•22 years ago
|
||
Could be as simple as
mDocument->SetDocumentCharacterSet("UTF-16");
mDocument->SetDocumentCharacterSetSource(kCharsetFromOtherComponent);
There are 3 options though:
1) Use the stylesheet's hint
2) Use the source's codepage
3) Always use UTF-16
Right now I'm leaning to 1, but I'm not sure.
I think it mostly depends on if we indeed use it to load external resources such
as stylesheets. If we do then I think the most important thing is that we get
that part right, i.e. then we should use 2).
On the other hand, if the encoding in the document isn't used for anything other
then to show something in the pageinfo dialog then 1) gets my vote since that is
what the author explicitly requests.
3) doesn't really make any sense to me since we're not using UTF-16. We're not
"really" using any encoding since encodings only exists in serialized documents,
which we of course don't have.
CC-ing bz to get his input on if/how the stylecode uses the document codepage.
![]() |
||
Comment 5•22 years ago
|
||
The document charset is used in a few places:
1) When loading CSS sheets, it's used if there is no HTTP header listing the
charset, no @charset rule in the sheet, and not charset attr on the element
or PI loading the sheet (in other words, 99% of the time).
2) When unescaping URIs (since the unescaped version of a URI is a byte array
which then needs to be converted to characters). This case is quite evil,
since we have no real way to tell whether a given URI string is coming from
the original XML document the XSLT is applied to or from the XSL sheet
itself. I'd use one or the other for the encoding there and maybe have an
NS_WARNING when they don't match.
3) When saving the document as "web page, complete" (serialization issue); in
fact when performing any sort of serialization to byte-stream.
4) Same as for stylesheets but for scripts loaded via <script src=""> (used if
there is no HTTP header, no charset attr on the loading element, and no
BOM).
There are probably other uses that I can't think of at the moment.
Summary: any xsl transformaions show wrong codepage, but draw normal page → any xsl transformations show wrong codepage, but draw normal page
point three in comment 5 convinces me that we should use the encoding specified
in the stylesheet if one is specified.
But from the other points I think we should use either the stylesheets or the
documents encoding as the default. Can't really say I have any arguments for if
the documents or the stylesheets encoding is more important.
![]() |
||
Comment 7•22 years ago
|
||
Imo, if the sheet specifies an encoding we should just use that. If the author
is confused, that's the author's problem. This gives authors the freedom to not
specify an encoding if they want us to "just do something".
Blocks: 220687
This patch makes us use the encoding specified in <xsl:output> element, or
fallback to the source-documents encoding if none is specified.
We should really should refactor the code in
txMozillaTextOutput::createResultDocument and
txMozillaXMLOutput::createResultDocument into a common function, but that's for
another bug.
Attachment #133706 -
Flags: superreview?(peterv)
Attachment #133706 -
Flags: review?(axel)
Comment 9•22 years ago
|
||
Comment on attachment 133706 [details] [diff] [review]
patch to fix
<Pike> like, three Sets and one Get
<sicking> doh!
could you move the charset up in txMozillaXMLOutput, so that the call to
ResetWithSource is part of the patch? That makes it easier to grok.
Attachment #133706 -
Flags: review?(axel) → review-
Assignee | ||
Updated•22 years ago
|
Attachment #133706 -
Flags: superreview?(peterv)
Attachment #133706 -
Attachment is obsolete: true
Attachment #133776 -
Flags: superreview?(peterv)
Attachment #133776 -
Flags: review?(axel)
Assignee | ||
Updated•22 years ago
|
Attachment #133776 -
Flags: superreview?(peterv) → superreview+
Updated•22 years ago
|
Attachment #133776 -
Flags: review?(axel) → review+
checked in
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Attachment #133776 -
Attachment is obsolete: true
You need to log in
before you can comment on or make changes to this bug.
Description
•