View Source after document.write tries to show (wyciwyg) UTF-16 using the previous page's charset

VERIFIED FIXED in mozilla1.9alpha8

Status

()

VERIFIED FIXED
12 years ago
9 years ago

People

(Reporter: jruderman, Assigned: smontagu)

Tracking

(4 keywords)

Trunk
mozilla1.9alpha8
x86
All
intl, qawanted, regression, testcase
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9 +
in-testsuite ?

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 1 obsolete attachment)

(Reporter)

Description

12 years ago
Steps to reproduce:
1. Load the testcase in Firefox trunk.
2. Click the button.
3. View Source (using Cmd+U).

Expected: foo

Result: ÿþf?o?o? (where each "?" is a question-mark-in-diamond replacement character, and "ÿþ" is probably a UTF-16 byte-order-mark FFFE being interpreted as U+00FF U+00FE).

This regression appeared in two steps:
- between 2006-09-08 and 2006-11-04, the "ÿþ" appeared.
- between 2007-01-02 and 2007-01-29, the "?"s appeared.
(Reporter)

Comment 1

12 years ago
Created attachment 266143 [details]
testcase
(Reporter)

Comment 2

12 years ago
When I load the testcase from Bugzilla, I get this instead:

??f?o?o?

I guess when I load it locally, Firefox tries to treat the UTF-16 as ISO-8859-1, but when I load it from Bugzilla, Firefox tries to treat the UTF-16 as UTF-8.  Those are the encodings shown by Page Info for the testcase before I click the button.
Summary: View Source after document.write tries to show UTF-16 as something else (wyciwyg) → View Source after document.write tries to show (wyciwyg) UTF-16 using the previous page's charset
This isn't a problem in Seamonkey, but I can definitely reproduce it in Firefox.  So chances are, it's something in the Firefox view-source UI.  nsHTMLDocument sees a user-forced charset on the docinfo, which is UTF-8 in the testcase.

Of course having smaller regression ranges (with hours) might help narrow down the problem....
Keywords: qawanted
For what it's worth, the relevant callstack seems to be:

#0  nsDocumentCharsetInfo::SetForcedCharset (this=0x34b50430, aCharset=0x34b7b2d0) at ../../../../mozilla/intl/chardet/src/nsDocumentCharsetInfo.cpp:59
#1  0x17934559 in nsDocShell::SetCharset (this=0x34b64930, aCharset=0x34b7b290 "UTF-8") at ../../../mozilla/docshell/base/nsDocShell.cpp:1241
#2  0x0136c708 in NS_InvokeByIndex_P (that=0x34b649e0, methodIndex=4, paramCount=1, params=0xbfffab54) at ../../../../../../../mozilla/xpcom/reflect/xptcall/src/md/unix/xptcinvoke_unixish_x86.cpp:179
#3  0x12a435f4 in XPCWrappedNative::CallMethod (ccx=@0xbfffad90, mode=CALL_SETTER) at ../../../../../mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:2245

with the JS stack being:

(gdb) jsstack 
0 [native frame]
1 viewSource(url = "wyciwyg://0/https://bugzilla.mozilla.org/attachment.cgi?id=266143") ["chrome://global/content/viewSource.js":90]
    loadFromURL = true
    arg = "charset=UTF-8"
    arrayArgComponents = charset,UTF-8
    docCharset = [xpconnect wrapped (nsISupports, nsIDocShell, nsIInterfaceRequestor, nsIDocCharset) @ 0x34b0dc90 (native @ 0x34b64934)]
    PageLoader = undefined
    loadFlags = undefined
    viewSrcUrl = undefined
    wraplonglinesPrefValue = undefined
    this = [object ChromeWindow @ 0x3f6860a0 (native @ 0x3f67689c)]
2 onLoadViewSource() ["chrome://global/content/viewSource.js":32]
    this = [object ChromeWindow @ 0x3f6860a0 (native @ 0x3f67689c)]

The code in question was first added on 2006-09-19.  ccing patch author and reviewers.  I seriously doubt that we want to do what that patch did here, myself...  We should only set the override if the original page had the override set.  Otherwise we should set the default, imo

I have no idea what might have happened in your other regression range.

On current trunk, the characterSet of a wyciwyg document will have nothing to do with what it should be parsed as (which is UTF-16), but that changed very very recently, outside of both of your regression ranges.  It _does_ mean that wyciwyg view-source is broken because of this charset-forcing, however.
Blocks: 286365
Flags: in-testsuite?
Flags: blocking-firefox3?
Keywords: intl
(Reporter)

Updated

12 years ago
Duplicate of this bug: 384313

Updated

12 years ago
Flags: blocking-firefox3? → blocking-firefox3+
Target Milestone: --- → Firefox 3 beta1
(Reporter)

Updated

12 years ago
Duplicate of this bug: 358673

Comment 7

12 years ago
Confirmed with Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a6pre) Gecko/20070623 Minefield/3.0a6pre on the testcase.
OS: Mac OS X → All
(Assignee)

Updated

11 years ago
Assignee: nobody → smontagu
(Assignee)

Comment 8

11 years ago
Created attachment 272912 [details] [diff] [review]
Patch

This does what bz suggests in comment 4, which fixes the bug as described.

However, it doesn't help in the following scenario:
1. Load the testcase.
2. Change the encoding (to Windows-1252, Shift_JIS or whatever)
3. Click the button.
4. View Source.

Thoughts?
> 2. Change the encoding (to Windows-1252, Shift_JIS or whatever)

Hmmm.... mDocumentCharsetInfo persists across document loads, right?  That makes it sound, to me, like the wrong place to store this information.

For example bfcache. If you force a charset on some page, then go back, the docshell has the forced charset flag, but the previous page wasn't parsed with that forced charset, and neither should its source be when viewed.

In other words, what you really want to do is ask the document for both its charset and its charset source, and only force if the source was forced.

And we might need to reset the source when doing a document.open, of course.... Probably a good idea anyway.
(Assignee)

Comment 10

11 years ago
(In reply to comment #9)

> In other words, what you really want to do is ask the document for both its
> charset and its charset source, and only force if the source was forced.

Just to be sure that I've understood and am not making the wrong assumptions, do you mean adding characterSetSource (or isCharacterSetForced) to nsIDOMNSDocument?
No, we don't want to do that.  That would expose it to web script, which doesn't seem desirable.

We want to either make the boolean prop on the charset info talk to the currently loaded document or add this to nsIDOMWindowUtils or something.

Updated

11 years ago
Target Milestone: Firefox 3 M7 → Firefox 3 M8
(Assignee)

Comment 12

11 years ago
Created attachment 273547 [details] [diff] [review]
Patch v.2

What I'm really asking for here is r=Mano on the toolkit part and r+sr=bz on dom and content.
Attachment #272912 - Attachment is obsolete: true
Attachment #273547 - Flags: superreview?(bzbarsky)
Attachment #273547 - Flags: review?(mano)
Comment on attachment 273547 [details] [diff] [review]
Patch v.2

>Index: dom/public/idl/base/nsIDOMWindowUtils.idl
>+  readonly attribute boolean isForcedCharset;

Perhaps |docCharsetIsForced| or something?

>Index: dom/src/base/nsDOMWindowUtils.cpp
>+    if (doc) {
>+      if (doc->GetDocumentCharacterSetSource() >= kCharsetFromParentForced)
>+        *aIsForced = PR_TRUE;

*aIsForced = doc && doc->GetDocumentCharacterSetSource() >= kCharsetFromParentForced;

with those nits, the core code looks good.
Attachment #273547 - Flags: superreview?(bzbarsky) → superreview+
Comment on attachment 273547 [details] [diff] [review]
Patch v.2

r=mano on the toolkit bits.
Attachment #273547 - Flags: review?(mano) → review-
(Assignee)

Comment 16

11 years ago
Created attachment 274984 [details] [diff] [review]
Patch for checkin with bz's nits fixed

The core code in this patch just adds one attribute and associated getter to nsIDOMWindowUtils, and also resets the charset source of the document when doing a document.open.
Attachment #274984 - Flags: superreview+
Attachment #274984 - Flags: review+
Attachment #274984 - Flags: approval1.9?
(Assignee)

Updated

11 years ago
Depends on: 391631
(Assignee)

Comment 17

11 years ago
Comment on attachment 274984 [details] [diff] [review]
Patch for checkin with bz's nits fixed

core code was split off and approved in bug 391631
Attachment #274984 - Flags: approval1.9?
(Assignee)

Comment 18

11 years ago
Toolkit part checked in.
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED

Comment 20

11 years ago
This could have been avoided if the isForcedCharset had been added to the end of the parameter list instead of the middle!
Product: Firefox → Toolkit
Does this bug need any other qa work, or can qawanted be removed?

Verified Fixed.

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.3a1pre) Gecko/20100113 Minefield/3.7a1pre (.NET CLR 3.5.30729) ID:20100113050725
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.