View/Page Source shows in windows-1252, even when Shift_JIS is detected by auto-detect and page is shown in Shift_JIS as expected. To obtain Shift_JIS display by View/Page Source, one of next is required. - At page display window, Change View/Character Encoding to other than Shift_JIS, then change back to Shift_JIS => Disk Cache, charset: is changed to Shift_JIS from windows-1252 - At View/Page Source window, Change View/Character Encoding to Shift_JIS => Disk Cache, charset: is changed to Shift_JIS from windows-1252 [Build ID] > Build identifier: Mozilla/5.0 (Windows NT 5.1; rv:2.0b12pre) Gecko/20110220 Firefox/4.0b12pre [Test Web page] > http://bugzilla-attachments.mozilla.gr.jp/attachment.cgi?id=3762 Written in Shift_JIS. Meta tag is intentionally commented out for test of auto-detect. <!-- <meta http-equiv="Content-Type" content="...; charset=Shift_JIS"> --> [NSPR log(all:5) for load of above Web page] > http://bugzilla-attachments.mozilla.gr.jp/attachment.cgi?id=4122 [Extracted NSPR log lines to show test procedure] > http://bugzilla-attachments.mozilla.gr.jp/attachment.cgi?id=4123  Auto-Detect = Japanese (Japanese is not mandatory. Same with others)  Cache clear, initial load Page is shown in Shift_JIS. View/Character Encoding : Shift_JIS is shown. At this step, Disk Cache, charset: window-1252 is set. => View/Page Source shows in windows-1252  Change auto-detect language choice of View/Character Encoding/Auto-Detect Change from Japanese to Chinese etc. Page is still shown in Shift_JIS. View/Character Encoding : Shift_JIS is still shown. At this step, Disk Cache, charset: window-1252 is still set. => View/Page Source shows in windows-1252  Change charset choice of View/Character Encoding to UTF-8 Page is shown in UTF-8 View/Character Encoding : UTF-8 is shown. At this step, Disk Cache, charset: UTF-8 is set. => View/Page Source shows in UTF-8 If this web page is loaded by Fx3, phenomenon of "double HTTP GET" is always observed. First HTTP GET for initial load. Second HTTP GET by charset change by auto-detect. Same problem as bug 597820? In this bug's case, additional resource to which HTTP GET is requested is "fav icon" only. Is "fav iocon" subresource of this page? I don't think "to hit the network again for the loads we started before finding the <meta>" is required in this bug's case. I think "save detected charset in Disk Cache by auto-detect" is sufficient.
Note: mozilla.gr.jp sends the test page with next header, as bugzillla.mozilla.org does do for attachment of a bug with no charset specification in mime-type. > Content-Type: text/html; charset=; name="testcase.html"
Hmm. We almost certainly don't run the charset sniffer for the view-source load, right? Arguably, we should store the sniffed charset in the shentry.... > Same problem as bug 597820? The double-get thing? No.
(In reply to comment #2) > Hmm. We almost certainly don't run the charset sniffer for the view-source > load, right? I don't know what the current code does. For the new code, I was planning on treating view-source: GETs like http: POSTs: Sniffing only the first 1024 bytes and not allowing reloads. Does that make sense as the plan going forward? > Arguably, we should store the sniffed charset in the shentry.... I had expected the charset to go into the history entry already, but I hadn't tested. :-(
> Does that make sense as the plan going forward? Yes, but imo we should be propagating the charset that was used to view the page to view-source, which will look like a channel charset to the parser... > I had expected the charset to go into the history entry already Doesn't seem to. I think the right fix to this bug is to store the charset in the shentry and propagate it to the view-source load.
You need to log in before you can comment on or make changes to this bug.