Closed
Bug 162239
Opened 22 years ago
Closed 22 years ago
POST document could not inherit charset from previous page if the previous charset is from autodetection.
Categories
(Core :: Internationalization, defect, P1)
Core
Internationalization
Tracking
()
VERIFIED
FIXED
mozilla1.0.1
People
(Reporter: kazhik, Assigned: shanjian)
References
Details
(Keywords: intl, regression, topembed, Whiteboard: [adt1 RTM] [ETA 08/21])
Attachments
(1 file, 5 obsolete files)
10.26 KB,
patch
|
jst
:
review+
darin.moz
:
superreview+
chofmann
:
approval+
|
Details | Diff | Splinter Review |
Auto-detection doesn't work for POST document.
Steps to reproduce:
(1) Set Default Character Coding to EUC-JP or ISO-2022-JP
and turn on Japanese auto-detection.
(2) Access to <http://www.jra.go.jp/> and select "HTML".
(3) Push one of the five button in the left, "Shutsuba-hyo",
"Kyousou-seiseki"...
Actual result: The next page is displayed as garbage.
Expected result: Auto-detection should work.
Original report in Bugzilla-jp:
http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=2500
Comment 1•22 years ago
|
||
This is a regression and a serious bug for Non-English users.
It seems that this problem has arisen after 1.1 Beta.
This should be All/All because we confirm this on Windows, MacOS,
Linux, FreeBSD.
When there is no charset in HTTP header nor <meta> charset,
Auto-detect should work even if it is the case of POST.
Comment 2•22 years ago
|
||
Confirm it happens on all platforms, N7.0PR1 doesn't has this problem, and base
on the comment above, the regression is after mozilla 1.1 Beta.
Nominate as nsbeta1.
Comment 3•22 years ago
|
||
In fact, this regression is between 06-12 and 06-13 branchm which is I guess
caused by fixed of bug 143579/bug 144897.
And there is another bug 156824, I think it's the same problem.
Assignee | ||
Comment 4•22 years ago
|
||
I talked to yuying, and here is my understanding. We really need to disable
autodetection for posted page, that is very important and we won't back out that
change. The problem here is that the posted page does not inherit the charset
from its submittee. I'll take a look to see what can we do.
Assignee: yokoyama → shanjian
Summary: Auto-detection doesn't work for POST document → POST document could not inherit charset from previous page if the previous charset is from autodetection.
Comment 5•22 years ago
|
||
This should be evluated as a show stopper for MachV, and major embedors. Marking
it as [ADT1 RTM].
Comment 6•22 years ago
|
||
The current summary was changed to:
POST document could not inherit charset from previous page if the previous
charset is from autodetection.
Actually, it's not only for the previous page charset is from auto-detection but
also previous page with charset meta-tag while the submission result page has no
meta-charset, see bug 156824.
Updated•22 years ago
|
Whiteboard: [adt1 RTM] [ETA Needed] → [adt1 RTM] [ETA 08/20]
Assignee | ||
Comment 7•22 years ago
|
||
Assignee | ||
Comment 8•22 years ago
|
||
Explaination to the problem and fix:
For POST document, charset autodetection is disabled and charset of submitting
document should be used. The only charset we carry over from old doc to new doc
is default charset in the existing code. Clearly it is not a good idea to change
default charset, otherwise it will affect all pages loaded after that point. We
have hintCharset in nsIMarkupDocumentViewer which is not used for anything right
now. My solution is set this hint charset based on submitting document, and
utilized this hint during the new document load.
Status: NEW → ASSIGNED
Comment 9•22 years ago
|
||
Comment on attachment 95909 [details] [diff] [review]
patch
sr=jst
Attachment #95909 -
Flags: superreview+
Comment 10•22 years ago
|
||
This is a pretty bad regression. We should definitely mark this one as a show
stopper.
Updated•22 years ago
|
Keywords: mozilla1.0.1,
mozilla1.2
Assignee | ||
Comment 11•22 years ago
|
||
Assignee | ||
Comment 12•22 years ago
|
||
In the new patch, I lower the priority of inherited charset to meta charset and
http header charset. We already have some code there to prevent meta charset
from reloading a POST page. This way we can still honor meta charset and http
header charset if it is different from form document.
Comment 13•22 years ago
|
||
Comment on attachment 95960 [details] [diff] [review]
update patch which lower the priority of hint charset.
r=jkeiser, but two comments:
(a) please move nsCOMPtr<nsIDocument> document; into the if() right before you
use it (to reduce its scope)
(b) you don't need to declare hintCharset anymore.
This patch looks good :)
Attachment #95960 -
Flags: review+
Assignee | ||
Comment 14•22 years ago
|
||
Assignee | ||
Updated•22 years ago
|
Attachment #95909 -
Attachment is obsolete: true
Attachment #95960 -
Attachment is obsolete: true
Assignee | ||
Comment 15•22 years ago
|
||
Comment on attachment 95969 [details] [diff] [review]
update take john's comment.
carry over review.
Attachment #95969 -
Flags: superreview+
Attachment #95969 -
Flags: review+
Assignee | ||
Updated•22 years ago
|
Attachment #95969 -
Flags: superreview+
Updated•22 years ago
|
Attachment #95969 -
Flags: superreview+
Comment 16•22 years ago
|
||
Comment on attachment 95969 [details] [diff] [review]
update take john's comment.
sr=dveditz
Comment 17•22 years ago
|
||
Comment on attachment 95969 [details] [diff] [review]
update take john's comment.
A few nits:
- In nsDocShell.cpp:
+ if (document) {
+ nsAutoString docCharset;
+ document->GetDocumentCharacterSet(docCharset);
...
This file uses 4-space indentation, so use 4 spaces here...
- In nsHTMLDocument.cpp:
@@ -955,7 +970,15 @@
// TryHintCharset and TryParentCharset. It should be always safe to try more
// sources.
if (! TryUserForcedCharset(muCV, dcInfo, charsetSource, charset)) {
- TryHintCharset(muCV, charsetSource, charset);
+ //check if current doc is from POST command
Add a space after the '//' to match other comments here.
sr=jst
Assignee | ||
Comment 18•22 years ago
|
||
A problem was found with the current approach. The use of hintCharset conflicts
with charset autodetection. I need to figure out a new patch.
Assignee | ||
Updated•22 years ago
|
Attachment #95969 -
Attachment is obsolete: true
Attachment #95969 -
Flags: superreview+
Attachment #95969 -
Flags: review+
Assignee | ||
Comment 19•22 years ago
|
||
Assignee | ||
Comment 20•22 years ago
|
||
Comment on attachment 96042 [details] [diff] [review]
new patch with a new defined charset field in document viewer
One more test, this patch does not seems to work in some situation.
Attachment #96042 -
Attachment is obsolete: true
Comment 21•22 years ago
|
||
In which situations does the patch not work? Pls elaborate ...
Whiteboard: [adt1 RTM] [ETA 08/20] → [adt1 RTM] [ETA 08/21]
Assignee | ||
Comment 22•22 years ago
|
||
Assignee | ||
Comment 23•22 years ago
|
||
about the patch:
PrevDocCharacterSet is added to document viewer, just as other charset source,
such as forcedCharset, hintCharset and defaultCharset. This field is set when a
document get a new charset. The charset is passed around. For POST document,
this information will be referenced.
Most of the charset is set in "nsHTMLDocument::StartDocumentLoad". There is one
exception. Parser may set charset if it found meta tag charset. We need to
record the charset in that case. That is done in
"HTMLContentSink::SetDocumentCharset". Parser notified the nsDocument about its
charset change in the same place.
Comment 24•22 years ago
|
||
Comment on attachment 96056 [details] [diff] [review]
this patch should work in all situation.
- In nsIMarkupDocumentViewer.idl:
*/
attribute PRInt32 hintCharacterSetSource;
+ /*
+ character set from prev document
+ */
+ attribute wstring prevDocCharacterSet;
+
Please fix this indentation mess, there are tabs all over this file, so be
careful.
sr=jst
Attachment #96056 -
Flags: superreview+
Assignee | ||
Comment 25•22 years ago
|
||
yuying found a problem in the test build.
Still investigating.
Assignee | ||
Comment 26•22 years ago
|
||
Assignee | ||
Updated•22 years ago
|
Attachment #96056 -
Attachment is obsolete: true
Attachment #96056 -
Flags: superreview+
Comment 27•22 years ago
|
||
Comment on attachment 96093 [details] [diff] [review]
add sink charset updated in charset detector
>Index: docshell/base/nsIMarkupDocumentViewer.idl
>+ /*
>+ character set from prev document
>+ */
>+ attribute wstring prevDocCharacterSet;
>+
> //void GetCharacterSetHint(in wstring hintCharset, in PRInt32 charsetSource);
string-fu nit: you should use AString instead of wstring here. also please fix
the indentation and remove commented out function.
i haven't verified the logic of the patch itself. someone more familar with
the
parser, docshell, and the content sink should really do that.
Comment 28•22 years ago
|
||
Comment on attachment 96093 [details] [diff] [review]
add sink charset updated in charset detector
r/sr=darin
i spoke to shanjian via AIM, and he told me that he's fixed the indentation
problem, and we decided to hold off on converting the interface from wstring to
AString (since many other methods/attributes already use wstring). plus, it
keeps things simpler, which is a good thing for the 1.0 branch.
Attachment #96093 -
Flags: superreview+
Comment 29•22 years ago
|
||
Comment on attachment 96093 [details] [diff] [review]
add sink charset updated in charset detector
r/sr=jst
Attachment #96093 -
Flags: review+
Comment 30•22 years ago
|
||
adt1.0.1+ (on ADT's behalf) approval for checkin to the 1.0 branch, pending
Drivers' approval. pls check this in asap, then replace "mozilla1.0.1+" with the
"fixed1.0.1" keyword.
Keywords: adt1.0.1+
Comment 31•22 years ago
|
||
Pls Note: Teri Preston in QA tested a private build of the 1.0 brnach with this
fix, and verified there were "No new form control regressions found on this
build." Yuying Long, the assigned QA conatct, also verified on a private build
that patch works as designed, and found no regressions.
Comment 32•22 years ago
|
||
I carefully look at the patch and go over with shanjian about the logic. It
looks like his patch make sense and are right. Cannot think about other issue
now. We should land it.
Updated•22 years ago
|
Keywords: mozilla1.0.1 → mozilla1.0.1+
Comment 33•22 years ago
|
||
Comment on attachment 96093 [details] [diff] [review]
add sink charset updated in charset detector
a=chofmann for 1.0.1
Attachment #96093 -
Flags: approval+
Comment 34•22 years ago
|
||
we need to test the heck out of this one in branch builds as soon as this lands.
Assignee | ||
Comment 35•22 years ago
|
||
fix checked into 1.0 branch.
Keywords: mozilla1.0.1+ → fixed1.0.1
Assignee | ||
Comment 36•22 years ago
|
||
Yuying's test report:
I have verified this build with bug 162239, the results are good. Here are the
details:
1. Non charset meta-tag in page before submission, auto-detector works proper
with different options (Universal, Japanese...), then POST, get a display
correct page which is carry over from previous page:
e.g. http://www.jra.go.jp
2. With charset meta-tag in page before submission, and no charset meta-tag in
after form submission posted page, will carry over from previous page also:
e.g. http://babel/tests/browser/form/gb2312_simpleform.html
3. Before/After form submission has different charset, the submission result
page will take it's own page charset instead of previous page charset, the
purpose for test this is there might be some western pages would possible has
two different western charsets between before and after submission.
http://babel/tests/browser/form/gb2312_shiftJis.html
4. There is no change with a related method GET in a form, auto-detect will
keep work with this method. While POST doesn't enable the auto-detection which
is what we intentionally to do.
5. Some other related scenarios outside a form with auto-detection, charset
carry over through links, different language auto-detect options, different
language encoding settings...etc. No problem was found so far.
I'm comfortable with current results - the page display, marked charset, reload
condition...etc.., even though I found a separate problem (not critical) today
that has been there for a long time, and I'm going to file it as a new bug.
Assignee | ||
Comment 37•22 years ago
|
||
This patch change the way how charset is resolved for POST document. It should
have no impact on non-POST pages. When verify this bug, except the general
regression, please pay special attent to how the submitting docment get its
charset from. The POST doc's charset shouldn't be affected.
Comment 38•22 years ago
|
||
ylong: pls verify this as fixed on the 1.0 branch tomorrow. thanks!
Comment 39•22 years ago
|
||
Verified fixed in 08-21 branch build on all platforms. The result is same as in
comment #36.
Keywords: fixed1.0.1 → verified1.0.1
Assignee | ||
Comment 40•22 years ago
|
||
fix checked into trunk.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 41•22 years ago
|
||
Fixed verified in 09-12 trunk build / WinXP-SC.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•