Closed Bug 59268 Opened 25 years ago Closed 24 years ago

Autodetection should not override charset from cache and bookmark

Categories

(Core :: Internationalization, defect, P3)

x86
Windows 2000
defect

Tracking

()

VERIFIED WORKSFORME

People

(Reporter: ezh, Assigned: shanjian)

References

()

Details

(Keywords: intl)

Attachments

(1 file)

1. Set autodetect to Russian 2. Load the URL. 3. Hit Reload. See the Estonian - Estoniana, English - aaaEnglish?
Assignee: nhotta → shanjian
The Russian autodetector is detecting this page as Cyrillic/Russian (IBM-866). The page is tagged with a <META> tag as windows-1250. <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1250"> Can someone verify all the text on this page is in one encoding? It looks like some of the alt text displays strangely (Latin1 garbage?) when viewed as Windows-1250, but appears displays better (I cannot read Cyrillic, but it looks like Cyrillic characters).
Trehe is a alt="" cyrillic text only. The main page is 1250.
This could be related to bug 56689. Auto detect should not affect pages with META charset.
Keywords: intl
I just met reproducible testcase that autodection is overiding meta tag. Hopefully this bug is caused by the same problem.\ How it happens: If a page has been loaded in right charset before, charset will be stored in cache. Next time when you visit the page, meta charset is found, but because it is the same charset is the one in cache, so reload did not happen and the charset information is still marked as from cache. However, if charset detector detects a different result, it will take precedence of charset from cache and thus cause the problem. Possible solutions: In nsMetaCharsetObserver.cpp, if meta charset is found, webshell can be notified even though the charset is the same. But this solution will cause a additional reload unless we did something in webshell.
Status: NEW → ASSIGNED
A better solution to this problem might be to lower the priority of charset autodectection. If we put kCharsetFromAutoDetection after kCharsetFromCache and kCharsetFromBookmarks, the problem will be resolved. Suppose page P is loaded using autodetection, and autodetect find charset C1. User found C1 is not the right one, and manually choose C2. If we rememver C2 is either cache or bookmark, we should not let autodetection override it next time. Frank, what do you think?
I think shanjian is right.
Attached patch proposed patchSplinter Review
sr=erik
fix checked in, modified summary to better describe the bug.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Summary: Autodetect codepage works strange... → Autodetection should not override charset from cache and bookmark
Need to reopen this bug. After I adjust the priority between autodetection and cache, reload/autodetection change will not affect page encoding, that is a serious problem.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Changed QA contact to ylong@netscape.com.
QA Contact: teruko → ylong
If this is a "serious problem", can we address it for nsbeta1, and set target for M0.9.1 Adding nsbeta1 keyword.
Keywords: nsbeta1
My original fix cause the serious problem, but it has been backed out. I will only take care if I have time. For this reason, this might not be a nsbeta1.
Marking as nsbeta1-.
Keywords: nsbeta1nsbeta1-
I could not reproduce the problem any more. Resolve it as worksforme. Reopen if you still see the problem. I would appreciate if you can add more information about how to reproduce the problem or create a better test case.
Status: REOPENED → RESOLVED
Closed: 25 years ago24 years ago
Resolution: --- → WORKSFORME
I could not reproduce this in 2001-05-24-06 build.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: