Closed Bug 655434 Opened 13 years ago Closed 3 years ago

Encoding auto-detection does not work for plain texts

Categories

(Core :: DOM: HTML Parser, defect, P5)

2.0 Branch
x86
All
defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox5 - wontfix
firefox6 - wontfix

People

(Reporter: mihailjp, Unassigned)

References

Details

(Keywords: regression, testcase)

Attachments

(3 files)

User-Agent:       Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Build Identifier: Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1

When I opened a Japanese-language plain text file encoded in Shift-JIS into a new tab, although the auto-detection feature was enabled for Japanese or universal, it didn't switch the current codeset from the default (in my profile this is UTF-8) and showed a lot of question marks instead of the correct text.
This only happens in 4.0.x on Windows. On Linux or in version 3.6.x it works well. This happens many non-English language text files but somewhy some text files are not affected by this bug.

Reproducible: Always

Steps to Reproduce:
1. Open a new tab.
2. Make sure that the character encoding auto-detection feature is enabled for the correct language.
3. Make sure that the current encoding is NOT correct one for the plain text file which you will open at the next step.
4. Open a plain text file written in non-English language (i. e. containing any characters other than 7-bit ASCII ones).

Actual Results:  
The text file is shown with the default or previously-selected wrong character encoding (UTF-8).

Expected Results:  
The text file should be rendered with the correct character encoding (Shift-JIS).
Attached file EUC-JP
Attached file Shift_JIS
Attached file UTF-8
Confirmed on 
http://hg.mozilla.org/mozilla-central/rev/88fdbd974f82
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0a1) Gecko/20110506 Firefox/6.0a1 ID:20110506030557

Works:
http://hg.mozilla.org/mozilla-central/rev/83c887dff0da
Mozilla/5.0 (Windows; U; Windows NT 6.1; WOW64; en-US; rv:1.9.3a5pre) Gecko/20100503 Minefield/3.7a5pre ID:20100503040502
Fails:
http://hg.mozilla.org/mozilla-central/rev/3a7920df7580
Mozilla/5.0 (Windows; U; Windows NT 6.1; WOW64; en-US; rv:1.9.3a5pre) Gecko/20100503 Minefield/3.7a5pre ID:20100503105056
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=83c887dff0da&tochange=3a7920df7580
Triggered by:
358113b3642e	Henri Sivonen — Bug 373864 - Enable the HTML5 parser by default. r+sr=jst.

It works if Disable HTML5.
Status: UNCONFIRMED → NEW
Component: General → HTML: Parser
Ever confirmed: true
OS: Windows 7 → All
Product: Firefox → Core
QA Contact: general → parser
Version: unspecified → 2.0 Branch
Keywords: regression
[STR]
1. Enable auto-detection feature for Japanese or universal.
2.Save Attached files to LOCAL  DISK.
3.Open them in New tab.

[Actual]
Auto-detection fails for  the file of all attachments.

[Expected]
Auto-detection should is successfully performed as Firefox3.6
Regression Window force enabled HTML5:
user_pref("html5.enable", true);
user_pref("html5.parser.enable", true);

Works:
http://hg.mozilla.org/mozilla-central/rev/4be7f43c1de3
Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a1pre) Gecko/20100121 Minefield/3.7a1pre ID:20100121202344
Fails:
http://hg.mozilla.org/mozilla-central/rev/97745a2b2de9
Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a1pre) Gecko/20100122 Minefield/3.7a1pre ID:20100122045549
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4be7f43c1de3&tochange=97745a2b2de9
In local build:
build from d6bfc339e6e8 : fails
build from 4046e4cb0c6e : works
Triggered by:
d6bfc339e6e8	Henri Sivonen — Bug 537557 - Thread-unsafe refcounting in the HTML5 parser when chardet enabled. r=bnewman.

CC'ing
Ben Newman
Henri Sivonen
Severity: minor → normal
Keywords: testcase
FWIW, this will be fixed by bug 479959, but we'll probably want another fix for earlier release trains.
Depends on: 479959
We're not going to track this for Firefox 5.  We shipped it in 4, and given the area we'd like to see it bake on trunk for a little while before putting it into an Aurora release.
(In reply to comment #9)
> FWIW, this will be fixed by bug 479959, but we'll probably want another fix
> for earlier release trains.

nsDetectionAdaptor.cpp is gone on the trunk, so this bug can't be fixed by a simple tweak. Do we want to resurrect nsDetectionAdaptor.cpp or shall we just wait for bug 479959 to fix this in a better way? It's virtually certain that bug 479959 won't make it into Firefox 6, because it has so many other bugs that it needs fixed first and even though they all have patches, they don't have reviews.
Release drivers aren't going to track this for Firefox 5. Unsetting the flag.
smontagu, do you think it's worthwhile to resurrect nsDetectionAdaptor.cpp or to just wait until bug 479959 lands (in Firefox 7 at the earliest)?

Bulk-downgrade of unassigned, >=5 years untouched DOM/Storage bugs' priority.

If you have reason to believe this is wrong (especially for the severity), please write a comment and ni :jstutte.

Severity: normal → S4
Priority: -- → P5

Fixed as part of the general local file encoding detection changes.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: