Closed Bug 849113 Opened 11 years ago Closed 9 years ago

Remove UI and HTML parser use of the "universal" encoding detector

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: hsivonen, Assigned: hsivonen)

References

Details

(Whiteboard: [fixed by bug 805374 and bug 844115])

Attachments

(1 file, 2 obsolete files)

Stop using the "universal" detector in the HTML parser and remove the UI 11 years ago Henri Sivonen (:hsivonen) 18.52 KB, patch		Details \| Diff \| Splinter Review
Stop using the "universal" detector in the HTML parser and remove the UI, with assertion counts adjusted 11 years ago Henri Sivonen (:hsivonen) 18.78 KB, patch		Details \| Diff \| Splinter Review
Fix bitrot 11 years ago Henri Sivonen (:hsivonen) 19.10 KB, patch		Details \| Diff \| Splinter Review

Henri Sivonen (:hsivonen)

Assignee

Description

•

11 years ago

Bug 848842 makes fixing bug 844115 in one go infeasible. Let's remove the UI for activating the "universal" detector and stop using it in the HTML parser first.

Henri Sivonen (:hsivonen)

Assignee

Comment 1

•

11 years ago

Attached patch Stop using the "universal" detector in the HTML parser and remove the UI (obsolete) — Details — Splinter Review

We have no locales enabling this by default.

Attachment #722781 - Flags: review?

Henri Sivonen (:hsivonen)

Assignee

Comment 2

•

11 years ago

Attached patch Stop using the "universal" detector in the HTML parser and remove the UI, with assertion counts adjusted (obsolete) — Details — Splinter Review

Attachment #722781 - Attachment is obsolete: true

Attachment #722781 - Flags: review?

Attachment #723454 - Flags: review?(smontagu)

Simon Montagu :smontagu

Comment 3

•

11 years ago

Comment on attachment 723454 [details] [diff] [review]
Stop using the "universal" detector in the HTML parser and remove the UI, with assertion counts adjusted

Review of attachment 723454 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, I can't review something which I don't believe in. IMO bug 844115 and all its dependencies are WONTFIX, but I admit the possibility that I'm not objective because of the amount of time that I've invested in encoding detection.

Attachment #723454 - Flags: review?(smontagu)

Henri Sivonen (:hsivonen)

Assignee

Comment 4

•

11 years ago

(In reply to Simon Montagu from comment #3)
> Sorry, I can't review something which I don't believe in. IMO bug 844115 and
> all its dependencies are WONTFIX, but I admit the possibility that I'm not
> objective because of the amount of time that I've invested in encoding
> detection.

Can you propose how to solve the following set of problems?

We have a detector labeled as "universal". The idea of a universal detector appeals to people and from time to time people who don't know that the "universal" detector isn't actually universal turn it on by default for a localization (has happened for Swedish and Traditional Chinese, both now reverted) or uses it in some new Gecko code even when a spec doesn't call for it (happened with File API which per spec should use UTF-8 if there's no label).

Using the universal detector exposes non-obvious implementation-specific mystery behavior to the Web. There seems to be neither an effort to standardize the details of the behavior nor an effort to make the "universal" detector actually universal.

I think it's not okay to expose implementation-specific mystery behavior as part of the Web platform. I also think it's not okay to use the enticing label "universal" for something that's not actually universal but people who see the label don't know that. Undoing changes that were made with the faulty assumptions that the detector was universal and that detection is good (as opposed to being a source of implementation-specific mystery) is always harder than making changes under faulty assumptions.

(I'm saying that the "universal" detector is not universal, because it seems arbitrary that "universal" includes Hebrew and Thai but does not include Arabic and Vietnamese.)

Note that I am not proposing the removal of the CJK detection code that lives under the "universal" detector in the source tree. Also, if there is a clear need still for the Hebrew detector, I think we could have a detector labeled as a Hebrew detector in the menu.

Henri Sivonen (:hsivonen)

Assignee

Comment 5

•

11 years ago

Also, every time a Web author turns on any detector is an opportunity for that author to publish Web content that depends on the implementation-specific behaviors of that detector. Consider how we'd feel if people were authoring content depending on a set of mystery behaviors in IE or Chrome.

Henri Sivonen (:hsivonen)

Assignee

Comment 6

•

11 years ago

Attached patch Fix bitrot — Details — Splinter Review

Attachment #723454 - Attachment is obsolete: true

Henri Sivonen (:hsivonen)

Assignee

Updated

•

9 years ago

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Whiteboard: [fixed by bug 805374 and bug 844115]

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Remove UI and HTML parser use of the "universal" encoding detector

Categories

(Core :: Internationalization, defect)

Tracking

()

People

(Reporter: hsivonen, Assigned: hsivonen)

References

Details

(Whiteboard: [fixed by bug 805374 and bug 844115])

Crash Data

Security

(public)

User Story

Attachments

(1 file, 2 obsolete files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Attachment

General

Description

File Name

Content Type