Closed Bug 545059 Opened 14 years ago Closed 14 years ago

Remove support for bidi.controlstextmode pref?

Categories

(Core :: DOM: Core & HTML, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: sicking, Unassigned)

References

Details

Attachments

(3 files)

The form submission code has a lot of code to deal with bidi issues. First of all this surprises me since I thought that bidi was a display issue, not a content issue. It seems like we are actually modifying the values in the form controls while submitting under certain conditions. Most of this seems to be related to some magic "IBM864" encoding.

Is this stuff really valid/correct/still needed? Do other browsers do this? I'd be willing to do some testing if someone can help me with how to actually trigger this code and let me know what the correct behavior actually is.

I was unable to dig out history on this code. The tails go as far back as a layout/html/forms/src/nsFormFrame.cpp file, but I can't seem to find cvs history on this file as I can't find that src directory in bonsai.
I was able to coerce bonsai to give up the information.

This was checked in as part of bug 71339. Checkin comment was:

bug 71339; author=simon@softel.co.il; r=rods; sr=erik; changes from IBM bidi project (Hebrew, Arabic, etc); in ifdef IBMBIDI for now

So it appears this was part of the initial bidi landing.
My understanding is that this code, which converts between logical and visual Arabic, and between nominal and presentation forms, is intended for situations where e.g. a web app on a legacy server is expecting data in a different form from the way it is entered on a PC.

As you say, this was written as part of the initial IBM bidi support about 10 years ago, and I don't know if it's still needed. Maybe Mati can help us out here?
Is there any way to test what other browsers do?
Here's a patch to remove the relevant code, if we decide to do so.

Still looking for help with how to test this in other browsers.
So I started searching backwards to see how to trigger this code. Turns out it's controlled by the pref "bidi.controlstextmode". Searching for that eventually lead to bug 373319.

Most interesting in that bug is bug 373319 comment 4. This points out that it looks like this code is buggy and there is indeed no way to trigger the actual calls Conv_FE_06 or Conv_06_FE_WithReverse.

So it seems like this is dead code and we should just remove it.
Remove the rest of the code dealing with this preference.

Again, still looking for ways to test this in other browsers. Given that this is all preference based in firefox, I obviously can't test it the same way in IE.
Attachment #426107 - Flags: review?(smontagu)
(In reply to comment #2)
> As you say, this was written as part of the initial IBM bidi support about 10
> years ago, and I don't know if it's still needed. Maybe Mati can help us out
> here?

Sorry, but I have no recollection of this matter (which seems to be related to Arabic more than Hebrew).
This is an interesting issue.  Here is some information which can probably help to reach a good decision.

Firstly, I have never used this encoding myself.  I suspect that it is (/was) mostly used for Arabic, but it can be possible that Persian has an extended code page similar to IBM864 with support for the four characters that it has in addition to Arabic.

So, this is an old encoding, dating back to DOS days.  Here, you can find a full encoding table for IBM864: <http://www.fileformat.info/info/charset/IBM864/list.htm>.

What's interesting about this encoding is that it represents most characters in their "glyph" form, i.e., in their presentational form, instead of in their logical form[1].  Therefore, the web browser has to convert the characters that the user has entered from the logical form to their presentational form.  Also, reading the code, I suspect that this encoding stores characters in their visual order, not their logical order.  This was a very common thing back in the DOS days, in order to shift the burden of bidi resolution to the text editors instead of all programs which render the text.  This is why the code needs to reverse the RTL runs.

According to the WHATWG wiki, this encoding is supported both in Firefox and IE. <http://wiki.whatwg.org/wiki/Web_Encodings>  I would suspect that there are still legacy web servers around which use this encoding, even though I have never came across one myself.  My recommendation is to WONTFIX this bug.  But if we really intend to remove this support, we should remove support for the web encoding in the same time as well, so that we can say we don't support this encoding at all any more, instead of supporting it for page rendering, but not for form submission.  (This encoding is indeed available from View > Character Encoding > More Encodings > Middle Eastern > Arabic (IBM 864).)

[1] For those not familiar with how the Arabic script works, there are multiple glyphs (or presentational forms) for each Arabic letter, and the shape of the letter inside a word is determined by its preceding/following characters.  All modern encodings today store characters in their logical format (using unicode code points in the range of 06nn), and convert them to their presentational format (using code points in the range of FEnn) upon rendering, using a process called "Arabic shaping".  Older encodings had a tendency to store characters in their presentational format, again in order to shift the burden of Arabic shaping from rendering software to the editor software.
Summary: Is bidi code in form submission still needed? → Remove support for IBM864 encoding in form submission?
Ehsan: First of all, my proposal isn't to remove support for the IBM864 encoding. Just to remove the pile of code in nsFormSubmission that is triggered when the "bidi.controlstextmode" pref is set.

As indicated in bug 373319 comment 4, it turns out that due to bugs, most of this code doesn't actually do *anything* currently. The interaction between UnicodeToNewBytes and GetSubmitCharset is such that in almost all cases we never fall into the if-statements in UnicodeToNewBytes.

The other thing to note is that all of this stuff is triggered only when the bidi.controlstextmode pref is set to a non-default value. And we have no UI for setting this preference. So this will only affect people that have gone to about:config to tweak this preference. And again, even then it most of the time won't do anything due to above mentioned bugs.

Lastly, apparently even if you set this pref, things break badly and so you really don't want to do it for other reasons.
Summary: Remove support for IBM864 encoding in form submission? → Remove support for bidi.controlstextmode pref?
Hmm.  Yes, I didn't pay enough attention to this part of the problem.

So, AIUI, encoding conversions on IBM864 encoded forms has never been working, even for those users who have set bidi.controlstextmode.  Therefore, I see nothing wrong with removing all the related code from the tree for now.  The fact that we've gotten away with this problem for so many years without anyone hitting it in practice probably means that this bug has not been encountered by people (who at least would file a bug or raise the issue in other ways.)  And the less code we have in the tree to maintain, the better!  :-)
Comment on attachment 426074 [details] [diff] [review]
Patch to remove the code

Code review = me, but please don't check in just yet. I am in contact with some folks at IBM who can confirm whether this code is really as dead as I think it is.

By the way, I think you can remove a lot more stuff in nsBidiUtils.cpp -- after this I don't think ArabicShaping and all the macros and tables that it uses are called from anywhere -- but that can be a follow-up patch.
Attachment #426074 - Flags: review?(smontagu) → review+
Comment on attachment 426107 [details] [diff] [review]
Remove the rest of the pref

As above, code review=me but please wait for confirmation before checking in.
Attachment #426107 - Flags: review?(smontagu) → review+
This depends on the patches in bug 544698 anyway, so likely won't be able to check in for a few days.

In any event I'll wait for a green light from you before checking in.
Holy hell, there's a lot of code to support that preference.

This should remove what's left.
Attachment #426573 - Flags: review?(smontagu)
(In reply to comment #14)
> Holy hell, there's a lot of code to support that preference.

Back in the day we used this code when rendering Arabic text on platforms that didn't know how to do Arabic shaping. The other callers all disappeared with the new textframe and gfx/thebes code, and this was the last hold-out.
Attachment #426573 - Flags: review?(smontagu) → review+
OK, my contacts at IBM confirm that this code is not needed any more, so you have the green light to check in all the patches.

(For the record, there are products that use visual ordering in text widgets, but they typically do this via CSS. This makes a lot more sense for web developers than jumping through hoops to set a user pref, and what is more, it is another reason why we don't want the code path in form submission that depends on the pref and the charset, since even without the bugs it wouldn't be a reliable test for when widgets actually have visual ordering)
Comment on attachment 426107 [details] [diff] [review]
Remove the rest of the pref

>--- a/layout/generic/nsBlockFrame.cpp
>+++ b/layout/generic/nsBlockFrame.cpp

> PRBool
> nsBlockFrame::IsVisualFormControl(nsPresContext* aPresContext)
> {
>   // This check is only necessary on visual bidi pages, because most
>   // visual pages use logical order for form controls so that they will
>   // display correctly on native widgets in OSs with Bidi support.
>-  // So bail out if the page is not visual, or if the pref is
>-  // set to use visual order on forms in visual pages
>+  // So bail out if the page is not visual.

This comment could do with rephrasing. Make it:

We always use logical order on form controls, so that they will display correctly in native widgets in OSs with Bidi support.
If the page uses logical ordering we can bail out immediately, but on visual pages we need to drill up in content to detect whether this block is a descendant of a form control.
Checked in

http://hg.mozilla.org/mozilla-central/rev/3250b548e80d

Thanks for clearing that this wasn't needed!
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Component: HTML: Form Submission → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: