1.41 KB, patch
|Details | Diff | Splinter Review|
1.31 KB, patch
|Details | Diff | Splinter Review|
11.21 KB, image/png
33.59 KB, image/x-png
3.00 KB, text/html
From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt) BuildID: latest (22 march) phone numbers are not displayed correctly with mozilla for example the number 03-5222903 is displayed: 5222893-03 and the number 1-778-692-8324 is displayed: 8324-692-778-1 using internet explorer results the correct display. Reproducible: Always Steps to Reproduce: 1.the source code lines afe copied into the attached file checkingphones.html 2.browse the attached file with mozilla 3.browse the attached file with Internet explorer 4.compare display of the two browsers. Actual Results: phone number displayed incorrectly as explained above. hyphens between secctions of the phone number cause incorrect reordering. Expected Results: as displayed by internet explorer
This is a classic example of a BIG problem, I discussed briefly with Erik, much better than my own examples. It is a conflict between standards compliance and displaying existing web pages in the manner expected by their authors. What is happening is that Mozilla is more compliant to the Unicode Bidi Algorithm than IE, but the page author has accomodated him or herself to IE's incorrect behaviour. If you look at the source in a non-bidi editor, you can see that all the telephone numbers are written backwards (i.e. the digits in each group are in logical order, but the number appears before the area code): ??: 5222903-03 <br> ???: 5241273-03 <br> ????? ???? ??????: 61-60-60-800-1 <br> ????? ???? ????"?: 8324-692-778-1 This means that IE, which treats the "-" (002D;HYPHEN-MINUS) character as a neutral, reverses the order of the numbers so that they display correctly in a right to left paragraph. But Mozilla, which follows the standard by treating the "-" character as a numeric terminator, displays the numbers in the order that they are written, and they come out wrong. It would be very easy to change the character type definitions in Mozilla to make it behave like IE (see attached patch) and this would certainly result in more correct display of most currently existing (logical) Hebrew pages. Whether that is The Right Thing To Do is another question.
mark moz0.9.1 The goal for moz0.9 is to land bidi and turn it on. Move BIDI functional bug to moz0.9
The patch looks reasonable. r=ftang. We change the bidi type of two characters . It is a safe change.
erik resign. reassign all his bug to ftang for now.
blizzard- can you supre review this one ?
sr=blizzard We need to document this change in behaviour wrt standards compliance.
fix and check in
From: Al Vining <firstname.lastname@example.org> Sat 7:27 AM Subject: Bug 73251 & bidi conformance http://bugzilla.mozilla.org/show_bug.cgi?id=73251 Summary: "BIDI - Hebrew site (windows-1255) - incorrect display of telephone numbers with area code prefix" Since Mozilla is "designed for standards compliance" could somebody explain fully why it is necessary (rather than convenient) to depart from the standard in this case. The summary seems to be misleading; it should read "... correct but unintended display ..." As I understand it, this is a known bug in Microsoft's implementation that the Unicode consortium have decided that they are no going to adapt to, as per this mailing to the Unicode list: | From: Jonathan Rosenne <email@example.com> | Sent: 15 April 2001 05:12 | Subject: RE: Unicode Bidi algorithm dissention (was: something else | entirely) | | There are various MS implementations. Internet Explorer 5, Windows | 2000 and Word 2000 are intended to be without deviations, but older | versions, such as Windows 95 and Word 97 have deviations. For Hebrew, | the most well known concerns the Hyphen-Minus. I believe I have heard | there are some specific Arabic deviations too. | | Microsoft had presented a paper to the UTC proposing some | modifications to the character classifications. Regarding Hyphen- | Minus their proposal was not accepted. They wanted to change it from | European Terminator to European Separator. I realize that there are various concessions to usage over standards in the mozilla tree, but can somebody tell me that this one a) doesn't prevent people from using the correct form in the future, and b) doesn't affect people who compose documents on other platforms (Mac, Java, commerical Unices) with compliant implementations. Just curious ;) Al. From: Tzafrir Cohen <firstname.lastname@example.org> Sat 1:08 PM Subject: Re: Bug 73251 & bidi conformance IE what most users have, and what most designers design for when they create a logical hebrew page (at least at the moment). Thus whatever mozilla chooses to do here will be problematic for some. Add to that mozilla-composed documents. Also: Is it possible (and not costly) to choose a behaviour in this spesific issue at runtime? If so: maybe allow users to decide about such issues? I.e.: add somewhere in the preferences a choice for "microsoft-compliant hyphen-minus", or maybe something more general. (in the build I have here, from 10.5 , those changes are not applied, and I see no such thing as "bidi preferences") -- Tzafrir From: The Nussbaum family <email@example.com> Sun 1:30 AM Subject: Re: Bug 73251 & bidi conformance Tzafrir is correct that the majority of Hebrew users run IE, but there is a healthy minority of us running Netscape and not a few on the Mac. [I regularly help new Hebrew and Arabic Netscape users on Netscape macintosh newsgroups.] I thought the whole point of BIDI was that it would be universal and eliminate the problems we've had in the past with multiple non-standard Hebrew & Arabic implementations. Janet From: Al Vining <firstname.lastname@example.org> Mon 6:37 PM Subject: Re: Bug 73251 & bidi conformance The bug was filed against a preview build which was conformant, but the Microsoft-style behaviour has been in (?almost) all of the nightly builds with bidi support. A pref would be interesting (and everyone loves hidden prefs...) but I'd rather that the first version that Netscape shipped (off 0.9.1??) was compliant and then if everyone said it was utterly unusable, then they could ship a (new! improved!) version with a pref later. Doing it the way we're doing it at the moment just seems defeatist for a project dedicated to internet standards, and will make worse the problem of existing pages. To be honest, though, I don't read a word of Hebrew or Arabic, so it's just the principle of the thing (and hey, these kind of principles are fairly cheap). Al. From: email@example.com (Simon Montagu) 3:00 AM Subject: Re: Bug 73251 & bidi conformance Exactly. I regret departing from the standard, but unfortunately compliance would lead to Mozilla being *perceived* as inaccurate in its rendering of real-world documents. At the moment the vast majority of logical Hebrew web documents are composed with Microsoft tools, and designed to be rendered correctly by Microsoft tools. >Add to that mozilla-composed documents. No: the Mozilla composer uses the Mozilla layout engine, so the reordering in force during composing is the same as that in the browser. >If so: maybe allow users to decide about such issues? I.e.: add somewhere in >the preferences a choice for "microsoft-compliant hyphen-minus", or maybe >something more general. It's possible, and could be coded in a non-costly way. I don't think adding yet another preference will be hugely popular, do you? >(in the build I have here, from 10.5 , those changes are not applied, and I >see no such thing as "bidi preferences") Bidi preferences are still under review, and not checked into the tree. See http://bugzilla.mozilla.org/show_bug.cgi?id=79682 http://bugzilla.mozilla.org/show_bug.cgi?id=79676 Simon From: Tzafrir Cohen <firstname.lastname@example.org> 6:29 AM Subject: Re: Bug 73251 & bidi conformance You can change Mozilla's behaviour now. In a couple of monthes hopefully some designers will already start designing for it. Does this mean that those pages will not be viewed properly by MSIE? Or that the composer will try to make pages that will be viewed right both by MSIE and by a "standard" bidi viewer? How exactly? The thing is that I also want to make sure mozilla views properly bidi docs composed by non-MS tools. From: Al Vining <email@example.com> 9:28 AM Subject: Re: Bug 73251 & bidi conformance Of course, this can become a self-fulfilling propecy, but I understand the dilemma that you have. The bug calls for a release note, though, so I wonder if you know: * Which are the non-compliant platforms - is it just 95/98/Me or is it 2000 as well. * Which are the compliant platforms? There are some results at: http://crl.nmsu.edu/~mleisher/ucdata.html but I don't see any Mac results and the Windows ones only cover IE5. * What are the workarounds for users on compliant platforms. Is it just a matter of typing the phone number in reverse order? Are there any other common scenarios (Social Security numbers, part numbers?) that people should be aware of. Is there any trickery with the <bdo> element that can make web pages look right with both implementations? cheers, Al.
More comments from the newsgroup: ------ From: firstname.lastname@example.org (Jonathan Rosenne) As far as I know Microsoft products display various conflicting behaviors regarding the Hyphen-Minus. I suggest that Mozilla implement the Unicode bidi algorithm conformingly, and in parallel a technical note should be published explaining how to produce HTML which will display correctly on all platforms. There may be additional issues to address. I am ready to edit such a note. I guess if the phone number is preceded and followed by ‎ it should be platform neutral. See for example http://www.qsm.co.il/ ------ As another posting pointed out, it may be possible to use other separators to achieve the desired effect. Reopening bug, with suggestion that checkin be reverted; apologies if I've overstepped the mark.
remove moz0.9.1 milestone, reassign back to email@example.com
Just noting the interaction of the suggested workaround here (use ‎) with bug 36163 (non-displaying characters display under certain OS / font combinations.
It seems like the discussion on this bug quickly went from the following quote from Simon Montagu (2001-03-26): It would be very easy to change the character type definitions in Mozilla to make it behave like IE (see attached patch) and this would certainly result in more correct display of most currently existing (logical) Hebrew pages. Whether that is The Right Thing To Do is another question. to the patch being checked in without any further discussion? This seems like the wrong thing to do if the Unicode Consortium has explicitly decided that this implementation by Microsoft is incorrect. If we're going to back it out (which I think we should), I think we should do so for 0.9.2 so we don't set a bad precedent -- otherwise it will be very hard to do so later. Does anyone else have other opinions on this issue?
Changing component to "Bidi.." and switching QA contact to firstname.lastname@example.org. Adding email@example.com.
I strongly believe that Mozilla should be as conformant as possible to Unicode Bidirection Algorithm. As far as I know, Unicode Bidi team is working on conformance tests, that will publish publicly. Please remove any extra patch made for bidi compatiblity with Internet Explorer. As far as I know, there are bidi noncompatiblity problems even with Microsoft products (like Windows 2000's Notepad vs. Internet Explorer 5.5), so selecting one among those many may be much worse than we may think.
Two more points: 1. This bug was addressed by changing intl/unicharutil/src/bidicattable.h itself ("DO NOT EDIT THIS DOCUMENT !!!"), rather than intl/unicharutil/tools/genbidicattable.pl which generates it. Unless it is commented in the perl script somewhere, it will be reverted the next time Unicode change the properties file and bidicattable.h is regenerated. 2. In news://news.mozilla.org/3B30F79C.5070205%40mozilla.org it is stated that "Netscape will be doing some kind of release off of the 0.9.2 branch". Which will lead to more backwards-compatibility issues if this "fix" is in that release.
firstname.lastname@example.org for the backing out. As tzafrir said, "whatever mozilla chooses to do here will be problematic for some." I doubt if we've heard the last of this issue :-)
Fix checked into trunk 2001-06-28 20:20 PDT. Adding nsBranch keyword.
Adding [PDT+] per conversation with chofmann.
Fix (backout) checked into branch 2001-06-30 12:00 PDT. I recommend we mark this bug as INVALID or WONTFIX, but I'll leave that to others.
*** Bug 88676 has been marked as a duplicate of this bug. ***
As I understand it, WONTFIX would be appropriate if we had decided to leave the non-standard-compliant behaviour in place, so INVALID is the correct resolution here.
*** Bug 113566 has been marked as a duplicate of this bug. ***
*** Bug 154011 has been marked as a duplicate of this bug. ***
*** Bug 170180 has been marked as a duplicate of this bug. ***
*** Bug 178420 has been marked as a duplicate of this bug. ***
Created attachment 109594 [details] A screen capture of various BiDi hyphenation situations Comments soon to follow.
Please take a look at the attached screen-capture before reading this comment; it will really help in clarifying why I believe this bug should be reopened. So, without much further ado, here are the main reasons: 1. Gecko tries to follow the Unicode Bidirection Algorithm, but the hyphenation standard in this algorithm is not relevant for PC computers, simply because Hebrew keyboards do not include the makaf sign (a higher hyphen). This is far from only being an issue with operating systems such as Windows; it's an inherent limitation of virtually all PC hardware (except for Macs, and these are very rare in Israel). Not only is the makaf sign not marked on keyboards, standard OS key-maps do not support it. This leads us to the ubiquitous workaround of using a minus sign instead of a makaf – an inevitable choice. 2. Normally, a minus sign followed by numbers should be placed to the left of the number, in order to denote that the number is negative. This is fine with numbers and with algebraic sentences which use Latin characters. It should *never* apply to sequences (for lack of a better term) that combine a Hebrew letter, a hyphen and a number. Why? For two reasons: 2a. There are simply no algebraic or mathematical expressions that combine Hebrew letters and negative numbers. 2b. According to the rules of the language, a sequence that combines a Hebrew letter and a number should be separated by either a hyphen, a space or another punctuation mark. The number and the letter should never be adjacent (except for sequences such as alef1 [common names for school classes], but these never include a minus sign or a hyphen). From the above it is clear that implementing such a fix only for sequences that include Hebrew letters would not affect any other combination. I can't think of a single case where such an implementation would result in problems with the character order. Not one. 3. Since makaf signs are not provided by Hebrew keyboards and users must resort to using minus signs, some Mozilla advocates suggest changing the order of characters. This suggestion has two major flaws: 3a. It is contrary to the logical and standard order in which people actually handwrite or type. The correct order is letter>hyphen>number NOT letter>number>hyphen (See example #4 in the attached screen capture) 3b. Only users of Gecko-based browsers will be able to view the sequence as the writer intended, the vast majority of users will see the hyphen at the end of the sequence. This is less important, but annoying nevertheless. It is not my main point. Please reconsider the status of this case. I'm afraid that without a proper solution for this issue, Mozilla has no future among Hebrew users. Prog.
might i add, that in IE6/WinXP, attachment 28641 [details] is displayed in reverse order, just like in mozila.
I suggest you view the source of the page in a text editor. You'll see that the phone numbers are reversed: 8324-692-778-1 instead of 1-778-692-8324 Note that as long as the order of the numbers and hyphens is correct, there is no difference between the rendering of IE and Gecko. Both seem to do the job very well, as is evident in the first two posts of the following discussion: http://snurl.com/phone_hyphenation Therefore, for phone numbers, the status of this bug should be WORKSFORME. However, most of the dupes for this bug dealt with a different aspect of BiDi hyphenation, the one I described in Comment #32, so it would probably be best to reopen this bug and just edit the title. Prog.
Created attachment 109656 [details] Yet another screen capture, IE6 vs. Gecko (NS7.01) Sorry for spamming this bug, this will be my comment for today. The attachment shows both browsers displaying the original phone-number test (attachment 28641 [details]). It seems that Microsoft have decided to follow Mozilla's lead on this, since IE now behaves exactly the same (phone numbers are reversed, as in the original HTML code). So now that MS have solved part of this bug for us, let's focus on the remaining hyphenation issue. Prog.
Mozilla is not about to implement its own version of the Unicode Bidi Algorithm. There is at least a case to be made for emulating Microsoft's non-standard behaviour, but new behaviour which is different from both Microsoft and the standard we don't need. This doesn't mean that I think the standard is perfect. Suggestions for changing the UBA should be sent to its author, Mark Davis (email@example.com), but I recommend raising them for discussion on firstname.lastname@example.org first. (http://www.w3.org/International/about.html#participation)
Why does everyone always seem to fall for the mental trap of 'either one without the other, or the other without the one"? I see no reason why an _option_ should not be added to Mozilla to diverge from Unicode BiDi algorithm compliance in this case. I urge Simon to consider re-opening the bug. Besides, even regardless of the keyboard issue, it is unreasonable to require makaf's to not be represented using minus, since there are 15 years of tradition of using minus for this purpose, and 40 (I think) years of tradition of using minus for hyphenation, instead of 'specialized' characters. Finally, if a divergence from the Unicode algorithm is made, I see no reason to adhere to what Microsoft is doing, rather than to what users except to see (i.e. one expects text entered in the intuitive way to be displayed the intuitive way).
Having a user preference doesn't help document authors -- then they need to write documents that work for either setting of the preference.
(regarding comment 38) I disagree, David. If Hebrew page authors were to write so as to display correctly in browsers compliant with the Unicode BiDi algorithm, they would have to stop using the minus sign for hyphenation. This demand is unreasonable - moreover, it is discriminatory. No one would suggest that people authoring Latin content be _forced_ to switch to using unicode hyphenation/dash characters instead of minus characters. So why should Hebrew content authors be so forced? I submit that authors must write documents that work in at least one of these two settings (and we all know which they're gonna pick, so long as keyboards don't have the makaf-ili key).
That's a reasonable argument for changing the behavior completely, but not for making it a pref. Has anyone tried to convince the Unicode folks to change the standard? That's the appropriate venue for this discussion, not here.
you don't have to use Makaf to display letter+'-'+number correctly. you just need to write the characters in the correct order. it looks to me that whoever wrote attachment 28641 [details] was a little drunk. the number is simply written backwards. it is reversed in IE too. no need to fix anything, except the html code. work is being done by prog (see message above and CC list) with the Israeli Institude of Standards to add a Makaf and other hebrew characters to the standard hebrew keyboard.
> you don't have to use Makaf to display letter+'-'+number correctly. you just > need to write the characters in the correct order. That's right, this bug should be a WORKFORME. See comment 34. The hyphenation issues, on the other hand, are still open (even if this bug is closed...) > work is being done by prog (see message above and CC list) with the Israeli > Institude of Standards to add a Makaf and other hebrew characters to the > standard hebrew keyboard. Even if my "campaign" succeeds, being able to access all those characters will only solve some issues but not this one. It is important IMHO to revive all those lost and forgotten Hebrew punctuation marks. It could also provide a good hyphenation solution for future texts, It will NOT however do squat for existing texts. In other words, virtually all texts will still render wrong (as in "not as the writer intended") in Mozilla for a long time to come. I will consult with the Unicode mailing list, but I don't have much hope in this. Last I heard, Microsoft did the same and failed. Prog.
> it looks to me that whoever wrote attachment 28641 [details] was a > little drunk. the number is simply written backwards. That is true. Perhaps the letter+minus+number issue should be made into a new bug. > you don't have to use Makaf to display letter+'-'+number correctly. > you just need to write the characters in the correct order. That is not true. The correct order is letter, then minus, then number. The order letter, then number/s, then minus is totally wrong (despite our having been used to it in the bad old days of early MS-Office versions).
> Perhaps the letter+minus+number issue should be made into a new bug. It will just be closed as a dupe of this one. > The correct order is letter, then minus, then number. The > order letter, then number/s, then minus is totally wrong (despite our having > been used to it in the bad old days of early MS-Office versions). This is the correct order, but using a minus is nothing more than a poor workaround to begin with. The fact that the two of you talk about where a minus "should" fit in these Hebrew sequences, really shows how an old computer limitation managed to wipe the correct punctuation mark (the maqaf) out of modern Hebrew use, and out of the mind/acceptance/familiarity of users. Most people now think that a maqaf and a minus are the same and should look the same, IMO this is very unfortunate. Prog.
If anyone is interested, a workaround for typing HebrewLetter+HyphenMinus+Number sequences can be found here: http://mozilla.org.il/board/viewtopic.php?p=1790#1790 In other related news, the issue of rendering *existing* texts that include such sequences is being discussed at the email@example.com mailing list: http://lists.w3.org/Archives/Public/www-international/2003JulSep/thread.html#84 Prog.
Reopening. First, let's start with the *big* news: The Unicode Consortium is about to change the properties of the Minus, Plus and Solidus signs, in a way that will finally open the door to fixing this long standing and irritating bug. The beta version of the Unicode data file (that contains the properties of all characters) already shows the following: 002B;PLUS SIGN;Sm;0;ES;;;;;N;;;;; 002D;HYPHEN-MINUS;Pd;0;ES;;;;;N;;;;; 002F;SOLIDUS;Po;0;CS;;;;;N;SLASH;;;; (http://www.unicode.org/Public/4.0-Update1/UnicodeData-4.0.1d3b.txt ) As opposed to (Unicode 4.0): 002B;PLUS SIGN;Sm;0;ET;;;;;N;;;;; 002D;HYPHEN-MINUS;Pd;0;ET;;;;;N;;;;; 002F;SOLIDUS;Po;0;ES;;;;;N;SLASH;;;; (http://www.unicode.org/Public/4.0-Update/UnicodeData-4.0.0.txt) Plainly speaking, conforming to Unicode 4.01 would also mean that Hyphen-Minus will no longer be displayed in wrong side of the number. Now, for some reason, this bug has morphed from an invalid description of a broken HTML page to one that deals with HebrewLetter+HyphenMinus+Number sequences. The current summary is clearly misguided, so I'm changing it from: "BIDI - Hebrew site (windows-1255) - incorrect display of telephone numbers with area code prefix" to "Hyphen/Minus and numbers switch places (appear to the left, instead of to the right in HebrewLetter+HyphenMinus+Number sequences)" I'm also updating the Status Whiteboard according to the aforementioned advancements in this issue. Prog.
With the release of Unicode 4.0.1, the property change is now official: Category: Weak Type: ES Description: European Number Separator General Scope: Plus Sign, Minus Sign For more information, see the latest revision of the Unicode Bidirectional Algorithm (2004-03-26) http://www.unicode.org/reports/tr9 Prog. PS. I'm assigning myself as QA contact instead of Gilad Ehven who is no longer active in Mozilla.
This means we need to grab the latest UCD to generate a new bidicattable.h, doesn't it? I would also suggest to verify the code in nsBidi.cpp (based on ICU4C).
(In reply to comment #46) > Reopening. > > First, let's start with the *big* news: The Unicode Consortium is about to > change the properties of the Minus, Plus and Solidus signs, in a way that will > finally open the door to fixing this long standing and irritating bug. Recycling old bug reports isn't good practice, especially one like this which has already been through several metamorphoses. I've filed bug 240943 on regenerating bidicattable.h from the current UCD. I now have a problem. I can't resolve this back to INVALID because of the change in Unicode, so I've made it depend on bug 240943, and when that is fixed this can become WORKSFORME.
Thanks to Simon Montagu's patch to Bug 240943, this issue is now fixed on the 1.7 Branch. You can download a fixed build here: http://ftp.mozilla.org/pub/mozilla.org/mozilla/nightly/latest-1.7/mozilla-i586-pc-msvc.zip - Don't download the Installer build, it is currently busted (see Bug 243008). Thanks again Simon! Mozilla is so much nicer to use now :-) Prog.
Since I checked the same patch into the trunk as well, I guess this is now WORKSFORME.
Created attachment 148115 [details] Before and after testcase Here's a testcase for the characters with changed properties.
(In reply to comment #52) > Created an attachment (id=148115)  > Before and after testcase > > Here's a testcase for the characters with changed properties. By the way, IE shows the two testcases (in the attachment) wrongly. What standard is it compatible with? :-)
Heh, good question :) Maybe it will be different in IE7