mailto link generated with BiDi user name and email address gets mangled with timestamp

RESOLVED FIXED in Bugzilla 2.22

Status

()

Bugzilla
User Interface
RESOLVED FIXED
12 years ago
10 years ago

People

(Reporter: lɛʁi לערי ריינהארט, Assigned: Wurblzap)

Tracking

(Blocks: 1 bug)

2.21
Bugzilla 2.22
Bug Flags:
approval +
blocking2.22 -

Details

(Whiteboard: [Wanted for 2.22], URL)

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

12 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7

Hallo!

could not find any bug for component Bugzilla containing "BiDi".

http://landfill.bugzilla.org/bugzilla-tip/show_bug.cgi?id=3197#c1 is a typical example where the content of a field - which could have a BiDi content including characters as ().- ... - is *not* embeded according to the sites / paragraph / span LTR / RTL orientation.

The general I18n solution is more complecated but for landfill in English the best solution known to me which is supported by most browsers is to embed the mailto link generated with user name and email address in
<span dir="ltr" > and </span>.

There are many known workarounds:
<smontagu> "btw, you can probably work around by adding an LRM character to the end of your name (Unicode U+200E)"
but for a browser independend tool as Bugzilla it might be better not to use that because some browsers have / might have problems with LRM.
Some browsers have problems also with
style="direction: ltr; unicode-bidi: embed;"
mentioned by smontagu.

This solution should / could be used wherever a link in Bugzilla is generated which content could be BiDi.

It seems that there are some bugs from product CORE related to this which where mentioned at irc://irc.mozilla.org/BiDi .

<smontagu> that shows that the bug with the underline is with both
<smontagu> and it's caused by bug 299065
...
<smontagu> the bug with the mangled ordering could also be fixed by style="direction: ltr; unicode-bidi: embed;"
...
<smontagu> I reported a similar bug with mail: bug 278713
<smontagu> and with Chatzilla, bug 278698

Hope that this description is enough.

best regards reinhardt [[user:gangleri]]

Reproducible: Always

Steps to Reproduce:
can *not* be reproduced at https://bugzilla.mozilla.org/ where UTF-8 characters are changed to &#nnnn; notation but
only at http://landfill.bugzilla.org/bugzilla-tip/ in actual configuration
Actual Results:  
see above

Expected Results:  
see above
(Assignee)

Comment 1

12 years ago
Until we find a way how to cleverly BiDi-balance a string, we should remove all characters influencing BiDi, which seems to me would be what's mentioned in http://www.unicode.org/Public/UNIDATA/PropList.txt as Bidi_Control.

Taking.
Assignee: myk → wurblzap
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Windows XP → All
Hardware: PC → All
Target Milestone: --- → Bugzilla 2.22
(Assignee)

Comment 2

12 years ago
Created attachment 205832 [details] [diff] [review]
Patch
Attachment #205832 - Flags: review?
(Assignee)

Updated

12 years ago
Blocks: 320273
(Reporter)

Comment 3

12 years ago
Thanks Marc!

A more "general" example is available at
http://landfill.bugzilla.org/bugzilla-tip/show_bug.cgi?id=3254 [bug landfill 3254] == "Your real name" can break varios page areas

To cover that please take a look at
bug 320273 BiDi: request for a "BiDi balancing function" to avoid BiDi overlapping between objects

Please insert dependencies if appropriate.
Done "midair" by wurblzap. Thanks! reinhardt [[user:gangleri]]
(In reply to comment #1)
> Until we find a way how to cleverly BiDi-balance a string, we should remove all
> characters influencing BiDi, which seems to me would be what's mentioned in
> http://www.unicode.org/Public/UNIDATA/PropList.txt as Bidi_Control.

I would suggest filtering all the characters listed at http://www.w3.org/TR/2003/NOTE-unicode-xml-20030613/#Charlist
(In reply to comment #4)
> I would suggest filtering all the characters listed at
> http://www.w3.org/TR/2003/NOTE-unicode-xml-20030613/#Charlist

And as a corollary to this, I think you should not filter U+200E LEFT-TO-RIGHT MARK and U+200F RIGHT-TO-LEFT MARK. 

(Assignee)

Comment 6

12 years ago
(In reply to comment #5)
> (In reply to comment #4)
> > I would suggest filtering all the characters listed at
> > http://www.w3.org/TR/2003/NOTE-unicode-xml-20030613/#Charlist

This makes sense to me.

> And as a corollary to this, I think you should not filter U+200E LEFT-TO-RIGHT
> MARK and U+200F RIGHT-TO-LEFT MARK. 

I don't know much about BiDi... Can you please elaborate why we should spare these from filtering?
Status: NEW → ASSIGNED
(In reply to comment #6)

> > And as a corollary to this, I think you should not filter U+200E LEFT-TO-RIGHT
> > MARK and U+200F RIGHT-TO-LEFT MARK. 
> 
> I don't know much about BiDi... Can you please elaborate why we should spare
> these from filtering?

These characters are like an invisible left-to-right and right-to-left character, and they are sometimes useful to ensure the correct ordering of bidi edge cases. Unlike the override codes, they don't generally have any effect on characters outside their immediate vicinity. It wouldn't be logical to ban the right-to-left mark if you permit right-to-left alphabetical characters.
(Assignee)

Comment 8

12 years ago
Created attachment 206098 [details] [diff] [review]
Patch 1.1

Ok, this excludes the two said characters from filtering.
Attachment #205832 - Attachment is obsolete: true
Attachment #206098 - Flags: review?
Attachment #205832 - Flags: review?
(Assignee)

Comment 9

12 years ago
Comment on attachment 206098 [details] [diff] [review]
Patch 1.1

Simon, perhaps you can sign off on the list of filtered characters?
Attachment #206098 - Flags: review?(smontagu)

Updated

12 years ago
Attachment #206098 - Flags: review?(smontagu) → review+

Comment 10

12 years ago
Comment on attachment 206098 [details] [diff] [review]
Patch 1.1

>+                    # Do the replacing in a loop so that we don't get tricked
>+                    # by stuff like 0xe2 0xe2 0x80 0xae 0x80 0xae.
>+                    while ($var =~ s/\xe2\x80(\xaa|\xab|\xac|\xad|\xae)//g) {
>+                    }

I'm not convinced by this regexp, but maybe it's because I don't understand the problem well enough:

From what Marc said, UTF-8 uses 3 bytes for words. So assuming we want to remove the word |abc| and we pass the string |dab|cab|cxy|, the string should be left untouched, right? Because "abc" never appears in a single word. But this regexp will return |dxy| which is not what we want AFAIK.

I'm not going to grant nor to deny review because in both cases I may be wrong.
(Assignee)

Comment 11

12 years ago
From http://www.ietf.org/rfc/rfc3629.txt, meaning we won't cut accross characters because the escape bit sequences occur in escape bytes only:

The table below summarizes the format of these different octet types.
The letter x indicates bits available for encoding bits of the character number.

   Char. number range  |        UTF-8 octet sequence
      (hexadecimal)    |              (binary)
   --------------------+---------------------------------------------
   0000 0000-0000 007F | 0xxxxxxx
   0000 0080-0000 07FF | 110xxxxx 10xxxxxx
   0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
   0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
(Assignee)

Updated

12 years ago
Flags: blocking2.22?

Comment 12

12 years ago
As far as I understand, this has always been a problem with Bugzilla, and it's not a regression, so I wouldn't hold up the release on it. However, it would definitely be nice to see it fixed before 2.22 if somebody does get to it.
Flags: blocking2.22? → blocking2.22-
Version: unspecified → 2.21
(Assignee)

Comment 13

12 years ago
Well... 2.22 is the first release to officially embrace UTF-8, so we might as well do it right.

I do think we have a chance to do ourselves a favour and start off with this potential source of embarassment sealed :)

Comment 14

12 years ago
(In reply to comment #13)
Okay, I understand. I don't want to block the release on it, but I would definitely be happy to take it for 2.22 before we release, if the patch is properly reviewed in time. Perhaps ask somebody specific for a review (perhaps glob?), so that we can get the patch in.
(Assignee)

Comment 15

12 years ago
Comment on attachment 206098 [details] [diff] [review]
Patch 1.1

Byron, would you take a look at this? Maybe we can get it into 2.22. The BiDi specifics should be covered by smontagu's r+.
Attachment #206098 - Flags: review? → review?(bugzilla)
> Byron, would you take a look at this?

to be honest i'm with LpSolit on this, i don't really know enough about the issue to grant or deny the review :(

(Assignee)

Updated

12 years ago
Whiteboard: [Wanted for 2.22]

Comment 17

12 years ago
Perhaps somebody from the Mozilla team might have some idea? Are there any developers we have access to who would be able to offer us some guidance or a review?

Comment 18

12 years ago
Marc's patch is Live on http://landfill.bugzilla.org/qa222my5, see e.g. bug 22 there
This is well out of my area of expertise.

Gerv
Is there any information I can provide to help along the review?
Comment on attachment 206098 [details] [diff] [review]
Patch 1.1

code looks good to me.  The process I'll go on smontagu's sayso :)
Attachment #206098 - Flags: review?(bugzilla) → review+
Flags: approval+
ok, the review mail that went out when I did that sucked bigtime. :)

This patch doesn't look like it fixes email, do we need to do that too?  (I assume so, but it could probably be handled on another bug if you want)

Comment 23

12 years ago
For those who didn't get the email, here is how it looked like:



Dave Miller <justdave@bugzilla.org> has granted Marc Schumann [[[[let's fix
Bug 319331 <wurblzap@gmail.com>'s request for review:
Bug 319331: mailto link generated with BiDi user name and email address gets
mangled with timestamp
https://bugzilla.mozilla.org/show_bug.cgi?id=319331

Attachment 206098 [details] [diff]: Patch 1.1
https://bugzilla.mozilla.org/attachment.cgi?id=206098&action=edit

------- Additional Comments from Dave Miller <justdave@bugzilla.org>
code looks good to me.	The process I'll go on smontagu's sayso  :) 
(Assignee)

Comment 24

12 years ago
Checking in Bugzilla/Template.pm;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/Template.pm,v  <--  Template.pm
new revision: 1.41; previous revision: 1.40
done

The follow-up bug for e-mail notifications is bug 324359.
Status: ASSIGNED → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
(Reporter)

Comment 25

10 years ago
Hi! I assume that this is not fixed.

plese see Bug 406406 – remedy against BiDi interference caused by the (arbitrary) content of various fields
which may relate to the same topic may be a duplicate but is still open (not live)

Best regards Reinhardt [[user:Gangleri]]
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Comment 26

10 years ago
No, my mail looks fine, and so do the mailto links in Bugzilla.

Your new bug is a separate bug.
Status: REOPENED → RESOLVED
Last Resolved: 12 years ago10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.