Closed Bug 944773 Opened 12 years ago Closed 10 years ago

XML tags and urls in right-to-left texts in verbatim

Tracking

(Not tracked)

Status:

RESOLVED WONTFIX

People

(Reporter: amir_farsi, Unassigned)

Details

Attachments

(1 file, 1 obsolete file)

It shows XML Error in a Persian text which XML long 12 years ago Amir Farsi 491.19 KB, image/jpeg		Details
XmlErrorByLongLTRinRTL.png 12 years ago Amir Farsi 370.05 KB, image/png		Details

Amir Farsi

Reporter

Description

•

12 years ago

User Agent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36 Steps to reproduce: problem is for format of any left to right text in right to left texts(specially for codes, urls, tags, xml and...).I have problems in verbatim. The verbatim shows XML tag error. Actual results: Shows XML error in verbatim on complex right-to-left texts. The result is complex and long text made untranslatable for right-to-left languages, specially Persian(Farsi)

Amir Farsi

Reporter

Updated

•

12 years ago

Severity: normal → major

(no longer active)

Updated

•

12 years ago

Component: fa / Persian → Verbatim

Product: Mozilla Localizations → Webtools

Version: unspecified → Trunk

Dwayne Bailey

Comment 1

•

12 years ago

If there are XML tag errors then your XML tags are in most cases incorrect. It is hard to determine this in RTL languages. Verbatim uses the standard textarea so any potential RTL issues here are impacting to everyone. The most useful keyboard shortcut in these cases is, Ctrl+Shift+x which allows you to toggle between RTL and LTR. In LTR you will be able to correct any broken variables or tags

Amir Farsi

Reporter

Comment 2

•

12 years ago

Yes. I used buttons, for switch between RTL and LTR. It made another problems for typing. Anyway, I think a problem is in processing LTR texts which are in RTL texts. For examples i can edit texts which have only one xml code or url, but when it has multiple xml codes in text, or it contains long xml strings ,brackets, and other symbols, like {} the work for editing the text in RTL language is very hard. The rtl and ltr buttons couldn't solve the problem good. Also, i think problem is when the system changes the direction of text to RTL, it think all of the text has RTL direction! For example system changes the direction of { to } and } to {, and it making problem in long texts and texts with multiple xml tags. I think Ehsan Akhgari can understand what is my problem?! You know this problem is in most of applications, except professional word processors. Mr. Akhgari, if you can describe the problem, please describe here. Thanks for your comment.

Amir Farsi

Reporter

Comment 3

•

12 years ago

Attached image It shows XML Error in a Persian text which XML long (obsolete) — Details

I added a screenshot from errors and status of text.

Amir Farsi

Reporter

Updated

•

12 years ago

Attachment #8340598 - Attachment is obsolete: true

Attachment #8340598 - Attachment is patch: true

Attachment #8340598 - Attachment mime type: image/jpeg → text/plain

Dwayne Bailey

Updated

•

12 years ago

Attachment #8340598 - Attachment is patch: false

Dwayne Bailey

Updated

•

12 years ago

Attachment #8340598 - Attachment mime type: text/plain → image/jpeg

Dwayne Bailey

Comment 4

•

12 years ago

OK I see you solved the bug, I found Shift-Ctrl-X useful to validate that. You had a stray > that cause the error I think. Can you please close the bug if it is solved.

Amir Farsi

Reporter

Comment 5

•

12 years ago

Attached image XmlErrorByLongLTRinRTL.png — Details

This keys, only changing and switching the direction of all of text between RTL and LTR. But my problem is not changing direction of all text in the translation text box. In the verbatim system for RTL languages, the direction of translation text box is RTL by default and it don't require to switch from LTR to RTL. I removed that attachment because i uploaded that attachment invalid. I think this new attachment Can describe this problem visually. My problem is writing correct ditection of LTR text in RTL text.

(no longer active)

Comment 6

•

12 years ago

Amir, the problem that you're facing is very common. Basically, the Unicode Bidi Algorithm (UBA) which is what pretty much all software use these days to lay out bidirectional text specifies some characters as having strong directionality (examples: a and ب) while some others (such as < and >) have weak directionality. The directionality of those weak characters is basically determined by their surrounding characters (note that I'm over-simplifying here) which means that the angle bracket characters used in HTML tags will get the wrong direction if their preceding character has strong RTL directionality. It's impossible for a program to determine whether this heuristic will result in a bad rendering. It doesn't most of the time, but it does especially when working on bidi HTML source code. You _can_ work around this problem by adding unicode marker characters (RLM and LRM) which are invisible characters having strong RTL and LTR directionality respectively, but then those characters will end up in your HTML code and will potentially break the actual layout of the rendered HTML. There is really no good solution here, I'm afraid.

Amir Farsi

Reporter

Comment 7

•

12 years ago

Thanks Ehsan, for your very good description. Your description shows the problem which i reported, is a common problem, and how it happens. But Ehsan, should there be a solution, for processing and rendering directionality of bidirectional texts, for example character shouldn't get their direction from surrounding characters, they can be have separate direction processing. At least we can have separated direction processing for characters such as < , > which maybe part pf html and other markup languages. if it embed in the rendering systems, softwares and algorithms). I think one day the problems of bidirectional text must be solved. Ehsan, can you run an open source project for more research on solving this problem in Mozilla? Also i used that Unicode marker chars, but they made another problems, and not solved the problem. (In reply to :Ehsan Akhgari (needinfo? me!) from comment #6) > Amir, the problem that you're facing is very common. Basically, the Unicode > Bidi Algorithm (UBA) which is what pretty much all software use these days > to lay out bidirectional text specifies some characters as having strong > directionality (examples: a and ب) while some others (such as < and >) have > weak directionality. The directionality of those weak characters is > basically determined by their surrounding characters (note that I'm > over-simplifying here) which means that the angle bracket characters used in > HTML tags will get the wrong direction if their preceding character has > strong RTL directionality. It's impossible for a program to determine > whether this heuristic will result in a bad rendering. It doesn't most of > the time, but it does especially when working on bidi HTML source code. > > You _can_ work around this problem by adding unicode marker characters (RLM > and LRM) which are invisible characters having strong RTL and LTR > directionality respectively, but then those characters will end up in your > HTML code and will potentially break the actual layout of the rendered HTML. > > There is really no good solution here, I'm afraid.

(no longer active)

Comment 8

•

12 years ago

Well the issue is that the Unicode Bidi Algorithm has not been designed to deal with things such as laying out a bidi XML source code. It's mostly designed with the natural language use cases in mind, so it's not entirely a surprise that it falls down badly in this case. I've never seen a better proposal which can handle both natural language and these "programming" use cases. Also note that the UBA is implemented in pretty much every piece of software that displays text on any platform (at least those that are bidi aware) so replacing it with another algorithm is a huge challenge, if not impossible.

Jeff Beatty [:gueroJeff]

Comment 9

•

10 years ago

Resolving this bug as Ehsan has described the problem very well and since this is a UBA-specific issue and the cost is too high to attempt to replace UBA with something else.

Status: UNCONFIRMED → RESOLVED

Closed: 10 years ago

Resolution: --- → WONTFIX

Nobody; OK to take it and work on it

Assignee

Updated

•

10 years ago

Product: Webtools → Webtools Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

XML tags and urls in right-to-left texts in verbatim

Categories

(Webtools Graveyard :: Verbatim, defect)

Tracking

(Not tracked)

People

(Reporter: amir_farsi, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Updated

Updated

Updated

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Attachment

General

Description

File Name

Content Type