Closed Bug 195035 Opened 23 years ago Closed 12 years ago

google.com - bidirectional text displayed wrongly in source view

Categories

(Tech Evangelism Graveyard :: Other, defect)

x86
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: punya, Assigned: momoi)

References

()

Details

(Keywords: top100, top500)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.3b) Gecko/20030210 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.3b) Gecko/20030210 In the source view for http://www.google.com/intl/ur/, parts of the text which should be displayed left-to-right are displayed right-to-left instead, for instance - </html> appears as <lmth/>. This happens at the end of the file, immediately after a large chunk of Urdu (right-to-left) text. There is no corresponding error in what is displayed by the browser. Reproducible: Always Steps to Reproduce: 1. Navigate to http://www.google.com/intl/ur/ 2. View the source using View > Page Source 3. Read the last line, scrolling to the right if necessary Actual Results: The source view displays left-to-right text in right-to-left order (e.g. <lmth/><ydob/>, in the middle of the last line of an html file) Expected Results: It should display the left-to-right text in its correct order (e.g. </body></html>)
WFM Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3b) Gecko/20030224 Possibly fixed by bug 192919 ?
Bug 192919 is about misrendered html (rtl rendered as ltr) in the browser component (gecko), while this is about misrendered html source (i.e. plain text) in the source viewer (in this case, ltr rendered as rtl). It doesn't seem that it's been fixed.
I also see part of the source going the wrong way. Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3b) Gecko/20030224 (nightly)
Confirmed using Mozilla 20030210 on WinXP
Status: UNCONFIRMED → NEW
Ever confirmed: true
Actually, view source *is* html rendered in the browser component, but this is not the same issue as bug 192919.
What is happening is this: the page source contains a single RLO (right-to-left override) character with no corresponding PDF (pop directional format). When viewing the HTML in the normal fashion, this doesn't matter because the override ends at the end of the block element. However, in the generated HTML that we use for view source, the elements that were block elements originally are no longer block elements, so the Bidi Algorithm produces incorrect results. Much the same thing happens viewing the source from IE, and for the same reason.
I recommend adding LRO...PDF characters around tags in the source view.
What about adding span { unicode-bidi: embed; } to viewsource.css?
To evangelism for now. The problem would be solved if that RLO wasn't there. It's unnecessary since the text in the element is right-to-left anyway, and embedding raw Bidi formatting characters in HTML is deprecated.
Assignee: mkaply → momoi
Component: BiDi Hebrew & Arabic → Asian
Product: Browser → Tech Evangelism
QA Contact: zach → ruixu
Version: Trunk → unspecified
Ian, Simon, could you please explain to me the costs, benefits, and implications of adding directionality chars to the source view? Either in a separate bug filed for this purpose or offline. I'm looking at things like: when is it needed, what does it do, what is the perf hit, what is the ram hit, etc.
-> evang500
Keywords: evang500
This is going to Other category. I will keep this bug, however.
Component: Asian → Other
QA Contact: ruixu → other
Kat, can you explain to me what I should say to Google to get them to change their content?
(In reply to comment #13) > Kat, can you explain to me what I should say to Google to get them to change > their content? An expanded version of what I said in comment 6 and comment 9 :) The Urdu translation of "I'm feeling lucky" appears in the HTML source with an RLO (right-to-left override) character, at the beginning but without the correct corresponding PDF (pop directional format) character at the end. The override character is unnecessary, because the Urdu text has right-to-left directionality already, and even if it were necessary, Unicode directional characters are deprecated in markup languages according to http://www.w3.org/TR/2003/NOTE-unicode-xml-20030613/#Bidi, and it would be more correct to use a <bdo> element to achieve the same effect. Since the RLO is only used in this one text on the whole page, I assume it's unintentional anyway.
Keywords: top500
Summary: bidirectional text displayed wrongly in source view → google.com - bidirectional text displayed wrongly in source view
Keywords: top100
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
Product: Tech Evangelism → Tech Evangelism Graveyard
You need to log in before you can comment on or make changes to this bug.