Closed Bug 1081742 Opened 10 years ago Closed 9 years ago

Reader mode doesn't support RTL/bidi text (e.g., Arabic)

Categories

(Firefox for Android Graveyard :: Reader View, defect)

32 Branch
Other
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Firefox 38

People

(Reporter: kafhaya66, Assigned: capella)

References

()

Details

Attachments

(2 files, 2 obsolete files)

User Agent: Mozilla/5.0 (Android; Mobile; rv:32.0) Gecko/32.0 Firefox/32.0
Build ID: 20140923174357

Steps to reproduce:

firefox support arabic font very good


Actual results:

but when i add some of arabic page to read later, font change to left and not support right text alignment in read later list and also when l want to convert this page to pdf


Expected results:

 for arabic when add to read later it should to align right side of page and also when convert to pdf
Do you have an example page showcasing this? Maybe this is bug 828922.
Flags: needinfo?(kafhaya66)
Component: General → Reader Mode
Summary: right text alignment for arabic → right text alignment for arabic not adhered to in reader mode
Flags: needinfo?(kafhaya66)
Group: core-security
CC list accessible: false
Not accessible to reporter
Group: core-security
Spotted a user mentioning this in a review on Google Play.
Blocks: rtl-meta
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: right text alignment for arabic not adhered to in reader mode → Reader mode doesn't support RTL/bidi text (e.g., Arabic)
Just noting in here based on IRC conversation that I ran into a related problem where RTL text-selection in reader mode will unselect a first word in a sentence, e.g, http://cl.ly/1c3v2V24042x/video.mp4 (mentioned that perhaps element .direction is not maintained)
Attached patch bug1081742.diff (obsolete) — Splinter Review
This seems to affect desktop too.

I have a basic patch I threw together that seems to work for both platforms, but at this point it depends partially on backing out bug 785992.

So I'll need discussion on why we did that, a way to have nsIParserUtils honor the text direction, or any other more obvious fix I'm unaware of.

I've little experience in this area, so I'm asking bnicholson for feedback since I see his prior involvement with JSDOMParser, and I may have spotted a loosely (un)related issue in getStyle() / setStyle().
Attachment #8557405 - Flags: feedback?(bnicholson)
Attached patch bug1081742.diff (obsolete) — Splinter Review
Tweaked during testing.
Assignee: nobody → markcapella
Attachment #8557405 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8557405 - Flags: feedback?(bnicholson)
Attachment #8557515 - Flags: review?(bnicholson)
Comment on attachment 8557515 [details] [diff] [review]
bug1081742.diff

Additional suggested reviewers ...
Attachment #8557515 - Flags: review?(margaret.leibovic)
Attachment #8557515 - Flags: review?(jaws)
Comment on attachment 8557515 [details] [diff] [review]
bug1081742.diff

Review of attachment 8557515 [details] [diff] [review]:
-----------------------------------------------------------------

Removing jaws since Margaret and I are now peers of the Reader submodule (https://wiki.mozilla.org/Toolkit/Submodules). Shouldn't need 3 reviewers here.

::: toolkit/components/reader/AboutReader.jsm
@@ -588,5 @@
> -    let parserUtils = Cc["@mozilla.org/parserutils;1"].getService(Ci.nsIParserUtils);
> -    let contentFragment = parserUtils.parseFragment(article.content, Ci.nsIParserUtils.SanitizerDropForms,
> -                                                    false, articleUri, this._contentElement);
> -    this._contentElement.innerHTML = "";
> -    this._contentElement.appendChild(contentFragment);

This wasn't working since the sanitizer strips styles by default. You should be able to fix this by adding the SanitizerAllowStyle flag [1].

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Reference/Interface/nsIParserUtils

To answer your question, the sanitizer probably isn't essential since the reader content is sandboxed, and we already strip unwanted elements in Readability.js. It doesn't hurt to have it if we can, though, since it will purge anything we may have missed.

::: toolkit/components/reader/content/JSDOMParser.js
@@ +578,5 @@
>        let attr = this.node.getAttribute("style");
>        if (!attr)
>          return undefined;
>  
> +      let styles = attr.split(";");

Ouch, nice catch.

::: toolkit/components/reader/content/Readability.js
@@ +431,5 @@
>          allElements[i]._index = i;
>        }
>  
>        /**
> +       * Readability returns static node lists, not live ones. When we remove

I think this comment was correct before. JSDOMParser isn't a real standards-compliant parser; things like "document.getElementsByTagName(...)" return a static array instead of a live NodeList. That means if any of these elements are added/removed from the document, the result will not change. This is a limitation of JSDOMParser, not Readability. The latter can still be used without the former, in which case this comment wouldn't apply.

@@ +913,4 @@
>        e.removeAttribute('style');
> +      if (textDirection !== undefined) {
> +        e.setAttribute("style", textDirection);
> +      }

So what's happening here is:
1) You're getting the style from the node
2) The style is being removed from the node
3) The style from #1 is being applied to node

In other words, this block isn't doing anything :) What you want is to keep the direction style while stripping everything else:

let style = e.getStyle("direction");
e.removeAttribute("style");
if (style) {
  e.setStyle("direction", style);
}

@@ +926,4 @@
>            cur.removeAttribute("style");
> +          if (textDirection !== undefined) {
> +            cur.setAttribute("style", textDirection);
> +          }

Same here. I'd recommend creating a _stripNonDirectionStyles() helper that can be called from both places.
Attachment #8557515 - Flags: review?(jaws)
Attachment #8557515 - Flags: review?(bnicholson)
Attachment #8557515 - Flags: feedback+
Thanks for all that ! Not sure how that comment got changed, accidental find/replace I'm guessing.

I'm not sure what you mean by |In other words, this block isn't doing anything|.
Both out intent seems to be to save the |direction| style, remove any other style k:v pairs such as "display:foo", "margin;foo", etc. end reset the saved direction style.

I'll look at your example closer, but adding a helper to JSDOMParser is "yeah let's!" :)

I appreciate the info re: You should be able to fix this by adding the SanitizerAllowStyle flag. TIL moment, and no need to backout useful code.
(In reply to Mark Capella [:capella] from comment #11)
> I'm not sure what you mean by |In other words, this block isn't doing
> anything|.
> Both out intent seems to be to save the |direction| style, remove any other
> style k:v pairs such as "display:foo", "margin;foo", etc. end reset the
> saved direction style.

Yeah, I could see that was your intent, but I don't think it was quite working that way. `let textDirection = e.getAttribute("style");` gets *all* styles for the element, not just the direction style. So when you reset the style after removing it, you were restoring all other k:v pairs with it.
ooops, there it is.
Attached patch bug1081742.diffSplinter Review
Cleaner :)
Attachment #8557515 - Attachment is obsolete: true
Attachment #8557515 - Flags: review?(margaret.leibovic)
Attachment #8558275 - Flags: review?(bnicholson)
Comment on attachment 8558275 [details] [diff] [review]
bug1081742.diff

Review of attachment 8558275 [details] [diff] [review]:
-----------------------------------------------------------------

Looks OK to me!

::: toolkit/components/reader/content/Readability.js
@@ +929,5 @@
> +   * Remove all style atrributes, preserving RTL/LTR text direction styles.
> +   *
> +   * @param Element
> +   * @return void.
> +   */ 

Nit: whitespace
Attachment #8558275 - Flags: review?(bnicholson) → review+
https://hg.mozilla.org/mozilla-central/rev/efd053a1586c
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 38
It looks that desktop reader mode still doesn't support RTL on this page:
http://hwzone.co.il/%D7%94%D7%A6%D7%A6%D7%94-%D7%9C%D7%9E%D7%95%D7%A6%D7%A8%D7%99-%D7%94%D7%A2%D7%99%D7%93%D7%9F-%D7%94%D7%97%D7%93%D7%A9-%D7%A9%D7%9C-amd-%D7%A9%D7%99%D7%92%D7%99%D7%A2%D7%95-%D7%91%D7%A9%D7%A0%D7%AA-201/

Kubuntu 15.04 64bit, Firefox 39.

Should I open a new bug?
Flags: needinfo?(markcapella)
Flags: needinfo?(markcapella)
(In reply to Kevin Brosnan [:kbrosnan] from comment #21)
> Works for me on a Nightly build.
> http://people.mozilla.org/~kbrosnan/tmp/1081742/rtl-reader.webm

Kevin, it's that I get and it's wrong.
The text should be aligned to right and RTL should be enabled (so the rows with both English and Hebrew text will appear in proper order).
Flags: needinfo?(kbrosnan)
mmm ... looks like maybe we had a conflict with the implementation of the new ReaderMode, when it moved to support desktop [0] bug 1134443  (?)

My original patch changed AboutReader.jsm, JSDOMParse.js, and Readability.js.

Bug 1134443 seems to have stepped on Readability.js changes ... looking further ...

[0] http://hg.mozilla.org/mozilla-central/diff/2bbb4ac2a49e/toolkit/components/reader/content/Readability.js
The "lost code" in comment #23 is interesting, but seems un-related.

This particular page seems to be setting the body direction via .css [1]

I now recall an earlier irc conversation [2] for this issue, and a tracking bug [3] - "Reader View picks wrong direction for some RTL pages."


[1] view-source:http://hwzone.co.il/wp-content/themes/HWzone%20Theme/style.css?v=0.0753
[2] http://logs.glob.uno/?c=mozilla%23introduction&s=10+Jun+2015&e=10+Jun+2015&h=bidi#c221144
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1173548
Flags: needinfo?(kbrosnan)
No longer blocks: rtl-meta
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: