Reader Mode: Optimize readability check

RESOLVED FIXED in Firefox 16

Status

()

Firefox for Android
General
RESOLVED FIXED
5 years ago
a year ago

People

(Reporter: lucasr, Assigned: mfinkle)

Tracking

({perf})

Trunk
Firefox 16
All
Android
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox16+ fixed)

Details

(Whiteboard: [sps])

Attachments

(4 attachments)

(Reporter)

Description

5 years ago
The readability check introduced in bug 750683 involves cloning the DOM document to find the top candidate node with the article content inside the page. We need to optimize this readability check as much as possible and avoid the document node cloning altogether.

Ideas:
- Avoid removing any node from document to process the page
- Look for more steps in the parsing process that are unnecessary for the readability check (we already skip prepDocument and removeScripts, there might be more)
Does this include optimizing current false positive cases? For example I'm seeing the reader icon on Bugzilla bug lists, which doesn't really make sense. In-scope or new bug?
(Reporter)

Comment 2

5 years ago
(In reply to pretzer from comment #1)
> Does this include optimizing current false positive cases? For example I'm
> seeing the reader icon on Bugzilla bug lists, which doesn't really make
> sense. In-scope or new bug?

Please file a separate bug for false positives. The readability detection code definitely has to be improved. This bug is about the efficiency of the readability checking code.
Filed bug 767956
Created attachment 638754 [details]
Log while readability heuristics are running

+1 on optimizing the readability check. If you load a page like https://crash-stats.mozilla.com/report/index/bp-2b7c0808-37df-40f0-a4d2-a3ed42120703 and then immediately zoom in, the page remains blurry for ~30 seconds before it repaints (on a Galaxy Tab 10.1). Logcat (attached) indicates that it is running the readability detection heuristics during this time. In the log I did the pinch at around 11:47:16 and the page didn't render sharply until around 11:47:48 which is a ridiculously long time.
This script is showing up as a major source of main thread contention on mobile. Readability.js blocks for 1.7 seconds on cnn.com page load.

Profile:
http://people.mozilla.com/~bgirard/cleopatra/?report=AMIfv94gvUJfBU9O1bx1V8cewSf-Qz01_PLNizR0f4SKW39Cp0AXMKsIMNB3fCW4zwYS7ZnGoxLPBBfyfYWi3AuZaua2WXG5bVBTB6_ZM5GP4ZFEv5BxH04d7oHGjKDgtceS1GPFS3x0KJfq4I6xxjL0afF0O9SkOQ

I think we need to escalate this.
Blocks: 771374
tracking-firefox16: --- → ?

Updated

5 years ago
status-firefox16: --- → affected
Keywords: perf

Updated

5 years ago
Whiteboard: [sps]
Created attachment 641447 [details] [diff] [review]
part 1: skip DOM manipulations on check

This patch tries to skip any DOM manipulations when performing a check. It also fixes a typo "doc" -> "node".

This part seems to reduce the time to check on nytimes by half
Assignee: nobody → mark.finkle
Attachment #641447 - Flags: review?(bnicholson)
Created attachment 641448 [details] [diff] [review]
part 2: wait until after page load to perform check

We are currently doing the reader check on "DOMContentLoaded" which is too early. The page scripts have not run and the network isn't even really finished yet. This makes it seem to take longer for the page to load.

This patch moves to "pageshow", which is also called when going back/forward in session history too.
Attachment #641448 - Flags: review?(bnicholson)
Part 2 introduces a bug. It seems that the Java BrowserToolbar code is designed in a way that assumes the reader check is done _before_ the network progress is complete. Therefore, the reader icon is not displayed unless you change the orientation and the toolbar is refreshed. Patch coming up...
Created attachment 641620 [details] [diff] [review]
part 3: make browsertoolbar smarter about page actions

BrowserToolbar assumes that the security mode and reader mode will be set beofre the network progress finishes. That is not a good idea.

This patch:
* Renames setReaderVisiblilty to setReaderMode, since it does not actually force the icon to be visible.
* Renames setStopVisibility to setPageActionVisibility since it's about more than "stop" now.
* setReaderMode and setSecurityMode both now call setPageActionVisibility in case the icon needs to be shown now.

The only real hack I had was using mStop.getVisibility() as a param to setPageActionVisibility, but it works well enough for now.
Attachment #641620 - Flags: review?(sriram)
Comment on attachment 641620 [details] [diff] [review]
part 3: make browsertoolbar smarter about page actions

Review of attachment 641620 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good to me.
Attachment #641620 - Flags: review?(sriram) → review+
Comment on attachment 641447 [details] [diff] [review]
part 1: skip DOM manipulations on check

Review of attachment 641447 [details] [diff] [review]:
-----------------------------------------------------------------

These changes look pretty harmless, and they didn't affect my readability test cases.

::: mobile/android/chrome/content/Readability.js
@@ -1300,5 @@
>  
>    check: function() {
> -    // Set proper flags for parsing document in readability check mode
> -    this._flags = this.FLAG_READABILITY_CHECK |
> -                  this.FLAG_STRIP_UNLIKELYS;

I wonder if this change might actually hurt performance. It avoids removing them, which should help, but that means we leave more nodes to be processed when parsing - especially if these nodes have lots of children.
Attachment #641447 - Flags: review?(bnicholson) → review+
Comment on attachment 641448 [details] [diff] [review]
part 2: wait until after page load to perform check

Review of attachment 641448 [details] [diff] [review]:
-----------------------------------------------------------------

Hopefully this will fix bug 767959.
Attachment #641448 - Flags: review?(bnicholson) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/f1c05fde89c7
https://hg.mozilla.org/integration/mozilla-inbound/rev/d0bd171a6f4e
https://hg.mozilla.org/integration/mozilla-inbound/rev/7601c7ace552

Updated

5 years ago
tracking-firefox16: ? → +
https://hg.mozilla.org/mozilla-central/rev/f1c05fde89c7
https://hg.mozilla.org/mozilla-central/rev/d0bd171a6f4e
https://hg.mozilla.org/mozilla-central/rev/7601c7ace552
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 16

Updated

5 years ago
status-firefox16: affected → fixed
You need to log in before you can comment on or make changes to this bug.