Closed Bug 787597 Opened 12 years ago Closed 12 years ago

Don't try to parse pages with too many elements

Categories

(Firefox for Android Graveyard :: Reader View, defect, P1)

ARM
Android
defect

Tracking

(firefox16 verified, firefox17 verified, firefox18 verified)

VERIFIED FIXED
Firefox 18
Tracking Status
firefox16 --- verified
firefox17 --- verified
firefox18 --- verified

People

(Reporter: bnicholson, Assigned: bnicholson)

Details

Attachments

(1 file)

Some large pages (e.g., http://en.wikipedia.org/wiki/List_of_j%C5%8Dy%C5%8D_kanji) can cause the reader worker to exceed 75MB when parsing. These pages have considerably more elements than most readable pages; the page I linked to has >27,000, whereas many don't even break 1,000. A simple check we can do before creating the worker is to count the number of elements in document. If they exceed X, bail immediately and don't try to parse.
This patch limits the number of elements to 3000. There's nothing particularly scientific about this number, but it's a higher than most pages, and considerably lower than those that cause excessive memory use. On my Droid RAZR, this check takes between 0-5ms on most pages. On the page linked to in comment 0, it takes 34ms. I don't think this is too terrible, especially considering the huge overhead we're preventing. Regarding the reader check, the only test page that now fails that passed before is: http://code.google.com/p/arc90labs-readability/source/browse/branches/haiti/js/readability.js. But I'm not sure this should even be considered a readable page anyway.
Attachment #657494 - Flags: review?(lucasr.at.mozilla)
Attachment #657494 - Flags: feedback?(mark.finkle)
Comment on attachment 657494 [details] [diff] [review] Don't do a reader parse for pages with > 3000 elements Remove the timing code before landing. We can always tweak the max count if needed.
Attachment #657494 - Flags: feedback?(mark.finkle) → feedback+
Comment on attachment 657494 [details] [diff] [review] Don't do a reader parse for pages with > 3000 elements Review of attachment 657494 [details] [diff] [review]: ----------------------------------------------------------------- Yep. Remove the timing code before landing.
Attachment #657494 - Flags: review?(lucasr.at.mozilla) → review+
Priority: -- → P1
Comment on attachment 657494 [details] [diff] [review] Don't do a reader parse for pages with > 3000 elements [Approval Request Comment] Bug caused by (feature/regressing bug #): User impact if declined: large pages can consume huge amounts of memory for the reader mode parsing Testing completed (on m-c, etc.): just landed m-i Risk to taking this patch (and alternatives if risky): low risk (only prevents large pages from doing a parse) String or UUID changes made by this patch: none
Attachment #657494 - Flags: approval-mozilla-beta?
Attachment #657494 - Flags: approval-mozilla-aurora?
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 18
Comment on attachment 657494 [details] [diff] [review] Don't do a reader parse for pages with > 3000 elements [Triage Comment] Approving for all branches given this is a new feature. The worst that could happen here is presumably that Reader Mode isn't offered enough. Good optimization.
Attachment #657494 - Flags: approval-mozilla-beta?
Attachment #657494 - Flags: approval-mozilla-beta+
Attachment #657494 - Flags: approval-mozilla-aurora?
Attachment #657494 - Flags: approval-mozilla-aurora+
This issue seems to be fixed on all branches (Nightly, Aurora and Beta). Closing bug as verified fixed. -- Device: Galaxy Note OS: Android 4.0.4
Status: RESOLVED → VERIFIED
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: