Closed Bug 785145 Opened 12 years ago Closed 12 years ago

Correctly mark Ps created from text-only DIVs for scoring

Categories

(Firefox for Android Graveyard :: Reader View, defect, P1)

All
Android
defect

Tracking

(firefox16 unaffected, firefox17 verified, firefox18 verified)

VERIFIED FIXED
Firefox 17
Tracking Status
firefox16 --- unaffected
firefox17 --- verified
firefox18 --- verified

People

(Reporter: lucasr, Assigned: lucasr)

Details

Attachments

(1 file)

Right now it seems we're doing it wrong add queuing a replaced node for scoring:

  http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/Readability.js#465
Blocks: 774914
Comment on attachment 654740 [details] [diff] [review]
Correctly mark Ps created from text-only DIVs for scoring

This change causes this page to fail:
http://articles.latimes.com/2012/mar/27/health/la-he-green-coffee-weight-loss-20120328

Even though the node is no longer in the document, maybe adding it to nodesToScore was intentional? Maybe the former node's parent is still useful for the algorithm.

Also, nodeIndex is decremented here. This means the index is checked again, which would cause newNode to be looked at twice. Perhaps this is contributing to the failure for the article I linked to? Maybe you could try removing the "nodeIndex -= 1;" line, but we might just want to leave this alone unless we have good reason to change it.
Attachment #654740 - Flags: review?(bnicholson) → review-
(In reply to Brian Nicholson (:bnicholson) from comment #2)
> Comment on attachment 654740 [details] [diff] [review]
> Correctly mark Ps created from text-only DIVs for scoring
> 
> This change causes this page to fail:
> http://articles.latimes.com/2012/mar/27/health/la-he-green-coffee-weight-
> loss-20120328

Interesting, it's not failing for me. Are you testing this on your reader test app or on device?

> Even though the node is no longer in the document, maybe adding it to
> nodesToScore was intentional? Maybe the former node's parent is still useful
> for the algorithm.

It doesn't make any sense to add the replaced to node to nodesToScore array. Once we call node.parentNode.replaceChild(newNode, node), the DIV that got replaced by the P will have a null parentNode pointer. This means this node will be simply discarded once through all the nodesToScore items. See:

http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/Readability.js#497

Note that the point of converting these DIVs to Ps is because the whole algorithm is driven by how much content each node has. This why we convert double BRs to Ps and why we convert certain DIVs to Ps. Paragraph nodes increment score for their parent and grandparent nodes:

http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/Readability.js#530

Hence, considering a parentless node for scoring is pointless for the algorithm.

> Also, nodeIndex is decremented here. This means the index is checked again,
> which would cause newNode to be looked at twice. Perhaps this is
> contributing to the failure for the article I linked to? Maybe you could try
> removing the "nodeIndex -= 1;" line, but we might just want to leave this
> alone unless we have good reason to change it.

Good catch. This is intentional. Once the loop visits the same index again, it will now be a P and the newNode will be added to nodesToScore array (mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/Readability.js#530). With this change, we don't need to do this anymore as we're adding the newNode straight away.

I think I'll combine the patch in this bug with the patch in bug 774914 as they're dealing with the same part of the algorithm. Better get these changes all at once.
Priority: -- → P1
No longer blocks: 774914
https://hg.mozilla.org/mozilla-central/rev/5e2b2c9c4f67
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 17
http://articles.latimes.com/2012/mar/27/health/la-he-green-coffee-weight-loss-20120328 doesn't fail anymore on the latest Nightly, Aurora and Beta builds. It seems that the issue was fixed. Closing bug as verified fixed.

--
Device: Galaxy Note
OS: Android 4.0.4
Status: RESOLVED → VERIFIED
I guess it was a regression in 17.0.
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: