Closed Bug 1640490 Opened 4 years ago Closed 4 years ago

Reader mode shows wrong content on some pages because it always re-fetches the article from the original URL

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla79

Tracking Flags:

Tracking

Status

firefox-esr68

---

unaffected

firefox76

---

unaffected

firefox77

---

unaffected

firefox78

fixed

firefox79

---

fixed

People

(Reporter: evilpie, Assigned: enndeakin)

References

(Regressed 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(2 files)

Bug 1640490 - Backed out changeset 217394da44ac (bug 1591469) for breaking about:reader in a lot of cases, r?neildeakin 4 years ago :Gijs (he/him) 47 bytes, text/x-phabricator-request	jcristau : approval-mozilla-beta+	Details \| Review
Bug 1640490, cache the article data in the parent process then retrieve it in the reader mode process when entering reader mode, r=gijs 4 years ago Neil Deakin 47 bytes, text/x-phabricator-request		Details \| Review

Tom S [:evilpie]

Reporter

Description

•

4 years ago

•

Edited

Bug 1591469 broke something for login protected sites. Since this change the reader mode will just contain a "readable" version of the login form, instead of the blog post after login. I can reproduce this on [removed], but I obviously can't share that password.

12:03.66 INFO: Last good revision: c060bcf8ea7a089333bfd0d99f096ef797ceb390
12:03.66 INFO: First bad revision: 217394da44ac9b55df2f91bf639944ea47949b3d
12:03.66 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c060bcf8ea7a089333bfd0d99f096ef797ceb390&tochange=217394da44ac9b55df2f91bf639944ea47949b3d

Tom S [:evilpie]

Reporter

Updated

•

4 years ago

Flags: needinfo?(enndeakin)

Tom S [:evilpie]

Reporter

Updated

•

4 years ago

Summary: Read mode doesn't work on login-protected pages → Reader mode doesn't work on login-protected pages

Tom S [:evilpie]

Reporter

Comment 1

•

4 years ago

[Tracking Requested - why for this release]:

Ah. I can also reproduce this on: https://www.zeit.de/mobilitaet/2020-05/hyundai-nexo-wasserstoff-brennstoffzelle-elektroauto-automarkt-deutschland. Click on "Einverstanden" under "zeit.de mit Werbung".

tracking-firefox78: --- → ?

:Gijs (he/him)

Comment 2

•

4 years ago

It looks like the article promise that's supposed to be coming from the content that we parse when clicking the button is not available, so we fall back to loading the content from the URL, which is why things break. I don't know why that promise is now null in these cases (or maybe always, but we just don't notice on other content?).

Summary: Reader mode doesn't work on login-protected pages → Reader mode shows wrong content on some pages

:Gijs (he/him)

Comment 3

•

4 years ago

(In reply to :Gijs (he/him) from comment #2)

I don't know why that promise is now null in these cases (or maybe always, but we just don't notice on other content?).

Seems to be always - you can reproduce with e.g. a BBC article, by modifying the article text to contain some bogus text ("haha this should show up in reader mode") and then pressing the reader mode button. In working builds, the text shows up. In broken builds, it does not.

status-firefox76: --- → unaffected

status-firefox77: --- → unaffected

status-firefox78: --- → affected

status-firefox-esr68: --- → unaffected

Keywords: regression

Summary: Reader mode shows wrong content on some pages → Reader mode shows wrong content on some pages because it always re-fetches the article from the original URL

:Gijs (he/him)

Comment 4

•

4 years ago

OK, so this is broken because actor A (say, for zeit.de ) sets this._articlePromise, but then actor B (for about:reader?url=...zeit.de...) gets used for the reader mode document, because it's a different document and origin, and when it checks it won't have the article promise. So the passing of that promise got broken.

I'm not really sure what the best solution here is. For non-fission, we won't process switch the child, so we could store the pending promises in a weakmap keyed on browsingcontexts that were in-process. But that won't work for fission. To be fission-compatible, we could potentially pass the information through the parent, but (a) we'd be passing content from one (potentially compromised) child process to another, which is a bit unfortunate, and (b) we'd be passing quite large strings around, which feels unfortunate. Hopefully Neil has some ideas.

Julien Cristau [:jcristau] (back April 22)

Comment 5

•

4 years ago

Should we back out bug 1591469 for 78?

tracking-firefox78: ? → +

Neil Deakin

Assignee

Comment 6

•

4 years ago

So if I understand this, the existing loaded document is processed and a new document tree for reader mode is generated from that. Then that is inserted into the about reader document, and presumably doesn't contain any scripts or anything, since it is not the same domain as the source page? So in fission mode, the reader mode version would not be in the same process as the original.

So this worked before because the legacy actors were reused for the reader mode version? But would never have worked in fission?

Flags: needinfo?(enndeakin)

:Gijs (he/him)

Comment 7

•

4 years ago

(In reply to Neil Deakin from comment #6)

So if I understand this, the existing loaded document is processed and a new document tree for reader mode is generated from that. Then that is inserted into the about reader document,

Yes.

and presumably doesn't contain any scripts or anything, since it is not the same domain as the source page?

Readability strips some of it itself, but we also run everything through the sanitizer. So no scripts (nor much else, e.g. unfortunately youtube embeds also don't work, and nobody has bothered teaching our sanitizer implementation about srcset, either, etc...).

So in fission mode, the reader mode version would not be in the same process as the original.

Yes.

So this worked before because the legacy actors were reused for the reader mode version? But would never have worked in fission?

I guess so, yes. It wasn't always an actor, it used to just be a frame script...

Anyway, what should we do about it, going forward? :-)

(In reply to Julien Cristau [:jcristau] from comment #5)

Should we back out bug 1591469 for 78?

This is definitely the simplest, but isn't a permanent solution.

Flags: needinfo?(enndeakin)

Julien Cristau [:jcristau] (back April 22)

Comment 8

•

4 years ago

78 is now on beta. A straight backout conflicts with further changes to AboutReader.jsm though.

:Gijs (he/him)

Comment 9

•

4 years ago

Attached file Bug 1640490 - Backed out changeset 217394da44ac (bug 1591469) for breaking about:reader in a lot of cases, r?neildeakin — Details

Jared Wein [:jaws] (please needinfo? me)

Updated

•

4 years ago

Assignee: nobody → gijskruitbosch+bugs

Severity: -- → S3

Status: NEW → ASSIGNED

Priority: -- → P1

:Gijs (he/him)

Comment 10

•

4 years ago

I agree this should be P1 but I would really prefer not to own this. I've just come up with a backout patch that applies to beta...

Assignee: gijskruitbosch+bugs → nobody

Status: ASSIGNED → NEW

Phabricator Automation

Updated

•

4 years ago

Assignee: nobody → gijskruitbosch+bugs

Status: NEW → ASSIGNED

Neil Deakin

Assignee

Comment 11

•

4 years ago

Would it be feasible to have some mechanism that always reuses the same process for the reader mode version as the original page?

Flags: needinfo?(enndeakin) → needinfo?(gijskruitbosch+bugs)

:Gijs (he/him)

Comment 12

•

4 years ago

(In reply to Neil Deakin from comment #11)

Would it be feasible to have some mechanism that always reuses the same process for the reader mode version as the original page?

It's the same process right now, it just won't be under Fission, I think - different origin, so different process. And we definitely do not want the about:reader privileged code running on the same origin as the web content. We could conceivably put the about:reader content in some kind of iframe inside the privileged frame, and then make the iframe same-origin with the original page, but that's a lot of work, has complications (like how to make the iframe appear + behave seamless) and would upset the current actor set-up anyway - plus we ideally, long-term, want to move the content into a sandboxed iframe (cf. bug 1204818) which due to the sandboxing would likely make it not same-origin again.

I see phabricator assigned me anyway, despite comment #10, so that's nice...

Flags: needinfo?(gijskruitbosch+bugs)

Mathew Hodson

Comment 13

•

4 years ago

This issue seems to be breaking reader mode on getpocket.com

Last good revision: c060bcf8ea7a089333bfd0d99f096ef797ceb390
First bad revision: 217394da44ac9b55df2f91bf639944ea47949b3d
Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c060bcf8ea7a089333bfd0d99f096ef797ceb390&tochange=217394da44ac9b55df2f91bf639944ea47949b3d

Go to https://getpocket.com/explore/item/a-jury-of-peers?utm_source=pocket-newtab
Press F9 to toggle reader view

Results:

Failed to load article from page

Mathew Hodson

Comment 14

•

4 years ago

There may have been a change to the getpocket.com site that is compounding this issue. I noticed when testing with mozregression, the first navigation to an article on getpocket.com on each profile would give me an 500 error page. A refesh would then show the article.

Also in my reproduction steps, if you exit reader mode, you also get a 500 error page.

Seems similar to bug 1633431.

Neil Deakin

Assignee

Comment 15

•

4 years ago

So it sounds like long term we can't really avoid passing the data from the child process to the parent and then to the new reader mode process.

:Gijs (he/him)

Comment 16

•

4 years ago

Comment on attachment 9153146 [details]
Bug 1640490 - Backed out changeset 217394da44ac (bug 1591469) for breaking about:reader in a lot of cases, r?neildeakin

Beta/Release Uplift Approval Request

User impact if declined: Reader mode doesn't work with paywalls or when article content is fetched dynamically
Is this code covered by automated tests?: Yes
Has the fix been verified in Nightly?: No
Needs manual test from QE?: No
If yes, steps to reproduce:
List of other uplifts needed: n/a
Risk to taking this patch: Low
Why is the change risky/not risky? (and alternatives if risky): This is a backout of a frontend-only patch
String changes made/needed: none

Attachment #9153146 - Flags: approval-mozilla-beta?

Neil Deakin

Assignee

Comment 17

•

4 years ago

I also have a mostly done patch that sends the parsed article data up to the parent and on to the reader mode process.

Julien Cristau [:jcristau] (back April 22)

Updated

•

4 years ago

status-firefox79: --- → affected

Julien Cristau [:jcristau] (back April 22)

Comment 18

•

4 years ago

Comment on attachment 9153146 [details]
Bug 1640490 - Backed out changeset 217394da44ac (bug 1591469) for breaking about:reader in a lot of cases, r?neildeakin

backout to fix a regression with reader mode, approved for 78.0b3

Attachment #9153146 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Julien Cristau [:jcristau] (back April 22)

Comment 19

•

4 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-beta/rev/52bf9bd04b97

status-firefox78: affected → fixed

:Gijs (he/him)

Comment 20

•

4 years ago

(In reply to Neil Deakin from comment #17)

I also have a mostly done patch that sends the parsed article data up to the parent and on to the reader mode process.

Alright, going to pass this on to you then. :-)

Assignee: gijskruitbosch+bugs → enndeakin

Neil Deakin

Assignee

Comment 21

•

4 years ago

Attached file Bug 1640490, cache the article data in the parent process then retrieve it in the reader mode process when entering reader mode, r=gijs — Details

This allows the correct article to be displayed even when the reader mode ends up using a different process than the original page,

Pulsebot

Comment 22

•

4 years ago

Pushed by neil@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6323c2dfd00f
cache the article data in the parent process then retrieve it in the reader mode process when entering reader mode, r=Gijs

Natalia Csoregi [:nataliaCs]

Comment 23

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/6323c2dfd00f

Status: ASSIGNED → RESOLVED

Closed: 4 years ago

status-firefox79: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla79

Jared Wein [:jaws] (please needinfo? me)

Updated

•

3 years ago

Regressions: 1720903

BMO Automation

Updated

•

3 years ago

Has Regression Range: --- → yes

You need to log in before you can comment on or make changes to this bug.