Reader mode shows wrong content on some pages because it always re-fetches the article from the original URL
Categories
(Toolkit :: Reader Mode, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox76 | --- | unaffected |
firefox77 | --- | unaffected |
firefox78 | + | fixed |
firefox79 | --- | fixed |
People
(Reporter: evilpie, Assigned: enndeakin)
References
(Regressed 1 open bug, Regression)
Details
(Keywords: regression)
Attachments
(2 files)
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
Details | Review |
Bug 1591469 broke something for login protected sites. Since this change the reader mode will just contain a "readable" version of the login form, instead of the blog post after login. I can reproduce this on [removed], but I obviously can't share that password.
12:03.66 INFO: Last good revision: c060bcf8ea7a089333bfd0d99f096ef797ceb390
12:03.66 INFO: First bad revision: 217394da44ac9b55df2f91bf639944ea47949b3d
12:03.66 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c060bcf8ea7a089333bfd0d99f096ef797ceb390&tochange=217394da44ac9b55df2f91bf639944ea47949b3d
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 1•4 years ago
|
||
[Tracking Requested - why for this release]:
Ah. I can also reproduce this on: https://www.zeit.de/mobilitaet/2020-05/hyundai-nexo-wasserstoff-brennstoffzelle-elektroauto-automarkt-deutschland. Click on "Einverstanden" under "zeit.de mit Werbung".
Comment 2•4 years ago
|
||
It looks like the article promise that's supposed to be coming from the content that we parse when clicking the button is not available, so we fall back to loading the content from the URL, which is why things break. I don't know why that promise is now null in these cases (or maybe always, but we just don't notice on other content?).
Comment 3•4 years ago
|
||
(In reply to :Gijs (he/him) from comment #2)
I don't know why that promise is now null in these cases (or maybe always, but we just don't notice on other content?).
Seems to be always - you can reproduce with e.g. a BBC article, by modifying the article text to contain some bogus text ("haha this should show up in reader mode") and then pressing the reader mode button. In working builds, the text shows up. In broken builds, it does not.
Comment 4•4 years ago
|
||
OK, so this is broken because actor A (say, for zeit.de
) sets this._articlePromise
, but then actor B (for about:reader?url=...zeit.de...
) gets used for the reader mode document, because it's a different document and origin, and when it checks it won't have the article promise. So the passing of that promise got broken.
I'm not really sure what the best solution here is. For non-fission, we won't process switch the child, so we could store the pending promises in a weakmap keyed on browsingcontexts that were in-process. But that won't work for fission. To be fission-compatible, we could potentially pass the information through the parent, but (a) we'd be passing content from one (potentially compromised) child process to another, which is a bit unfortunate, and (b) we'd be passing quite large strings around, which feels unfortunate. Hopefully Neil has some ideas.
Assignee | ||
Comment 6•4 years ago
|
||
So if I understand this, the existing loaded document is processed and a new document tree for reader mode is generated from that. Then that is inserted into the about reader document, and presumably doesn't contain any scripts or anything, since it is not the same domain as the source page? So in fission mode, the reader mode version would not be in the same process as the original.
So this worked before because the legacy actors were reused for the reader mode version? But would never have worked in fission?
Comment 7•4 years ago
|
||
(In reply to Neil Deakin from comment #6)
So if I understand this, the existing loaded document is processed and a new document tree for reader mode is generated from that. Then that is inserted into the about reader document,
Yes.
and presumably doesn't contain any scripts or anything, since it is not the same domain as the source page?
Readability strips some of it itself, but we also run everything through the sanitizer. So no scripts (nor much else, e.g. unfortunately youtube embeds also don't work, and nobody has bothered teaching our sanitizer implementation about srcset
, either, etc...).
So in fission mode, the reader mode version would not be in the same process as the original.
Yes.
So this worked before because the legacy actors were reused for the reader mode version? But would never have worked in fission?
I guess so, yes. It wasn't always an actor, it used to just be a frame script...
Anyway, what should we do about it, going forward? :-)
(In reply to Julien Cristau [:jcristau] from comment #5)
Should we back out bug 1591469 for 78?
This is definitely the simplest, but isn't a permanent solution.
Comment 8•4 years ago
|
||
78 is now on beta. A straight backout conflicts with further changes to AboutReader.jsm though.
Comment 9•4 years ago
|
||
Updated•4 years ago
|
Comment 10•4 years ago
|
||
I agree this should be P1 but I would really prefer not to own this. I've just come up with a backout patch that applies to beta...
Updated•4 years ago
|
Assignee | ||
Comment 11•4 years ago
|
||
Would it be feasible to have some mechanism that always reuses the same process for the reader mode version as the original page?
Comment 12•4 years ago
|
||
(In reply to Neil Deakin from comment #11)
Would it be feasible to have some mechanism that always reuses the same process for the reader mode version as the original page?
It's the same process right now, it just won't be under Fission, I think - different origin, so different process. And we definitely do not want the about:reader privileged code running on the same origin as the web content. We could conceivably put the about:reader content in some kind of iframe inside the privileged frame, and then make the iframe same-origin with the original page, but that's a lot of work, has complications (like how to make the iframe appear + behave seamless) and would upset the current actor set-up anyway - plus we ideally, long-term, want to move the content into a sandboxed iframe (cf. bug 1204818) which due to the sandboxing would likely make it not same-origin again.
I see phabricator assigned me anyway, despite comment #10, so that's nice...
Comment 13•4 years ago
|
||
This issue seems to be breaking reader mode on getpocket.com
Last good revision: c060bcf8ea7a089333bfd0d99f096ef797ceb390
First bad revision: 217394da44ac9b55df2f91bf639944ea47949b3d
Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c060bcf8ea7a089333bfd0d99f096ef797ceb390&tochange=217394da44ac9b55df2f91bf639944ea47949b3d
- Go to https://getpocket.com/explore/item/a-jury-of-peers?utm_source=pocket-newtab
- Press F9 to toggle reader view
Results:
Failed to load article from page
Comment 14•4 years ago
|
||
There may have been a change to the getpocket.com site that is compounding this issue. I noticed when testing with mozregression, the first navigation to an article on getpocket.com on each profile would give me an 500 error page. A refesh would then show the article.
Also in my reproduction steps, if you exit reader mode, you also get a 500 error page.
Seems similar to bug 1633431.
Assignee | ||
Comment 15•4 years ago
|
||
So it sounds like long term we can't really avoid passing the data from the child process to the parent and then to the new reader mode process.
Comment 16•4 years ago
|
||
Comment on attachment 9153146 [details]
Bug 1640490 - Backed out changeset 217394da44ac (bug 1591469) for breaking about:reader in a lot of cases, r?neildeakin
Beta/Release Uplift Approval Request
- User impact if declined: Reader mode doesn't work with paywalls or when article content is fetched dynamically
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: n/a
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This is a backout of a frontend-only patch
- String changes made/needed: none
Assignee | ||
Comment 17•4 years ago
|
||
I also have a mostly done patch that sends the parsed article data up to the parent and on to the reader mode process.
Updated•4 years ago
|
Comment 18•4 years ago
|
||
Comment on attachment 9153146 [details]
Bug 1640490 - Backed out changeset 217394da44ac (bug 1591469) for breaking about:reader in a lot of cases, r?neildeakin
backout to fix a regression with reader mode, approved for 78.0b3
Comment 19•4 years ago
|
||
bugherder uplift |
Comment 20•4 years ago
|
||
(In reply to Neil Deakin from comment #17)
I also have a mostly done patch that sends the parsed article data up to the parent and on to the reader mode process.
Alright, going to pass this on to you then. :-)
Assignee | ||
Comment 21•4 years ago
|
||
This allows the correct article to be displayed even when the reader mode ends up using a different process than the original page,
Comment 22•4 years ago
|
||
Pushed by neil@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/6323c2dfd00f cache the article data in the parent process then retrieve it in the reader mode process when entering reader mode, r=Gijs
Comment 23•4 years ago
|
||
bugherder |
Updated•3 years ago
|
Description
•