Open Bug 1422519 Opened 7 years ago Updated 2 years ago

Images etc. requested in reader mode should include the original page host/uri in their Referer or Origin header

Categories

(Toolkit :: Reader Mode, enhancement, P3)

enhancement

Tracking

()

Tracking Status
firefox57 --- wontfix
firefox58 --- wontfix
firefox59 --- affected

People

(Reporter: Gijs, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Unfortunately, because of how reader mode works, we run content on a page with an about: URI. This means that for any included images or other subresources, we do not include the right host/uri in the respective referer/origin headers when requesting such resources.

It's possible we would be able to fix this using bug 1204818. However, we will need to ensure that it won't be possible for some other frame from the same origin to gain access to the reader mode frame, and it's not clear how we would do that if we ensure that the principal / uri for the framed document is same-origin with the original site. In principle, I would assume that same-origin frames would become accessible if reader mode was opened in a window that had a parent/opener/target relationship with some other window on that site (that *was* able to run script, which of course won't be possible on the sandboxed frame).

If there's some simple way to fix bug 1204818 *and* fix this bug, without causing security issues, I would be very interested. Christoph, do you have any ideas on what would be the best way to do this?
Flags: needinfo?(ckerschb)
Priority: -- → P3
Francois has been working on Bug 446344 to fix origin headers and Thomas is working in general to fix all referrer related issues (within Bug 1409600 and dependents). Passing the ni? on to those two folks.
Blocks: 1409600
Depends on: 446344
Flags: needinfo?(tnguyen)
Flags: needinfo?(francois)
Flags: needinfo?(ckerschb)
(In reply to :Gijs from comment #0)
> It's possible we would be able to fix this using bug 1204818. However, we
> will need to ensure that it won't be possible for some other frame from the
> same origin to gain access to the reader mode frame, and it's not clear how
> we would do that if we ensure that the principal / uri for the framed
> document is same-origin with the original site. In principle, I would assume
> that same-origin frames would become accessible if reader mode was opened in
> a window that had a parent/opener/target relationship with some other window
> on that site (that *was* able to run script, which of course won't be
> possible on the sandboxed frame).
> 
I do agree it's worth to fix the referrer in reader mode (seems we are missing that), but I don't see the relation between fixing referrer and the access policy of reader mode frame in client side. It's mostly related to principal, or am I missing anything?
Flags: needinfo?(tnguyen)
(In reply to Thomas Nguyen[:tnguyen] ni? plz from comment #2)
> I do agree it's worth to fix the referrer in reader mode (seems we are
> missing that), but I don't see the relation between fixing referrer and the
> access policy of reader mode frame in client side. It's mostly related to
> principal, or am I missing anything?

I'm not really sure what your question is.

As far as I can tell, referer/origin headers are based on the principal of the page with links / subresources.

I'm 99% sure that "can frame X access frame Y" (substitute "DOM window" for "frame" if that helps) is based on principals.

Thus, I am concerned about our ability to fix the headers while ensuring "same origin" web pages can *not* access the reader mode frame (because they shouldn't be able to run script on it or do anything else).

Does that help clarify?

Right now reader mode uses about:reader as a codebase origin. If we created a subframe for the content, that would make it easier/safer to give that a separate origin, but I'm still not sure that if we gave it a plain codebase principal for the original website, that that would be safe in terms of cross-frame scripting access.

I'm even more uncomfortable giving the toplevel about:reader page a web (http/https) codebase principal, for similar reasons.

Am I missing something obvious?
Flags: needinfo?(tnguyen)
(In reply to :Gijs from comment #3)
> As far as I can tell, referer/origin headers are based on the principal of
> the page with links / subresources.

The Origin header uses the triggering principal.

Both the Referrer and the Origin enforce a scheme whitelist: http, https, ftp. If the referring page uses the about: scheme then it won't show up in these headers.
Flags: needinfo?(francois)
(In reply to :Gijs from comment #3)
> (In reply to Thomas Nguyen[:tnguyen] ni? plz from comment #2)
> > I do agree it's worth to fix the referrer in reader mode (seems we are
> > missing that), but I don't see the relation between fixing referrer and the
> > access policy of reader mode frame in client side. It's mostly related to
> > principal, or am I missing anything?
> 
> I'm not really sure what your question is.
> 
> As far as I can tell, referer/origin headers are based on the principal of
> the page with links / subresources.
> 
> I'm 99% sure that "can frame X access frame Y" (substitute "DOM window" for
> "frame" if that helps) is based on principals.
> 
> Thus, I am concerned about our ability to fix the headers while ensuring
> "same origin" web pages can *not* access the reader mode frame (because they
> shouldn't be able to run script on it or do anything else).
> 
> Does that help clarify?
> 
> Right now reader mode uses about:reader as a codebase origin. If we created
> a subframe for the content, that would make it easier/safer to give that a
> separate origin, but I'm still not sure that if we gave it a plain codebase
> principal for the original website, that that would be safe in terms of
> cross-frame scripting access.
> 
> I'm even more uncomfortable giving the toplevel about:reader page a web
> (http/https) codebase principal, for similar reasons.
> 
> Am I missing something obvious?

Thanks Gijs for the explanation. I guess there's a bit more we may have to think about referrer header.
- Basically, we use triggering principal when setting the referrer header, but only for checking whether the request is cross-origin or not, then decide to trim unexpected part of URL we are about to send. In reader mode, we may find a way to precisely checking that (probably an uri is enough to compute that), then it's okay.
- In most of the cases, we compute the referrer header from document's uri (top level or frame). So if we want to send correct referrer header, we may have to store or get original document's uri. I see in reader mode, the document uri is something looks like about:reader?url=${encodeURIComponent(url)}`. Is that url subtracted from "about:rader?url=" good to use?
It seems be ok in term of cross-frame scripting access.
Flags: needinfo?(tnguyen)
(In reply to Thomas Nguyen[:tnguyen] ni? plz from comment #5)
> - In most of the cases, we compute the referrer header from document's uri
> (top level or frame). So if we want to send correct referrer header, we may
> have to store or get original document's uri. I see in reader mode, the
> document uri is something looks like
> about:reader?url=${encodeURIComponent(url)}`. Is that url subtracted from
> "about:rader?url=" good to use?

Yes. There's a jsm that you could use to get the URL (rather than doing your own parsing, which is a bit annoying because about: URIs support for querystring/hashes is not the same as that of http URIs). But if this is implemented in C++, which I expect, then I guess that doesn't help much...
Depends on: 1424076
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.