Closed Bug 1410920 Opened 7 years ago Closed 6 years ago

Highlights could potentially leak data to trackers

Categories

(Firefox :: New Tab Page, defect, P4)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1449294
Tracking Status
firefox57 --- wontfix
firefox58 --- wontfix
firefox59 --- wontfix
firefox60 --- wontfix

People

(Reporter: groovecoder, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: uiwanted, Whiteboard: [thumbnails])

While building & testing a fresh Nightly 58.0a1 (2017-10-17) (64-bit), I noticed many requests sent to 3rd-party trackers as soon as I opened Firefox. It looks like the source of the requests is the "Highlights" section of about:newtab? It seems to make full content requests for the articles it's going to include in the section?
After a 2nd spot-check, it may not be leaking the data I thought it was. It contacts the media/CDN sites for thumbnails, but does not include any cookies.
In 57.0b10, I only see thumbnail requests sent. E.g., GET /img/media/bf942733be9b4ac493c7749aad22a9b98ef7e115/0_131_3500_2100/master/3500.jpg?w=1200&h=630&q=55&auto=format&usm=12&fit=crop&crop=faces%2Centropy&bm=normal&ba=bottom%2Cleft&blend64=aHR0cHM6Ly91cGxvYWRzLmd1aW0uY28udWsvMjAxNi8wNS8yNS9vdmVybGF5LWxvZ28tMTIwMC05MF9vcHQucG5n&s=36fddb035d0c3bdd35d78b34585cd316 HTTP/1.1 Host: i.guim.co.uk User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:57.0) Gecko/20100101 Firefox/57.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate, br Connection: keep-alive Upgrade-Insecure-Requests: 1 Pragma: no-cache Cache-Control: no-cache No cookies included.
And here's a thumbnail request sent in 58.0a1: GET /images/2017/10/12/business/12STATE1/12STATE1-facebookJumbo.jpg HTTP/1.1 Host: static01.nyt.com User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:58.0) Gecko/20100101 Firefox/58.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate, br Connection: keep-alive Upgrade-Insecure-Requests: 1 Pragma: no-cache Cache-Control: no-cache No cookies again, and neither of these hosts (static01.nyt.com nor i.guim.co.uk) are on the Disconnect block-list. [1] But I haven't done a thorough audit to see how many thumbnail requests may include cookies and/or may be sent to domains on the Disconnect block-list. [1] https://github.com/disconnectme/disconnect-tracking-protection/blob/master/services.json
Summary: Highlights leaks data to trackers → Highlights could potentially leak data to trackers
Where "could potentially leak" data is if the page has set the thumbnail to come from the tracker?
Yes, pretty much. At one point I thought I saw the about:newtab page sending full page-load network activity (including all trackers), but I can't reliably reproduce that behavior. So, this does not seem to be as serious as I first feared. A mitigation could be to cache the thumbnail for the page when the user first visits the page. That way they're not opened to any MORE tracking by about:newtab.
* Why don't we grab this metadata at page load time? I'm guessing to minimize excess storage? * Can we sandbox connections here? (i.e. no cookies, no correlation with sessions, etc) Anything that needs auth won't work, but maybe that's okay? * Should we filter these requests through TP by default? (basically "never do out of band requests to known trackers")
Just to add to Mike's comments: (In reply to Mike Connor [:mconnor] from comment #6) > * Why don't we grab this metadata at page load time? I'm guessing to > minimize excess storage? We do grab metadata at page-load time. The metadata is only refetched out-of-band when the cache expires or if you've synced (or imported history) that we don't have metadata for. > * Can we sandbox connections here? (i.e. no cookies, no correlation with > sessions, etc) Anything that needs auth won't work, but maybe that's okay? The out-of-band metadata collection does not follow redirects, does not set cookies, and executes no scripts. > * Should we filter these requests through TP by default? (basically "never > do out of band requests to known trackers") I am unsure of the benefit of TP with this. Trackers typically require script execution or least cookies. This mechanism will only send requests to the root domain of a site the user has legitimately (and recently) visited, and will not invoke any tracker scripts on that site.
Just to make sure, can you point me at the AS code that invokes these requests? I didn't see any with cookies, nor any that invoked any scripts, but they were definitely going to CDN/media domains for the sites.
Flags: needinfo?(tspurway)
Flags: needinfo?(tspurway)
Ah, okay that already uses LOAD_ANONYMOUS so it should strip cookies. That makes me much happier. So, this only "leaks" the user's IP address back to a 3rd-party server that they presumably already visited. That privacy risk seems much smaller than I feared. Still, I'm going to keep this bug open and mark it blocked by bug #1380448 - we should really run these thumbnail requests thru Tracking Protection in case some thumbnails are served by known trackers.
Depends on: 1380448
Sprucing up the new tab page isn't of a particularly urgent nature; there's no need to initiate requests in the background no matter how 'anonymised' they might be. A screenshot can be taken when the user next navigates to a top site.
[I seem to have confused highlights with top sites but the same logic should apply.]
(In reply to Luke Crouch [:groovecoder] from comment #10) > Still, I'm going to keep this bug open and mark it blocked by bug #1380448 - > we should really run these thumbnail requests thru Tracking Protection in > case some thumbnails are served by known trackers. I don't believe that that bug is technically related - the thumbnail requests aren't made from about: pages. The requests are made elsewhere and screenshotted, and the about: page uses the screenshots, but to fix this issue we need to apply tracking protection to the thumbnail requests (or just not make them in the background). Of course, TP might cause the screenshots to look significantly different... More generally though, this was previously filed as bug 1382703 and marked FIXED. What changed?
Flags: needinfo?(edilee)
Bug 1382703 provided packaged icons for the default top sites, e.g., twitter, and showing the icon is preferred over a thumbnail. This bug is for highlights, which are dynamic based on the browsing history, and we don't package any default Highlights thumbnails.
Flags: needinfo?(edilee)
(In reply to Luke Crouch [:groovecoder] from comment #5) > Yes, pretty much. At one point I thought I saw the about:newtab page sending > full page-load network activity (including all trackers), but I can't > reliably reproduce that behavior. I can reproduce this 100% reliably with one of the sites in my top sites collection. There are a couple of issues here: - The only way to disable this is to set browser.newtabpage.activity-stream.enabled to false. Just disabling top sites on about:newtab doesn't change anything. - The whole page doesn't only get loaded but also executed. Apart from the fact that that's - for this particular site - between 3MB and 6MB over 370-600 requests that get loaded (at least) every single time I start the browser, there is absolutely no way their tracking is bad enough not to identify me based on this. In case it makes any difference, the site is www.kicker.de. I have a profile that reproduces this which I can't share, but I'm happy to run additional investigation steps if it helps. Ping me on IRC.
I still couldn't reproduce this reliably in 59.0a1 with kicker.de (even after I spent a few minutes reading an article). I closed all tabs, opened a new tab, and saw kicker.de thumbnails in my highlights section, but I didn't see the full traffic sent to kicker.de. :till - did you have all other tabs and windows closed? how long had it been since you visited kicker.de?
Flags: needinfo?(till)
(In reply to Luke Crouch [:groovecoder] from comment #16) > :till - did you have all other tabs and windows closed? Yes, I did. And verified that no network traffic happened at all when restoring my session with browser.newtabpage.activity-stream.enabled set to false. As soon as I set that to true the requests start. > how long had it been since you visited kicker.de? Pretty exactly a week.
Flags: needinfo?(till) → needinfo?(lcrouch)
I still couldn't reproduce this on kicker.de a week later. The thumbnails showed up in my top highlights, but I didn't see the full page loads trigger.
Flags: needinfo?(lcrouch)
Keywords: uiwanted
Priority: -- → P4
Whiteboard: [thumbnails]
In today's meeting we agreed that we could and should filter ALL (i.e., Private Browsing AND regular browsing) thumbnail requests thru Tracking Protection. This will provide the same level of protection as other Private Browsing requests, and it should speed up the thumb-nailing service too!
Turning on tracking protection in bug 1449294. There is a more complex form of tracking that thumbnails can still leak though for even a single page request with no external dependencies. For example, only I would visit a given url / subdomain on my website, so if I see requests for that url from the server when I didn't go to it myself, it's quite possible that the background thumbnail service made the request.
Depends on: 1449294
We've been incorrectly assuming thumbnails are captured without cookies as cookie-less is the default behavior *only* in Nightly. Bug 1457929 will turn that on for all channels.
See Also: → 1457929
I'll just mark this duplicate of the bug where we turned on tracking protection for thumbnails bug 1449294. There's other improvements to thumbnails tracked in bug 1445085 such as turning on safe browsing bug 1453448.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Component: Activity Streams: Newtab → New Tab Page
You need to log in before you can comment on or make changes to this bug.