Closed Bug 666059 Opened 13 years ago Closed 12 years ago

Use disk device for reading in private browsing mode

Categories

(Core :: Networking: Cache, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: michal, Assigned: michal)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(1 file)

Currently we use disk cache for private browsing too and we delete the whole cache when leaving private browsing mode. Instead of doing this we should use only memory cache during private browsing.
Instead of only using the memory cache, we should use a hybrid of the disk and memory caches in private browsing mode:

1. Read from the memory cache first, and then read from the disk cache only when no info is found.
2. When invalidating resources in private browsing mode, we need to store a "ignore the disk cache for this entry" marker.
3. Write only to the memory cache, never to the disk cache.

If it is much easier to implement "Use memory cache only" first, then we can do that, but then let's file a follow-up for the hybrid read-write-memory/read-only-disk cache.
Blocks: http_cache
Attached patch patch v1Splinter Review
Things that need to be solved yet:

1) Why is necessary to clear the cache in tests test_sts_privatebrowsing.html and browser_webconsole_bug_618311_private_browsing.js? Without clearing the cache they do succeed when they run individually, but they fail after test_stricttransportsecurity.html and browser_webconsole_bug_618311_close_panels.js. I don't understand it especially in the latter case.

2) Store "ignore the disk cache for this entry" marker after invalidating resource in PB mode.
Brian, could you be more verbose on this? There is nothing like invalidating entries from cache's point of view. Do you mean that if some entry is doomed in memory cache, it should be also doomed in the disk cache?
Attachment #548404 - Flags: review?(bjarne)
Summary: Use memory cache only for private browsing → Use disk device for reading in private browsing mode
(In reply to Michal Novotny (:michal) from comment #3)
> Things that need to be solved yet:
> 
> 2) Store "ignore the disk cache for this entry" marker after invalidating
> resource in PB mode.
> Brian, could you be more verbose on this? There is nothing like invalidating
> entries from cache's point of view. Do you mean that if some entry is doomed
> in memory cache, it should be also doomed in the disk cache?

In private browsing mode, if an entry on disk is doomed, then we should treat *act* like it as doomed while in private browsing mode, without writing any change to disk for that entry.
(In reply to Brian Smith (:bsmith) from comment #4)
> In private browsing mode, if an entry on disk is doomed, then we should
> treat *act* like it as doomed while in private browsing mode, without
> writing any change to disk for that entry.

This would be quite tricky to implement. And we don't know a reason of dooming in nsCacheService::DoomEntry(). E.g. user can lower the cache size during private browsing mode which could lead to dooming of some active entries in the disk cache and we really do want to evict them in this case. I understand that we shouldn't modify data and metadata on the disk while we are in the private browsing mode, but dooming doesn't in fact modify it.
I'm slowly getting back in working-mode and have started to review this one...  do either Michal or Brian want to add something to the subject from comment #5? How important for functional correctness is it to get this right? If a disk-entry is doomed in PB mode, should it stay doomed when the user leaves PB?
(In reply to Bjarne (:bjarne) from comment #6)
> How important for functional correctness is it to get this right?

If I understand it correctly, one of the goal of the private browsing mode is to not change any information on the computer. On the other hand, the disk cache isn't a reliable persistent storage. I.e. disk entries can be removed for whatever reason at any time. So in my opinion, it is acceptable to remove the entry from disk device if it was doomed during the private browsing mode.
Brian: we need your input here...

(In reply to Michal Novotny (:michal) from comment #7)
> it is acceptable
> to remove the entry from disk device if it was doomed during the private
> browsing mode.

Being the devil's advocate: If you (somehow) know a URL to be in the disk cache prior to entering PB, and after leaving PB it's not there, wouldn't that mean that the URL was visited during PB mode? And consequently that browsing-history leaked.

Resizing cache-sizes in PB mode is another matter though. Looks like we should either add info about why an entry is doomed, or disallow resizing (disk) cache when in PB-mode.
I agree with Bjarne. We can resize the cache because that doesn't leak info about what resources have been visited during the session, but we can't doom entries (on disk).

It is most urgent to get the switch between private browsing mode and regular mode to stop deleting the cache. I think that is the Q3 goal we set out. What I proposed--being able to *read* entries from the disk cache in private browsing mode--is something we should do, but it goes beyond the initial goal.

Also, reading entries from the disk cache will leak information unless we can ensure that we read files modifying the files' atime attributes. (This wouldn't apply to the block or metadata files.)
Comment on attachment 548404 [details] [diff] [review]
patch v1

Review of attachment 548404 [details] [diff] [review]:
-----------------------------------------------------------------

Canceling review-request. Michal: plz align your patch according to latest comments and re-request review when you're happy with it.

IMO we cannot read from disk-cache in PB mode without major changes which may be out of scope for this bug. The timestamp-issues mentioned by Brian is one reason, and there is the dooming-issue: Since we cannot touch the disk, we may find a disk-entry after we have doomed it. I'm pretty sure the rest of the logic in the cache won't deal with finding an entry which is supposed to be doomed.

In order to use disk-cache in PB mode we need to use in-memory datastructures for information about entries on disk. If we're careful (and probably with some tweaking) we can probably use the current datastructure and just avoid flushing it to disk, but this will take some work and verification. Alternative is to create a new, parallel datastructure which takes precedence when in PB mode and holds the info we need to use the disk-cache safely. With any of these approaches resizing the cache can doom entries without leaking info. I suggest a new bug for this work and that we change this bug back to the original scope (avoid deleting disk-cache when entering PB mode, and use mem-disk only while in PB).
Attachment #548404 - Flags: review?(bjarne)
(In reply to Brian Smith (:bsmith) from comment #9)
> It is most urgent to get the switch between private browsing mode and
> regular mode to stop deleting the cache. I think that is the Q3 goal we set
> out. What I proposed--being able to *read* entries from the disk cache in
> private browsing mode--is something we should do, but it goes beyond the
> initial goal.

OK, let's keep this bug for the read-only disk cache and reopen bug #663580 for the Q3 goal.
Hmm, I would have expected somebody to CC me on this bug, so that I can help you guys avoid all of the work already done here!

One of the implicit goals of the PB mode is to make sure that it's not easily possible that the user's non-private session would be associated with their private ones.  This is the reason we disallow using non-private cookies and the cache inside PB mode.

There's of course only so much that we can do about it, but anywhere where we have control over the code we've made sure to keep this invariant.  The fact that we don't make promises about this to the user is that there are some things which we just can't protect against.

So, we definitely don't want to read from the disk cache in PB mode.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
I am re-opening this just because I've received directly conflicting information about it, and I want to make sure the conflict is resolved before we resolve this bug one way or the other.

My understanding of Private Browsing mode comes primarily from talking about it with Sid, and from understanding how the Strict Transport Security service works, and how the cert override service works. (They work exactly as I've proposed the disk cache works--the persistent state is read-only, and new state is written only in memory.)

What Ehsan describes above seems reasonable but it goes well beyond what we've previously said private browsing mode does. See these links:

[1] http://support.mozilla.org/en-US/kb/private-browsing-browse-web-without-saving-info
[2] https://wiki.mozilla.org/PrivateBrowsing

"The purpose of private browsing is to put Firefox into a temporary state where no information about the user's browsing session is stored locally.... It is very important that the user understands that this feature enables local privacy on their machine, but that their ISP, corporation, or government will still be able to monitor their activities online. We don't want to have whistle blowers fired or dissidents jailed on account of bad UI."

"Modify every part of the code where we write to disk." Note "write," not "read or write".

AFAICT, it is totally unrealistic to implement unlinkability as far as information that can be retrieved from websites goes. Example: Let's say I switch to private browsing mode and log into GMail. Then, I log out and switch back into normal mode. Now, I log into GMail again. At the bottom of the GMail page, it will show me a record of the times I logged into GMail, including the times I logged in when in private browsing mode. Also, the ISP and/or websites can always correlate my IP address and other network-level information across non-private and private browsing sessions.

Anyway, I am not arguing here against having an "anonymous browsing" mode that works the way Ehsan described. But, AFAICT what Ehsan described doesn't seem to match the documented intent of the current private browsing feature and it definitely doesn't match the implementation of private browsing in at least a few Gecko components.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
(In reply to Brian Smith (:bsmith) from comment #13)

My understanding also from my work on PB mode, communication with people and an overall original and meaningful function of PB mode I think it always was and will be a "sandbox" or "anonymous" mode.

> it definitely doesn't match the implementation of private browsing in at
> least a few Gecko components.

Then we have to fix them :)
(In reply to Brian Smith (:bsmith) from comment #13)
> I am re-opening this just because I've received directly conflicting
> information about it, and I want to make sure the conflict is resolved
> before we resolve this bug one way or the other.
> 
> My understanding of Private Browsing mode comes primarily from talking about
> it with Sid, and from understanding how the Strict Transport Security
> service works, and how the cert override service works. (They work exactly
> as I've proposed the disk cache works--the persistent state is read-only,
> and new state is written only in memory.)

Just as a data point, I've been involved in the original design of PB, and have implemented it, and I'm currently serving as the humble module owner of PB.

> What Ehsan describes above seems reasonable but it goes well beyond what
> we've previously said private browsing mode does. See these links:
> 
> [1]
> http://support.mozilla.org/en-US/kb/private-browsing-browse-web-without-
> saving-info
> [2] https://wiki.mozilla.org/PrivateBrowsing
> 
> "The purpose of private browsing is to put Firefox into a temporary state
> where no information about the user's browsing session is stored locally....
> It is very important that the user understands that this feature enables
> local privacy on their machine, but that their ISP, corporation, or
> government will still be able to monitor their activities online. We don't
> want to have whistle blowers fired or dissidents jailed on account of bad
> UI."
> 
> "Modify every part of the code where we write to disk." Note "write," not
> "read or write".
> 
> AFAICT, it is totally unrealistic to implement unlinkability as far as
> information that can be retrieved from websites goes.

This is absolutely true, and it's precisely why we don't mention this as a goal when we talk about PB.  We've gone to great lengths to make it extremely hard for websites to associate PB and non-PB sessions, but this will never be possible because there are components that are outside of our control (such as plugins, web servers, web caches, etc.)  And also the user behavior might defeat this purpose (e.g., logging in into the same service from both modes, as you indicate below.)

> Example: Let's say I
> switch to private browsing mode and log into GMail. Then, I log out and
> switch back into normal mode. Now, I log into GMail again. At the bottom of
> the GMail page, it will show me a record of the times I logged into GMail,
> including the times I logged in when in private browsing mode. Also, the ISP
> and/or websites can always correlate my IP address and other network-level
> information across non-private and private browsing sessions.

> Anyway, I am not arguing here against having an "anonymous browsing" mode
> that works the way Ehsan described. But, AFAICT what Ehsan described doesn't
> seem to match the documented intent of the current private browsing feature
> and it definitely doesn't match the implementation of private browsing in at
> least a few Gecko components.

Note that I'm not really talking about anonymous browsing, I'm only talking about making it hard (not impossible) for websites to be able to make this link.  Any attempt to explicitly open up a way for websites to make this link easily is against the original design goals of the feature, hence WONTFIXing this again.  :-)

(Also, note that the cache modules was explicitly one of the modules where we took great care to make this impossible, see bug 248970 for the history.)
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: