Update HTTP cache index format to work with OriginAttributes' suffix

RESOLVED FIXED in Firefox 51

Status

()

defect
P1
major
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: mayhemer, Assigned: michal)

Tracking

(Blocks 1 bug)

Trunk
mozilla52
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox49 affected, firefox50 affected, firefox51 fixed, firefox52 fixed)

Details

(Whiteboard: [necko-active])

Attachments

(2 attachments, 3 obsolete attachments)

Reporter

Description

4 years ago
The appid/inbrowser are gone.  OriginAttributes replaces them and packs a generic list of attributes, more than just those two already, and is designed to be easily extended/changed.

The class is serializable and also provides CreateSuffix method that spits out a string that can later be parsed back.


Michal, can you take this bug?  

The goal is to change CacheFileMetadata to expose OriginAttributes instead of appid/inbrowser and let CacheIndex::InitEntryFromDiskData (and any other place) work with it.  I'm exposing OriginAttributes on nsILoadContextInfo in a binary form already, see [1] how to use it and carry it to CacheFileMetadata.

[1] https://bugzilla.mozilla.org/attachment.cgi?id=8655555&action=diff#a/netwerk/cache2/CacheFileMetadata.cpp_sec2
Flags: needinfo?(michal.novotny)
Assignee

Comment 1

4 years ago
Posted patch patch v1 (obsolete) — Splinter Review
The patch doesn't change index format, i.e. we still get appId and isInBrowser flags from the OriginAttributes instead of storing the serialized value. But it happens now only in CacheIndexEntry so it's easy to change or add a new flags if needed.

The problem with serialized OriginAttributes is that it's quite large and the length is variable. The code would need a large change to support cache index entries with variable length and I see no benefit right now.
Assignee: nobody → michal.novotny
Flags: needinfo?(michal.novotny)
Assignee

Updated

4 years ago
Attachment #8665449 - Flags: review?(valentin.gosu)
Reporter

Comment 2

4 years ago
Comment on attachment 8665449 [details] [diff] [review]
patch v1

Problem is that there are not just appid and inbrowser in OriginAttributes.  There are currently two more and one more is about to come soon.

How does this patch deal with them?
Comment on attachment 8665449 [details] [diff] [review]
patch v1

Review of attachment 8665449 [details] [diff] [review]:
-----------------------------------------------------------------

I agree with Honza that we should probably account for all members of the OriginAttributes, but in the mean time I think we can take this patch, as it is a good intermediary step keeps existing functionality.
Attachment #8665449 - Flags: review?(valentin.gosu) → review+
Reporter

Updated

4 years ago
Blocks: 1214360
Reporter

Comment 4

4 years ago
Michal, what is the status?  We may need a full (proper) handling of OriginAttribs in the cache index to implement bug 1214360 correctly.
Flags: needinfo?(michal.novotny)
Assignee

Comment 5

4 years ago
The variable length of serialized OriginAttributes is no longer an argument since the other attributes that we don't store yet have variable length too. So there is no difference between storing the whole serialized OA or extracting separate attributes and storing them. But the main concern remains, the attributes are large and we IMO shouldn't store it in memory, especially not in the verbose format produced by CreateSuffix().

AFAICS, OA in CacheIndex is used only in :

1) CacheIndex::IsCollision()
We can get rid of this because telemetry probes from bug 1151812 show that we don't have collisions.

2) CacheIndexContextIterator::AddRecord() via CacheIndex::RecordMatchesLoadContextInfo()
We could change the context iterator so that it won't match the OA when the hashes are added when the iterator is created. Instead, we would parse the OA from the file for every entry in GetNextHash().

3) CacheIndex::GetCacheStats() via CacheIndex::RecordMatchesLoadContextInfo()
As above, we would parse OA from the file, but we would also need to make this method asynchronous.


So we need to choose between high memory usage and going to disk whenever we need OriginAttributes. I prefer the latter but I'd like to know you opinion.
Flags: needinfo?(michal.novotny) → needinfo?(valentin.gosu)
(In reply to Michal Novotny (:michal) from comment #5)
> So we need to choose between high memory usage and going to disk whenever we
> need OriginAttributes. I prefer the latter but I'd like to know you opinion.

If there's a way to compress OA to a small/fixed number of bytes, that would be better, but I don't know if that an easy thing to do.
Mobile would be one most affected by this choice, where memory is more likely to be an issue than disk access. So I agree with you that we should not store it in memory if possible.
Flags: needinfo?(valentin.gosu)
Reporter

Comment 7

4 years ago
I don't recall that well how the index memory structures are implemented, but from what I know those are hashtables with separate, self-implemented entries.  I think keeping attributes that are integers as numbers is OK (uint32).  And having nsCString members for any string based attributes may also be OK, since empty nsCString is probably very small.  But that should be confirmed first.  I'm a bit worried about disk/flash mem performance.  OTOH, Michal, do we have a telemetry from mobile and/or b2g how fast we build the index?  Since that would give us a good pointer on how long it takes to crunch all the files.

Problem with the read-file approach is that we mostly reduce any selective deletion to a full disk walk.  That actually makes the index itself useless and we could just remove it at all...  Also, any IO caused by the delete walk will slow the whole device down as there is no central IO priority queuing logic in the whole platform.
Whiteboard: [necko-backlog]
Reporter

Comment 8

3 years ago
This is causing bug 1311271 and bug 1309695 where definitely the second one is kinda serious.  I'd like to get this fixed for 52 (ESR).  We should anyway stop walking around this bug...

Michal, I think the best way here is to add OA to the index.  I think we can just add a hash (or part of a hash, or something) of the suffix.  

According the |AppendKeyPrefix| function, we today add only: O (the suffix), a (anon), and p (private) as parameters to the cache entry context key.  Hence, index should contain the same info (except private of course).

I don't want to solve the pattern matching now.  We need to keep people's cache when they use containers (bug 1309695).  Matching can be solved later with another version of the index, if necessary.

Please let me know ASAP when you think you could have a patch ready.
Severity: normal → major
Status: NEW → ASSIGNED
Flags: needinfo?(michal.novotny)
Priority: P2 → P1
Whiteboard: [necko-backlog] → [necko-active]
Reporter

Updated

3 years ago
Duplicate of this bug: 1311271
Assignee

Comment 10

3 years ago
As discussed on IRC, this shouldn't be too complicated and I hope I can have it at the beginning of the next week.
Flags: needinfo?(michal.novotny)
Assignee

Comment 11

3 years ago
Posted patch patch v2 (obsolete) — Splinter Review
Attachment #8665449 - Attachment is obsolete: true
Attachment #8806500 - Flags: review?(honzab.moz)
Reporter

Comment 12

3 years ago
Comment on attachment 8806500 [details] [diff] [review]
patch v2

Review of attachment 8806500 [details] [diff] [review]:
-----------------------------------------------------------------

thanks!

::: netwerk/cache2/CacheHashUtils.h
@@ +56,5 @@
>  };
>  
> +typedef uint64_t OriginAttrsHash;
> +
> +OriginAttrsHash GetOriginAttrsHash(const mozilla::OriginAttributes &aOA);

if it's just a reference, could we rather just predeclar mozilla::OriginAttributes and include baseprincipal.h in cpp?  or will consumers of this api have to include baseprincipal.h anyway?

::: netwerk/cache2/CacheIndex.h
@@ +311,2 @@
>  
> +  static const uint32_t kReservedMask    = 0x03000000;

this is in sync with the struct CacheIndexRecord bits comment?
Attachment #8806500 - Flags: review?(honzab.moz) → review+
Assignee

Comment 13

3 years ago
Posted patch patch v3 (obsolete) — Splinter Review
(In reply to Honza Bambas (:mayhemer) from comment #12)
> ::: netwerk/cache2/CacheHashUtils.h
> @@ +56,5 @@
> >  };
> >  
> > +typedef uint64_t OriginAttrsHash;
> > +
> > +OriginAttrsHash GetOriginAttrsHash(const mozilla::OriginAttributes &aOA);
> 
> if it's just a reference, could we rather just predeclar
> mozilla::OriginAttributes and include baseprincipal.h in cpp?  or will
> consumers of this api have to include baseprincipal.h anyway?

fixed

> ::: netwerk/cache2/CacheIndex.h
> @@ +311,2 @@
> >  
> > +  static const uint32_t kReservedMask    = 0x03000000;
> 
> this is in sync with the struct CacheIndexRecord bits comment?

yes
Attachment #8806500 - Attachment is obsolete: true
Attachment #8807100 - Flags: review+

Comment 15

3 years ago
Pushed by mnovotny@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bd16c2fde4bf
Update HTTP cache index format to work with OriginAttributes' suffix, r=honzab

Comment 16

3 years ago
Backout by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/dbc272fdb238
Backed out changeset bd16c2fde4bf for bc test failures
backed this out for failurs like https://treeherder.mozilla.org/logviewer.html#?job_id=38765164&repo=mozilla-inbound
Flags: needinfo?(michal.novotny)
Assignee

Comment 18

3 years ago
The test was added after I pushed the patch to try. Anyway, the test passes correctly on my computer. I guess the problem is somewhere in the test, but I don't understand it. Honza, you did the review of the test, do you see why it doesn't work when we correctly evict entries by origin attributes?
Flags: needinfo?(michal.novotny) → needinfo?(honzab.moz)
This test fails because it relies on the old behavior of the nsICacheStorage.asyncVisitStorage() which will ignore originAttributes and visit all of them. After you fix the cache index, it will only visit the default one, which causes this test fails. Following is the diff to resolve this by modifying this test to visit all originAttributes that it uses during its test.

--- a/browser/components/originattributes/test/browser/browser_cache.js
+++ b/browser/components/originattributes/test/browser/browser_cache.js
@@ -216,16 +216,24 @@ function* doTest(aBrowser) {

 // The check function, which checks the number of cache entries.
 function* doCheck(aShouldIsolate, aInputA, aInputB) {
   let expectedEntryCount = 1;
   let data = [];
   data = data.concat(yield cacheDataForContext(LoadContextInfo.default));
   data = data.concat(yield cacheDataForContext(LoadContextInfo.private));
   data = data.concat(yield cacheDataForContext(LoadContextInfo.custom(true, {})));
+  data = data.concat(yield cacheDataForContext(LoadContextInfo.custom(false, { userContextId: 1 })));
+  data = data.concat(yield cacheDataForContext(LoadContextInfo.custom(true, { userContextId: 1 })));
+  data = data.concat(yield cacheDataForContext(LoadContextInfo.custom(false, { userContextId: 2 })));
+  data = data.concat(yield cacheDataForContext(LoadContextInfo.custom(true, { userContextId: 2 })));
+  data = data.concat(yield cacheDataForContext(LoadContextInfo.custom(false, { firstPartyDomain: "example.com" })));
+  data = data.concat(yield cacheDataForContext(LoadContextInfo.custom(true, { firstPartyDomain: "example.com" })));
+  data = data.concat(yield cacheDataForContext(LoadContextInfo.custom(false, { firstPartyDomain: "example.org" })));
+  data = data.concat(yield cacheDataForContext(LoadContextInfo.custom(true, { firstPartyDomain: "example.org" })));

   if (aShouldIsolate) {
     expectedEntryCount = 2;
   }

   for (let suffix of suffixes) {
     let foundEntryCount = countMatchingCacheEntries(data, "example.net", suffix);
     let result = (expectedEntryCount === foundEntryCount);
Assignee

Comment 20

3 years ago
Tim, thanks for the patch. I was a bit confused that the test passes correctly on my computer without this change, but now I realized I didn't have the cache index patch applied :-/
Flags: needinfo?(honzab.moz)

Comment 21

3 years ago
Pushed by mnovotny@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e6992502f673
Update HTTP cache index format to work with OriginAttributes' suffix, r=honzab
Assignee

Comment 22

3 years ago
Attachment #8807100 - Attachment is obsolete: true
Attachment #8808779 - Flags: review+

Comment 23

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/e6992502f673
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Assignee

Comment 24

3 years ago
Comment on attachment 8808779 [details] [diff] [review]
patch v4 - test fixed

Approval Request Comment
[Feature/regressing bug #]: origin attributes
[User impact if declined]: The whole cache is cleared when only entries with specific loadContextInfo should be doomed. I'm not sure what exactly initiates this selective dooming, but it happens very often on my computer with current aurora version.
[Describe test coverage new/current, TreeHerder]: passes try server
[Risks and why]: low, simple change how we store and compare loadContextInfo
[String/UUID change made/needed]: none
Attachment #8808779 - Flags: approval-mozilla-aurora?
Comment on attachment 8808779 [details] [diff] [review]
patch v4 - test fixed

This patch can prevent cache from being cleared completely in some scenarios like private browsing. Take it in 51 aurora.
Attachment #8808779 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
needs rebasing for aurora

grafting 373887:e6992502f673 "Bug 1201042 - Update HTTP cache index format to work with OriginAttributes' suffix, r=honzab"
other [graft] changed browser/components/originattributes/test/browser/browser_cache.js which local [local] deleted
use (c)hanged version, leave (d)eleted, or leave (u)nresolved? c
merging netwerk/cache2/CacheFile.cpp                                                                                                                                                                   
merging netwerk/cache2/CacheFileIOManager.cpp                                                                                                                                                          
merging netwerk/cache2/CacheFileIOManager.h                                                                                                                                                            
merging netwerk/cache2/CacheHashUtils.cpp                                                                                                                                                              
merging netwerk/cache2/CacheHashUtils.h                                                                                                                                                                
merging netwerk/cache2/CacheIndex.cpp                                                                                                                                                                  
merging netwerk/cache2/CacheIndex.h                                                                                                                                                                    
warning: conflicts while merging netwerk/cache2/CacheFileIOManager.cpp! (edit, then use 'hg resolve --mark')                                                                                           
abort: unresolved conflicts, can't continue
(use 'hg resolve' and 'hg graft --continue')
Flags: needinfo?(michal.novotny)
Assignee

Comment 27

3 years ago
Flags: needinfo?(michal.novotny)
You need to log in before you can comment on or make changes to this bug.