Closed Bug 1166038 Opened 9 years ago Closed 9 years ago

Cache API should de-duplicate response security info

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: bkelly, Assigned: bkelly)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

The Cache API currently stores the https security info as a blob with each request/response pair in the sqlite database.  These blobs are each ~2k or more in size.  The data is also very repetitive with the same security info (cert) for each https origin.

This is bad for a couple reasons:

1) It wastes a lot of space repeating the same info.
2) Storing the blobs in the same table with the rest of the entry encourages db fragmentation.  If they were stored in a separate table then they would be isolated to a different set of pages.

This should be fixed before doing other optimization work.
Depends on: 1166577
I'd appreciate if you could wait on bug 1164397 before changing anything here, since I may end up touching this code...
Ehsan, can you tell me what changes you are expecting?  It seems no matter what the security info blob needs to be stored, no?
Flags: needinfo?(ehsan)
(In reply to Ben Kelly [:bkelly] from comment #2)
> Ehsan, can you tell me what changes you are expecting?  It seems no matter
> what the security info blob needs to be stored, no?

Oh, yes, of course!  I'm just changing the way the security into is stored in InternalResponse, which will affect how cache interacts with it, which may cause me to have to move code around.  But once that's done, we can definitely look into doing this.  Just trying to avoid some merge conflicts.  :-)

(But that being said, if you do need to fix this before then, I guess I can deal with the merges myself, not that big of a deal...)
Flags: needinfo?(ehsan)
Ehsan, as we discussed in IRC, I'd like to keep moving forward on these schema changes since they are my main thing I want to accomplish before shipping.  If you get close on your changes, though, I can wait to land.

This patch moves the security info blob into a new table.  A hash column is used to perform an indexed search for existing duplicate security info.  The hash is calculated as the first 8 bytes of the SHA1.

I also include a test that exercises storing some https entries in Cache and incrementally removing them until the security info is removed.  (Our existing tests added security info, but never fully removed it because they were in the sw script cache.)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0bf7b92876eb
Attachment #8608338 - Flags: review?(ehsan)
Comment on attachment 8608338 [details] [diff] [review]
De-duplicate security info stored in the Cache API. r=ehsan

Review of attachment 8608338 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/cache/DBSchema.cpp
@@ +154,5 @@
>  
> +struct IdCount
> +{
> +  IdCount() : mId(-1), mCount(0) { }
> +  IdCount(int32_t aId) : mId(aId), mCount(1) { }

Nit: explicit

@@ +179,5 @@
> +                               const nsACString& aData, int32_t *aIdOut);
> +static nsresult DeleteSecurity(mozIStorageConnection* aConn, int32_t aId,
> +                               int32_t aCount);
> +static nsresult DeleteSecurityList(mozIStorageConnection* aConn,
> +                                   const nsTArray<IdCount>& aDeletedStorageIdList);

Nit: please call these ...SecurityInfo...

@@ +259,5 @@
>  
> +    // Security blobs are quite large and duplicated for every Response from
> +    // the same https origin.  This table is used to de-duplicate this data.
> +    rv = aConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
> +      "CREATE TABLE security ("

Nit: please rename to security_info, since "security info" means something specific in the Necko parlance.

@@ +262,5 @@
> +    rv = aConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
> +      "CREATE TABLE security ("
> +        "id INTEGER NOT NULL PRIMARY KEY, "
> +        "hash BLOB NOT NULL, "  // first 8-bytes of the sha1 hash of data column
> +        "data BLOB NOT NULL, "  // full security info data, 1k to 4k in size

Nit: it may be more accurate to say in the comment "typically a few KB in size", since the exact size depends on the cert chain used by the web site which can be longer in some sites.  But doesn't matter much, 1-4K is probably a good estimate for many cases.

@@ +270,5 @@
> +    if (NS_WARN_IF(NS_FAILED(rv))) { return rv; }
> +
> +    // Index the smaller hash value instead of the large security data blob.
> +    rv = aConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
> +      "CREATE INDEX security_hash_index ON security (hash);"

Nit: security_info_hash_index

@@ +294,5 @@
>          "response_status INTEGER NOT NULL, "
>          "response_status_text TEXT NOT NULL, "
>          "response_headers_guard INTEGER NOT NULL, "
>          "response_body_id TEXT NULL, "
> +        "response_security_id INTEGER NULL REFERENCES security(id), "

Nit: response_security_info_id
Attachment #8608338 - Flags: review?(ehsan) → review+
https://hg.mozilla.org/mozilla-central/rev/c0d8a4b053e2
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: