Closed
Bug 1173439
Opened 9 years ago
Closed 9 years ago
Remove Cache API text column indices from sqlite schema
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
FIXED
mozilla41
Tracking | Status | |
---|---|---|
firefox41 | --- | fixed |
People
(Reporter: bkelly, Assigned: bkelly)
References
(Blocks 1 open bug)
Details
Attachments
(5 files, 7 obsolete files)
14.84 KB,
patch
|
bkelly
:
review+
|
Details | Diff | Splinter Review |
11.42 KB,
patch
|
bkelly
:
review+
|
Details | Diff | Splinter Review |
3.22 KB,
patch
|
bkelly
:
review+
|
Details | Diff | Splinter Review |
9.84 KB,
patch
|
bkelly
:
review+
|
Details | Diff | Splinter Review |
13.71 KB,
patch
|
bkelly
:
review+
|
Details | Diff | Splinter Review |
One of the database optimizations we should do sooner rather than later is removing the text column indices for url and url_without_query.
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Assignee | ||
Comment 1•9 years ago
|
||
Switch to storing URLs as UTF8 c-strings. Making the URL strings more compact should help regardless of how we store them. In addition, fetch already does this conversion, so we avoid reconverting them back to UTF16 this way.
Assignee | ||
Comment 2•9 years ago
|
||
I broke this bug up into a number of smaller refactors.
The end goal is to store the URL as separate URL-without-query and query columns. Matching willing compare both column unless ignoreSearch is set, then just the URL-without-query will be examined. I can then hash these two columns and index the hashes.
The first part is to switch to use UTF8 strings for the URLs as I discussed above.
Attachment #8622537 -
Attachment is obsolete: true
Attachment #8622786 -
Flags: review?(ehsan)
Assignee | ||
Comment 3•9 years ago
|
||
Next, parse the URL query out into a separate field and store it in the database.
Attachment #8622787 -
Flags: review?(ehsan)
Assignee | ||
Comment 4•9 years ago
|
||
Next, use the new request_url_query column along with the existing request_url_no_query column to perform matching. Deserialize Request objects using these fields as well.
Attachment #8622790 -
Flags: review?(ehsan)
Assignee | ||
Comment 5•9 years ago
|
||
The CacheRequest.url() member and request_url column are no longer used. Remove them.
Attachment #8622791 -
Flags: review?(ehsan)
Assignee | ||
Comment 6•9 years ago
|
||
Now remove the old request_url_no_query index and replace it with a single index containing the following columns:
cache_id
request_url_no_query_hash
request_url_query_hash
The hash columns are created using the first 8 bytes of the SHA-1 of the URL.
The resulting SQL query is:
SELECT id, COUNT(response_headers.name) AS vary_count
FROM entries
LEFT OUTER JOIN response_headers ON entries.id=response_headers.entry_id
AND response_headers.name='vary'
WHERE entries.cache_id=:cache_id
AND entries.request_url_no_query_hash=:url_no_query_hash
AND entries.request_url_query_hash=:url_query_hash
AND entries.request_url_no_query=:url_no_query
AND entries.request_url_query=:url_query
GROUP BY entries.id ORDER BY entries.id;
I used EXPLAIN QUERY PLAN to verify this will correctly use the new entries index and also the request_headers name index.
With all of these changes applied peak disk usage in the test_cache_shrink.html mochitest drops from 672k to 288k.
I will examine execution times in automation to see if there are any improvements or regressions. I don't see any changes in the few tests I ran locally.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=eefb2c5059bf
Attachment #8622794 -
Flags: review?(ehsan)
Assignee | ||
Comment 7•9 years ago
|
||
The ASAN tests on try showed that the previous patch had a problem with passing nsACString types to ThrowTypeError(). The var args there specifically want things inheriting from nsAString. Update the patch to fix that.
Attachment #8622786 -
Attachment is obsolete: true
Attachment #8622786 -
Flags: review?(ehsan)
Attachment #8623041 -
Flags: review?(ehsan)
Assignee | ||
Comment 8•9 years ago
|
||
Assignee | ||
Comment 9•9 years ago
|
||
The previous try run suggests that cache-match.https.html in the web-platform-tests slows from an average of 14.3 seconds to 17.8 seconds. That's a ~25% slowdown on those slow windows vms.
Although the second try I posted has better numbers at 13.1 seconds. I guess this test has a lot of variance in automation.
In any case, I'm going to see if using md5sum instead of sha-1 makes any noticeable difference:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=39ac6d818b16
Updated•9 years ago
|
Attachment #8623041 -
Flags: review?(ehsan) → review+
Comment 10•9 years ago
|
||
Comment on attachment 8622787 [details] [diff] [review]
P2 Parse Response URL query as a separate field. r=ehsan
Review of attachment 8622787 [details] [diff] [review]:
-----------------------------------------------------------------
::: dom/cache/TypeUtils.h
@@ +100,5 @@
> ToInternalHeaders(const nsTArray<HeadersEntry>& aHeadersEntryList,
> HeadersGuardEnum aGuard = HeadersGuardEnum::None);
>
> static void
> ProcessURL(nsACString& aUrl, bool* aSchemeValidOut,
Nit: please document what this function should return, given an input of "http://example.com/path?foo=bar".
Attachment #8622787 -
Flags: review?(ehsan) → review+
Updated•9 years ago
|
Attachment #8622790 -
Flags: review?(ehsan) → review+
Updated•9 years ago
|
Attachment #8622791 -
Flags: review?(ehsan) → review+
Comment 11•9 years ago
|
||
Comment on attachment 8622794 [details] [diff] [review]
P5 Cache should index on a hash instead of the url itself. r=ehsan
Review of attachment 8622794 [details] [diff] [review]:
-----------------------------------------------------------------
FWIW, there is also mozilla/SHA1.h which has a better C++ interface in case you want to use it in the future.
Attachment #8622794 -
Flags: review?(ehsan) → review+
Assignee | ||
Comment 12•9 years ago
|
||
I took some data from the cache-match.https.html test running on windows in automation:
https://docs.google.com/spreadsheets/d/19A3Cl9CliLlE3d3qFv7rmtrX3gdXv27ybKxksWfRHv0/edit?usp=sharing
It actually shows that md5 is slower than sha1 by 3% to 4% for some reason.
There does not appear to be any regression from the currently reviewed patches. So I will go with them without further changes.
Assignee | ||
Comment 13•9 years ago
|
||
Attachment #8623041 -
Attachment is obsolete: true
Attachment #8623403 -
Flags: review+
Assignee | ||
Comment 14•9 years ago
|
||
Attachment #8622787 -
Attachment is obsolete: true
Attachment #8623404 -
Flags: review+
Assignee | ||
Comment 15•9 years ago
|
||
Attachment #8622790 -
Attachment is obsolete: true
Attachment #8623405 -
Flags: review+
Assignee | ||
Comment 16•9 years ago
|
||
Attachment #8622791 -
Attachment is obsolete: true
Attachment #8623406 -
Flags: review+
Assignee | ||
Comment 17•9 years ago
|
||
Attachment #8622794 -
Attachment is obsolete: true
Attachment #8623407 -
Flags: review+
Comment 18•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/3f1616172cd6
https://hg.mozilla.org/integration/mozilla-inbound/rev/818edd204d7b
https://hg.mozilla.org/integration/mozilla-inbound/rev/619c2a4e7ce2
https://hg.mozilla.org/integration/mozilla-inbound/rev/12ccbd45dcb5
https://hg.mozilla.org/integration/mozilla-inbound/rev/eefb95916e40
Comment 19•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/3f1616172cd6
https://hg.mozilla.org/mozilla-central/rev/818edd204d7b
https://hg.mozilla.org/mozilla-central/rev/619c2a4e7ce2
https://hg.mozilla.org/mozilla-central/rev/12ccbd45dcb5
https://hg.mozilla.org/mozilla-central/rev/eefb95916e40
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox41:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•