Closed
Bug 230933
Opened 21 years ago
Closed 17 years ago
move cookies to mozstorage
Categories
(Core :: Networking: Cookies, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9alpha6
People
(Reporter: mvl, Assigned: dwitte)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 2 obsolete files)
52.60 KB,
patch
|
sdwilsh
:
review-
|
Details | Diff | Splinter Review |
53.68 KB,
patch
|
sdwilsh
:
review+
mconnor
:
superreview+
|
Details | Diff | Splinter Review |
spin off from bug 178993. To store http-only, we need to change the format of cookies.txt. But we can't, for compatibility reasons. So, a new file needs to be created. And to do it right, it should allow more attributes to be saved, like last use time. very short summary of ideas so far: - just add httponly and last use time - add a few more unassinged fields, to be used later. (bools and int) - make the format flexible, store what mozilla knows about, and merge when saving not to lose data. - Should have versioning, cookies2.txt has a versioned filename - Maintain cookies.txt, to be used by older mozillae read bug 178993 for more details.
Assignee | ||
Comment 1•20 years ago
|
||
i've added a bunch of bugs as dependent on this, so we can keep track of what reasons we have for implementing a new fileformat (and what other things we might want to do while we're at it). a short summary: bug 208985 - support Set-Cookie2. bug 248590 - persist cookie creation time, to enhance behavioral consistency with older mozillae. bug 188850 - persist last-recently-used time, to allow a more performant implementation of LRU cookie deletion. bug 201936 - persist above two fields for purposes of user information (cookiemgr).
Comment 2•19 years ago
|
||
I know similar problems have been solved before for other components and config files, can someone link to some fixed bugs?
Assignee | ||
Comment 3•19 years ago
|
||
vlad has said that storage is (alpha-) ready for consumers, so once 1.1 ships i think we can look at moving cookies over to it. we'll probably continue to write out the legacy cookies.txt to keep compatibility for 3rd party apps. volunteers for testing storage would be appreciated. one aspect darin mentioned is perf - is storage sufficiently powerful and performant to entrust nsCookieService's memory store to it? or should cookies retain its memory list and lazy-flushing behavior?
(In reply to comment #3) > vlad has said that storage is (alpha-) ready for consumers, so once 1.1 ships i > think we can look at moving cookies over to it. we'll probably continue to write > out the legacy cookies.txt to keep compatibility for 3rd party apps. Probably more than alpha ready even, it's gotten a pretty good beating on calendar, though there's probably still some API niceness/cleanup that could happen. > volunteers for testing storage would be appreciated. one aspect darin mentioned > is perf - is storage sufficiently powerful and performant to entrust > nsCookieService's memory store to it? or should cookies retain its memory list > and lazy-flushing behavior? Good question; I don't know the kinds of queries cookies would do, so I don't know how well they could be optimized using indexes. So I think the best answer I can give is "try it and see"... For bookmarks I'm going to try to work with sqlite directly, but that's far less performance intensive...
Assignee | ||
Comment 5•19 years ago
|
||
more info on storage at http://wiki.mozilla.org/Mozilla2:Unified_Storage
Comment 6•19 years ago
|
||
Keep in mind that cookies needs an in-memory datastore for session cookies anyways. I don't think that using sqlite for session cookies makes sense, so we probably want to preserve the existing in-memory data structure. Whether that means that we load persistent cookies from sqlite into the same in-memory data structure or provide some way to aggregate the two data structures, I'm not sure. I know that storage is being used by calendar, but it is still sort of a work in progress. The "review storage" bug (see bug 273050) hasn't seen any traction since I posted review comments last year. That said, I suspect that there are mostly just cosmetic changes in order. The interfaces probably will change some.
Assignee | ||
Comment 7•19 years ago
|
||
good point about session cookies. given that we don't want two different querying methods for persistent and session cookies, it's probably best to stick with our hash store. how efficient is sqlite for that use case - where we use it purely for storage, and discount its memory-caching/querying abilities? do we pay for that functionality even if we don't use it?
(In reply to comment #7) > good point about session cookies. given that we don't want two different > querying methods for persistent and session cookies, it's probably best to stick > with our hash store. > > how efficient is sqlite for that use case - where we use it purely for storage, > and discount its memory-caching/querying abilities? do we pay for that > functionality even if we don't use it? Not sure what you mean by pay; you're going to be doing SELECT queries to read the data out and INSERT/UPDATE statements to flush the data to the store, so they're basically the same.. the only difference is the frequency of the calls, and perhaps how much data you'd read (e.g. you'll probably SELECT * to read data and then read one statement, which will be faster than doing a select with some kind of WHERE hostname LIKE '%foo.com', and you'll probably be flushing all the INSERT/UPDATEs in one transaction, as opposed to doing them one by one). That should give you a net performance increase as far as time spent in sqlite goes.
Assignee | ||
Comment 9•19 years ago
|
||
sorry for being vague - by pay, i was thinking memory footprint. for example, the maximum size of the cookie file atm is around 4mb (although real-world ones wouldn't come near that). if sqlite maintains its own in-memory cache of the file data, we'd be doubling up on it... not sure if that's a big deal or not, though. (if it does, it'd be useful to know if there's a way of disabling the cache.)
Ah, I see what you mean... sqlite will maintain its own memory cache per database, up to a runtime-specified size. However, I assume cookies would just live as a set of tables in an overall profile storage database; so any caching would be used overall by that database. The cookies bits would be evicted as necessary, so I don't think it would be that big of a deal.
Assignee | ||
Comment 11•19 years ago
|
||
mvl and i have talked about how we'd implement this whole storage thing for cookies. one issue is versioning - storage is nice in that it allows an older mozilla to use the cookie store, without stomping attributes it doesn't know about (e.g. that were introduced in a newer mozilla). we should take advantage of that. this also means that we have to keep the db synced (update cookies on the fly, and flush to disk periodically); we can't just nuke-and-rebuild when it comes time to flush. one potential problem here is preserving the unknown attributes might not make sense - say a cookie is set for httponly, and then is re-set in an older mozilla (which does not understand httponly) to be available to all sites. preserving the httponly attribute in this case would be wrong. but i don't think that'll be a big deal in practice. so, on startup we'll have to import the db, which means adding any attributes that don't exist - see, for example, the lastaccessed counter generation code we have at the moment. whenever a cookie is added/removed/changed, we sync that to the db immediately (as an "update" to preserve existing attributes), and lazily flush like we do now. i'm guessing that flushing to disk is handled internally to storage, and is suitably lazy (c.f. 5sec delay that cookies uses now)?
Assignee | ||
Comment 13•17 years ago
|
||
taking. i have a patch in hand to move cookies over to mozstorage; still have some testing and perf work to do before it's ready.
Status: NEW → ASSIGNED
Assignee | ||
Updated•17 years ago
|
Assignee: nobody → dwitte
Status: ASSIGNED → NEW
Target Milestone: --- → mozilla1.9alpha6
Assignee | ||
Comment 14•17 years ago
|
||
first cut at storage for cookies. some notes: the design is pretty simple; i kept the in-memory hash we use atm, and just replaced the file reading/writing code. this means we have to keep two data structures synced, but the hashes really are a lot faster for lookups. (i have lots of perf data i'll post later). during pageload, there are two perf cases: getting cookies, and setting cookies. getting cookies will be unaffected, since we still use hashes. for the typical case of setting a few cookies, perf should be much faster since we update only the cookies that changed, rather than waiting 5 seconds and dumping the whole cookie file out to disk. (async writing is critical here). i took advantage of the unique "creation id" we store for each cookie, and used it as the row id for the db. this makes db lookup of a specific cookie very fast. the only flaw here is overflow, but you'd have to set 1,000,000 cookies per second for 300,000 years to overflow that 64 bit int. we currently have a "last accessed" timestamp per cookie, which is updated every time a cookie is modified or sent back to a server, and is used to evict old cookies once we hit the per-host or total limit. this is a pain to do with storage, because we'd have to update this value in the db every time a cookie is sent to a server (slow). i've changed the algo to just evict based on creation id instead, which should work fine since eviction is arbitrary and up to us anyway. this way, we don't need to update anything, and the behavior seems sane. i decided to delete the user's old cookies.txt file after import; there's not much point for it to stick around, and it contains private data that shouldn't get orphaned and rot on the disk. also, dataloss for cookies isn't that critical, since they can easily be set again. similarly, if the sqlite db has a schema version higher than it understands (the downgrading case), we just delete the table and start over. if anyone has arguments to the contrary, i'd like to hear them. ready for testing and review!
Assignee | ||
Updated•17 years ago
|
Summary: Allow for storing more cookie attributes by creating a new file → move cookies to mozstorage
Assignee | ||
Comment 15•17 years ago
|
||
ugh, ignore the ridiculous changes to Makefile.in, they're not supposed to be there.
Comment 16•17 years ago
|
||
> we currently have a "last accessed" timestamp per cookie, which is updated
> every time a cookie is modified or sent back to a server, and is used to evict
> old cookies once we hit the per-host or total limit. this is a pain to do with
> storage, because we'd have to update this value in the db every time a cookie
> is sent to a server (slow). i've changed the algo to just evict based on
> creation id instead, which should work fine since eviction is arbitrary and up
> to us anyway. this way, we don't need to update anything, and the behavior
> seems sane.
I'm not 100% sure I understand this change, but it doesn't sound sane. I think it is going to break some web applications. I know of a few web apps that create lots of cookies, knowing in advance that some the browsers are going to get fed up and start throwing away cookies. In order to make sure that the most important cookies are not discarded, they reset only the critically important cookies with every response.
For example, a login cookie might be reset with every response. Less important cookies holding information about individual short lived transactions are allowed to expire.
In your patch, does resetting a cookie with the same name and value, or even the same name and a different value, update the creation ID? If so, your approach sounds reasonably safe. If the creation ID stays constant even when the server resets the cookie, you may want to look for an alternative.
Assignee | ||
Comment 17•17 years ago
|
||
that's correct; any change to a cookie (including expiry time!) will result in a new nsCookie object to replace the old one, since they're immutable to consumers once created. when critical cookies are evicted and the consumer resets them, they will receive fresh creation id's.
Assignee | ||
Comment 18•17 years ago
|
||
a few changes. this uses creation time (in microseconds) as the unique id, with a counter to enforce uniqueness (since the system clock isn't necessarily monotonic). this way, we store the creation time of cookies for free, which would be nice to display to the user. in the db downgrade case, we check if all the expected columns are there, and just carry on if they are. this should make switching between mozilla versions transparent. if columns have been deleted, we drop the table and start over.
Attachment #267699 -
Attachment is obsolete: true
Assignee | ||
Comment 19•17 years ago
|
||
minor revisions. ready for review!
Attachment #267942 -
Attachment is obsolete: true
Attachment #268007 -
Flags: superreview?(mconnor)
Attachment #268007 -
Flags: review?(sdwilsh)
Comment 20•17 years ago
|
||
Comment on attachment 268007 [details] [diff] [review] storage patch v3 General nit s/nsInt64/PR[I|Ui]nt64/g, but you can do a followup bug if you want since all these files seem to use it still. See http://groups.google.com/group/mozilla.dev.tech.nspr/msg/d48490f4c0781b92 >+nsresult >+nsCookieService::InitDB() >+{ >+ // cache a connection to the cookie database This comment should really go above where you make the call to OpenDatabase >+ PRInt32 dbSchemaVersion; >+ { >+ // scope the statement, so the write lock is released when finished >+ nsCOMPtr<mozIStorageStatement> stmt; >+ rv = mDBConn->CreateStatement(NS_LITERAL_CSTRING("PRAGMA user_version"), >+ getter_AddRefs(stmt)); >+ NS_ENSURE_SUCCESS(rv, rv); >+ >+ PRBool hasResult; >+ rv = stmt->ExecuteStep(&hasResult); >+ NS_ENSURE_SUCCESS(rv, rv); >+ NS_ENSURE_TRUE(hasResult, NS_ERROR_FAILURE); No result here means that user_version is not set. If it isn't, we can still figure out of the DB is useable and act accordingly. As the code is now, the user would have to delete the database themselves for this to ever work again. >+ // downgrading. >+ // if columns have been added to the table, we can still use the ones we >+ // understand safely. if columns have been deleted or altered, just >+ // blow away the table and start from scratch! >+ Please explicitly state that if someone modifies how a column is used, they need to rename it so downgrading works properly. r=me with these fixed
Attachment #268007 -
Flags: review?(sdwilsh) → review-
Assignee | ||
Updated•17 years ago
|
Attachment #268007 -
Flags: superreview?(mconnor)
Assignee | ||
Comment 21•17 years ago
|
||
nits fixed. thanks for the quick review!
Attachment #268014 -
Flags: superreview?(mconnor)
Attachment #268014 -
Flags: review?(sdwilsh)
Updated•17 years ago
|
Attachment #268014 -
Flags: review?(sdwilsh) → review+
Comment 22•17 years ago
|
||
Comment on attachment 268014 [details] [diff] [review] storage patch v3.1 >+ if (NS_FAILED(rv)) { >+ // our columns aren't there - drop the table! >+ rv = mDBConn->ExecuteSimpleSQL(NS_LITERAL_CSTRING("DROP TABLE moz_cookies")); >+ NS_ENSURE_SUCCESS(rv, rv); do we want to copy the sqlite file so that someone who blows up their file on a downgrade this way can recover? >+ // check whether to import or just read in the db >+ if (!tableExists) >+ rv = ImportCookies(); >+ else >+ rv = Read(); >+ >+ return rv; >+} if (!tableExists) return ImportCookies(); return Read(); Generally seems more readable. >+ // if it's a non-session cookie and hasn't just been read from the db, write it out. >+ if (aWriteToDB && !aCookie->IsSession() && mStmtInsert) { >+ // use our cached sqlite "insert" statement >+ mozStorageStatementScoper scoper(mStmtInsert); >+ >+ nsresult rv = bindCookieParameters(mStmtInsert, aCookie); >+ if (NS_FAILED(rv)) { >+ NS_WARNING("binding parameters failed!"); style whining: return PR_TRUE here, and ditch the else? too-much-nested-if ;) >+ } else { >+ PRBool hasResult; >+ rv = mStmtInsert->ExecuteStep(&hasResult); >+ if (NS_FAILED(rv)) >+ NS_WARNING("db insert failed!"); >+ } >+ } > > return PR_TRUE; overall, sr=me, looks like sdwilsh did the hard stuff here.
Attachment #268014 -
Flags: superreview?(mconnor) → superreview+
Assignee | ||
Comment 23•17 years ago
|
||
checked in for alpha, with nits fixed. i didn't do the file backup yet; probably want fresh reviews for that. might add that in at a later point. fixed! notes to anyone considering modifying the cookie sqlite db schema in future: see comments in nsCookieService::InitDB() re upgrading and downgrading!
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Comment 24•17 years ago
|
||
(In reply to comment #23) > i didn't do the file backup yet; probably want fresh reviews for that. might add > that in at a later point. Probably would be a good idea to file a bug for this, lest it be forgotten.
Comment 25•17 years ago
|
||
This totally broke cookies in Camino, as far as I can tell.
Comment 26•17 years ago
|
||
Please file a separate bug for Camino, as camino-specific bugs are not a blocker.
Comment 27•17 years ago
|
||
(In reply to comment #26) > Please file a separate bug for Camino, as camino-specific bugs are not a > blocker. Yeah, I was on my way to do this too. Bug 385074 has been filed.
Comment 28•17 years ago
|
||
Not user-visible, regressions can be tested separately, so in-testsuite-.
Flags: in-testsuite-
Assignee | ||
Comment 29•17 years ago
|
||
as promised in comment 14, some comments on perf: transactions make a big difference. with synchronous writing on (sqlite default), adding 1000 cookies one-by-one (via INSERT), in a loop, took (all times in microseconds): no transaction: 1100000 (yes, 1.1 seconds!) no transaction, but synchronous=OFF: 106000 transactions (synchronous makes no difference): 6900 which justifies why we use async writing, to optimize the relatively common case of a pageload setting 5-10 cookies over a few seconds. note that, while we use transactions inside the SetCookie functions themselves, it's not possible to set all cookies for a pageload under one transaction - so, we have to optimize that case as well, which async helps with a lot. also, these tests were for the case of no indexing on the table, default caching and paging. starting from the last number (6900) and playing around with what we index the table on: transactions, with index on host string: 13000 transactions, with index on 64 bit int (expiry): 12500 transactions, with index on unique rowid: 5600 i.e. the cost of indexing on either a string or non-unique integer is significant, but indexing on the row id itself is very cheap (c.f. creation time). very similar numbers result for removal of 1000 entries: removing, with match on name, path and host strings: 290000 removing, as above, but indexing on host: 15700 removing, index on host, and only matching host: 12500 removing, index on 64 bit int (expiry): 15700 removing, index on unique rowid: 5600 thus when removing a specific entry, any kind of indexing gives a 20x speedup. whether the index is by string or int doesn't matter much, but indexing by rowid is far cheaper. looking at caching settings (PRAGMA cache_size), i didn't see much difference on insertion time for cache sizes between 2 and 2000 pages. paging (PRAGMA page_size) did have an effect on insertion time; large pages (>2k) doubled the insertion time. pages 512 - 1024 seemed to work best (tested with sync on), which probably corresponds to the OS page size. based on these findings, i left page_size and cache_size at the sqlite defaults. looking at mem usage vs cache_size, the largest cookie file is around 4MB (1000 cookies * 4k each). that's significant, so we want to limit the cache somewhat. default cache size is 2000 pages which added around 400k to mem usage over the no cache value. increasing cache to 10000 pages made mem usage jump 4MB (the whole db). the default value seemed reasonable (the in-mem cookie list is going to chew up 4MB anyway, so a 400kb database cache is small potatoes). finally, comparing lookup times: 1000 lookups (matching on host, name, and path) take 356us using the in-memory hashtable (fast!). SELECTing those same entries (again, matching by host, name, and path; also indexing on host, to be fair) takes 13000us. this explains why i stuck with our hash structure; the db is much slower for query operations.
Assignee | ||
Comment 30•17 years ago
|
||
(all these tests were done on a firefox opt build, otherwise default build options, on a core 2 duo 6400 @ 2.13GHz).
You need to log in
before you can comment on or make changes to this bug.
Description
•