Closed
Bug 71071
Opened 24 years ago
Closed 24 years ago
rewriting cookies.txt on each cookie
Categories
(Core :: Networking: Cookies, defect)
Core
Networking: Cookies
Tracking
()
VERIFIED
FIXED
mozilla0.9
People
(Reporter: cathleennscp, Assigned: morse)
Details
(Keywords: topperf)
Attachments
(1 file)
1.16 KB,
text/plain
|
Details |
Currently, when we need to add a new cookie, we blow away the entire cookie file
in order to update it with the new cookie info. This is costly on performance,
especially with slow file acess platform.
We need to keep the old cookie file and add the new line for the new cookie.
Let me rephrase that by saying that we need to reduce the frequency of writing
this out by not doing it on each setcookie notification.
Comment 3•24 years ago
|
||
My position on this is that it's a feature :-). We crash often enough that
people get upset when we don't write to disk after each cookie change. We got
enough pressure in 4.0 to add this functionality then, and we're getting
pressure now from embeddors for the same reason (data loss potential).
Do we have performance data on this being a problem? It's obviously a function
of urls that use cookies.
Assignee | ||
Comment 4•24 years ago
|
||
Perhaps there's a middle ground. We could write out once for each page load,
perhaps when we got the notification that the page has finished loading. As it
is now, we write out on each cookie set, so if a page sets seven cookies (as
does netscape.com), we do seven writes.
Comment 5•24 years ago
|
||
If we can tell that the cookie is not in the file, we should append. Unix write
is atomic to disk filesystems, and append mode guarantees that the write goes at
the end of the file. Batching up on end load sounds good too.
/be
Comment 6•24 years ago
|
||
yea, someone else mentioned this as well. If we can batch up a
per-top-level-document entity and write when that's complete that's the best way
to go. Many sites' cookie intent would be invalid if only 3 out of their 7
cookies were written anyway. Each top-level-doc can be considered a
"transaction." So, yea, let's tie cookie writing to an OnEndDocumentLoad() or
when a docuemnts loadgroup goes away (I'm partial to tha latter). The
internal/private cookie interface should have a "flush()" or "persist()" method
on it. We don't want to expose this on the public iface because we already have
a profile wide "flush()" mechanism that can be used if an embeddor wants to snap
things to disk.
Comment 7•24 years ago
|
||
> pressure now from embeddors for the same reason (data loss potential).
When the file is rewritten is a backup done and a crash in the middle handled on
restart? If not, writing more frequently would increase the chance of data loss
of possibly the whole file.
Comment 8•24 years ago
|
||
IMO, that argument doesn't hold water. the likelyhood of crashing due to power
failure, disk i/o failure, system failure, or some other random act of G__, is,
relative to Mozilla crashes and stability problems, negligible at best.
I'd still argue that we're chasing our tail here. no-one has yet produced any
perf numbers regarding the penalty we're paying for doing this. Obviously
hitting the disk is expensive, I just don't think it's happening as often as
people think.
We don't even have to do Quantify level profiling, at least on windows and
linux, the cookie library is separate and can be removed from the components dir
by hand. Some stopwatch sanity checks on page loads w/ heavy cookie usage, w/
and w/ out the cookie lib might provide some insight.
If this is a noticable issue, we should take a couple approaches:
1. Consider each top level URL load a "transaction" and only write the file once
per top-level load.
2. We may want to do this anyway, but IIRC, we set a "cookies have changed" flag
at the outermost level of the SetCookie() routine which indeed makes us write
for *every* set-cookie header that comes in. We could set this flag at a more
granular level, perhaps *only* when a cookie has indeed changed (this would
require some additional logic to determine if a pre-existing cookie was there,
and whether or not it was identical to the one trying to be set).
Comment 9•24 years ago
|
||
sorry, didn't answer your question. backups and recovering from a broken write
are not handled.
Comment 10•24 years ago
|
||
It depends on a lot of things, for example the size of the cookie file.
I have had this sort of thing bite me before in applications, but it may well
not be an issue in Mozilla. Personally I wouldn't be too upset if I lost my
cookie file anyway.
Hopefully everyone agrees changing this to append is a good idea whether
immediate or not.
Comment 11•24 years ago
|
||
blindly appending would lead to cookie file growth beyond anyone's interest.
random access on the cookie file would involve some in memory representation of
the file's state, and at this point I'd argue we're putting more time into this
problem than is justified. AFAIK, this still hasn't shown up on any perf metric
radar.
Assignee | ||
Updated•24 years ago
|
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9
Comment 12•24 years ago
|
||
Sounds like OS/All to me. Change back to WinNT if I'm wrong. <nit>btw, is this
really topperf or 'merely' perf in keyword terms?</nit>
OS: Windows NT → All
Comment 13•24 years ago
|
||
I ran some performance numbers on a p3 450 (maybe 350?) running windows 2k with
a scsi hard drive. I used jrgm@netscape.com's performance server. In one run, I
remove the file saving code from cookies (I added an early return so that the
code would not ever be hit.) On the next run, I remove this early return so
that there were no changes in my tree. In an opt build, there was no negative
performance for writing the cookies out. In fact opposite as what you would
expect, the build that wrote to disk every time a cookie was set ran a few
seconds faster!! I attributed this to possible concurrent usage of the server,
but jrgm discredited this claim.
I am guessing that the disk cache is caching this file, or the scsi performance
makes this kind of write nearly invisible. We will have to try this test on a
lower end machine, maybe a mac.
Comment 14•24 years ago
|
||
Some operating systems (OpenVMS being one of them) will not return from a
close() until all the data is on the disk. Its probably these systems which are
really getting hit by the cookies file getting rewritten several times per page
load.
Comment 15•24 years ago
|
||
I wrote a small test to get some idea of the cost of writing a cookie file.
Using my own cookie file to estimate how much data to write, I decided that the
program should write 250 lines each of 75 characters. I put the
open/write-loop/close in a loop and let it run long enough to get some
reasonable numbers. My two test systems were a dual-cpu Alpha running OpenVMS,
and a 500MHz pentium running RedHat 6.1. Results are:
OpenVMS - 70 "cookie files" in 9 seconds - 0.9s for 7 cookies
Linux - 7000 "cookie files" in 4 seconds - 0.004s for 7 cookies
[The "7 cookies" is because earlier in this bug someone states that visiting
netscape.com results in 7 cookies. All times are elapsed times, not cpu times.]
So, somewhat as expected. On an O/S with a disk write cache the cost of writing
multiple cookie files in negligable. But on an O/S where it cares that bits
actually make it out to the disk the cost is pretty high.
Note that this test does not take into account the reading of the cookie file.
I'll attach the little test program in case anyone wants to try on a different
operating system.
Comment 16•24 years ago
|
||
Comment 17•24 years ago
|
||
anyone want to run colin's program on the mac and post the findings here?
Comment 18•24 years ago
|
||
Jud: no one proposed blindly appending, that's silly. Append-mode write makes
sense if the client can know the cookie is not yet in the file (and only the
client writes the file).
We have seen performance results from jrgm showing where this hurts: NFS. NFS
does synchronous writes (without server-side NVRAM accelerator hardware) per its
spec, so clients can be sure that data is on stable storage by the time they get
the RPC reply message for a given write.
Then there is OpenVMS.
I think this bug is worth fixing for those who care. Maybe Colin could come up
with a patch?
/be
Comment 19•24 years ago
|
||
I looked at the cooked code a little. First off, its NOT possible to simply
append a new cookie if its new because the contents of the cookie file are
sorted (by path name length). Also I'm not sure how much gain that would really
have given use anyway since a lot of the time we are updating existing cookies.
At the moment whenever we create or update a cookie we set a changed flag and
then call cookie_Save. So the fix is to just set the changed flag, and then call
cookie_Save from an OnEndDocumentLoad routine (if the flag is set)? Is that's
all we need to do here? I could probably manage that!
Comment 20•24 years ago
|
||
Just a small note that synchronous behavior can occur on a Linux ext2
filesystem. Just mount the fs with the sync option.
You can see the difference with this test. Assume the test program is
a.out, then do
rm cookie-dummy.txt
chattr +S `pwd`
./a.out 200
rm cookie-dummy.txt
chattr -S `pwd`
./a.out 200
Assignee | ||
Comment 21•24 years ago
|
||
Colink, I had already thought about the sorting issue and and concluded it is
not a problem. The cookies need to be sorted by pathlength when in memory but
the order on the hard disk is insignificant -- whenever the cookies are read in
they get re-sorted anyway.
But I also agree with you that the appending is probably not worth doing.
As far as the write on load done, you are correct that the flag that is already
being set makes this very simple. I am presently testing out a change that does
this. Furthermore I'm in the midst of a major code reorganization for the
cookie module (see bug 46783) so it's best that nobody else modify any of the
cookie files at the moment. I'll probably be checking in the fix to save-once
as part of the checkin for bug 46783 and will close out this bug report at that
time.
Comment 22•24 years ago
|
||
Steve: thanks, that matches my experience with cookies files -- they need not be
sorted on disk.
Everyone: how about some measurements on the synchronous systems before we give
up on append-mode? If the cookies file is small, it may not matter. But in
general I think it's better to avoid rewriting a file in place. The other
benefit is crashproofing, which OS kernel and utility code always ensures by
rewriting a tmp file on the same filesystem, then using an atomic rename(2)
system call to replace the existing file. At no point will a crash leave you
with no data, partial data, or mangled data.
Of course, cookies.txt may not be worth the bother. Opinions? The code change
to do a safe tmpfile write and rename is small. I can help with the details.
/be
Assignee | ||
Comment 23•24 years ago
|
||
The change for writing the cookies when the document finishes loading has now
been included in the patch for bug 46783. So I'll close out this bug report
when that patch gets checked in.
Comment 24•24 years ago
|
||
Brendan: what about symlinks? Should mozilla replace a symlink with a
regular file just because it didn't expect it? It would annoy the user
who had her reasons. Does anyone consider a symlink a security issue? I
don't; others may.
It occurs to me that a power user could make the cookie file a named
pipe and do all sorts of interesting things. Maybe mozilla should pay
more attention to special files.
Personally, I don't think this is worth all the attention it's getting.
Assignee | ||
Comment 25•24 years ago
|
||
Fixed as part of the checkin for bug 46783 that just occured.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•