Last Comment Bug 477934 - nsSafeFileOutputStream (prefs.js) not safe from system crashes
: nsSafeFileOutputStream (prefs.js) not safe from system crashes
Status: VERIFIED FIXED
: dataloss, fixed1.8.1.22, verified1.9.1
Product: Core
Classification: Components
Component: Networking: File (show other bugs)
: Trunk
: All All
: -- critical (vote)
: mozilla1.9.2a1
Assigned To: timeless
:
: Patrick McManus [:mcmanus]
Mentors:
: 318156 406378 (view as bug list)
Depends on: 496825
Blocks: profile-corrupt startup 406378
  Show dependency treegraph
 
Reported: 2009-02-10 18:18 PST by Sean Hunt
Modified: 2015-09-17 04:43 PDT (History)
26 users (show)
mbeltzner: blocking1.9.1+
dveditz: wanted1.8.1.x+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
flush first (673 bytes, patch)
2009-03-12 02:56 PDT, timeless
cbiesinger: review+
dveditz: approval1.8.1.next+
Details | Diff | Splinter Review
(2.84 KB, text/plain)
2009-05-07 22:39 PDT, WangJie
no flags Details

Description Sean Hunt 2009-02-10 18:18:34 PST
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121622 Ubuntu/8.10 (intrepid) Firefox/3.0.5
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121622 Ubuntu/8.10 (intrepid) Firefox/3.0.5

If the OS crashes or processing otherwise terminates, prefs.js has a tendency to get corrupted, and Firefox just resets the whole thing. Unfortunately, the first thing I do upon startup is load firefox, so I keep forgetting to check to see whether any corruption has occurred, so I can't produce the corrupted version.

Reproducible: Sometimes

Steps to Reproduce:
1. Crash the OS (or hold down the power button)
2. Load Firefox.
Actual Results:  
Many preferences have disappeared, including all my NoScript & Chatzilla options. These are all stored in prefs.js

Expected Results:  
Firefox should not lose my preferences.

I'm on ext4, so a filesystem bug could be the culprit, but the journal should prevent any serious damage to the files. So Firefox must not be writing to the file all at once and have no backups.
Comment 1 Sean Hunt 2009-02-12 12:51:57 PST
Update: I managed to look at prefs.js this time, and the file was completely empty, rather than corrupted.

This would imply the failing is somewhere between the fopen call that resets the file, and the actual writing to the file.

The only solution I can think of is to write to a backup file after writing to the main prefs.js, so that if prefs.js is lost, then backup can be used to restore prefs.
Comment 2 WADA 2009-02-13 00:04:46 PST
See Bug 246722 Comment #2 for history of backups.number_of_prefs_copies.
Comment 3 Sean Hunt 2009-03-11 15:30:39 PDT
It appears to be a conflict between the way ext4 handles delayed allocation, and the way Firefox writes files. More information at https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/88
Comment 4 timeless 2009-03-12 02:56:58 PDT
Created attachment 367003 [details] [diff] [review]
flush first

in theory this should fix it....

otoh, i don't intend to setup linux+ext4 and then intentionally cause my system to crash just to test it.

note that i'm not testing the return value from Flush(). It could be tested, but i'm not sure what's gained from doing that.

Either the user wants a chance to get their new data, or they don't.
Comment 6 Simon Bünzli 2009-04-26 08:53:00 PDT
*** Bug 318156 has been marked as a duplicate of this bug. ***
Comment 7 Simon Bünzli 2009-04-26 08:56:04 PDT
Comment on attachment 367003 [details] [diff] [review]
flush first

This would also make Session Restore more reliable (cf. bug 406378).
Comment 8 timeless 2009-04-28 00:15:06 PDT
*** Bug 406378 has been marked as a duplicate of this bug. ***
Comment 9 Mike Beltzner [:beltzner, not reading bugmail] 2009-05-06 10:53:25 PDT
Comment on attachment 367003 [details] [diff] [review]
flush first

a191=beltzner
Comment 11 Mike Beltzner [:beltzner, not reading bugmail] 2009-05-06 21:27:56 PDT
This checkin was in a range identified with a Ts Shutdown regression on Windows:

Regression: Ts Shutdown increase 27.64% on WINNT 5.1 Firefox3.5
   Previous results:
       357.263 from build 20090506145316 of revision 486b76052a94 at 2009-05-06 14:53:00
   New results:
       456.0 from build 20090506155401 of revision 7aa4483585bd at 2009-05-06 15:54:00
http://graphs-new.mozilla.org/graph.html#tests=[{"machine":32,"test":36,"branch":3},{"machine":33,"test":36,"branch":3},{"machine":34,"test":36,"branch":3},{"machine":35,"test":36,"branch":3},{"machine":48,"test":36,"branch":3}]&sel=1241564040,1241736840
Comment 12 Dão Gottwald [:dao] 2009-05-07 02:56:44 PDT
Backed out from trunk and branch. If there's no fix for the perf regression and it's considered less severe than the dataloss issue, you might want to reland this as is on trunk and re-request branch approval.
Comment 13 Shawn Wilsher :sdwilsh 2009-05-07 07:41:53 PDT
So, if it's supposed to be a safe output stream, and we are using it as such, and it's not actually safe, it seems to be a serious issue.  This might just be a performance regression we need to eat.
Comment 14 WangJie 2009-05-07 22:39:32 PDT
Created attachment 376378 [details]
 

test comment
Comment 15 timeless 2009-05-10 15:15:29 PDT
dao: so um. this is a mandatory flush. basically anytime we're seeing a perf penalty here, it's because the system was planning to lose our data if it died.

we're paying for an essential data flush.

if someone wants to rework the flush so that it happens on a thread while we shut down, well, in theory, that might be possible, but given the totally broken state of unix platforms, you probably wouldn't see a win for a while anyway. (oh, and the complexity of such a thing would be amazingly painful, you'd have to create a queue and make sure that renames cooperated sequentially and that you shut down the flusher before you quit.)

now, i don't care, but if you're going to back it out, then i hope that you'll also reland it.

i'm traveling for business and am not going to campaign to save your [firefox] users' data.

to be perfectly frank, i'm quite happy for everyone else to lose their data. I've done my part by providing the necessary fix.
Comment 16 Andrew Hagen 2009-05-10 17:30:55 PDT
Keyword: relnote. Assuming this is not fixed for 3.5, here is the proposed release note:

"Users on the Linux platform may randomly experience loss of preferences when Firefox shuts down. Plan accordingly."
Comment 17 Dave Garrett 2009-05-10 19:53:53 PDT
(In reply to comment #16)
> "Users on the Linux platform may randomly experience loss of preferences when
> Firefox shuts down. Plan accordingly."

It's certainly not random. It's a compounding effect from a crash.
Comment 18 Simon Bünzli 2009-05-11 00:44:44 PDT
(In reply to comment #16)
This actually affects all platforms, not only preferences (e.g. sessionstore.js as well) and only the case of a system crash.
Comment 19 Mike Beltzner [:beltzner, not reading bugmail] 2009-05-12 11:42:24 PDT
Dataloss vs. TsShutdown ... hard call, but I think the arguments in here have swayed me to believe that this is more important.

Please reland on trunk and branch.
Comment 20 Johnny Stenback (:jst, jst@mozilla.com) 2009-05-12 19:26:46 PDT
Re-landed on trunk and 1.9.1.

http://hg.mozilla.org/mozilla-central/rev/6bb086bb8b90
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/5bb3fc3ab02f
Comment 21 Ludovic Hirlimann [:Usul] 2009-05-28 10:19:18 PDT
Would this be portable to 1.8 for the next Tb 2.x update ?
Comment 22 Dan Mosedale (:dmose) 2009-05-28 10:25:00 PDT
I suspect this wants to actually block 1.8.1.next to help Thunderbird profile lossage issues; nominating.
Comment 23 Ludovic Hirlimann [:Usul] 2009-05-28 10:25:46 PDT
Oups
Comment 24 Daniel Veditz [:dveditz] 2009-05-29 11:07:43 PDT
This is coming in too late to block for 1.8.1.22 -- we don't have time to hassle this into the tree. If you want it fixed now rather than two-three months from now we'll take the patch if you request approval and can promise to land it on the 1.8 branch _TODAY_, Friday May 29.
Comment 25 Daniel Veditz [:dveditz] 2009-05-29 12:28:39 PDT
Comment on attachment 367003 [details] [diff] [review]
flush first

Approved for 1.8.1.22, a=dveditz
Comment 26 Robert Kaiser 2009-05-29 12:31:08 PDT
Checking in netwerk/base/src/nsFileStreams.cpp;
/cvsroot/mozilla/netwerk/base/src/nsFileStreams.cpp,v  <--  nsFileStreams.cpp
new revision: 1.72.8.2; previous revision: 1.72.8.1
done
Comment 27 Henrik Skupin (:whimboo) 2009-06-07 16:40:49 PDT
Verified fixed on trunk and 1.9.1 based on code check-in.

I tried it also with the steps given by the reporter and didn't loose settings. Reporter, can you verify the fix too?
Comment 28 Michael Kraft [:morac] 2009-06-09 08:13:42 PDT
This change has resulted in a fairly good chance of triggering a NS_ERROR_FILE_ACCESS_DENIED exceptions when nsSafeFileOutputStream::Finish() is called on my home PC.  The problem is that the Flush() is triggering Norton Internet Security 2009 (and presumably Norton Antivirus 2009) to scan the file which can result in the temporary file being locked when the move occurs, resulting in the changes being lost.

This is resulting in frequent loss of data when saving preferences, installing add-ons, etc.

See bug 496825
Comment 29 Dan Mosedale (:dmose) 2009-06-09 18:10:14 PDT
Bug 496825 Comment 15 makes me think that we need to get in touch with Symantic.  Perhaps Sam has some insight here?
Comment 30 Samuel Sidler (old account; do not CC) 2009-06-09 18:23:12 PDT
I contacted them, as I said in that bug... Michael's comment was split off to a new bug, which is where we're tracking that (potential) regression.

Note You need to log in before you can comment on or make changes to this bug.