Losing history on every Win98 crash due to missing directory flush

RESOLVED FIXED

Status

()

Core
History: Global
--
major
RESOLVED FIXED
15 years ago
10 years ago

People

(Reporter: Christian Franke, Assigned: Bienvenu)

Tracking

Trunk
x86
Windows 98
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.4.2 +
blocking1.6b -
blocking1.6 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

15 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.5) Gecko/20030925
Build Identifier: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.5) Gecko/20030925

When Win9x crashes with Mozilla open, the history file gets corrupted by
scandisk's file size "repair".

Reproducible: Always

Steps to Reproduce:
1. Start Mozilla and visit URLs.
2. Crash Windows (e.g. Hardware-Reset)
3. Reboot Windows, let scandisk repair the allocation error of history.dat.
4. Start Mozilla
5. Open History

Actual Results:  
History Lost.

Expected Results:  
History preserved, only some recent entries may be lost.

Win98 does never update the file size in the directory entry when the file is
written in sequential manner without intermediate seeks or reads.

After reboot, scandisk repairs the difference between allocation chain length
and file size by setting the file size to end of the last cluster.

This adds garbage to the end of history.dat. Mozilla clear the corrupted file.

A call to FlushFileBuffers() to update the file size would solve this problem.
For an example see function update_EOF() in dbm/src/hash.c
(http://lxr.mozilla.org/seamonkey/source/dbm/src/hash.c#617)


For a method to recover the history see Bug 113378, comment 10.
(Reporter)

Comment 1

15 years ago
This may work or not:

morkConfig.h:

  #if defined(MORK_WIN)
  void mork_fileflush(FILE * file);
  #define MORK_FILEFLUSH(file) mork_fileflush(file) 
  #else
  #define MORK_FILEFLUSH(file) fflush(file) 
  #endif /*MORK_WIN*/

morkConfig.cpp:

  #if defined(MORK_WIN)

  #include <whatever/needed.h>

  void mork_fileflush(FILE * file)
  {
    fflush(file);
    int fd = fileno(file);
    HANDLE fh = (HANDLE)_get_osfhandle(fd);
    FlushFileBuffers(fh);
  }

  #endif /*MORK_WIN*/

(Sorry, but I don't have Moz.Development installed.
Perhaps someone can test this ;-)

*** This bug has been marked as a duplicate of 63292 ***
Status: UNCONFIRMED → RESOLVED
Last Resolved: 15 years ago
Resolution: --- → DUPLICATE
Summary: Loosing history on every Win98 crash due to missing directory flush → Losing history on every Win98 crash due to missing directory flush
(Reporter)

Comment 3

15 years ago
Re comment 2: Sorry for possible spam, but this is IMO not a dup, because Bug
63292 does not address this Win9x-only issue (no problem on W2K/XP or other OS).
See also Bug 63292, comment 75.
(Reporter)

Comment 4

15 years ago
Created attachment 133681 [details]
Testprogram + Outputs to demonstrate directory flush issue

The program writes 10 Blocks of 1000000 Bytes and displays the file size
reported by the directory functions (stat()/FindFirstFile()) after each Block.
After Block 3 and 7, fsync()/FlushFileBuffers() is called. For the remaining
blocks, the programs waits up to 20 seconds for size match. Results:

Win98SE: Size in dir is only updated by FlushFileBuffers() and CloseHandle().

WinXP-FAT: Slow (about 130Kb/sec) and unrelieable update of size in dir,
FlushFileBuffers() recommended.

WinXP-NTFS: Fast update of size in dir.

Linux: No problem ;-)

Comment 5

15 years ago
*** Bug 223781 has been marked as a duplicate of this bug. ***

Comment 6

15 years ago
reopen b/c duplicate
Status: RESOLVED → UNCONFIRMED
Resolution: DUPLICATE → ---

Comment 7

15 years ago
and confirming b/c duplicate
Status: UNCONFIRMED → NEW
Ever confirmed: true
(Reporter)

Comment 8

15 years ago
Re comment 7: Thanks for confirming.
Same problem and same fix applies to panacea.dat and Mail/*.msf files
blocking1.4.2? and 1.6b?
Flags: blocking1.6b?
Flags: blocking1.4.2?
(Reporter)

Updated

15 years ago
Blocks: 19454

Updated

15 years ago
No longer blocks: 19454

Comment 9

15 years ago
Who knows mork and can help us here? bienvenu, jst, can you look at this patch? 
(Assignee)

Comment 10

15 years ago
sure, I can try this - the patch looks OK.
(Assignee)

Comment 11

15 years ago
I worry a little that it might slow us down on windows, but I'll try to see.
(Assignee)

Comment 12

15 years ago
Created attachment 136314 [details] [diff] [review]
cvs diff

this fix will work - however, it will cause excessive disk flushing on windows,
aka grinding, because we commit quite frequently. Also, do we even support
win98 anymore?
(Reporter)

Comment 13

15 years ago
The following change prevents excessive disk flushing on WinNT/2K/XP:

 void mork_fileflush(FILE * file)
 {
   fflush(file);
+  if (!(GetVersion() & 0x80000000))
+    return; // WinNT/2K/XP
+  // Win9x/ME
   int fd = fileno(file);
   HANDLE fh = (HANDLE)_get_osfhandle(fd);
   FlushFileBuffers(fh);
 }
(Assignee)

Comment 14

15 years ago
thx very much, I'll try that. Personally, I don't think we need the flushing on
win NT because NT is much better about cleaning itself up when apps crash - blue
screen of death is very rare, at least compared to win98...
msdn about GetVersion:
"This function has been superseded by GetVersionEx, which is the preferred
method for obtaining system version number information. New applications should
use GetVersionEx. "
(Reporter)

Comment 16

15 years ago
That's correct, here a version using GetVersionEx:

+  OSVERSIONINFOA vi = { sizeof(OSVERSIONINFOA) };
+  if (!(GetVersionExA(&vi) && vi.dwPlatformId == VER_PLATFORM_WIN32_WINDOWS))
+    return; // WinNT/2K/XP/unknown
+  // Win9x/ME

(BTW: MSVC 6.0 CRT itself uses GetVersion(), so it is very unlikely to disappear
from KERNEL32.DLL ;-)

Re comment 14: Yes, the problem with missing updates of file size on open files
only occurs with Win9x (aka DOS) file system. I only found 2 reliable methods to
force size update: call FlushFileBuffers() or close/reopen the file. The latter
is much faster but impractical here.
(Assignee)

Comment 17

15 years ago
why is closing/reopening much faster? Shouldn't that also flush all the file
buffers, i.e., do more work?
(Reporter)

Comment 18

15 years ago
FlushFileBuffers() would not return until file/dir/FAT buffers are flushed to
disk but close/reopen() leaves physical flushing to standard write behind routine.

Comment 19

15 years ago
So what is the state of the patch we have here. Is there going to be observable
performance impact? Will this be confined to win98? Is this patch low-risk
enough to taked this late in 1.6? Should we hold off until we open for 1.7.
(Assignee)

Comment 20

15 years ago
there will be a noticeable performance impact on win98 (and win ME?). I think it
won't be too noticeable with normal operations (downloading mail, reading mail,
etc) but some operations, like downloading imap or news for offline use might
cause quite a bit of disk rattling, due to over-committing of the db.

I can go either way w.r.t. checking into 1.6 vs 1.7. My feeling is that what's
going to happen is that once this gets checked in, some hot spots are going to
be found on win98 that have to be fixed by reducing commits. Is that risk worth
avoiding the history.dat corruption when win98 crashes? Probably...I'll attach
the latest version of the patch.
(Assignee)

Comment 21

15 years ago
Created attachment 136736 [details] [diff] [review]
latest patch
Attachment #136314 - Attachment is obsolete: true

Comment 22

15 years ago
Who can review this patch? If it doesn't make beta (unlikely) it sounds like we
probably wouldn't want to take the risk into final since we wouldn't have the
chance to respond to those hotspots. 
(Assignee)

Comment 23

15 years ago
Comment on attachment 136736 [details] [diff] [review]
latest patch

I'll review it, and request sr from mscott
Attachment #136736 - Flags: superreview?(mscott)
Attachment #136736 - Flags: review+

Updated

15 years ago
Attachment #136736 - Flags: superreview?(mscott) → superreview+

Updated

15 years ago
Flags: blocking1.6b?
Flags: blocking1.6b-
Flags: blocking1.6+

Comment 24

15 years ago
Comment on attachment 136736 [details] [diff] [review]
latest patch

a=asa (on behalf of drivers) for checkin to 1.6.
Attachment #136736 - Flags: approval1.6+
(Assignee)

Comment 25

15 years ago
I'll check this in as soon as I can. maybe tonight.
(Assignee)

Comment 26

15 years ago
taking.
Assignee: history → bienvenu
(Assignee)

Comment 27

15 years ago
fix checked in last night (12/10/2003)
Status: NEW → RESOLVED
Last Resolved: 15 years ago15 years ago
Resolution: --- → FIXED

Comment 28

15 years ago
To reduce HD usage peaks on win9x, I suggest we should use some timer, i.e. call
FlushFileBuffers() every n seconds (10?) if a flush is necessary. In the case of
a windows crash, some history will be lost, but that'll only be that of the last
10 seconds. That, inmo, is a good tradeoff between a total loss of last
session's history, and a (yet to be confirmed) dramatic increase of hd usage on
win9x.

Updated

15 years ago
Blocks: 231606

Updated

15 years ago
Blocks: 231919

Comment 29

15 years ago
Exactly.

And why didn't you use the close/reopen apporoach when it is faster? Somebody
said it is impractical. But I think it is the same: flush <=> close/reopen in
the same place in the code. It is easier than close an reopen only once needed,
in other place of code. The OS will flush the closed/reopened file a bit later
than the flushfile command, but in case of win98 this looks like only 2secs. I
wouldn't worry about that.

Comment 30

15 years ago
This would be nice for 1.4.2 I think.
Flags: blocking1.4.2? → blocking1.4.2+

Comment 31

15 years ago
Great fix. The filesizes reported by a file manager are now the same as the ones
indicated in Mozilla process file handles. No disk thrashing so far.

Comment 32

14 years ago
*** Bug 227257 has been marked as a duplicate of this bug. ***

Comment 33

14 years ago
This looks like the fix went into the mork db backend and should therefore work
on all files that use this format. For some reason, it doesn't seem to apply to
the addressbook *.mab files. E.g. the filesize is reported as 0 bytes, but
opening the file shows there already is some data (of course while Moz is running).

Comment 34

14 years ago
Forget it, I tested it over a network share and it looked wrong. Checking it on
a local machine yielded the correct results - the filesize updated soon after
any changes. Sorry.

Comment 35

10 years ago
I have now found i disk thrashing situation with this: when the history is set to larger than 90 days, the history.dat file has grown to 50MB and it is beeing rewritten after every change (like truncating and filled again), or when closing a navigator window. This thrashes the disk and waits until the file is written, only then can seamonkey be used again. The machine has 384 MB RAM, but this is a hard flush to disk, so any (abundant) cache is not utilized. Can we do something about it?
You need to log in before you can comment on or make changes to this bug.