Closed Bug 124770 Opened 23 years ago Closed 8 years ago

"Delayed Write Failed" error message

Categories

(Core :: Networking: File, defect, P4)

x86
Windows 2000
defect

Tracking

()

RESOLVED DUPLICATE of bug 253060

People

(Reporter: michael.wardle, Unassigned)

Details

(Whiteboard: [necko-backlog])

Attachments

(1 file)

I frequently get error messages

-----
Windows - Delayed Write Failed
--
Windows was unable to save all the data for the file
\Device\LanmanRedirector\<profiledir>\Cache\....
The data has been lost. This error may be caused by a failure of your computer
hardware or network connection. Please try to save this file elsewhere.
-----

I also occasionally get similar messages about trying to write to other areas of
my profile (mail directory, and so on).

As you might guess, my home directory is a CIFS/SMB/LANManager remote share
(actually provided by Samba under Linux, for what it's worth), shared as
\\host\username, mapped to H:.

It seems that these messages only occur (or occur more frequently) if I try to
download a file and save it onto H:, then writing any file to my profile seems
to cause problems.  This also means that any changes made once these errors
start occuring are lost.

I do not see messages like this under any other Windows application.

Suspect there's a problem with the network code and or cache code.

OS: Windows 2000 Professional
hardware: Intel Pentium III
Mozilla: 2002020406 (0.9.8)
Is there somekind of quota in place for how much data you can save to the remote
disk?
No, I already checked this.  I have no quota, and the remote disk has sufficient
space left.
reporter: can you try setting your cache directory to some local drive?  see
Preferences->Advanced->Cache in a recent nightly build.
This doesn't sound cache specific because the reporter experiences problems with
other areas of his profile.  The initial file downloading to the remote drive
seems a bit suspicious though.
I'm not seeing this since installing build 2002022603, but don't take that to
mean it's not there -- it just means I haven't seen it yet! :-)
Still observing this in 20020304 -- reaffirming that it seems to happen only
after downloading a file.

I set my disk cache directory to C:\WINNT\Temp, and it still happens.
Confirmed XP/2000, profile stored on network drive samba (Linux | Solaris).

This is not cahce specific, happens when moz writes any profile data back to the
profile.

Problem happens when samba has closed the connection (idle time - logic error etc.)

To reproduce (brutally) then 
1) set your profile to a network drive (samba) (H:)
2) Do a lot of surfing that would involve using the cache
3) find the PID of your samba connection (smbstatus on Unix box)
4) kill your smb demon (pid from 3)
5) reconnect the drive h: by usig explorer to view files on H:
6) do more surfing
   -> Eventually you will get delayed write errors


Also happens if you have mozilla running and you hibernate the PC

Other programs do not seem  to exhibit the same behaviour
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P4
Target Milestone: --- → mozilla1.3alpha
I am also getting the same error for mail. I have my mail downloaded on a mapped
directory (also via Samba). And after a period of time, presumable after Samba
as closed the connection for being idle, I get the Delayed Write Failed error.

Closing down Moz and relaunching it solves the problem, so presumably the code
to re-establish a connect or whatever is there.??? 

I'm using Moz 1.2 20021126 on Win2000 Server.
gordon: does mail and cache share the same usage of a file interface?

This sounds like a smb version of a TCP connection reset. I don't know much
about the internals of these windows file sharing protocols, maybe we need to
file a more generalized bug in File + see if mail has any similar reports?
Ben: I'm not sure what you mean.  Mail doesn't use the disk cache, but both the
disk cache and mail use files.  I don't know what 'interface usage' you're
referring to.

This is not a cache bug.  If a user puts files on a remote disk, and that disk
goes away, there's not much we can do.  Our cache code at least will recover on
next launch by detecting the cache was not flushed and create a clean one.

I'm not sure what other Windows applications the reporter is using, so I have no
idea what behavior to expect from them, when his connection breaks to his remote
disk.  Perhaps the difference is that Mozilla opens files and leaves them open
(for performance).  Maybe loosing the SMB connection breaks existing file
references, but opening new file references reestablishes the SMB connection.

I'm inclined to mark this either INVALID, WONTFIX, or pass it on to profile or
mail in case there's anything they can do (I'm not sure if mail needs to leave
files open).
I thought mozilla has at least two sets of file I/O libraries. One doesn't even
work w/ UNC paths, from what I recall. At any rate, very little has been done w/
making sure we run robustly in SMB environments.


Additional details are in MS knowledge base articles Q311563 and Q321733:
http://support.microsoft.com/default.aspx?scid=kb%3Ben-us%3B311563
http://support.microsoft.com/default.aspx?scid=kb;en-us;q321733

The eent log entry I see is Event ID 50, code STATUS_CONNECTION_DISCONNECTED,
not the STATUS_ACCESS_DENIED that is fixed by their hotfix.
-> File,

per #6, this is probably not a cache problem, but is still related to I/O when
accessing files on the remote disk.

Assignee: gordon → darin
Component: Networking: Cache → Networking: File
QA Contact: tever → benc
marking INVALID.  i don't see any evidence that this is mozilla's fault.  sounds
more like a bad network drive configuration.  mozilla just uses straight file
i/o, like every other application.

benc: mozilla may have to file abstractions, but that doesn't matter.  in the
end it just calls the same WIN32 methods to open/read/write/close files.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → INVALID
VERIFIED/invalid:
This is the only example of this problem I can recall.
Status: RESOLVED → VERIFIED
Same problem experienced with Mozilla 1.7.1 Win2k, with shares from a Samba
3.0.5 PDC.  I am Using Roaming Profiles with the Application Data being accessed
from a share rather than the cached romaing profile.  This is to minimize Login
and logout times for those people who have a very large stash of emails and
other crud thatnormally gets stored in the profile.  I suspect after some
reading that this happens because files are left opened rather than
open-write-close.  If Samba closes an idol connection then the file descriptor
to that file becomes invalid.  I have turned off the cache and found I have the
problem with the history.dat only.  I wonder if it's possible to change this
approach.
This appears to be a consistent problem in any environment where profile 
redirection is taking place to a network server. Our domain redirects 
Application Data (and hence the firefox/thunderbird caches) to the main Windows 
file server. If Firefox or Thunderbird are left open long enough, 
consistent "delayed write failed" error messages will occur until Mozilla is 
restarted.

Firefox and Thunderbird are the _only_ applications in our domain that have 
this problem.

See: http://www.doc.ic.ac.uk/csg/faqs/2000.html#q17
reopening... i'm willing to accept that we should perhaps close file descriptors
after some timeout.  no promise that i'll have time to fix this.  patches are
definitely welcome.
Status: VERIFIED → REOPENED
Priority: P4 → --
Resolution: INVALID → ---
Target Milestone: mozilla1.3alpha → mozilla1.8beta
My hope is that we'll move to storing the cache locally instead of on the
network drive, and that should help with these problems.
Priority: -- → P4
Target Milestone: mozilla1.8beta1 → Future
For anyone looking at this thread, the work-around we've adopted is to disable
the cache. We do this by modifying defaults\profile\prefs.js to include:
     user_pref("browser.cache.disk.enable", false);

This also has the convenient side effect of not running users over their
filesystem quota with worthless temporary cache files (a regular problem with
Netscape/Mozilla on the UNIX side of our operations).

Note this won't help your users if they have already created Firefox profiles.
(In that case, Firefox uses whatever value is defined in their individual prefs
files.) If you're in this unfortunate situation, your best bet is probably to
force the issue by editing user.js
>Note this won't help your users if they have already created Firefox profiles.
>(In that case, Firefox uses whatever value is defined in their individual prefs
>files.)

pref values identical to the default value are not stored in prefs.js.
It's not just cache that's the problem. We store by policy TB mailprofile to the server and we run *smack* into this problem every time the server (win2003 server) is restarted. This happens more often than you'd think due to nice hotfixes.

Restarting TB does not make this problem go away. Restarting windows (w2k/xp) does. Mailprofiles are saved to server because your mail archive is in fact quite valuable piece of company IP and deserves robust backups.
Assignee: darin → nobody
Status: REOPENED → NEW
QA Contact: benc → networking.file
Target Milestone: Future → ---
Ugh. Bouncing happy 4-year old and nobody's even assigned to it!
Well I might be wasting my time adding my comments as the last one was back in January, but this issue wont be going away.
I'm slowly moving from the evil empire to Linux and the open source community and have this problem on TB 1.5.0.5.  It also occurs for me with Outlook 98 in the same way.
Like more and more home users I have multiple PC's and to make life easier I place as much data as I can on a stand alone network drive.  I get the problem if I leave TB (and previously Outlook) open, go into standby and then take the machine out of standby.  Now I know programs aren't designed for this sort of behaviour and some would say functions as designed in this instance as files are left open and hanging when the connection drops.  But it's still a bug if users get the problem.
I can't beleive it's not fixable reasonably easily, although I admit I've never worked on a program as large as TB myself.
Could an option be added 'immediate file closure'.  Then every time a routine opens a file, it checks to see if the option is ticked, and if so it would immediatly close the file after the write.
This would make program response slower for those who chose it, but in a home enviroment and ones where people have dodgy networks, that wouldn't matter.
I voted for this bug a few years ago, and it's still happening in the latest version of TB. I have my Inbox on a samba share, and if the network connection drops (e.g. by putting the laptop to sleep), I get "delayed write" errors for Inbox.msf even after re-establishing the shared drive. The only solution is to quit TB. Then it takes about 15 mins while TB rebuilds the Inbox.msf file!

The problem seems to be that TB is keeping the filehandle open, which is an odd way of doing things. Why not just close the file handle after the user operation is complete? Closing and re-opening the handle isn't likely to cause any noticeable performance loss.
3 years later (TB 10.0.2) ...

I don't have my profile on a Samba share but I do have the mail content on a Samba share. I get this error after I resume my PC from hibernation.

The only workaround seems to be to restart TB (which is painful because for security reasons I don't store passwords and, yes, I realise that that conflicts with hibernating at all).

Possible fixes:

* don't keep the file open (as proposed in the previous comment but I would be worried about the performance impact)

* detect the necessary system events so that all files are closed on hibernation and reopened on resumption (this seems good to me)

* provide the user with a means of manually hibernating the application before hibernating the PC - resume would be manual too - window title should reflect hibernated state (seems like a hack to me though)

However the second and third fixes only address hibernation (i.e. a controlled loss of network connectivity), not an uncontrolled loss. The latter has not happened and would be very rare for me (sharing is only across LAN, server is up 24x7). I can live with the "delayed write" errors in the uncontrolled loss scenario.

At the very least can TB detect the fact that it has lost connectivity to its files and recover gracefully?

Is there some way of disabling the "delayed write", either on a file-open or on a file or for a whole PC? (Is "delayed write" crucial to performance for these .MSF files?)
8 year later... (TB 15.0.1, Win7x64), it must be about time to fix this?

Happens whenever I hibernate my laptop, profile is on a network share.  

Given the trend (finally) towards power efficient devices and mobile devices, it would be nice to provide graceful error handling/recovery for connection lost scenarios.  My laptop is frequently suspended - restarting TB after each hibernate defeats the purpose of hibernate.
Whiteboard: [necko-backlog]
Status: NEW → RESOLVED
Closed: 21 years ago8 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: