Closed
Bug 32443
Opened 23 years ago
Closed 21 years ago
GetMsg currently does NOT detect out of disk space conditions
Categories
(MailNews Core :: Backend, defect, P3)
Tracking
(Not tracked)
VERIFIED
WORKSFORME
People
(Reporter: jce2, Assigned: jce2)
References
Details
(Keywords: dataloss, Whiteboard: Cannot reproduce since checkin. relnote-user)
Attachments
(2 files)
4.20 KB,
patch
|
Details | Diff | Splinter Review | |
4.66 KB,
patch
|
Details | Diff | Splinter Review |
I ran out of space on the partition where my mail gets downloaded, and instead of giving me an error, it acted like it downloaded everything perfectly. Then when I went to read the last 10 messages, I kept getting "Unexpected error" dialogs popping up. Of course, the "unexpected error" was that I had run out of disk space, and the body to those messages had NOT been downloaded. You need to check for out of disk space conditions while downloading and saving every message.
Comment 1•23 years ago
|
||
Reassign to jefft
Assignee: phil → jefft
Summary: Mail/News currently does NOT detect out of disk space conditions. → GetMsg currently does NOT detect out of disk space conditions
Assignee | ||
Comment 2•23 years ago
|
||
M17!??? This needs to be fixed NOW..and it should be an easy fix too!! (at least if you isolate disk reads in one procedure, like you should be doing). Assign it to ME if you're that swamped.
If you want to fix this and have the fix be x-platform, please work with jefft to assign this bug report to you.
jce2, thanks for offering help. Let me know when you have a fix. Appreciated.
Assignee: jefft → jce2
Status: ASSIGNED → NEW
Assignee | ||
Comment 6•23 years ago
|
||
Well, there WAS an out of disk check in nsPop3Protocol.cpp, but it seems to be ifdefed out. #if 0 // not yet { const char* dir = MSG_GetFolderDirectory(MSG_GetPrefs(m_pop3ConData->pane)); /* When checking for disk space available, take in consideration * possible database * changes, therefore ask for a little more than what the message * size is. Also, due to disk sector sizes, allocation blocks, * etc. The space "available" may be greater than the actual space * usable. */ if((m_totalDownloadSize > 0) && ((PRUint32)m_totalDownloadSize + (PRUint32) 3096) > FE_DiskSpaceAvailable(ce->window_id, dir)) { return(Error(MK_POP3_OUT_OF_DISK_SPACE)); } } #endif So, supposedly, we need to re-write this check using nsIFile instead of whatever this FE thing was.
Assignee | ||
Comment 7•23 years ago
|
||
BTW: When I re-write that, I'm going to make it a bit clearer what the hell a "ce" is. :P
Assignee | ||
Comment 8•23 years ago
|
||
Well, I've just found that nsLocalFile::GetAvailableDiskSpace has NOT been implimented, and we pretty much need that implimented before we can fix this bug. :P I've filed that as 32942.
Depends on: 32942
Assignee | ||
Comment 9•23 years ago
|
||
I've resolved bug #32942, now I'm getting ready to fix this bug. Therefore, assigning it to myself.
Status: NEW → ASSIGNED
Assignee | ||
Comment 10•23 years ago
|
||
This bug just bit me again. I've been too busy because of my relocation and looking for a job to fix it, but I'll try to do it this weekend. I'm trying to get a nsbeta2+ to check the fix in.
Keywords: nsbeta2
Assignee | ||
Comment 12•23 years ago
|
||
*sigh* after getting GetDiskSpaceAvailable implimented in nsIFile, I suddenly realized that nsPop3Protocol.cpp still uses nsFileSpec! Anyways, I still should be able to impliment this.
Comment 13•23 years ago
|
||
I'm wondering if we should hold nsbeta2+ for this. I'd be more than happy to take in beta3. I don't think I'd pull beta2 off the wire if we have this problem. anyone mind if I nominate for beta3 and take if off the beta2 radar?
Assignee | ||
Comment 14•23 years ago
|
||
Ok, ok, I'll get off my ass and fix this one today. What error code should I return if there isn't enough disk space? NS_NODISKSPACE ?
Comment 15•23 years ago
|
||
Hey jce2, we're going to take this off the beta2 radar. We don't think we'd pull the beta off the wire to snag this fix. But don't worry. We'll be branching the beta tomorrow. So you can check this into the tip (which won't be beta2) as soon as tomorrow if you have a fix. Not trying to rush you, if anything this just gives you more time.
Keywords: nsbeta3
Whiteboard: [nsbeta2+] → [nsbeta2-]
Assignee | ||
Comment 16•23 years ago
|
||
Almost ready to submit my patch. But I have two questions first: 1) I've written my procedure based on nsIFileSpec, because mail/news is still using nsIFileSpec. Should I also write a procedure based on nsIFile, and make an ifdef switch out of it? 2) More important: Are we ever going to have more then one pop process running at the same time? My code does not contain the semaphores or any of the other code needed to accurately handle two pop processes running within the same partition. I'm also returning MK_POP3_OUT_OF_DISK_SPACE, instead of a XP error code.
Assignee | ||
Comment 17•23 years ago
|
||
Ok, here's my first patch to fix this bug. I've stepped through this code about 20 times, so I'm very confident with it. Now we need a procedure that will handle the out of disk space error appropriately. Right now, the mail download will simply abort if there isn't enough disk space, and a debug message will be printed to the console saying that there isn't enough disk space. Of course, we should pop up a dialog box explaining why the mail download has been aborted.
Assignee | ||
Comment 18•23 years ago
|
||
Assignee | ||
Comment 19•23 years ago
|
||
Adding patch keyboard and removing [nsbeta2-] from whiteboard for re-evaluation. (I'm hoping that you might be able to check this into the branch). We should check this in as soon as possible, even though we don't have a dialog box handler for this yet, since this bug causes loss of data.
Keywords: patch
Whiteboard: [nsbeta2-]
Comment 20•23 years ago
|
||
Putting on [nsbeta2-] radar. Not critical to beta2. Adding "nsbeta3" keyword for consideration of a fix for that milestone.
Whiteboard: [nsbeta2-]
Comment 21•23 years ago
|
||
jce2 - I think you need to email waterson@mozilla.org or brendan@mozilla.org for permission to checkin your patch. thanks. Do you think you'll be able to get to a dialog to display to the user why Get Message fails for them?
Assignee | ||
Comment 22•23 years ago
|
||
I need the mail-news people to review this patch first, right? I'll add the dialog box code as soon as I figure out where that should reside. Hey, jefft, would you give me a r= on this one? And since this was a "relnote2", this should definately be a nsbeta3+
Keywords: review
Target Milestone: M17 → M18
Comment 23•23 years ago
|
||
mscott, bienvenu: Could you review the attached patch please? It has been waiting for about 10 days now. Thanks!
Whiteboard: [nsbeta2-] → [nsbeta2-] [FIX IN HAND]
Comment 24•23 years ago
|
||
It doesn't matter if nsbeta3+ or not since this is not written by a Netscape employee (ie. patch). What dialog box code are you adding? What will the dialog box say, look like? Thanks BTW, jefft is on sabbatical, hence the lack of response from him for a review.
Comment 25•23 years ago
|
||
If you have a fix, please get a review and approval by going through the Mozilla review/approval process. But, not holding Netscape PR3 for, so - per mail triage
Keywords: mailtrack
Whiteboard: [nsbeta2-] [FIX IN HAND] → [nsbeta3-][nsbeta2-] [FIX IN HAND]
Comment 26•23 years ago
|
||
OK, r=bienvenu. Could you find someone to try this on the mac for you? You've tried it on Linux, and I can try it on windows, but we should try it on the mac.
Assignee | ||
Comment 27•23 years ago
|
||
Hey! The patch is right there, I've been waiting for ages for the module owners to review it (They DO get this bug report e-mailed to them, right?). All I need is some guy to test it on the mac, and then I can ask brendan to check it in. I have complete confidence in this code, I stepped through it literally 25 times. Second of all, this bug causes DATA LOSS (I should know, I lost 75 of my e-mail messages when it bit me in march). This NEEDS to be in nsbeta3. It needed to be in nsbeta1. We're not Microsoft (are we?) There's no excuse for our software to lose potentially important e-mail messages. Third of all, it would help if module owners would make a daily search for bugs with the keyword "patch" on them in their module. The netscape PDT is being so restrictive about marking bugs nsbeta3 that you're only going to get some (VERY important) things fixed through patches from the outside world. If these patches don't make it in, netscape 6 is going to (bluntly) suck. Adding "dataloss" keyword to bug and clearing nsbeta3- for reconsideration (after all, the fix is RIGHT THERE, just review it and check it in!)
Keywords: dataloss
Whiteboard: [nsbeta3-][nsbeta2-] [FIX IN HAND] → [nsbeta2-] [FIX IN HAND]
Comment 28•23 years ago
|
||
I agree this should be nsbeta3+, and again, Jeff is on sabbatical, along with a couple other people, and those of us not lucky enough to take a vacation have been scrambling to cover for them; hence the lack of response. The main reason I wanted this tested on the mac is that GetDiskSpaceAvailable is notoriously platform dependent. I'm sure brendan or waterson would be happy to approve this.
Comment 29•23 years ago
|
||
Sorry for the delay, i was sick. Now, I am working on it and so far I found two compile errors with the way you manage PRInt64 variables which on Mac are in fact a struct of two int. I will attach the new patch when I am done testing it...
Comment 30•23 years ago
|
||
Here is the two modifications I add to to in order to compile on Mac: 1) replace: PRInt64 mailboxSpaceLeft = 0; by: PRInt64 mailboxSpaceLeft = LL_Zero(); 2) replace: // The big if statement if ((m_totalDownloadSize + (PRInt64) EXTRA_SAFETY_SPACE) > mailboxSpaceLeft ) by: // The big if statement PRInt64 llResult; PRInt64 llExtraSafetySpace; PRInt64 llTotalDownloadSize; LL_I2L(llExtraSafetySpace, EXTRA_SAFETY_SPACE); LL_I2L(llTotalDownloadSize, m_totalDownloadSize); LL_ADD(llResult, llTotalDownloadSize, llExtraSafetySpace); if (LL_CMP(llResult, >, mailboxSpaceLeft))
Comment 31•23 years ago
|
||
Comment 32•23 years ago
|
||
The patch works fine on Mac except than when my disk is full (yes, I tried this case), the function GetMsg doesn't download the messages and return an error but the User is not notified of the problem! No alert, no status, just looks like GetMsg doesn't work or you don't have new messages.
Assignee | ||
Comment 33•23 years ago
|
||
First of all, I'm GLAD that we tested this on mac first. It would have been very embarrasing to be the one who broke the build on mac. ducarroz: Could you e-mail me and give me a quick explanation about how you use LL_I2L on mac, and what I need to do differently when I do mathmatical operations? I was completely clueless of the compile error over in Linux land. (Either that, or give me a good url where all of this is explained) Basically, you're saying that PRint64 = struct { int a; int b } on Mac? (Yes, I know that isn't the correct typedef syntax, It's too early in the morning for that). Anyways, this patch WON'T pop up a dialog box when the Out of disk space condition is detected. That needs to be done by the procedure that calls nsIPop3Sink (and I don't quite know what object that is yet). As I said in an earlier comment: "Now we need a procedure that will handle the out of disk space error appropriately. Right now, the mail download will simply abort if there isn't enough disk space, and a debug message will be printed to the console saying that there isn't enough disk space. Of course, we should pop up a dialog box explaining why the mail download has been aborted." I'm sure this is VERY easily done, as soon as I can find: a) The procedure that is supposed to handle MK_POP3_OUT_OF_DISK_SPACE. b) The xul file that contains all the dialog definitions. I'll look at finding these things when I get home this evening (I'm in New Zealand). In the meantime, we should apply this patch to the main tree, even without the dialog box handler present, because this is a HORRIBLE dataloss bug. Which of the two outcomes would you prefer if a user runs out of disk space? 1) The message download aborts, the user thinks that he has downloaded all of his messages even though he really hasn't, and the remaining messages sit on the pop3 server until he realizes that mozilla suddenly isn't downloading any more messages and either thinks to check his debug console (where it says "Out of disk space! Raising error!"), or he comes on #mozilla and asks "Why isn't it downloading any new messages?". Anyways, on Windows (since this will probably be located on his "C:" drive, he's going to realize that he's run out of disk space soon enough). Or 2) (What CURRENTLY happens): Mozilla appears to download ALL of the messages located on the server (and DELETES them from the server). The user has no idea that something is wrong until he clicks on a message header from a message that was lost due to no disk space and gets an "Unknown error" dialog. THOSE MESSAGES ARE LOST FOREVER! Especially now that we have "biff" capability, and the ability to automatically check messages, this bug is DEADLY! Mozilla could run on a computer all day, happily automatically losing ALL of the e-mail messages that were sent to that user that day (and maybe the next day). (3) Mozilla detects the out of disk space condition and pops up a dialog box telling the user. We'll get that as soon as I find those objects, or ben helps me out a bit. :P) So PLEASE check this in, even without the dialog box handler (It will follow in the next few days). We NEED to get rid of this HORRIBLE data-losing bug!
Comment 34•23 years ago
|
||
See http://lxr.mozilla.org/nspr/source/nsprpub/pr/include/prlong.h for info on PRInt64.
Assignee | ||
Comment 35•23 years ago
|
||
I should add that my earlier patch obviously compiled for me on Linux and Windows. And THANK YOU, ducarroz, for fixing my patch so it does compile on Mac :-) We might want to write up a quick page on what we need to do to make sure that mac builds don't break...
Comment 36•23 years ago
|
||
removed keyword review as this patch has been reviewed. R=bienvenu,ducarroz. jce2, do you want me checkin the fix for you?
Keywords: review
Comment 37•23 years ago
|
||
a=brendan@mozilla.org -- who is gonna make a dialog or status message alert the user to the problem? Is that this bug, with another patch forthcoming, or is it another bug? Please use diff -u format for patches. The prlong.h macros, for which I'm originally to blame, are necessary in Mozilla for any PR*Int64 arithmetic or bitwise-logical operations. This is "known" but probably underdocumented. There are other platforms than Mac that lack native compiler support for 64-bit integers. /be
Comment 38•23 years ago
|
||
I have filled bug 49868 about the missing error message when the disk is full.
Assignee | ||
Comment 39•23 years ago
|
||
Yes, please check the fix in for me (I don't have cvs write access). (Maybe I should start the process of getting write access). I think we should create another bug for popping up the dialog, since this bug already has enough descriptions added to it, and the other patch should go under another component anyways. And yes, those macros are WELL underdocumented. I had no idea about them.
Comment 40•23 years ago
|
||
I will check it in a soon the tree open... and I have already open a new bug for the error message (bug 49868).
Comment 41•23 years ago
|
||
jce2: out of curiosity, and maybe it'll do some good: what are the doc(s) that you in fact have perused, and in which some statement about using prlong.h's macros as well as its types would have done some good? I'd genuinely like to know, but I'm also skeptical -- and I feel that if you #include "prlong.h" and use its typedef names, you should investigate their operational semantics a bit more -- say by reading that header file. /be
Comment 42•23 years ago
|
||
Fixed and checked in
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Comment 43•23 years ago
|
||
OK, I'm very confused by all the conversation in this bug and am not certain what everyone expected the final fix to be... Using the oct10 branch commercial build with linux rh6.0: POP user without "leave on server" enabled: 1. Does Get Msg when out of disk space, the header is received and displayed in the thread pane. 2. When user selects the message they get the "Unknown Error" alert, which tells them nothing (I know there's a futured bug about the text, but 49868). The body is not downloaded. 3. User exits and re-launches, logs in to server, gets messages. The header(s) previously downloaded which hit out of space condition are never seen again, even when disk space is freed and other new messages are received. POP user with "leave on server" enabled: All of the above happens, but once disk space is freed, the message is gotten/recieved again and body is downloadable. Using NT 4.0 with today's commercial branch: User never sees any problem, never gets the header downloaded, It just appears that Get Msg isn't working or there aren't any messages. Once disk space is freed, they will get the message(s) whether or not "leave on server" was enabled or disabled. So, I haven't tried mac yet (BIGer disk to fill), but I assume from ducarroz' testing when he tested the fix that mac should work like NT. I am reopening this based on the Linux case -- data loss in the event that you didn't leave the messages on the server. I still think this is crappy that the user (who would never look at a console even if there was one appearing in the regular user world) doesn't know what is up until they discover in some other way that the disk is full... that argument is covered in the dialog bug 49868. I am also going to add keyword rtm so that the PDT team puts their two cents in (again). Cleared the [FIX IN HAND] text from whiteboard, too.
Keywords: relnoteRTM
Comment 44•23 years ago
|
||
mail team, this is a dataloss bug on Linux. Can anyone give a hand and see why Linux didn't get fixed along with the other platforms?
Whiteboard: [nsbeta2-] → [nsbeta2-][rtm need info]
Assignee | ||
Comment 45•23 years ago
|
||
I'm sorry. I've tried and I've tried and I've tried, and I can't reproduce the data-loss described in this bug since I checked in my fix many months ago.
Assignee | ||
Comment 46•23 years ago
|
||
Is it possible that "GetDiskSpaceAvailable" is broken in some cases and not broken in other cases?
Assignee | ||
Comment 47•23 years ago
|
||
I repeat, I can not reproduce what laurel has commented about. I have tested this in two ways: 1) Run mozilla in a debugger, set a break point to right after GetDiskSpaceAvailable() is called and changed the value to less then what it needs to download, exactly what it needs to download, slightly over the amount it needed to download, and zero, and a negative number. Results (as expected): Doesn't download, downloads, downloads, doesn't download, doesn't download. 2) I filled up a partion with junk, then mapped my mail file to that test partition. It wouldn't download, and the message said "Not enough disk space". Therefore, I can't reproduce this bug at all. When it DOES happen, it MUST be a failure of the GetDiskSpaceAvailable procedure (as shown by my testing of #1). Therefore, this bug should drop off of the RTM radar, unless it can be reproduced, and even then, the problem is probably (most definately) with GetDiskSpaceAvailable and not GetMsg.
Whiteboard: [nsbeta2-][rtm need info] → [nsbeta2-][rtm need info] Cannot reproduce.
Assignee | ||
Comment 48•23 years ago
|
||
I'm taking this to rtm- since I own the bug.
Whiteboard: [nsbeta2-][rtm need info] Cannot reproduce. → [nsbeta2-][rtm-] Cannot reproduce since checkin.
Comment 49•23 years ago
|
||
I read somewhere (I guess it was in another bug) that on some Unix flavor, if the user has a quota, GetDiskSpaceAvailable returns the amount of physical disk space available, not the amount of space available in your quota. So, the only way to really guarantee that you have enough disk space available is to actually try to allocate it on disk. I'll go try to find that bug. I'm not sure if that's what's going on here. I don't know how Laurel recreated the out of disk space situation.
Comment 50•23 years ago
|
||
see bug 55814
Comment 51•23 years ago
|
||
"Low disk space" relnote... Gerv
Whiteboard: [nsbeta2-][rtm-] Cannot reproduce since checkin. → [nsbeta2-][rtm-] Cannot reproduce since checkin. relnote-user
Assignee | ||
Comment 52•23 years ago
|
||
LAST CALL to everybody: Can anyone come up with any situation other than running out of disk quota that causes loss of data like this? If not, then this bug should be closed, and another one filed, because the bug is then in GetDiskSpaceAvailable, and not GetMsg.
I saw the described behavior on a late December build when running out of disk space on an x86 linux 2.2 system. No quotas or anything like that. I wound up with an InBox that acted super-stupid when I was trying to file or delete messages. The Inbox mbox file wound up with a segment that looks like this: 0000000 F r o m - S a t D e c 3 0000020 0 1 3 : 3 8 : 0 3 2 0 0 0 \n 0000040 X - U I D L : 5 e 0 0 d 8 c a 0000060 6 9 7 b 6 f a 9 0 9 0 b 1 5 2 8 0000100 f 8 c c 9 4 7 2 \n X - M o z i l 0000120 l a - S t a t u s : 0 0 0 1 \n 0000140 X - M o z i l l a - S t a t u s 0000160 2 : 0 0 0 0 0 0 \0 \0 \0 \0 \0 \0 \0 0000200 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0075000 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 F r o 0075020 m - T u e J a n 2 1 0075040 9 : 4 6 : 3 3 2 0 0 1 \n X - U 0075060 I D L : 1 c 5 6 4 2 3 8 e d 7 0075100 f 9 8 1 7 d 2 e 9 b c e 8 b 3 e 0075120 a f 8 8 4 0075125 Notice the line numbers in the od output - a whole lot of zeros. It looks like something got smacked trying to write the X-Mozilla-Status2: line. Hope that helps.
Assignee | ||
Comment 54•23 years ago
|
||
Hmm...maybe another procedure other then GetMsg decided to write out status information, but couldn't.
Comment 55•23 years ago
|
||
Only GetMsg grows the size of the file (except for copying/moving a msg into a folder).
Assignee | ||
Comment 56•23 years ago
|
||
Blowing away nsbeta2- and rtm- since those are ancient. Bill: Do you remember if you did any copying or moving messages in or out of that folder before this happened?
Whiteboard: [nsbeta2-][rtm-] Cannot reproduce since checkin. relnote-user → Cannot reproduce since checkin. relnote-user
Updated•22 years ago
|
Target Milestone: M18 → ---
Keywords: nsenterprise
Comment 57•22 years ago
|
||
I'm sorry if this'll sound stupid, but if no mail is deleted on the server _before_ we know for sure it's been written out correctly, how can data-loss occur then? It would make a failing disk-space checking procedure (not saying it is, just _if_, for whatever reason) a lot less painful... In 0.9.2 Release notes: [... If that option ("Leave messages on Server") was not enabled, those messages are lost. ...] This couldn't be possible anymore then, since a message would have to be written fully before the 'dele' or whatever command. (the possibility of loosing mail - anyhow, even by randomly pulling the modem's plug- sends a shiver down my spine, and I'm sure I'm not the only one ... :) ).
Comment 59•21 years ago
|
||
I'm not sure why this is still at an open state. I've verified with apr8-9 commercial trunk build for bug 49868. Marking verified worksforme
Status: REOPENED → RESOLVED
Closed: 23 years ago → 21 years ago
Resolution: --- → WORKSFORME
Updated•19 years ago
|
Product: MailNews → Core
Updated•15 years ago
|
Product: Core → MailNews Core
You need to log in
before you can comment on or make changes to this bug.
Description
•