Closed Bug 142196 Opened 22 years ago Closed 21 years ago

Cannot get pop email sometimes (if profile is on network drive)

Categories

(MailNews Core :: Database, defect)

x86
All
defect
Not set
major

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: clement_tse, Assigned: Bienvenu)

References

Details

(Whiteboard: [adt3])

Attachments

(9 files)

From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; T312461)
BuildID:    Mozilla/5.0 (Windows; U; WinNT4.0;en-US; rv:1.0rc1) Gecko/20020417

I have setup two email accounts, one pop3 account (get mail from 
pop3.newtthk.com), another imap email account (get mail from local email server)

I set to get email every 5 minutes from both account.  Sometimes, the mouse 
pointer turns into an hour glass for a very long time.  During this period, 
mail will not appear in my pop account inbox.  Mails do comes into my imap 
account inbox during that period.  The other operations are still operations, 
including reading and sending emails in the imap account.

The hour glass will be showing when I put the mouse over the folders on the 
left of the screen, and the list of emails in the top right.  But the hour 
glass will not be shown when I put it over the message content in the lower 
right.

This state can go on for hours.  If I quit and then restart Mozilla immediate, 
it will get those pop3 emails immediately.

The frequency is that on 2 out of 5 working days, this will happened.  My wild 
guess is that it may be failed in one of its attempts to get email over the 
Internet from the pop3 account, but it cannot get out from that state and then 
retry afterwards.

It would be helpful if I can see what it is trying to do when this happened.  
Some sort of log file would be useful.  How to do the logging?

Reproducible: Sometimes
Steps to Reproduce:
Setup of two email accounts, one pop3 email, another imap email.  Each get new 
emails every 5 minutes.
Start Mozilla, wait for about 2 hours.
WFM with Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.0.0+) Gecko/20020503
(tried today app. 20x)
I have upgraded to RC3, the problem is still there.  One more information is 
that my mail directory of the pop account is on the network drive.

I have one more observation, if during the time it cannot get pop email, if I 
click on the 'Get mail', nothing happened.  Even the status bar will not change 
to 'Getting email...'.  But if I do something about the Inbox, e.g. move one of 
the Inbox message to another folder, then click 'Get mail' again.  This time, 
it can get the email immediately.
Reporter,
Here are the instructions on creating a pop3 log and attach the log to this bug. 
http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html 

Also can you download a recent build and try to see if you experience the same
problem.
Do you have Automatically download new messages checked in the accounts and
settings?   
I have created the log as attached (nsmail1.log).  The first few lines shows 
RECV which is the email with attachment I received successfully.  After this 
email is received, mozilla do not get pop3 emails.  Even I click on the 'Get 
Message', nothing happened.  Then I moved an email from another email box to 
Inbox and then click 'Get Message', this time a new message is get 
immediately.  

Yes, I have Automatically download new messages checked in the accounts and
settings.

I have not tried to download the latest build yet, will try to download tonight 
and have a try.
I tried the latest build 20020528.  The problem is still the same.  I can only 
download new email after I moved an email into the Inbox.
I tried the latest build 20020528.  The problem is worse.  After some time, I 
tried to get email.  Mozilla did not get the pop3 email.  This time, even I 
modified the Inbox, Mozilla still did not start to get the pop3 email.  I 
captured the pop log at that moment (nsmail3.log).
I noticed one workaround.  If I moved the pop3 email inbox from network drive 
back to my local drive, the problem is gone.
I and some collegues have exactly the same Problem (Mozilla 1.0). The problem
seems to be,
that the mail direktory is on a drive in the Network. Moving it to a local drive
turns everything to good. I tried to log the mail activity, but mozilla does not
log anything. I download an email and mozilla logs it. Then I klick on the get
button and nothing happens and the log file does not change. I have to delete a
message first, before I can download the next Email and before mozilla logs
anything.
Attached file pop3 log file
The same happens with mozilla 1.1 alpha
Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.1b) Gecko/20020721

behaves exactly as described before!
confirming the bug based on pop3log file. 
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Ok, I am also able to reproduce this problem if I have a profile on network 
drive. The problem is that the summary files validation fails because the time 
stamp on the berkeley mailbox and summary file does not match. I think this is
because nsFileSpec::GetModDate doesn't seem to be returning accurate results. 
For now please use local drives to store pop3 profile. I don't think there is
an easy way to fix this bug.

One way to fix will involve api change from nsFileSpec to nsLocalFile and check 
if ftLastWriteTime is more accurate. 

Updating summary to when the bug actually happens. 
Summary: Cannot get pop email sometimes → Cannot get pop email sometimes (if profile is on network drive)
Actually this is kind of a blocker for all pop /local operations where profiles
are stored on network drive because modDate inaccuracy fails the summaryValid
check which is the key to keeping db and berkeley mailbox in sync. I see us
building summary files on start-up everytime. 
Bug is a little strange, 5 persons in our lab use 3 PCs with Mozilla as
Mail-Client and their Inbox stored on a network drive, but this Bug appears only
at two of our accounts. And it worked fine for me for several weeks and appeared
one day. One colleague told me, he had this problem first only on one of the
three PCs and now he has this Problem with all three PCs. Another colleague does
NOT have this problems (account also on network drive).

Maybe this helps you!
*** Bug 159156 has been marked as a duplicate of this bug. ***
When you add a Filter that copies every Mail to a new Folder instead the Inbox
Folder Mail Client work fine (see Log File with comment submitted after this)
build 2002072104
yesterday i made a new profile its done the same impossible to get mail after
the startup
i have karafilidis method (make a filter)
everything is the same
There is only one workaround and that is to create a new profile on local hard
drive and move the mail folders from the profile on network drive to this new
profile. 

You will see this hourglass problem for profiles on network drive. It is only
matter of time. 
Ok
THX Navin Gupta i understand now : from online to local
it works
I used another workaround.  The objective is to have the inbox and all folders 
to be put onto network so that backup can be done.  I created two Accounts in a 
single profile.  One account set for sending emails, another set for receiving 
emails.  The account set for sending emails are put on the network.  The 
account set for receiving emails are put on the local drive.  I set message 
filter that all messages received from the 'receiving account' will be moved to 
the inbox of the 'sending account'.

With this workaround, I used the 'sending account' as a normal email account.  
Hope this help.
I was thinking of relnote (ing) this bug for mach V. Scott, what do you think
about it? 
Keywords: relnote
I have no objective to get more people involved if he can help to solve the 
problem.
I see this problem on Linux (latest-trunk). Is this the same bug? If so we
should set OS=ALL. Sometimes POP3 fails after some time, both ways:
automatically (every other minute) and manually. My home directory is mounted
using NFS.

pi
Blocks: 165832
OS: Windows NT → All
*** Bug 162812 has been marked as a duplicate of this bug. ***
*** Bug 167700 has been marked as a duplicate of this bug. ***
nominating this for next release 
Keywords: nsbeta1
*** Bug 169469 has been marked as a duplicate of this bug. ***
*** Bug 164225 has been marked as a duplicate of this bug. ***
QA Contact: sheelar → esther
I have been adding comments to 132538, but now realise that this is the place
for them.  Not sure if 132538 is a dupe of this though.  Please see my comments
in this bug.

Several of my colleagues who use POP and Mozilla/Netscape7.0 have this problem.
 Only one doesn't.  The only differences so far are that he stores his profile
on a local drive, and his mail directories on a network drive.  Everyone else
stores both on a network drive.  The working Mozilla has its profiles on a
network share from a SPARC Solaris box, incidentally.  The others have
profile/mail on Windows shares.

Shouldn't this bug have a much higher severity?  For us, it's a complete blocker
to upgrading from Netscape 4.
Interestingly, Eudora seems to have exactly the same problem with network
drives.  They fix it by allowing the user to specify the number of seconds that
the date on the summary file can be behind that of the mbx file.  See the
section on TocDateLeeway in http://www.eudora.com/techsupport/ini.html.  Would
anything similar work for Mozilla?
yes, that should work for us.  We could add a global pref for this, or a
per-server pref. I'm not sure if we'd add a UI for it...
*** Bug 140812 has been marked as a duplicate of this bug. ***
If this bug is "fixed" in this way, it would need to be done in a highly visible
manner.  A hidden preference doesn't seem appropriate.  A business would expect
Mozilla/Netscape to work straight away in their environment, and this would be
one that in all likelihood stores mail on network drives.  To have an install
that fails by default would be mad.  The time leeway pref would have to default
to something that would generally fix the problem.  There should be clear
documentation about the feature and, I believe, an easy way of adjusting it i.e.
GUI.  Would it be possible to detect when this bug is the likely cause of a
suspect summary file?  Could the mail program pop up an alert saying something
along the lines of: "Your summary file appears to be out of sync, this may be as
a result of you storing your mail on a network drive.  To fix this, try
adjusting the preference x to ..."?
Setting severity to major in a vague attempt to get some more attention.  This
bug causes a major loss of funtion in Mozilla Mail and really needs sorting out
soon!  Apologies if work is happening; it just seems that this bug is
improportionately quiet compared to its impact.
Severity: normal → major
I'd agree with Max (comment 40) that this bug is at least a major one. Being an
administrator, I told my colleagues to put everything important (and mail is
definitely important) on the file server to benefit from regular backups and
higher security etc. So being unable to use a network drive for profiles will
probably be a blocker for many people and mean to go back to Netscape 4.7
actually :-(

BTW: I synchronized the clock of the clients, the mail server and the file
server. That seems to decrease the number of hangs but doesn't prevent them
completely. Furthermore I noticed that a hang mostly occurred during/after user
actions (reading mail, moving mail to a different folder etc.). I left mozilla
alone overnight with a 3 minute auto fetch intervall and had e-mails sent
randomly approx. every 2 minutes to the inbox - fetching these without any
further user interactions worked. So maybe concurrency of several threads in
mozilla is an issue here, too.

Mozilla 1.1
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.1) Gecko/20020826

Happens for me on Win XP Pro too, but I have nothing on network drives.
Get Msgs just stops working after several hours.  When that happens,
restarting Mozilla solves the problem.  But then, the bug happens again
after several hours.

Re: #42
Could this be a different bug?
---
I suggest that the summary for this bug isn't really appropriate.  It greatly
masks the severity of this bug.  How about the following instead:
POP email is generally unusable when profile is on a network drive

Any comments/status updates from people working on this bug?
I am preparing on recommending Netscape 7 as our e-mail program, and since our
admins love to put all files on our network; this bug is preventing us from even
considering using Netscape (Mozilla). 

The suggested fix in comment #36 would be an adequate stop-gap.
Sounds reasonable, but do we *know* that this is causing the problem?

pi
possibly related to bug 101953 or 100395?
Even if we don't *know* what caused the problem, its maybe of interest that, for
me, with build 2002101612 (1.2beta) the behaviour seems to be at least greatly
improved. As far as I can think back, somewhere between 0.95 and 0.97, the
problem occured a. directly after getting messages or b. at least within half an
hour(!!). 1.2Beta was now running a whole day with only one time locking up, and
for now, I'm not sure if this concerned this bug (the 'hang' was not directly
related to getting messages or kkeping mail open in background, it came up by
clicking through different folders in the folder pane).
Comment 46: I think these bugs are not related.

I use the above mentioned build running Win2K with profiles stored on a server
running NT4 (client time synchronized which hadn't remarkable impact for me).
I'd give it some time.  I've seen this problem sometimes occur continuously, and
sometimes hide for several hours before appearing again.  If it occurs at all,
then it renders the mail program useless.  One of my colleagues who tried using
it with POP and network profile has just commented to me, "Netscape 7 sucks". 
He only really thinks this because of this bug but it's obviously put him off
the whole product.  This is a real shame, but I won't be able to convince him
otherwise until I can show him that the mail problems have gone away.
For sure it will take more than one day to proof it, but for me (and my 4
colleagues) the problem occured EVER in very short time. 
Maybe I was so lucky not having to restart my mailclient all the time, that I
was a bit rashly ;-)
I received several emails asking me for workaround.  So, I will describe the
workaround (described in #26) in more details.  Hope this help to improve the
situation before the bug is fixed.

You need to create two accounts.  If you don't know how to create two email
accounts, you can go to 'Edit->Mail & Newsgroups Account Settings...'  

The first account is the 'sending account'.  You need to setup you actual email
address, smtp server correctly.  Then use a dummy pop3 server.  It is important
to note that you should not use the same pop3 server as your usual pop3 server.
 Also type in a dummy pop3 user.  The effect on the 'sending account' is that,
if you click 'Get Msgs', it will popup error message.  However, you can send
emails as normal using this 'sending account'.  All the folders of this 'sending
account' can be setup on the network.  The sent folder is on the network now.

The second account is the 'receiving account'.  You need to setup the correct
pop3 server and pop3 user account.  The folders of this account should be setup
on a local drive.  Then you need to setup a filter criteria.  You can choose the
menu 'Tools - Message Filters'.  You need to define a message filter such that
all incoming messages will be moved to the inbox of the 'sending account'.  A
good enought Message Filter for 'all messages' is Subject doesn't contain %%%%.
 Yes, this does not guarantee 'all messages'.  But Subject does contains '%%%%'
are highly probably as junk email anyway :)  Don't forget to set the 'receiving
account' to download emails from server regularly.

With this, all incoming and outgoing message will be moved to the network drive.

Hope this help.
We are increasingly being pressured to move off of Netscape 4.7x because of the
greater lack of compatibility with many, if not most, of the sites out that are
using MS extensions or even standards that leap-frog 4.7x.

For small sites, relocating the Mozilla profile off the server to a local hard
drive may be acceptable but for a larger site, that is creating a real nightmare
regarding backup and recovery of user data not to mention just the headache in
the manual efforts per box!  

We have had problems with network based profiles on both NT and 2K servers, and
in the next three weeks, I must deal with the migration of a large community of
folks off of 4.7x and I would prefer that target to be Mozilla.  There have been
months and months of bantering this without any apparent solution by the
development community on the horizon.

Can a commitment for either a reasonable work around (ie. automated) be
developed or better yet, a fix for this integrated into the next build cycle?!?!?
Keywords: 4xp, mozilla1.3
Keywords: nsenterprise
Not sure if this will help any bit, but it seems things work better on a slow
connection. I use the same profile (on a network drive) from my desktop (100Mb
LAN) and my laptop (11Mb Wifi, wow, can this one be slow!). On my desktop I'll
have to exit Mozilla and restart after changing one message from read to unread
or after deleting something, before I can check new mail. Every singly time. On
my laptop I also get the hourglass sometimes, but certainly not as often.
I'm using the same build of Mozilla on both. Clocks on all computers are
synchronised. The only other difference I can think of is that my desktop has
Windows 2000 and my Laptop Windows XP.
taking - the fix for this is going to be in the mail db. I'm going to add a pref
"mail.db_timestamp_leeway" - if setting this turns out to fix the problem, we
can add a UI for setting it. We can decide if we want it to be per-server as
well, or just global (which would affect where the UI would go).
Assignee: naving → bienvenu
Status: ASSIGNED → NEW
Component: Networking: POP → Mail Database
Does this problem occur on 4.x ? Can anyone tell us who is experiencing this
problem. Why are we adding another pref and making even more complicated than it
already is ?

This bug definitely does not occur in 4.x or 3.x.  We use 4.x extensively, with
exactly the same mail servers, and network drives at work.  The bug has never
been seen in all the time we've been using Netscape 4.  However, it is extremely
prevalent when anyone tries using Mozilla/Netscape 7 with the same setup. 
Running Eudora with the same setup has been known to be vulnerable to this
problem, but is fixed by setting a higher leeway.

I agree that a better solution would probably be to work out what has changed
since 4.x to make Mozilla suffer from this bug, and then fix it this way. 
However, I imagine this could take a while, and a solution (any solution!) is
needed soon!

I have a few worries about the proposed solution and associated pref though. 
Unless it's possible to provide default values that cure the bug on the majority
of platforms, there could be visibility issues.  People really want a system
that works out of the box.  If they try it, and it doesn't, they won't
necessarily go trawling through advanced settings to work out why!  It's not
possible to have Mozilla "auto-detect" the problem, and increase leeways
appropriately is it?  At the least, could it alert the user of the potential
problem (when it occurs), and point them to the solution?
This problem is difficult to fix. You can have concept of logical timestamp
after we have opened the db for the very first time instead of matching actual
timestamps. leeway would also fix it but leeway threshold will be the key. If we
have it too large it might compromise the whole validation process. 
I never noticed that in a long time of using Netscape 4. And it used to work for
Mozilla, it started at some point. So it is a regression.

pi
Keywords: regression
Yes, pretty much all the mozilla code is different from the 4.x code. But in
particular, we're using a completely different file io library in Mozilla than
4.x. There's no getting around using the timestamp of the file on disk for the
simple reason that some other app or the user can change the file and invalidate
the db. Yes, usually this will cause the file size to change, but it might not.
The db has to be known to be consistent with the mailbox file on disk, so we
have to use some attribute of the file on disk that is updated as the file
changes - I don't know what else we can use other than the timestamp. Doing
something like a CRC of the entire contents of the mailbox is not a good idea
for obvious reasons. 

Anyway, looking at the code, I think it's fairly obvious what the difference
between 4.x and Mozilla is, and I believe it could explain this problem. 4.x
always closed the mailbox file before getting the time stamp. Mozilla does not.
Mozilla does call flush on the stream, but that doesn't really flush the
underlying file, but just the file stream.

There are a three ways to fix this that I can think of - one, obviously, is to
close the file before getting the file status, like 4.x did. The second is to
make sure that we open the mailbox file in synchronous mode, so that writes are
synchronous, which if you look at the npsr doc, makes sure that the file data
and file status are physically updated. The third is to make sure that flushing
the stream also flushes the PRFileDesc. My inclination is to try the third one,
making flusing the stream also flush the PRFileDesc. I'm not sure if that will
work if the file is on the network and we haven't opened the file in sync mode,
but I think it's worth a try.
Keywords: regression
David, what is it that was changed in Mozilla? It worked until earlier this
year. Something must have introduced this problem.

pi
there were some changes to the way we shared file streams between different
operations this year, but I doubt they were responsible.  One change was that we
got more consistent about checking the time stamp, so that if it was wrong,
we'll detect it earlier and more often.

I'm going to attach a patch that makes us flush the file before getting the time
stamp, at least on windows. I'd like to check that in to see if it helps with
the problem on windows. If it doesn't help, we'd need to explore the other
possibilities. I don't know enough about sync mode to know if it's going to
block the UI on network writes (which would be unfortunate). If syncing does
help, then I need to figure out why the file stream code thinks sync won't work
on Linux - I think it should.
Attached patch proposed fixSplinter Review
make nsIOFileStream::flush flush the underlying prfile on windows (it already
was on the mac).
Comment on attachment 105721 [details] [diff] [review]
proposed fix

sr=sspitzer

does this mean the bug still exists for unix?

(if my inbox is on a remote machine, over nfs?)

navin suggested in http://bugzilla.mozilla.org/show_bug.cgi?id=142196#c15 that
we might be able to fix this if we re-wrote to nsLocalFile.

do you think we'll do that one day?
Attachment #105721 - Flags: superreview+
yes, I believe this bug will still exist for unix, but if this fixes it for
windows, I think we can use the same fix for linux (there's a comment in the
nsIOFileStream code that says sync always returns an error on linux, but I'd
want to investigate that comment - it's probably several years old).

I hope we would rewrite this to use nsIFile, but that's a large task and I don't
know if it would fix the problem or not.
How can I test this patch?  Will a build be available that incorporates it, or
do I need to roll my own?  I'm a bit reluctant to try the latter, having never
done it with Mozilla before!
Cavin, can I get a review on the last patch? I plan to check this in when the
tree is open next, sometime today, so it would be in tomorrow's build.
Comment on attachment 105721 [details] [diff] [review]
proposed fix

r=cavin.
I've downloaded the moz source and applied this patch.  Initial tests with our
setup are very positive.  I haven't been able to get the bug to show its head at
all.  Normally, all I have to do is receive an email, go to another folder and
come back to the inbox.  Not exactly rigorous testing, but it's certainly
looking like the fix works!
Max, thx very much for trying this. I'm very glad it seems to be working. What
windows OS are you running on? win98, NT 4.0, 2K, or XP? I'm pretty sure it
should be OK on all of them, but I'm a tiny bit worried about win98.
Re: #68
I did that test on XP Professional SP1.  However, I can easily test it on all
the windows platforms you've mentioned.  Can you post here when a downloadable
build containing this fix is up?  My build is a bit too bogged down with
debugging stuff!
fix checked in. I'll leave it open for the linux problem, which Dan is going to
help me verify on Linux. I hope to remove the #ifdef completely.
Status: NEW → ASSIGNED
fix checked in for unix as well. mkapply verified it works for OS/2 as well, and
dmose checked it on linux.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Great! Any chance this can get into the 1.0 and 1.2 branches as well?
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3a) Gecko/20021113

I am still experiencing this problem:-(

pi
I was a bit dubious about this being marked fixed after so little testing.  :-(

I can verify that it does appear to be fixed on Win2K and XP platforms though
(20021112).  Haven't had a chance to try 98 yet.  Is this test still needed?
How do i apply this patch?
I'm using windows 2000 with Mozilla 1.1

Thanks

Graeme.
It would be good to test win98. As far as Linux is concerned, if Sync doesn't
really sync on Linux, or the Linux version of NSPR, that's pretty unfortunate. I
suspect that means we can't open the file in Sync mode, since that should be
similar to calling Sync. I'd hate to have to close the file before setting the
summary valid, and re-open it, but I guess we could investigate that.

Boris, since people have described lots of variations of problems in this bug,
perhaps you could explain exactly what's happening to you?
I'm not sure which details I could provide. Mein ~/.mozilla is mounted with NFS.
After a while (happened two times today) I just did not receive new mail,
neither automatically (set to every two minutes) nor manually.

I'm glad to answer your questions for further details, I just don't have an idea
what would be useful.

pi
Boris, I'm trying to find out if you were seeing the stopwatch cursor after,
say, marking a message read/unread, as described in
http://bugzilla.mozilla.org/show_bug.cgi?id=142196#c52, or just seeing get new
mail fail. So it sounds like you're just seeing get new mail fail. Do you have
mail filters? I can certainly make it so we're more tolerant of the .msf file
appearing out of date when we do get new mail, so that we try to reparse the db.
The other possibility would be to close the inbox after get new mail so that
setsummaryvalid works better.
>Boris, I'm trying to find out if you were seeing the stopwatch cursor after,
>say, marking a message read/unread, as described in
>http://bugzilla.mozilla.org/show_bug.cgi?id=142196#c52, 

I did not pay attention, but will the next time it happens.

>Do you have mail filters? 

Yes, I do.

pi
Comment on attachment 105721 [details] [diff] [review]
proposed fix

a=blizzard on behalf of drivers for 1.2
Attachment #105721 - Flags: approval+
fix checked into 1.2
Boris, have you checked that all the NFS clients and the NFS server have the
system clocks in the same time? You can use NTP to synchronize the clocks.

If the clocks are not synchronized it can effect many programs that depend on
timestamps. This might help or not, just a thought.
Asko, times are set regularly, but that does not prevent them from deviating
after a while.

pi
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3a) Gecko/20021113

OK, it happened again, therefore reopening.

I cannot reproduce it by means of comment 52.

I see the sand clock on if I highlight the Inbox or the title of that account.
If I mark another folder or account, it goes away.

Ths stop button is active and without affect, whereever I go.

Compacting folders does not seem to do anything.

pi
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3a) Gecko/20021122

Same symptoms as in previous comment with this new version.

pi
Boris, I'm still trying to find out what operations might be causing this. Here
are a few possibilities:

1. Getting new mail into your inbox. (thought I think this actually closes the
file before getting the time stamp, which should be OK)
2. Move/copying mail into your inbox from another folder.
3. Deleting a single message from your inbox.
4. Delete multiple messages at one time (i.e., select 3 messages and press delete)
5. Reading an unread message.
6. marking a message as unread.
7. Marking several messages as unread (i.e., select 3 messages and mark as
read/unread).

To figure out if any of these operations is the one causing your problem, what
you would do is:

0. Send yourself a message.
1. Perform one of above operations in the inbox.
2. Select another folder.
3. Press get new mail.




David, I tried all of your steps. None triggered the problem. It really just
seems to happen randomly.

Is there anything I can do to investigate once the problem shows up?

pi
Boris, when it happens, the thing to do is remember what you just did, or what
the last thing you did with the inbox was.  If you haven't done anything with
the inbox since the last time get new mail/opening the inbox worked, then it
would have to be the process of getting new mail, and/or filtering incoming mail
that caused the problem. But like I said before, I think get new mail actually
closes the stream so I'm pretty sure that operation is OK.

I'm grasping at straws here, but do you do things like read new mail while it's
still getting downloaded to your inbox? So if you get 10 new messages, do you
start reading the first new message while the next 9 get downloaded? And if so,
is there any correlation between that happening and this problem?
I'll try to remember what happened next time. The problem is that I don't notice
the problem right away. Mail is fetched every other minute and sometimes (in
addition) manually.

It might also be that I read mail while new is coming in (since that happens
automatically). Getting mail is extremly fast, though, since the connection to
the server is one hop on 100 MBit.

pi
It happened again and there is not much to report. I have been reading my mail
and ended in my Inbox, where I deleted all messages (one by one after I had read
it). I could get e-mail at this point. I let MailNews alone and used the
browser. After a while i came back and tried to check mail manually, but it
failed. So an empty Inbox was highlighted.

I don't know if this means anything, but:
-rw-------    1 3.14     3.14          17k Nov 26 15:21 Inbox
-rw-------    1 3.14     3.14         1.7k Nov 26 15:39 Inbox.msf
So there is a huge gap in the time stamps.

pi
More fancy details after restart. No mail came in while the problem occured
(most of the time this is different), so this does not play a role.

When I restarted the Inbox forgot that I want my mail threaded, but that also
happens to other folders under totally unclear circumstances.

pi
Boris, do you often empty (delete all the messages) in your inbox? Do you think
that could be involved?

The difference in timestamps is probably because we commited the db at some
point - what's interesting is the time stamp stored in the db, not so much the
timestamp of the db on disk. It would be a definite problem if the timestamp of
the .msf file was earlier than the timestamp of the mail folder. Losing the info
that you want the db threaded is a bug, but it's because we think the db is invalid.

If I understand your comments, either reading the messages or deleting them
caused the db to become invalid.
David, I keep my Inbox empty most of the time. But sometimes there is something
left I keep over several days. It does not seem to have an effect for our problem.

> The difference in timestamps is probably because we commited the db at some
> point - what's interesting is the time stamp stored in the db, not so much the
> timestamp of the db on disk.

I'll see if I can give you information on that next time.

> If I understand your comments, either reading the messages or deleting
> them caused the db to become invalid.

Probably not. It might happen sometimes. But most of the time it does not show
problems.

pi
Do you have auto-compact turned on, so that the inbox and other local folders
are automatically compacted when a certain amount of space will be reclaimed?
Edit | Preferences | Offline & Diskspace?
Yes, I do have auto-compact turned on, but that does not apply. I usually have
compacted manually, before it applies. And since there is a dialog I would have
noticed if that happened before our problem.

pi
It happened again. This time the mail directory looks strange, watch for the
time stamps:

-rw-------    1 3.14	 3.14		 0 Nov 27 11:30 Bugzilla
-rw-------    1 3.14	 3.14	      1.3k Nov 27 13:17 Bugzilla.msf
-rw-------    1 3.14	 3.14	      3.8k Nov 27 12:17 Drafts
-rw-------    1 3.14	 3.14	      2.4k Nov 27  2002 Drafts.msf
-rw-------    1 3.14	 3.14	      313k Nov 27 14:20 Inbox
-rw-------    1 3.14	 3.14	      1.5k Nov 27  2002 Inbox.msf
-rw-------    1 3.14	 3.14	       42k Nov 27 14:21 ML
-rw-------    1 3.14	 3.14	      3.8k Nov 27  2002 ML.msf
-rw-------    1 3.14	 3.14		 0 Oct 21 14:38 Orakel
-rw-------    1 3.14	 3.14		 0 Sep	5 17:20 Sent
-rw-------    1 3.14	 3.14		 0 Apr 22  2002 Templates
-rw-------    1 3.14	 3.14	      455k Nov 27 14:21 Trash
-rw-------    1 3.14	 3.14	       30k Nov 27  2002 Trash.msf
-rw-------    1 3.14	 3.14		 0 Oct 17 18:40 Unsent Messages
-rw-------    1 3.14	 3.14	      1.0k Nov	8 17:28 msgFilterRules.dat
-rw-------    1 3.14	 3.14	      2.5k Nov 27 14:27 popstate.dat
-rw-------    1 3.14	 3.14	      1.0k Nov	8 15:48 rules.dat

The last I did was reading mail in ML and deleting it (folder left empty).
There it still worked. Coming back I found it did not work. I'll attach ML.msf
next.

pi
To me it seems that the msf files are in the future. Are you sure that time
is always going forward in your computer?

I would recommend using xntpd if it somehow possible.
As I said before, times are adjusted regularly, but since this directory is
mounted, it will happen, that the two clocks do not always agree.

I assume (don't know, though) that ls decided to display the time (which is the
time from the server) of the file this way, because it was in the future
compared to the time of the local machine. Right now (allow a second for ssh to
look up the other machine) I have Wed Nov 27 14:54:29 CET 2002 for the server
and Wed Nov 27 14:54:10 CET 2002 locally.

Even I would be able to have that fixed (again, I think it cannot work), for
many users there is no way to have this done.

Anyway, my understanding of the previous discussion says, that Mozilla would not
look at the time of the local machine, is this correct?

pi
Mozilla just uses the time stamps as reported by the file system (and if the
file system is networked, then it might be the time on the network machine) - if
those are inconsistent or change, you'll have this problem.
*** Bug 182180 has been marked as a duplicate of this bug. ***
*** Bug 182180 has been marked as a duplicate of this bug. ***
I keep getting this problem several times a day. I just tried to remove all the
msf files. And while doing so I had a look at popstate.dat. Could that be
involved here? I found several long ago deleted accounts there. So I manually
deleted those. I don't really think this makes a difference, but just a thought.

pi
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3b) Gecko/20030107

Still happening:-(

pi
Mail triage team: nsbeta1+/adt3
Keywords: nsbeta1nsbeta1+
Whiteboard: [adt3]
Hi,

I've been suffering from this bug for a quite a long time (Netscape 7.0, WinNT
SP6, profiles on a Samba 2.2.0 server).
Strangely enough, after
- upgrading to Samba 2.2.6
- using the SAMBA server as a PDC for a domain
- and joinig the domain with a server based user profile
the problem disappeared. Before, I had to restart Netscape several times a day
because fetching mail hung - now it works all day without problems.

Hope this helps,
Rainer
This bug has nsbeta1+, many people pointed out how bad this is to use for
companies. Asking for blocking1.3b.

pi
Flags: blocking1.3b?
Flags: blocking1.3b? → blocking1.3b-
re-assigning nsbeta1+ bugs
Assignee: bienvenu → sspitzer
Status: REOPENED → NEW
Whow, this bug is nsbeta1+, nsenterprise etc. I'll mark it blocking1.3?

Today I had to restart Mozilla four or five times in only three hours! It is
really annoying. I switched off automatic mail retrieval, but that did not make
a difference. Does anybody have a suggestion how to investigate this bug?

pi
Flags: blocking1.3?
Forget my last comment about automatic mail retrieval, that setting did not have
any effect at all, hence I was still getting mail automatically.

pi
It seems that the option for automatic mail retieval only takes effect after
restart. But even then the problem shows up. I die produce copies of my mail
directory and observed the following:

I had a failure to retrieve mail. So I closed Mozilla. I again made a copy and
found that there were significant changes to several files. Since there was no
change I don't see why this should happen. After restarting again there were
changes, but only to irrelevant (folders not used to retrieve mail) msf files.

If you want to look at the details, please e-mail me and ask for "case 1" data.

pi
Flags: blocking1.3? → blocking1.3-
Next incident (please refer to case2 if you want the files):

Last actions:
diff -r 113849/Inbox.msf 113949/Inbox.msf
48a49,57
>
> @$${12{@
> @$$}12}@
>
> @$${13{@
> @$$}13}@
>
> @$${14{@
> @$$}14}@

One minute later:
6c6
< -rw-------    1 3.14     3.14            0 Feb 19 11:37 Inbox
---
> -rw-------    1 3.14     3.14            0 Feb 19 11:38 Inbox
8,9c8,9
< -rw-------    1 3.14     3.14          14k Feb 19 11:37 ML
< -rw-------    1 3.14     3.14         2.3k Feb 19 11:37 ML.msf
---
> -rw-------    1 3.14     3.14          14k Feb 19 11:38 ML
> -rw-------    1 3.14     3.14         2.3k Feb 19 11:38 ML.msf

Then nothing.

pi
Blocks: 101953
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4a) Gecko/20030331

It is still happening.

Comment 100 explains the reason. Not clear to me, how this could happen, if only
Mozilla works on those files, though. What exactly are the times Mozilla compares?

Anyhow, Mozilla should at least alert the user instead of doing just nothing.

Is there any workaround, like touching some file?

pi
Flags: blocking1.4b?
Keywords: mozilla1.3
Flags: blocking1.4b? → blocking1.4b-
I posted a patch somewhere (I'll try to find it) that put in a pref like eudora
has for a little fudge factor in the time stamp checking - did you see that
patch and try it, Pi?
Just to confirm that since the patch appeared in 1.2, all of our problems as
described in this bug disappeared completely.  This has since been tested quite
extensively on XP, 2000 and Linux using POP and network profiles.  I was
surprised to notice that this bug has now gone back to New status.  Wouldn't a
new bug be more appropiate?
Attached patch proposed fixSplinter Review
this patch reparses the inbox, if it's not locked, when biff fires and the db
is invalid.
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4b) Gecko/20030507

Latest patch WFM.

pi
taking
Assignee: sspitzer → bienvenu
marking fixed, patch r/sr=sspitzer, a=sspitzer.
Status: NEW → RESOLVED
Closed: 22 years ago21 years ago
Resolution: --- → FIXED
Waiting to verify after for those who have seen this bug and have commented in
this bug to update on how this fix is working for them. 
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4b) Gecko/20030514
(built from source of 05/13/03 15:26:00 with patch 101274)

Still WFM now that this patch is in the source.

pi
verifying, thanks Boris
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
Keywords: relnote
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: