Closed Bug 400526 Opened 17 years ago Closed 11 years ago

newsgroup subscriptions forgotten on restart

Categories

(MailNews Core :: Backend, defect)

x86
Windows XP
defect
Not set
major

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: nelson, Unassigned)

References

Details

Attachments

(2 files)

I have 4 newsgroup accounts (in SM trunk profile).
One of them recently appeared to have forgotten all its subscriptions,
so I went in and resuscribed to numerous groups.  But when I restarted,
I found that it only remembered 3 groups, and the rest were forgotten.

So, I resubscribed AGAIN, and when I restarted, there were only 3 AGAIN.

When I resubscribe to the groups, it remembers everything about those 
groups that it previously knew, including
- what articles I had previously read
- filters

It just won't remember that it was subscribed to them.  

The dates on the .rc and the hostinfo.dat files haven't changed in months, 
since about the time that I switched from SM 1.5a to SM 2.0a1 ("suite runner"), 
so I gather that mailnews now records the information about the subscribed 
groups somewhere else than in those files.  I would try to investigate this
if I knew where that information is stored.
While we're at it, can someone explain the difference between the two tabs 
in the subscription dialog, the "Current group list" and "New groups" tabs?  
They seem to do the very same things.  

I've tried using both of them to (re)subscribe to the groups, but neither
one seems to make any difference.
correction: 
When I resubscribe to a group I've previously read (that it has forgotten),
it forgets what articles I have previously read.  It remembers old filters,
but it does NOT apply filters to the articles in those groups when I 
resubscribe to them, 
Subscription information has always been stored in the .rc files, along with, obviously what articles have been read. That hasn't changed. Unsubscribing and resubscribing has *always* retained the read set, since Netscape 3.0, because unsubscribing doesn't clear out the read set from the rc file. So that hasn't changed.

My news rc file for the newsgroup I read has a time stamp from a minute ago...

But I suspect the issue is simply that for some reason changes to your newsrc file are not getting flushed out to disk. That would explain what you're seeing. We used to write it out on a timer (10 minutes, iirc), if it was dirty, and I thought we also committed it when big things had changed, like subscription.

The newgroups list is a new feature (actually, revived from Netscape 4) that tries to show you groups that have been added to the server since the last time we asked the server for all groups on the server.
this is not a database (.msf) issue, so changing component to backend. 

When I subscribe to a new newsgroup, and close the subscribe UI, the timestamp on the .rc file changes immediately. I take it this isn't the case for you. Is that true for all your news servers, or just the one you're having issues with? Have you looked at the newsrc file for that server to see if anything looks odd with it?
Assignee: bienvenu → nobody
Component: MailNews: Database → MailNews: Backend
QA Contact: database → backend
Component: MailNews: Backend → MailNews: Database
Oops.
Component: MailNews: Database → MailNews: Backend
Here are the timestamps on the .rc files for all 4 of my news servers:

   975 Jun 23 18:24 ./News/news.grc.com.rc
  3278 Jun 23 18:24 ./News/news.microsoft.com.rc
  1779 Jun 23 18:25 ./News/news.mozilla.org.rc
  3680 Oct 20 00:14 ./News/newsgroups.comcast.net.rc

The .rc file thinks I am subscribed to 10 newsgroups, but only 3 show up
in the folder pane, until I resubscribe to the rest.  

It seems that each time I restart, it forgets a different set of folders
than before.  That is, it's not the same 3 folders that get remembered 
each time.  I've seen this behavior before, months ago, where the group(s)
at the top of the list would be forgotten on restart, and would get added
at the end of the list when I resubscribed, so that the entire list seemed
to do a slow circular shift with each restart, groups moving from top to 
bottom.  The cause of that was (IIRC) lines in the .rc file that had 
exceeded some length, and was fixed by shortening those lines.  There's a
separate bug about the fact that "mark newsgroup read" doesn't shorten
the .rc file to the form 1-lastNumber and that bug is probably the cause
of the excessive .rc file line lengths.  But that doesn't seem to be the
explanation in this case, because the .rc files simply aren't being updated.
There's something about this that still doesn't make sense to me. 
Are the rc files used only to remember which groups are subscribed, 
but not also used to remember which articles are read? 

Given that the date on news.mozilla.org.rc hasn't udpated in months, 
one might expect that

a) the set of subscribed newsgroups on that server has not changed in 
months, (which might be true), and

b) the set of articles marked as read would be the same on that server
every time I restart SeaMonkey.  (this is not true).

So, one might expect that I would keep seeing the same unread articles
over and over in the newsgroups on news.mozilla.org, every time I restart
the mailnews client ... but I don't.  On other news servers, even though
the .rc file is not apparently being udpated, the set of read articles
seems to be recorded and maintained accurately ... somewhere.  

That somewhere seems to be the .msf file, not the .rc file.  As I read
news articles, I see the .msf file date being updated, not the .rc file.

Does the mailnews client have some mode in which it ceases to use the .rc
file and uses the .msf file instead?
I conducted another experiment, with another NNTP server,
one of the other servers in my set of 4, news.micrososoft.com.

I went to that server and subscribed to a new newsgroup, one to which 
I had not previously subscribed, microsoft.public.vc.debugger.  I read 
about half of the articles.  Then I restarted seamonkey and went again 
to the same newsgroup.  SM remembered that I had subscribed to the new
newsgroup, and remembered the articles that I had previously read just
minutes earlier, before restarting SM.   

Yet, the .rc file remains unchanged, bearing a date from last June.
The .rc file still bears no line with the word debugger.
Only the .msf file has changed.

    3278 Jun 23 18:24 news.microsoft.com.rc
    1083 Oct 20 18:45 news.microsoft.com.msf

From these facts, I conclude that the .rc file is no longer relevant,
and that all the information about subscriptions and read articles is kept
in the .msf file, not in the .rc file.

The remaining question is: why does one of my 4 news accounts not remember
all the newsgroups and articles from the previous run, but only remembers 
3 of the groups?  

I imagine the problem is .msf file corruption of some sort, but the file 
contents are indecipherable by mere mortals.  

Are you sure this problem doesn't belong in the MailNews: Database component?
I tried my own little experiment.

I subscribed to a new newsgroup on news.mozilla.org. I shutdown and looked in the .rc file for the news server. It contained a line for the new newsgroup that wasn't in there before. I then deleted all the .msf files for that news server, and restarted. All the newsgroups appeared subscribed, and the read state was remembered for each of them. From these facts, I conclude that subscription is indeed stored in the .rc file, and not the .msf files.

At this point, I have no idea what's going on on your system. Maybe it's something specific to SeaMonkey that I have no idea about.

Yes, I'm sure this doesn't belong in the MailNews: Database component.
I have a testable hypothesis about what's going on with the .rc files on 
Windows for SM clients.  It is that when mailnews attempts to open the .rc 
file for writing, the open fails due to permissions (more on that below).  
Mailnews silently ignores the failure, but does go on to update the .msf 
file (or perhaps the .msf file was updated first, before the failure to 
open the .rc file).  The .msf file contains a (normally redundant) copy
of the information found in the .rc file.  When mailnews restarts, it reads 
both the .rc file and the .msf file.  Although the content of the .rc file
has not changed, since prior to the previous run, the updated information
is retrieved from the .msf file, so that it appears to work. 

This hypothesis does not explain the problem originally reported in this
bug, and so perhaps I should file a separate bug about the .rc files not
updating.  This hypothesis, if true, only explains how mailnews can 
apparently remember subscribed newsgroups and articles read, even when the
.rc file is not being updated.

Windows' system of file access permissions is FAR more complex (sadly) than
Unix's.  Window's file management UI (e.g. Windows Explorer) does not reveal
all the relevant information about file permissions.  Two files may appear 
(in windows explorer) to be owned by the same user and have the same
permissions, yet a process may be able to open one of them and not the other.
Or, a process may change its own "security descriptor" in subtle ways, and
be able to access a file at some times, but not at others.

MailNews has a history of using different inconsistent security attributes 
in different code paths for writing files.  One manifestation of this was
(is?) that attempting to save a single attachment from an email would fail, 
but attempting to save ALL attachments from the same email would succeed, 
because the path for saving all attachments used a different set of 
security attributes than the path for saving a single attachment.  
I suspect something similar is going on here, enabling mailnews to write
the .msf file but not the .rc file.  

This hypothesis can be tested, through simulation, on Linux/Unix, by 
making the .rc file be read-only, removing write access permissions from it.  
If that doesn't work (if the code unlinks the old file and creates a new one), then it may be necessary to change the directory permissions and/or ownership 
as well.  

Now, you may well ask, why would this suddenly affect SeaMonkey users in 
the past few months?  The answer may be that SeaMonkey recently changed
the path to its profile directory, AGAIN, forcing users to move or copy
their profile directories and files.  When a user copies his entire 
profile from one location to another, all the new copies of files are 
created with the same exact security descriptor (on windows), even if 
they originally had different security descriptors.  So, the code that
created them with inconsistent security descriptors in one location may
not be able to access them with new consistent security descriptors in 
the new location.  

If this hypothesis is correct, then this shows yet another reason to 
solve the long standing problem of inconsistent security descriptor
use in mozilla products.  But it also raises some interesting issues
about the redundant information between .msf and .rc files.  
great, that's much more believable than the theory that we've changed the code  to not to use the .rc files...what are the privileges on your .rc files now? And why didn't SeaMonkey migrate the profile for you?

I don't know how interesting the redundant information is - maybe it's interesting to talk about not using the .rc files, but I imagine that would raise some hackles.
> what are the privileges on your .rc files now?

Windows has no tools that provide an adequate display of them.
All it shows is that they are owned by my user, and are not read-only,
not "hidden", and not "system" files, but we know that is a gross 
oversimplification of the real story.

> And why didn't SeaMonkey migrate the profile for you?

Several months ago, SM moved its profiles on Windows from their original 
location
   <prefix>/Mozilla/Profiles/
(where <prefix> is C:/Documents and Settings/<username>/Application Data) to 
   <prefix>/mozilla.org/SeaMonkey/Profiles/
and then a few weeks ago, moved them again to a third location, namely
   <prefix>/Mozilla/SeaMonkey/Profiles/

At some point in that series of moves, they also split the profile 
directory into two directories, with some of the old profile files 
moving to a second profile directory with the same name but with a new 
different <prefix>:
   C:/Documents and Settings/<username>/Local Settings/Application Data/
(note the addition of "Local Settings/" to that path).  

Each time (I am told), it provided code to migrate from the original 
location to the new location, but it did not provide code to migrate from
the mozilla.org location to the new Mozilla location (that is, from the
second location to the third).  Instead, they provided instructions to 
manually move the files.  

As for the elimination of redundant information, I'd vote for keeping the
.rc files, which are editable and whose format is well known, and 
eliminating the redundant info from the other files (wherever it lies), 
most of which are NOT editable (I don't consider .msf files editable).
Summary: new newsgroup subscriptions forgotten on restart → newsgroup subscriptions forgotten on restart
heh, well, for local, rss, and imap messages, we store the read state, counts, etc, in the .msf files. We're not going to change them all to use .rc files, and since we share as much code as possible between all the different kinds of messages, it would be *more* work and code to get rid of the "redundant" information in the .msf files for newsgroups. We trust the .rc file over the .msf file for newsgroups, and that's usually sufficient, I believe.

If the files are writeable by you, that should be all that's required. Since it's not saving subscriptions and read state in the .rc files, maybe they're not writable, but I kinda wonder if we're not looking in an other directory for the .rc files. It might be interesting to go into account settings and look at the local directory under server settings for one or more of your news servers.
I see that the pathname of the .rc file is separately configurable from
the path for the news directory.  Under what circumstances does it make
sense for the .rc file to NOT be in the news directory?
that's up to the user, I suppose.  In theory, you might want your newsrc file to be outside the profile directory so that an other news reader can also use it, i.e., to share it with an other program. I don't know if anyone has ever done that.
Ok, well, your suggestion in comment 14 revealed a lot of things.
For years (ever since Netscape 4.x) I had my news folders seperate 
from my profile folders, so that I could share my news folders in 
multiple profiles.  My .rc files were in my news folders.  

When SM migrated my profile from 
   <prefix>/Mozilla/Profiles/
to
   <prefix>/mozilla.org/SeaMonkey/Profiles/
it copied my .rc file from my out-of-profile news directory into the News 
directory in the new SM profile, and set the news directory for the new
profile to point to the new profile's news directory.  

But, I now realize, it left the .rc file path where it previously was, in 
the separate out-of-profile directory.  So, it was using a .rc file, just 
not the one it had copied into my profile's News directory.

I thought it had completely switched my news stuff to the profile directory, 
but apparently not. Sigh.  

So, today I went and looked at the .rc files in the old news directory, 
which I had not look at for months.  The .rc files there were up to date.

The .rc file for the one account that is having the troubles remembering
the subscribed newsgroups (the subject of this bug) was over 200KB long. 
wc, the word count command, reports it has only 4 lines in it.  
It's seriously hosed.  Do you want to see it?  

I've seen .rc files get into this state before.  The line for one or more
newsgroups get way way too long.  Shortening them, by removing all but
the last few article numbers in each line, solves the problem.  

So, ultimately, once we got past the distraction of the appearance that 
.rc files were not being used, the problem is a badly-formed .rc file. 

Isn't there already a bug on file about rc lines getting too long because 
gaps for unread expired article numbers aren't removed ? Maybe that's 
relevant.
Here's the .rc file as mailnews created/left it.  
This is the file that seemed to forget all but 3 group subscriptions 
from run to run.
Doing a Mark all read should also shorten the line in the newsrc file.
that rc file is horked - a lot of lines got concatenated together, and lines got duplicated. I really don't know how it got that way.
If I may make a guess, I'll bet that what appears to be concatenation is 
actually truncation.  Some code probably attempted to write out a single
line that was long than some buffer size limit, so it was truncated. 
The end-of-line characters that would have been written if it had not been
truncated were lost, so when the next line was then written to the file, 
it appeared to be concatenated to the previous line, due to the lost 
end-of-line characters.  

I'd guess that some software isn't checking the return value from some 
write call, and hence is failing to notice a short write.  It might be a
good idea to try to limit line lengths to below some limit.  If the line
has so many article numbers that it unavoidably must exceed that length
limit, then I suggest discarding old article numbers to shorten the line
to make it writable.  I might suggest 8KB as the line length limit.
> Doing a Mark all read should also shorten the line in the newsrc file.

Maybe what I'm about to describe is a non-issue, but it seems like it is 
part of the problem of very long lines in .rc files.  Please advise.

As an experiment, I read articles in a very high traffic volume newsgroup,
sci.crypt, and marked all but the last 20 or so read.  Then I read the 
last 20 articles, so that they all were read.  Then I marked one of the 
previously read articles unread, and exited SM, and looked in the freshly
updated .rc file at the line for this newsgroup.  

I expected the line to have just one comma, at the place where the one
unread article was in the article number sequence.  I expected it to look 
like this:
    sci.crypt: 1-PPPPPP,QQQQQQ-RRRRRR
where 
- PPPPPP is a number, one less than the article number of the article
   I marked as unread,
- QQQQQQ is a number, one greater than the article number of the article I 
   marked as unread, and two greater than PPPPPP, and 
- RRRRRR is the highest article number for that group on that server.

Instead I saw a line that had 187 commas in it.  There were 187 gaps in 
the numbers, and many of the gaps were greater than one article, suggesting
that there were multiple unread articles in that portion of the number 
sequence.  

I expected that the span of numbers from the lowest article number on that
line to the highest number on that line, would be no more than 1000, because
I have this group set to read no more than 1000 articles, and to mark the 
rest as read.  In other words, given that the line was of the form
   sci.crypt: 1-AAAAAA,BBBBBB-CCCCCC,...,YYYYYY-ZZZZZZ
I expected that the difference between AAAAAA and ZZZZZZ would be no more 
than 1000.  

But in the file, the difference between AAAAAA and ZZZZZZ was over 600,000!
There was one gap in the sets of adjacent numbers that exceeded 400,000.
Surely there must be some algorithm that causes us to drop numbers 
representing articles (read or not) that are very old.  That is not evident
here.

Then I restarted SM, and used "Mark Newsgroup Read" from the group's context
menu.  I expected this would mark all articles from 1-RRRRRR read.  Then 
I again marked one article unread (the same article as before), and exited
SM again.  This time, I really expected to see the line in the single comma 
format I described above.  Was that expectation entirely wrong?

But the file was unchanged.  The line for sci.crypt still had 187 gaps in it.
That line is in this attachment.
premature to request wanted TB3?, so qawanted

might this be resolved by demorkfication?
not sure who to cc:
Keywords: qawanted
Demorkfication?  
In reply to comment #17: Nelson, I have seen something else about newsrc lines getting too long: see bug 79130 and its workaround at bug 294754 comment #3.
(In reply to comment #19)
> Doing a Mark all read should also shorten the line in the newsrc file.

I agree.  It *SHOULD*.  But it doesn't.

Product: Core → MailNews Core
Keywords: qawanted
dup of Bug 387339 ?
Bug 540686 looks like a dup of 387339.

*This* bug shares in common with bug 387339 the fact that all are apparently
due to corruption of the .newsrc information, in memory or in the file. 
But other than that, I wouldn't call this bug a duplicate of bug 387339.

I suppose that if the .newsrc file showed up on the screen, it would get attention from the TB people.  But since it's invisible to the user, it's 
invisible to MoMo, too.
Is this reproducible under conditions where the news files have never been outside the profile, and only been accessed from one OS?
Flags: needinfo?(nelson)
See Also: → 387339
bug 387339 reports WFM.
No response from Nelson,so => incomplete
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(nelson)
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: