Closed Bug 424607 Opened 16 years ago Closed 16 years ago

Numerous regressions in news groups (caused by fix for bug 16913 ?)

Categories

(MailNews Core :: Filters, defect)

x86
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nelson, Assigned: jcranmer)

References

Details

(Keywords: regression)

Attachments

(1 file)

At about the time that the fix for Bug 16913 was committed, 
(Filter news based on any headers) I began to notice numerous 
regressions in mailnews in the trunk nightly builds.  Together these
regressions have really reduced the desirability of using trunk 
mail/news for news reading AT ALL.  Here are some of the issues:

1. news groups behave as if no new news is arriving, for weeks. 
Day after day, visiting news groups, the client reports "no new messages
on server".  And the newest message displayed in the message header 
pane (displaying ALL messages in date order) is days or weeks old.  
A check on groups.google.com shows that these groups are alive and
well, but my mail/news client surely thinks there's no traffic in them.

2. summary files get corrupted
On a few occasions, the problem of not seeing any new mail/news traffic
is remedied by by deleting the newsgroup summary file (.msf), from which 
I conclude that those files are being corrupted.  But it's not many days 
before the problem is right back.  

Groups that experience issues 1 and 2 a LOT include:
	rec.video.cable-tv
	comp.dcom.modem.cable
	alt.online-service.comcast

3. Counts of unread messages in a thread sometimes exceed the total count 
of messages in the thread.  Rebuilding the summary file does NOT fix this.
I will attach a screen shot that shows a single on-going example of this.  
The shot will show a thread with a total of 2 messages, and an unread 
message count of 25.  I tried marking the entire group read, tried 
rebuilding the summary file repeatedly. Nothing makes that 25 go away.

4. Time spent filtering messages when I first go to visit a group is 
VERY long.  The filtering now is done in two passes, the second of which
only goes at about 3 messages per second.  The first pass is very fast,
hundreds of messages per second are filtered on From and Subject.
The second pass re-filters ALL the messages, including the ones that were 
marked read in the first pass.  The second pass should only request 
headers for messages that were not eliminated in the first pass.  
This is painfully apparent when the first pass eliminates 90% of the 
messages (as if often the case).

Consequently, I find I must limit the number of message that I attempt
to download from each group each day to no more than 100, or else I have
to wait WAY WAY too long for the two passes to finish.  When I consider 
that well over half of the messages are eliminated in the first pass, 
it frustrates me to know that I am waiting all that time for no useful
result for those messages.  I know that in many cases, after downloading
100 headers, I will end up with less than 10 messages that passed the 
filter.  I know that the vast majority of them were eliminated in the 
first pass, but I had to wait 30 seconds or more for their headers to 
be downloaded a second time.  That time spent would be MUCH more worthwhile
if all the messages checked in the second pass were ones that had NOT 
already been eliminated, and so had a prayer of being not marked read at the end.

One mitigating workaround for that last problem: It is possible to start 
multiple newsgroups going through the second slow pass in parallel, taking 
about the same time together as a single group takes by itself.

5. I have the groups set to show "threads with unread".  Before there were
two passes of filtering, any thread whose contents were entirely eliminated
during the filtering did not show in the message header pane.  Now, threads
that have no unread messages nonetheless appear in the message header pane
when the filtering is done.  Selecting "threads with unread" again causes 
those entirely-read threads to disappear, thankfully, but it shouldn't be
necessary.  The determination of which threads are unread should be made
AFTER the second filtering pass, not before.
Flags: blocking-thunderbird3.0a1?
Correction:  There are several places in comment zero where I wrote 
"message header pane".  In ALL cases, I actually meant the message LIST pane
that shows the messages in a folder/newsgroup.  Sorry for the confusion.  
Another symptom:
Disabling a server-wide filter doesn't disable it.
The filter log continues to show matches being made to the server-wide 
filters, even when they have been disabled.  This applies to filters 
that get additional headers from the server.  (Is there a name for such
filters?)
This seems like a pretty painful set of regressions.  Marking as blocking-3.0a1+.  Nelson, can you try running with the build just before that checkin for a few days to see if that makes these problems go away?  Joshua, are you able to reproduce this?
Flags: blocking-thunderbird3.0a1? → blocking-thunderbird3.0a1+
This problem is not limited to a few newsgroups on comcast's news server.
I'm now experiencing it on mozilla's server.

My news client acted as if NO new messages were seen in the m.d.t.crypto 
newsgroup since 3-23, 4 days ago.  For 4 days, I saw NO new messages in 
that group.  I knew that messages were being received in that group, 
because I could see them looking at groups.google.com, and because I was 
occasionally being CC'ed on messages to that group.

I tried rebuilding the MSF file using the button in the newsgroup properties
dialog.  That had no effect.  

Then I shut down SM, and RENAMED the .msf file for that group, effectively removing it.  When I restarted SM and read the group, SM began to behave 
as if it was seeing new messages.  It downloaded headers for the group, 
and when it was finished, it was seeing all the messages, up to date.  

Now, IMO, this is pretty clear evidence that the inscrutable .msf files are 
corrupted, containing info that makes SM behave as if no new messages were 
being received.  I don't know why the "rebuild message summaries" button 
doesn't fix it.  Evidently that button doesn't rebuild the portion of the
file that contains the corruption.  

I have the renamed msf file, and will attach it here if anyone wants to 
analyze it.
(In reply to comment #0)
> 1. news groups behave as if no new news is arriving, for weeks. 
> Day after day, visiting news groups, the client reports "no new messages
> on server".  And the newest message displayed in the message header 
> pane (displaying ALL messages in date order) is days or weeks old.  
> A check on groups.google.com shows that these groups are alive and
> well, but my mail/news client surely thinks there's no traffic in them.
>
> 2. summary files get corrupted
> On a few occasions, the problem of not seeing any new mail/news traffic
> is remedied by by deleting the newsgroup summary file (.msf), from which 
> I conclude that those files are being corrupted.  But it's not many days 
> before the problem is right back.  

I haven't looked at these yet.

> 3. Counts of unread messages in a thread sometimes exceed the total count 
> of messages in the thread.  Rebuilding the summary file does NOT fix this.
> I will attach a screen shot that shows a single on-going example of this.  
> The shot will show a thread with a total of 2 messages, and an unread 
> message count of 25.  I tried marking the entire group read, tried 
> rebuilding the summary file repeatedly. Nothing makes that 25 go away.

Newsgroup count problems are known problems that have existed for long periods of times (I have annoying ones in TB 2). The thread problem I don't think I've seen before, but I suspect that it would have existed for a while before 16913.

> 4. Time spent filtering messages when I first go to visit a group is 
> VERY long.  The filtering now is done in two passes, the second of which
> only goes at about 3 messages per second.  The first pass is very fast,
> hundreds of messages per second are filtered on From and Subject.
> The second pass re-filters ALL the messages, including the ones that were 
> marked read in the first pass.  The second pass should only request 
> headers for messages that were not eliminated in the first pass.  
> This is painfully apparent when the first pass eliminates 90% of the 
> messages (as if often the case).

This is a feature, not a bug. Your choices are "slow" or "not at all": blame GigaNews for only supporting an extremely limited set of headers when querying with XHDR (i.e., only the ones returned by XOVER).

> One mitigating workaround for that last problem: It is possible to start 
> multiple newsgroups going through the second slow pass in parallel, taking 
> about the same time together as a single group takes by itself.

Async news is on my list of long-term goals for news.

> 5. I have the groups set to show "threads with unread".  Before there were
> two passes of filtering, any thread whose contents were entirely eliminated
> during the filtering did not show in the message header pane.  Now, threads
> that have no unread messages nonetheless appear in the message header pane
> when the filtering is done.  Selecting "threads with unread" again causes 
> those entirely-read threads to disappear, thankfully, but it shouldn't be
> necessary.  The determination of which threads are unread should be made
> AFTER the second filtering pass, not before.

I suspect this is a problem related to filters in general and not caused by my patch for bug 16913.

In total:
I am guessing that you did a lot of filter modification after bug 16913 was checked in, and you then started noticing many of these problems at the same time when scrutinizing the results of your modifications. I have not gotten around to looking at these problems yet, though. Have you run a pre-16913 build and verified that these problems do not exist in those builds?
> This is a feature, not a bug. Your choices are "slow" or "not at all"

I consider the fact that the second pass inquires about messages that have
already been marked read during the first pass to be a bug.  If it were 
fixed, that would GREATLY mitigate the slowness of the second pass.
(In reply to comment #6)
> I consider the fact that the second pass inquires about messages that have
> already been marked read during the first pass to be a bug.  If it were 
> fixed, that would GREATLY mitigate the slowness of the second pass.

The filters should only be applied in one pass, after XOVER/XHDR or HEAD has been used. To not do it like this would break filters like Subject contains Foo and NNTP-Posting-Host contains 127.0.0.1. In addition, the only filters which could save from further inquiry would be the actions "Delete Header" and "Stop Execution" since the others might have other overriding features.

However, one possible improvement in this area would be to save the fact that XOVER/XHDR is broken in server information. IIRC, news would have to rediscover this fact every time it applied the filters.
The problem I described above, where the news client begins to act as if
no new messages have arrived in various newsgroups for DAYS is now occurring
on news.mozilla.org as well as on comcast's news server.  

Viewing ALL messages in ALL threads, sorted by Date, ascending, unthreaded, 
shows NO new messages for days and days.  I now find it necessary to delete 
msf files every few days for the newsgroups for news.mozilla.org in order to 
get the news reader to discover new messages.  

After doing so, I see the most amazing strange set of behaviors.  upon first
read of a newsgroup after having deleted its .msf file, a dialog tells me
that there are too many messages to download, and asks me to tell it how many
to download (as expected).  Then after that, the message counts are obviously
wrong, in both the folder pane and the message list pane.  Then I use the 
"rebuild message summaries" button, after which the message counts in the 
folder pane are correct, but the message counts in the message list pane 
are obviously wrong.  With the view set to view threads with unread messages,
sorted in order received, ascending, threaded, many threads appear in the 
list pane showing large unread counts, even though all the messages in those
threads are read.  Clicking on any message in one of those threads causes 
the counts for that thread to be corrected.  So, it is necessary to go 
through many threads one-by-one to get their counts corrected.  

All of this nonsense is necessary JUST to get the reader to start seeing new
messages from the server again, and I have to repeat it once or twice a week.
(In reply to comment #3)
> Nelson, can you try running with the build just before that
> checkin for a few days to see if that makes these problems go away?  

Please tell me the date for which I should pull the old installer.

Nelson, will have you a chance to try that older build soon so that we can see if there is truly a regression here?  We're trying to get our 3.0a1 blockers sorted out...
Assignee: nobody → Pidgeot18
[ Tracking information for me ]

> 1. news groups behave as if no new news is arriving, for weeks. 

Probably WFM on news.mozilla.org: my casual testing didn't show any problems.

> 2. summary files get corrupted

Ditto

> 3. Counts of unread messages in a thread sometimes exceed the total count 

Saw this just now, confirmed that it's a result of cache information on thread
meta-rows in the msf file. I'll have to look into the Rebuild Index code to be
sure, but I think this predates 16913. I'll split off a new bug if this is the
case. May be an expired message problem.

> 4. Time spent filtering messages when I first go to visit a group is 

Invalid, per earlier comment.

> 5. I have the groups set to show "threads with unread".  Before there were

Neither tested explicitly nor seen explicitly.
Joshua, I don't accept that my complaint about the enormous slow-down is 
"invalid".  It's huge usability issue.  There is a way to implement the
added filtering that would have MUCH less of a problem with this than 
the present implementation.  The added filtering doesn't need to be so slow.

The problems with newsgroups acting as though no new news had arrived for
days or weeks at a time persists.  I've grown tired of deleting and 
rebuilding the msf files every 2-3 days to work around it.  

The pain of dealing with trunk mailnews is now sufficiently great that I 
find that I am simply not using trunk mailnews to read news much any more.
I have an old laptop on which I have a trunk build from last summer, and I
now much prefer to read news on it.  It just works.  I get all the news, and
it doesn't crash.  I don't get some of the new wizzy filtering, but that new
stuff is worthless to me if the most basic functionality of the reader is 
broken.  

The present behavior is seriously a regression compared to the build I have
from ~8 months ago.  It may or may not be entirely due to bug 16913, but it 
is surely a major set of regressions, a huge reduction in usability.  

This may sink SeaMonkey for me.  I want the newer trunk browser features, but 
I don't want the newer trunk mailnews regressions.  I may have to switch to 
FF(trunk) plus TB (old).  It's that or find a new mailnews reader altogether.

Dan,  I will try a build from 20080204 within 2 days.
I'm now running Gecko/2008020302 and I will try it for a few days.
Prior to reverting to this build, my news reader appeared to have not
seen ANY new messages in the newsgroup alt.online-service.comcast since 3/27.
The message filter log showed that the last message filtered was dated 3/27.
There were NO messages in the message list pane newer than 3/27 (viewing all
messages, all threads, sorted by date - of course).

So, I reverted to the older build, and deleted the .msf file, and changed the 
.rc file to reduce the value of the highest message number by a few hundred.
Upon starting the old mailnews client and visitng that newsgroup, I now see 
over 260 new news messages in that group, all since 4/1.  

I probably would have gotten similar results by merely removing the .msf file
and editing the .rc file, even if I had not reverted to the older build.  
The question now is: how long will it last before the news client begins to 
act as though all new message traffic in that newsgroup has ceased, again? 
With the very recent builds I had been using, it would only be 2-3 days.
Now, with this much older build, we'll see.  If I'm still getting new incoming
news messages on Friday, I will consider that conclusive evidence of a 
regression.
Currently I'm undergoing a test that may cure problem #3 (it's bug 115202 proper, but the problem may be the same or at least have similar causes).

WRT to problems #1 and 2, it is possible that I am not seeing it due to pref mismatches.

WRT to problem #5, I think I know what the problem: it's probably due to the fact that the message is added to the database before being marked as read.

Nelson, is there any possibility you could get on IRC (irc://irc.mozilla.org/maildev) in the afternoon EDT (roughly 4:30 PM EDT-10:30 PM EDT) sometime this week?
In reply to comment 11, on Monday I reverted to Gecko/2008020302 and I have
not seen ANY of the problems described in comment 0 since then.
Beginning Monday April 14, I have two profiles, 
one that I use exclusively with the older nightly build from February 3, and 
one that I use exclusively with the newer nightly builds (e.g. April 21).

Now, a week later, problems 1, 2 and 4 are all  very manifest in the new 
profile, and not at all manifest in the older profile.  

Several newsgroups have not seen any new messages since April 15.  
With the group set to show all messages in all threads, sorted by date,
ascending, non-threaded, the newest message that appears is April 15 
in the affected groups.

Every day, when I try to read news in those groups, It tells me that some
large number of new messages have appeared, and offers to get 300 headers
(300 is the number I have configured).  When I let it go, it proceeds to 
slowly count the downloaded headers to 300 by 3's (the sequence is always
1, 4, 7, 10, 13 ...).  At the end, it says either "no new messages on server"
or "no messages matched filter".  And then it, again, shows the newest 
message in the group as being from April 15.  

I have filters and filter logs for each of the news groups, and also a 
server-wide set of filters and a filter log for them.  The logs show no
messages since April 15 for the affected groups.  The server-wide log 
shows messages for the unaffected groups, but none for the affected groups.

It sure is maddening and disappointing to spend all that time, waiting 
while it downloads 300 headers, to be left with the result that it found
NOTHING in those 300 headers; no messages filtered out, no messages accepted
and displayed.  

I've been seeing this pattern with nightly builds since February, on a 
giganews server, one of the nation's most wide subscribed servers.  
Nelson, could you post an NNTP log of when it is failing to get new messages?
After discussion in IRC, I've requested that we backout the original patch for 3.0a1, since symptoms 1 and 2 are particularly painful.  That said, I'll wait to do this backout until we've got the rest of our blockers taken care of, on the off chance that some logging either at the NSPR or NNTP level leads us to a not-overly-risky patch.
Whiteboard: [needs backout or patch]
The patch from bug 16913 has been backed out.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Whiteboard: [needs backout or patch]
Flags: blocking-thunderbird3.0a1+
Flags: blocking-thunderbird3.0a1+
The last post on this was 29 April. It's now 13 June. I just downloaded the 12 June build (Windows, .exe file) from http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-trunk/

I find
a) In a News account I can't set up a filter on anything except Subject, From or Date. In particular I can't set up a filter on any custom header like "Delivered-To". The drop down list only contains Subject, From and Date; there is no Customize option.

a) I can't run *any* message rules on any news account; the option to run message rules is grayed out.
(In reply to comment #21)
> The last post on this was 29 April. It's now 13 June. I just downloaded the 12
> June build (Windows, .exe file) from
> http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-trunk/
> 
> I find
> a) In a News account I can't set up a filter on anything except Subject, From
> or Date. In particular I can't set up a filter on any custom header like
> "Delivered-To". The drop down list only contains Subject, From and Date; there
> is no Customize option.

Bug 16913 was backed out because of the regressions here; it's reinsertion is currently pending on some regression information. Please read bugs fully, especially comment #20 on this one before responding.
First, the status of *this* bug is "RESOLVED FIXED". Post #20 was nearly two months ago. Sorry but I assumed that the fact the regressions problem was fixed meant that filtering on headers was available again but was achieved using something other than the 16913 patch. I didn't realise that post #20 meant we were back to the 1999 situation.

Second, the fact that the option to run rules is grayed out seems to me to mean that in fixing the regressions something else got broken and I wanted to record that in what I considered to be a not unreasonable place.
(In reply to comment #23)
> Second, the fact that the option to run rules is grayed out seems to me to mean
> that in fixing the regressions something else got broken and I wanted to record
> that in what I considered to be a not unreasonable place.

That's just gonna get lost here.  Please file a separate bug, if you would.

Blocks: 16913
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: