Closed Bug 217271 Opened 21 years ago Closed 13 years ago

IMAP and excessive bandwidth via getting all message flags

Categories

(MailNews Core :: Networking: IMAP, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: jwilliams, Assigned: Bienvenu)

Details

(Keywords: perf)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030529

Several months ago I switched to Mozilla 1.4 on my win 98 pc at home.  I connect
to a secure IMAP server to grab my mail.

Last month I received an enormous excess usage charge from my ISP.  While
examining the source of the issue with my ISP, I ran an IP traffic meter while
doing various networking tasks.  Imagine my horror to find that every time I
start mozilla mail/news, or press "Get new messages", mozilla sucks about 5 megs
down my ADSL connection...

I have some folders with several thousand messges, but the IMAP protocol is
smart enough to only download new headers, and to only load messages when I
select them and so on.  I looked in my mozilla prefs to see if there was
anything obvious - didn't see anything there.


I checked with (gasp) Outlook express, and upon start up it only downloads a
tiny tiny fraction, pulling only the new headers, and only messages when
explicitly requested.

So, what's going on here?  Is Mozilla failing to recognise that it's already
cached these thousands of message headers, and is downloading the full headers
everytime it checks for new messages?

Whatever the cause, I've had to drop Mozilla at home because I can't afford to
pay an extra $50/month just for the privelege of using a better email client!  

I hope this isn't a duplicate, I looked and couldn't find any bandwidth related
bug reports for MailNews.




Reproducible: Always

Steps to Reproduce:
1. Start an IP logger, such as TrafMeter
2. Open Mozilla mail with an IMAP connection
3. Watch the ip traffic soar as mozilla sucks every header, rather than just new
headers
4. Cry as you pay $50 to ISP for excess usage charges
Actual Results:  
Bandwidth usage suggests that mozilla is re-loading every single message header.


Expected Results:  
Recongise that 99.999% of headers are already there, and only download the new
ones.  Outlook Express (wash my mouth out) does this quite happily.
we *do* only download the new headers. However, we download the flags for all
the messages in case the folder has been accessed by another e-mail client, or
from another machine, the first time you connect to a folder. There is a bug out
there to reduce the bandwidth for things like that. We shouldn't be downloading
5MB every time you press get new mail, but the first time, we will download the
flags for all the messages. If you have the junk mail controls turned on, we
will also download all the message bodies for new messages.

If you generate a protocol log, you can see exactly what we're doing:

http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html#imap
Status: UNCONFIRMED → NEW
Ever confirmed: true
OK, but this doesn't explain the behaviour I'm seeing.

I'll try a protocol log later tonight and see what I find.
Another thing, I have the pref 

mail.check_all_imap_folders_for_new

set to true.  However, that should just save me all the mouse clicks on each
folder to check for messages - the actual download behaviour for each one should
be unchanged.

Also how do I interpret the protocol log?  it's a huge file, I jsut tried it
here at work (moz 1.4 on linux) and restarted the mail client, produced a whole
bunch of output.
not sure what you mean by "save the mouse clicks" but
mail.check_all_imap_folders_for_new is definitely your problem. Try turning that
off. The way that's implemented is going to eat a tremendous amount of bandwidth.
Thanks I'll try it.

My understanding was that the check_all_folders option simply iterates through
the folder list, and checks for new mail in each folder.

How is this different from me simply clicking on each folder name in sequence?

It still doesn't explain [my impression of] the root problem, of headers being
downloaded every time...????
I don't believe headers are downloaded each time - do you see status messages
for the folders with a thousand messages saying "downloading hdr 1 of 1000, 100
of 1000, etc"?

Checking each folder for new messages is implemented a bit like you describe -
it's like you clicked on each folder. That's not the best way to implement that
(there's a bug on that) but it was easiest for the guy that did it. But what
happens is that you end up fetching the flags for every message in every folder
every time you do get new mail. We only cache five connections to the server. If
you have a lot of folders, that will quickly exhaust our connection cache, and
force us to re-use a connection, and sync up the flag state again.

We should use the IMAP status command on each folder to see if there are new
msgs in the folder from the last time we opened it, which would take a lot less
bandwidth. Do you have server side filters such that you're really getting new
messages in folders other than the inbox? If so, does it happen to all your
folders, or just some of them? If the latter, you can configure some of the
folders to be checked for new mail, by bringing up the properties dialog for the
folder and checking "check this folder for new messages" and turn off the
mail.check_all_imap_folders_for_new pref.
>> I don't believe headers are downloaded each time - do you see status messages
>> for the folders with a thousand messages saying "downloading hdr 1 of 1000, 100
>> of 1000, etc"?

Yes, that's what I'm seeing.  It takes several minutes on startup before all the
downloading stops, things settle down and I can start actually reading my mail.

So, given those messages, the time it takes, and the bandwidth usage, it's
definitely sucking some serious data every time.  I'm not so popular that I get
that much email, it's all old stuff that has already been pulled previously.

And yes, I do have a large number of folders (~30 or so, one for each mailing
list I'm on and so on), so that would explain the flags stuff you talk about.
> We should use the IMAP status command on each folder to see if there are new
> msgs in the folder from the last time we opened it, which would take a lot 
> less bandwidth. 

Sounds like a great idea, and probably how IMAP intends clients to operate.

> Do you have server side filters such that you're really getting new
> messages in folders other than the inbox? If so, does it happen to all your
> folders, or just some of them? 

Yes, lots of them.  I use procmail server-side to dispatch my mail where it
belongs, otherwise I have to duplicate my  mozilla filters at work and at home,
and on the laptop, and ....

> If the latter, you can configure some of the
> folders to be checked for new mail, by bringing up the properties dialog for 
> the folder and checking "check this folder for new messages" and turn off the
> mail.check_all_imap_folders_for_new pref.

Yes I'll do that - I've also archived some of my mailing list folders to reduce
the amount of traffic in the short term.  However, a longer term solution like
that you mention (query imap server) would make more sense.

Problem: do you send a client ID or something for the server to track? 
Otherwise, if I pull the latest from my work computer, then go home and do the
same from my home PC, will the server realise that it's two different clients,
and respond accordingly?


is this for every folder, or just the inbox? And every time, it redownloads all
the headers, even the old ones? If that's the case, can you e-mail me two
protocol logs, that show opening the inbox and downloading all the headers, then
shutdown, save the log to a different file, and generate a second log showing
the same thing? The only reasons I can think we would download all the headers
again is if we though UID validity had changed (which is the server's way of
telling us to throw away all our cached information), or if our database got
corrupted every time, somehow...

Re the client ID question, no, it's completely up to the client to sync with the
changes on the server. That's why we download all the flags for every message,
in case another client changed them (e.g., deleted a message).
yes I think it's for every folder.

When I get home tonight I'll gnerate the logs as you request, and email the
output directly to you.

taking from Seth, changing component to IMAP.
Assignee: sspitzer → bienvenu
Component: Networking: MailNews General → Networking: IMAP
How about not fetching the flags for old messages at all if certain amount of
time hasn't passed and Mozilla hasn't been restarted since the last fetch? Now
that it's bound to the  connection state I'm fetching flags all the time as my
uw-imapd decides to kick me out.

John, you can select the specific folders to be checked for new mail.
Right-click on a folder, select properties and check "Check this folder for new
messages".
Ere, we'd get into trouble if we didn't fetch flags in that case, because it's
quite possible that we were kicked off because some other program/client made a
change to the folder
I'd like to see mozilla mail change to using imap status as the default for
checking for new messages combined with a synchronization option (per folder)
which would enable the old flag check method if people wanted it... (this is how
outlook express/2000 appear to deal with this issue)

It really does cause a lot of traffic..  I work in IT support for 40+ thousand
users.. if all of them hit our mail server infrastructure using the cached
connections to make it fast rather than making the client efficient seems a
little backwards to me.  The users in the thunderbird pilot group cause
significantly more server load than the eudora imap pilot group (and I want
thunderbird/mozilla to win).
Travis, are you talking about just checking the inbox for new messages, or all
folders for new messages? I think for just checking the inbox, issuing a NOOP
and downloading new headers is going to cause roughly the same amount of server
load as issuing a status and then downloading all the headers when user selects
the inbox. But if you're talking about the option to check all folders for new
messages, I agree completely.

Or are you really concerned about the fact that we cache open connections to the
INBOX? I guess for some servers issuing a STATUS command is more lightweight
than a SELECT...

Or, is it the full synchronization we do of flags on a per-folder basis? That's
somewhat orthogonal to the STATUS/SELECT issue, at least in terms of network
traffic. I agree that we would like to have an option to avoid that, for users
that only access their imap folders from a single machine...I had thought about
this more as a startup time issue, and that we would go fetch the flags in the
background to sync the folder, but you're suggesting that we do away with the
default synchonization completely...
Product: MailNews → Core
David, is this headed anywhere other than incomplete? Not sure what aspects of this may be covered in other bugs, but reporter appears to be gone, and no response from Travis.
QA Contact: grylchan → networking.imap
Product: Core → MailNews Core
bienvenu, I think you have a bug for the startup issue of comment 15. sound right?

Is there anything in comment 6 you want to keep this bug for?
Or any other reason to keep the bug?
Summary: Mail cacheing, IMAP and excessive bandwidth → IMAP and excessive bandwidth via getting all message flags
we use status and condstore now, which gets rid of a lot of bandwidth. So I don't see a need to keep this particular bug open.
flip of the coin yields => WFM
Status: NEW → RESOLVED
Closed: 13 years ago
Keywords: perf
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.