Open Bug 543956 Opened 14 years ago Updated 11 months ago

Always download All Headers, even for folders *not* set to offline mode

Categories

(MailNews Core :: Networking: IMAP, enhancement)

x86
All
enhancement

Tracking

(Not tracked)

People

(Reporter: tanstaafl, Unassigned)

References

(Blocks 1 open bug)

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6
Build Identifier: 3.0.1 final

Another simple one.

When a folder is clicked on for the first time, if it is *not* set to offline head, *All* headers should be downloaded, not just 'Normal' headers.

Reproducible: Always

Steps to Reproduce:
1. Add a 'new' account that has lots of folders/messages
2. Disable offline mode for all folders
3. Click on a folder - only 'Normal' headers are downloaded, so filters that act on custom headers are ineffective
Charles, could you elaborate on how to tell whether all headers have been downloaded? I'm not entirely clear whether you're referring to messages or to the headers for each message, and it's not completely obvious how to verify the latter. (It might also be helpful to expand on how you're disabling offline mode; at any rate, it saves time.)
(In reply to comment #1)
> Charles, could you elaborate on how to tell whether all headers have been
> downloaded?

Actually, I'm going by memory of what was stated in another bug dealing with filters not acting on 'custom headers' because it doesn't download 'all headers', only the 'normal' ones...

So, maybe I'm wrong...

> I'm not entirely clear whether you're referring to messages or to
> the headers for each message,

Just the headers, yes.

> and it's not completely obvious how to verify the latter. (It might also be
> helpful to expand on how you're disabling offline mode; at any rate, it
> saves time.)

Unchecking the folder in offline settings. I also have autosync disabled, but I would still like the full headers for messages to be downloaded so that filters will always work on anything in the headers, whether custom or 'normal' (whatever the difference is).
Oh - here's the first page where I read about this:

http://turbulentsky.com/thunderbird-run-now-filters-fail-on.html
(In reply to comment #2)
> (In reply to comment #1)
> > Charles, could you elaborate on how to tell whether all headers have been
> > downloaded?
> 
> Actually, I'm going by memory of what was stated in another bug dealing with
> filters not acting on 'custom headers' because it doesn't download 'all
> headers', only the 'normal' ones...

found it - bug 184490...
Currently, first fetch of mail information is done by next command.
Headers obtained by above are;
  Standard headers(From, ..., Message-ID), and heaers used by message filter
> UID fetch 74 (UID RFC822.SIZE FLAGS BODY.PEEK[HEADER.FIELDS (
>   From To Cc Bcc Subject Date Message-ID
>   Priority X-Priority References Newsgroups In-Reply-To Content-Type received
>  )])

There are known issues around this. 	
(a) Non standard headers can be accessed only when;
(a-1) Whole mail data is held in offline-store
(a-2) Whole mail headers are held in offline-store or Disk Cache
=> After-the-fact filter is imposible, if offline use=off. Even if offline
   use=on, it's impossible for some mails, if offline-store size is limited.
=> Data of "Received" column is always same as "Date",
   unless message filter uses Received: header.
(b) Quirks for malformed mail doesn't work well.
(b-1) Quirks for raw binary in header data by Tb doesn't work.
      (e.g. for non-encoded non-ascii Subject)
      If server returns converted data instead of the wrong raw binary to
      HEADER.FIELDS(Subject), Tb stores it in MsgDb as subject of mail.
> response to HEADER.FIELDS(Subject)
>   Subject: ???????? ????
> response to UID fetch 74 (UID RFC822.SIZE BODY.PEEK[])
>   Subject: ЃEроткий теЃEЋт         
=> "???...???" (generated by server) is displayed at thread pane.
   Readable "ЃEр...EЋт" (by Tb's quirks) at message header pane.
   (data in msg header lines are used, instead of subject data in MsgDB)
    
I think next new mode, in addition to current auto-sync=on and auto-sync=off, is a solution of above issues, as Charles Marcus says.
(1) issue next command first. 
> UID fetch 74 (UID RFC822.SIZE FLAGS BODY.PEEK[HEADER.FIELDS(content-type)])
(1) When auto-sync=on and offline-use=on,
    fetch whole mail data(UID fetch 74 (UID RFC822.SIZE BODY.PEEK[])),
    and save in offline-store as currntly done.
(2) When offline-store file use is not permited by user,
    download to Disk Cache as currently done.
(3) When non-multipart and offline-store file use is permited by user,
    issue UID fetch 74 (UID RFC822.SIZE BODY.PEEK[]), and save whole mail
    data in offline-store.
(4) When multipart and offline-store file use is permited by user,
    fetch header portion only and save in offline-store.
    Any part data is saved as "this body part is downloaded on demand...".

As very big mail data is usually produced by parts in multipart mail, offline-store file size won't become huge by (3) and (4). I believe very big text/plain mail and very big simple text/html mail is exceptional.
I think gain by "all headers is always held locally" is far greater than loss by "not-so-small additional disk space is required".

If above is extended for part of "multipart mail"/"a part in multipart mail",
  - If text/xxx part, save whole part data in offline-store.
  - If not text part, save only header portion of the part in offline-store.
it can also be a solution for request of "don't always download big attachment data".
It can be said "enhancement of download-on-demand" + "enhancement of auto-sync".
Current auto-sync is for local body search based on Global Indexer. So, "image data in image/xxx part is not downloded locally" will not produce big problem on body search.

Issues in above approach:
Per folder option like "offline use=on/off" of auto-sync is required.
If offline-store file size is limited by user, it's impossible to locally hold all header portion of all part of all mails.
Big text attachments is probablly rare. But size limit of text part may be required to keep offline-store file not-so-huge. If so, enhancement of "Display Attachments Inline" will be required. (display in inline only when already downloaded.) It's required for image/jpeg,png,.. part too.
(In reply to comment #5)
> There are known issues around this.     
> (a) Non standard headers can be accessed only when;
> (a-1) Whole mail data is held in offline-store

Right - this is exactly what I'm talking about. Why limit the headers being downloaded? Just download them all, for every folder, every time, regardless of offline settings. Then if/when a folder is set to offline mode, only the message bodies/mime-parts need to be downloaded.

> (a-2) Whole mail headers are held in offline-store or Disk Cache
> => After-the-fact filter is imposible, if offline use=off. Even if offline
>    use=on, it's impossible for some mails, if offline-store size is limited.

Ok, I don't limit the size of my offline store, so had forgotten about that use case.

But, really, just how much space can just the headers consume? See below...

> => Data of "Received" column is always same as "Date",

Yes, good point - another annoying another bug remedied by simply downloading all headers all the time.

> As very big mail data is usually produced by parts in multipart mail,
> offline-store file size won't become huge by (3) and (4). I believe very big
> text/plain mail and very big simple text/html mail is exceptional.
> I think gain by "all headers is always held locally" is far greater than loss
> by "not-so-small additional disk space is required".

Yes, yes - exactly my thinking...

> If above is extended for part of "multipart mail"/"a part in multipart mail",
>   - If text/xxx part, save whole part data in offline-store.
>   - If not text part, save only header portion of the part in offline-store.
> it can also be a solution for request of "don't always download big attachment
> data".

:)

> Issues in above approach:
> Per folder option like "offline use=on/off" of auto-sync is required.

Well, actually, my thoughts were to change the behavior - or at least provide an option that a mail admin can set via user.js - so that TB *always* downloads the full headers the first time a folder is clicked on (and for all new messages for that folder thereafter), regardless of offline settings/sync state. I mean, it has to download *some* of the headers when a folder is clicked on the first time it is accessed, so why not just download *all* of the headers?

Since, as you acknowledged, there isn't that much difference storage wise between 'Normal headers' and 'Full/All headers', it just makes sense to me to make this hard-coded default. But, as I said, a user pref would be enough to make me and a lot of heavy IMAP users really happy. :)

> If offline-store file size is limited by user,

Where is that setting? I can't find it either in Tools > Account Settings > Sync & Storage, or Tools > Options > Advanced > Network and Disk Space?

> it's impossible to locally hold all header portion of all part of all mails.

Well... I would actually question this... has anyone ever benchmarked:

 a) how much disk space is consumed by the Partial headers (that TB downloads
    now) for X number of messages, and

 b) the difference in disk space consumed by these Partial headers vs All? 

I could be wrong, but I just did a quick test on one of our users accounts with 7.6GB of mail - 3.6 GB in her Sent folder, and the rest spread out over about a hundred folders.

I added her account, disabled all offline & sync settings, defined a new location for her offline store, and then clicked on every single folder and let TB download its Partial Headers - or whatever it is that it downloads currently. Anyway...

After doing this, the entire offline folder for that account consumes a whopping 14MB.

So, the question is - how much more disk space would be consumed if *All* of the headers were downloaded? Even if it was *triple*, it would still be well under 100MB for an account with 10GB of mail.

Regardless, and assuming that the above difference is significant even in this day and age (of cheap, large hard drives), in my opinion this kind of 'corner case' should not prevent the rest of us from reaping the benefits that having access to the full headers would provide. Allow for it, sure, but don't bind everyone else down by its limitations.

Heck, just have the current 'Normal Headers Only' behavior be a hard-coded part of what happens when someone limits offline-store file size.

> Big text attachments is probably rare. But size limit of text part may be
> required to keep offline-store file not-so-huge. If so, enhancement of
> "Display Attachments Inline" will be required. (display in inline only
> when already downloaded.) It's required for image/jpeg,png,.. part too.

I just don't think the focus should be quite so much on the least-common-denominator, corner cases, like someone who is worried about consuming an extra 50MB or 100MB of space on a 1TB hard drive.
Tanstaafl has a reasonable point, which he is also making in bug 402594.
Status: UNCONFIRMED → NEW
Component: General → Networking: IMAP
Ever confirmed: true
Product: Thunderbird → MailNews Core
Had reason to review my list of feature requests... this one would sure go a long way to making Thunderbird a whole lot more IMAP friendly for those not using the default full offline mode for everything.
See Also: → 402594
Blocks: 184490
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.