Closed Bug 572974 Opened 14 years ago Closed 14 years ago

Partially downloaded mail data is mis-interpreted as complete mail data, if connection error occurs while downloading(IMAP, offline-use=off, mail.server.default.mime_parts_on_demand=false)

Categories

(MailNews Core :: Networking: IMAP, defect)

x86
Windows XP
defect
Not set
normal

Tracking

(thunderbird3.1 .3-fixed)

RESOLVED FIXED
Thunderbird 3.3a1
Tracking Status
thunderbird3.1 --- .3-fixed

People

(Reporter: World, Assigned: Bienvenu)

References

Details

Attachments

(2 files)

[Build ID]
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.3a6pre) Gecko/20100617 Shredder/3.2a1pre

Partially downloded mail is mis-interpreted as good mail, if connection error occurs while downloading.
(IMAP, offline-use=off, mail.server.default.mime_parts_on_demand=false)

This is a test result of intentional test for bug 565852 and for some other problems like "attachment of previously viewed mail is shown".
Mail data: multipart/mixed 20MB mail. 3 PDF attachments. 
> Content-Type: multipart/mixed, boundary="bdy"
> --bdy
> Content-Type: multipart/text
> mail-body-text
> --bdy
> Content-Type: application/pdf
> pdf-1 data lines in base64 from 1 to N
                                     <== cut off at here.
                                         checked by save as .eml, view source 
> pdf-1 data lines in base64 from (N+1) to end
> --bdy
> Content-Type: application/pdf
> pdf-2 data lines in base64
> --bdy
> Content-Type: application/pdf
> pdf-3 data lines in base64
> --bdy--

[Steps to reproduce problem]
(0) offline-use=off, rebuld-index,
    mail.server.default.mime_parts_on_demand=false
    in order to force download of whole mail data.
(1) View a small text/plain mail
(2) click the 20 MB mail, after a while, subject of this mail is shown
    at header box, and mail body is shown at message pane,
    and progress bar of Downloading appears.
(3) Pull off LAN cable. Tb detects error
(4) Plugin LAN cable again.
    If other mail which is not downloaded yet is clcked, Tb tries to download
    it. So, Tb normally detected Online status.
(4) Click a small text/plain mail, then click the 20MB mail again.
    Only one attachment icon is shown in attachment pane.
(6) View Source, Save As shows partial mail data.
    => Loss of data in a part.
       Nothing is save by "Save As" of the attchment. (no file is created)  
    => Loss of not-downloaded-yet parts.
Work Offline->Work Online and Restart of Tb didn't help. Partial data in Disk Cache looks mis-used.

Note: 
"attachment of previously viewed mail is shown" can be observed by next.
- View the 20MB multipart/mixed mail-1 (three PDF attachments).
  => Downloading of whole mail data starts.
- After subject and mail body body of mail-1 is shown
  click small text/plain mail-2
  => Attachment pane is not cleared, even though text/plain mail.
     First pdf attachment icon of mail-1 is shown in attachment pane.
"Save As of attachment does nothing" part looks bug 572065.
Following is mail data in Disk Cache for 20MB mail(3 PDF attachments), with offline-use=on, with mail.server.default.mime_parts_on_demand=true.

(1) Clear Disk Cache, (2) Rebuild-index, (3) set offline-use=on, (4) While auto-sync is downloading big 2 mails(12MB and 20MB), click the the big two mail emails alternatively. (5) auto-sync is interfered and halted. (6) Clear Disk Cache, (7) Click th 20MB mail => quickly shown, as "Download on demmand".
After that, following data was held in Disk Cache.
> Subject: big-mail-001
> Content-Type: multipart/mixed; boundary="------------070106050700040503000905"
> 
> --------------070106050700040503000905
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> Content-Transfer-Encoding: 7bit
> 
> big-mail-001
> --------------070106050700040503000905
> X-Mozilla-IMAP-Part: 2
> Content-Type: application/pdf; name="Pharo-PBE1-2009-10-28.pdf"
> Content-Transfer-Encoding: base64
> Content-Disposition: attachment; filename="Pharo-PBE1-2009-10-28.pdf"
> 
> This body part will be downloaded on demand.
> --------------070106050700040503000905
> X-Mozilla-IMAP-Part: 3
> Content-Type: application/pdf;
>  name="sunbelt_kerio_personal_firewall_user_guide.pdf"
> Content-Transfer-Encoding: base64
> Content-Disposition: attachment;
>  filename="sunbelt_kerio_personal_firewall_user_guide.pdf"
> 
> This body part will be downloaded on demand.
> --------------070106050700040503000905
> X-Mozilla-IMAP-Part: 4
> Content-Type: application/pdf; name="na_p2psec_guide0901r2.pdf"
> Content-Transfer-Encoding: base64
> Content-Disposition: attachment; filename="na_p2psec_guide0901r2.pdf"
> 
> This body part will be downloaded on demand.
> --------------070106050700040503000905--

If conection error happens around writing of this data, and if similar problem to bug 572065 happens(Tb considers partially downloaded mail dada as "completed mail data"), "Save As" may generate 27 bytes file, and "This body part will be downloaded on demand" can be shown by "View Source".
Summary: Partially downloded mail is mis-interpreted as completed download mail, if connection error occurs while downloading(IMAP, offline-use=off, mail.server.default.mime_parts_on_demand=false) → Partially downloded mail data is mis-interpreted as complete mail data, if connection error occurs while downloading(IMAP, offline-use=off, mail.server.default.mime_parts_on_demand=false)
Summary: Partially downloded mail data is mis-interpreted as complete mail data, if connection error occurs while downloading(IMAP, offline-use=off, mail.server.default.mime_parts_on_demand=false) → Partially downloaded mail data is mis-interpreted as complete mail data, if connection error occurs while downloading(IMAP, offline-use=off, mail.server.default.mime_parts_on_demand=false)
Blocks: 570914
Blocks: 570804
No longer blocks: 570804
Some of our users are still experiencing this on 3.1.1, but it is vastly better than 3.0, so difficult if not impossible to reproduce (sadly it *never* happens on my PC)...

If there is anything specific I can do (install a debug version?) to help troubleshoot this, please let me know...
Depends on: 565852
I was able to reproduce a partial message getting into the disk/memory cache by disconnecting the network connection and then shutting down the app while downloading an extremely large message (a scenario not previously described, afaict).

The way the disk/memory cache works in general is that we start downloading a message, and get a cache entry for that message. If we get an error fetching the message, we remove the entry from the cache, by calling DoomCacheEntry. This happens in two places - nsImapMockChannel::Cancel, and nsImapProtocol::LoadCurrentUrl, if there was an error running the url. Generally, if you click on a message, and it spins because you pulled out the network cable, and you click on a different message, nsImapMockChannel::Cancel will get called and the cache entry will get removed.  For some reason, shutting down doesn't cancel the mock channel, or get to the error handling code in nsImapProtocol::LoadCurrentUrl. I'll try to figure out why that is.

The other approach would be to only mark the cache entry as valid when we've successfully finished the URL, essentially assuming failure until we get success.

Re the "this body part will be downloaded on demand" aspect of this issue, that's handled by calling nsIMAPProtocol::SetContentModified(IMAP_CONTENT_NOT_MODIFIED) only in the case that we're fetching the whole message, which ultimately sets a metadata element on the cacheEntry. Otherwise, the metadata says the content is modified, and we won't use it later on. So I don't know what's going on there.
(In reply to comment #4)
> Generally, if you click on a message, and it spins because you pulled out the
> network cable, and you click on a different message, nsImapMockChannel::Cancel
> will get called and the cache entry will get removed.  For some reason,
> shutting down doesn't cancel the mock channel, or get to the error handling
> code in nsImapProtocol::LoadCurrentUrl. I'll try to figure out why that is.

It may be caused by next sequence in my test.
> (3) Pull off LAN cable. Tb detects error
> (4) Plugin LAN cable again.
> (5) Click a small text/plain mail, then click the 20MB mail again.
As my test was emulation of temporary error in other bugs(due to slow line, sleep/wakeup, ...), I clicked other mail after step (4) and Tb returned to Online again, instead of before step (4).
When I pulled the cable, I never got an error (though I may not have waited for a full timeout). Clicking on an other message usually invokes nsImapMockChannel::Cancel, which correctly discards the previous cache entry.
As bug 565852 has been fixed by you, I checked this bug(offline-use=off, mail.server.default.mime_parts_on_demand=false to force full mail download) and Bug 570914(offline-use=on) again, and I could observe Bug 570914 without pulling off of LAN cable, only with clicking of mails while Tb is trying to auto-sync of mail data.
(Build ID)
> Mozilla/5.0 (Windows NT 5.1; rv:2.0b4pre) Gecko/20100806 Shredder/3.2a1pre

(0) Same mail folder, mail data as Bug 570914.
      mail-1 : text/plain
      mail-2 : multipart/mixed, text/plain + one   application/pdf, 12MB
      mail-3 : multipart/mixed, text/plain + three application/pdf, 20MB
    offline-use=on, Clear chace
(1) Folder Properties, Repair Folder => auto-sync starts
(2) mail-1 was downloaded to offline-store.
    while auto-sync is downloading mail-2,
    (Activity Manager shows "synching 2 of 3")
    interfere auto-sync by clicking of mail-1, mail-2, mail-3 frequently.
(3) During interfering by mail clicking, Activity Manager showed
    "synching 3 of 3". offline-store file size = around 30 MB.
(4) At this stage, I noticed that size of mail-2 at thread pane is 0.1KB.
(5) Compact folder => offline-store file size changed to 20MB.
(6) Save PDf attachment of mail-2 => 27 bytes file
    (Bug 570914 was reproduced)
(7) Restart Tb, size of mail-2 is still 0.1KB.
(8) View message source. Following is shown for PDF attachment.
> --------------060901010509010104000400
> X-Mozilla-IMAP-Part: 2
> Content-Type: application/pdf; name="Pharo-PBE1-2009-10-28.pdf"
> Content-Transfer-Encoding: base64
> Content-Disposition: attachment; filename="Pharo-PBE1-2009-10-28.pdf"
>
> This body part will be downloaded on demand.
> --------------060901010509010104000400--

As "X-Mozilla-IMAP-Part: 2" is seen, partial data in Disk Cache looks to be used for storing mail data to offline-store file.
- auto-sync of mail-2 was interfered by mail click of mail-2
- mail-2 is downloaded to Disk Cache(or Memory Cache) with multipart-on-demand
- By mail clicks, download of mail-2 with multipart-on-demand was inteferred,
  and this bug happend
- After that, auto-sync of mail-2 used partial data in Cache.
You have to be very careful about looking at the disk cache and drawing conclusions based on its contents - unless the meta data says that the message has full content, we won't use the contents of the disk cache.
(In reply to comment #9)
> unless the meta data says that the message  has full content, we won't use the contents of the disk cache.

It's merely my guess. But this bug is for prolem of status of "full content" even though data itself is partial, isn't it? Does auto-sync use multi-part-on demand(FETCH BODY[n])? If not, and if auto-sync always uses FETCH BODY[], why can "X-Mozilla-IMAP-Part: 2" and "This body part will be downloaded on demand" appear in offline-store file, even after Compact, even after restart?
(In reply to comment #10)
autosync always fetches the whole message, usually several messages at a time, unless the messages are quite large. If the message is large, it will fetch a single message, and potentially go through the code that gets the message from the memory/disk cache.  Maybe it somehow ignores the fact that the message is only partial in the disk cache...I realize that somehow these partial messages are getting into the offline store, but I rather suspect they're getting written there directly. I'll try to investigate.
I was able to recreate the issue described in comment 7, but what I saw was that the partial messages were only in the offline store, not the disk cache (the disk cache was empty). This is consistent with the fact that we look in the offline store first, and we don't download from the disk/memory caches into the offline store via autosync - there's a special check that says if we're downloading for offline, don't look in the disk/memory cache.

Because we don't run two imap connections to a folder at the same time, loading a message while autosync is downloading a large message in theory has to wait for the download to finish. But it's possible that we pseudo-interrupt it, or interrupt the load group. Or, if we queue the url, the playing back of the queued url right after the autosync download finished might cause some confusion. I'll keep investigating.
I think the issue is the latter of the two I mentioned earlier - because we're queuing the url that loads the partly downloaded message, it seems we don't have a channel listener when we eventually run the url, and we fall into the ParseAdoptedMessage part of nsImapProtocol::PostLineDownLoadEvent, which ends up writing the line to the offline store. But I'm not quite sure how to fix it. I would have thought we would use the same channel listener as we would if we weren't running the queued url, and I'll see if I can do that, and if it fixes it.
Attached patch proposed fixSplinter Review
This fixes the issue I was able to reproduce with WADA's steps of waiting for autosync to kick in on a large message and then loading different messages. It basically trusts the url's StoreResultsOffline setting all the time, and makes sure it's set when doing a FolderMsgDump.

This passes all the xpcshell tests locally (FWIW). I need to figure out how to write an xpcshell test for this, since mozmill can't do IMAP.
Assignee: nobody → bienvenu
I'm having a real hard time writing an xpcshell test that fails w/o this patch. I'll give it one more day...
Comment on attachment 465811 [details] [diff] [review]
proposed fix

ran into a brick wall with the xpcshell test, which is that I need a docShell, and docShell's relevant load methods are [noscript], so that was pretty much a waste of a couple days.
Attachment #465811 - Flags: review?(bugzilla)
David, can phenomenon like bug 587528 occur by this bug?
If so, ignoring of RFC822.SIZE is better per account option...
(In reply to comment #17)
> David, can phenomenon like bug 587528 occur by this bug?
I don't think so. That looks like a corrupted offline store, but corrupted in the sense that we don't know where the beginning of the message is, not that we wrote a partial message and thought it was the whole message.
Original of this bug(comment #0, offline-store=off) is real misinterpretation of PARTIAL data(status of not-fully-downloaded-yet is lost?).
But in problem of comment #8(same as Bug 570914), data is not really partial, because successfully downloaded data by "fetch body.structure" and "fetch body[1]"(multipart on demand) is written to ofline-store by auto-sync.
This case looks for me next:
  (auto-synch is interefered by mail click)
  While auto-sync is executing "uid xxx fetch body[]", mail click invokes
  "uid xxx fetch body[1]", and auto-sync uses data by "uid xxx fetch body[1]",
  where xxx is same. As RFC822.SIZE value is not used to avoid problems
  due to wrong RC822.SIZE, mail is not re-downloaded forever.
Is your patch for problem of comment #8 too? Or for problem of comment #0 only? 

I guess bug 587528's phenomenon is like next.
  (fetch by mail click is interfered by auto-sync)
  New mail alert is shown by STATUS, and user clicks a new mail, and
  "uid xxx fetch body[]"(simple text/plain mail) is issued. While downloading,
  auto-sync invokes "uid xxx fetch body[]". Auto-sync uses last chunk
  by previous download and fetch response to "uid xxx fetch body[]" by myself.
  As RFC822.SIZE value is not used, mail is not re-downloaded.
Can problem like above guess occur?
autosync won't use the data retrieved by uid xxx fetch body[1]. In fact, autosync won't use the disk or memory cache at all.

WADA, I'm hoping to be able to land this patch on the trunk in the next few days, and then you could see what issues you can still reproduce. I'd generate you a try server build with the patch, but I can't do SeaMonkey try server builds.
Comment on attachment 465811 [details] [diff] [review]
proposed fix

I think we want this for 3.1.3
Attachment #465811 - Flags: approval-thunderbird3.1.3?
Comment on attachment 465811 [details] [diff] [review]
proposed fix

Ok, lets try this, we'll keep a careful eye out during the baking period just in case.
Attachment #465811 - Flags: review?(bugzilla)
Attachment #465811 - Flags: review+
Attachment #465811 - Flags: approval-thunderbird3.1.3?
Attachment #465811 - Flags: approval-thunderbird3.1.3+
fixed on trunk and 1.9.2 branch. 

WADA-san, if you want to try a nightly SeaMonkey build from tomorrow or later, and see what issues you can still reproduce, that would be very helpful. The one issue I believe is still open is that pulling the network cable while downloading a message to the disk cache can result in a partial message ending up in the disk cache. I haven't had a chance to debug that any further, but I felt that this issue was a bit more important because it could be reproduced without any network flakiness. I don't think messages from the disk cache can end up in the offline store, but I could be wrong, of course.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 3.2a1
wada, 
can you confirm this is fixed? 
and see comment 23.
(In reply to comment #24)
> wada, can you confirm this is fixed? 

No, I can't say "fixed".

I can say "problem exists", if problem occurs.
I can't say "problem is resolved" when I don't see problem, because I don't know solid peoblem re-creation procedure of this bug.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: