Closed Bug 93208 Opened 23 years ago Closed 23 years ago

Message mimeparts downloaded more than once (actually 28 times)

Categories

(MailNews Core :: Networking: IMAP, defect)

defect
Not set
critical

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9.8

People

(Reporter: bugzilla, Assigned: Bienvenu)

References

Details

(Keywords: perf)

Attachments

(3 files, 5 obsolete files)

Perhaps I'm just stupid (I must be), but the mimeparts of a messages seems to be downloaded more than once. Actually there're downloaded 28 times. If it's correct it's a major performance problem. I seem to be able to reproduce this every time. Weird Weird. I'm weird. I'm attaching a FETCH log where I just select the one mail that I have in my INBOX. I'm also attaching the mail that generates the HUGE amounts of FETCH. You can check it out at my IMAP server. David: you have the login/password details. Otherwise mail me and I can send you the login details. There must be something that I just dont understand...! This cant be true...:)
Keywords: perf
Looks like Bugzilla Bug 22034 ... DUPs ?
still no comment on this one... major performace bug! in build 20010913 I'm getting 28 "FETCH (BODY[1]" lines in my IMAP log just loading the attached message!
I have no idea how to read such logs. naving, any idea if this is "real"?
hello... even my imap server says it getting this: UID fetch 1168 (BODY[1]) 28 times! loading the mail in Outlook Express takes around * 60 * second and these are the IMAP commands it sends: UID FETCH 1\3a* (BODY.PEEK[HEADER.FIELDS (References X-Ref X-Priority X-MSMail- Priority X-MSOESRec Newsgroups)] ENVELOPE RFC822.SIZE UID FLAGS INTERNALDATE) UID FETCH 1168 (BODY.PEEK[] UID) Mozilla (20011018) takes around * 510 * seconds!!!!!!!!!!!!!! It starts with sending UID fetch 1168 (BODY[HEADER] bla bla bla around 61 times then it starts fetching each part. Some part are fetched in 10240 chunks. Why does Mozilla fetch the message in soooo many fetches when OE can do only one? If you want you can test it on my imap server. Just mail me and I'll send you the username/password. But it's not server depending.
Keywords: mail1
bug 101480 seems to be about reply. This bug is about just reading the mail..
Here is what I found: 1. We already fetch all displayable inline body parts (i.e. all the .gif data) when we try to figure out what the msg is made up of (i.e. parse the body structure). 2. Then we generate the following URL's for each displayable inline body part: "imap://qatest20@nsmail-2:143/fetch%3EUID%3E/INBOX%3E1901?part=1.2" "imap://qatest20@nsmail-2:143/fetch%3EUID%3E/INBOX%3E1901?part=1.3" . . . and when each of the URL is executed it fetches ALL the displayable inline body parts again. David, in nsImapProtocol::ProcessSelectedStateURL() instead of calling: foundShell->Generate(NULL); which fetches all body parts, can we pass in the part number associated with the URL so that we only fetch the desired body part, something like the following? foundShell->Generate(partNumber); I have not tried it yes because I don't know the part number format to generate, but if it works then it will cut down the number of fetches from x*x to x*1 (not counting the fetches we did in step #1).
Cavin, in theory you should be able to do that, but I don't know for sure.
I'm sure I'm missing something but isn't the message with all of the inline parts in the memory cache? The idea is supposed to be that we fetch the message from the memory cache 28 times, lettimg mime extract the part it wants as we feed it into mime each time. Why are we hitting the network again to download inline parts we've already put in the memory cache?
Scott, that's what I thought at the beginning but then I figured out the body data was never cached. So if you reply to the msg with a .gif attachment we'll go and download the gif data from the server again. David, can you explain it again (I know only high level even though you've told me once)? In any case, we should do this fix so that we only process the part the URL is intended for.
Passing the part number associated with the URL to foundShell->Generate() seems to work well. On my machine the load time shrinks to 7.031 (from 52.104). So with the fix, we now download each of the inline mime part twice. But ideally we should be only downloading the parts only once. From what I can see it seems that we never cache the part/body data (let me know if I'm wrong), so we need another bug to keep track of this issue. If we can cache body data not only can we cut down the msg read time even more but also the msg reply speed will be improved. Right now when we reply to a msg with inline parts we download these parts again at reply time. I'll attach the patch later. Basically, I created a new method in nsImapUrl to return part number from an URL like the following: "imap://qatest20@nsmail-2:143/fetch%3EUID%3E/INBOX%3E1901?part=1.2" For the above URL the method will return "2" as the part number. Then I pass this par number to foundShell->Generate().
Scott and David, can you review the patch? Thanks.
you should use PRInt32 instead of naked ints. The code seems alright, other than that. It does worry me a little, so I'd need to spend more time with it. Scott, what do you think?
Moving to 0.9.8.
Target Milestone: --- → mozilla0.9.8
Reassigning to myself.
Assignee: mscott → cavin
Comment on attachment 60005 [details] [diff] [review] Patch to reduce number of part fetches. let's make imappart a nsXPIDLCString so we can remove the PR_FREE. numCharsToCopy should be a PRInt32. I've been running with this for a little while and haven't noticed anything yet. However it's certainly high risk code. Checking in at the beginning of 098 may be the best time to try it out.
I agree and 0.9.8 should give us enough time to test it out. I'll make the two changes before checking in the fix.
The fix I added was to set 'end' to the end of the url string when there's no more parameter after "part=" (ie, the url ends with "&part=1.2" or "?part=1.2"). Also make sure 'numCharsToCopy' is greater than 0 before allocating space for the part number. Dave and Scott, can you r and sr once again? Thanks.
OK, I've dug into this a bit and have found the core problem: libmime throws away the image parts we fetch inline when generating the message. In 4.x, it passed them onto libimg, but in 6.x, it just drops them. So the fix is to make sure these images are cached somehow. I initially thought we should put them in the image lib cache directly, but this won't work, I think, because it turns out imglib does not use the async features of the memory cache, and it needs to in this scenario. What will happen is that when we've generated the image part, imglib might turn around and request the image before we've finished generating the data for the image. So, the alternative is to stick the image in the imap mem cache, and then when imap gets the url to fetch the image part, get it from the cache. Because the imagelib request will be blocked until the first message url fetch is completed by the multiple connection protection code, we're guaranteed that we will have finished with the image by the time the url gets run (not that that's really required, because IMAP does use the async memory cache api's). So, after discussion with mscott, I'm going to add an attribute on mailnews urls that says whether they should cache image data. Only imap wil set this to true right now, in the case that we're not putting the message in the memory cache. libmime will get the uri/url and look for that flag, when handling image data. If set, it will write the decoded image data to the memory cache. The imap code url running code will be changed to check for parts in the memory cache.
I'm still working on this - it's complicated :-( It turns out that I can't just put the single mime part in the mem cache because libmime can't parse just a single part, and wants the whole message, from which it will extract the part. I could put a shell around the real part to satify libmime (the way we do when building up some of the parts), but that seems complicated. The alternative is to put things in the imglib cache, but that will block img lib while waiting for imap to build up the image. Maybe that's not the worst thing in the world, I'm not sure. If it blocks the whole UI, obviously we can't do it. I'm not sure what thread imglib is running on.
ok, thanks to an idea from Scott, I've got this working. Scott's idea was to set the content type on the imap mock channel if we had cached the part in the mem cache (using the meta data for the cache entry, which we set in the mime code). Then, we don't even need to go through libmime to parse the img, we can go directly to imglib. This makes it really fast, so it's pretty cool. I had to avoid calling SetupPartExtractorListener in the case where we're fetching an image from the mem cache, which meant I had to move that call to several places to avoid doing it when we're fetching from the mem cache. I'll attach two patches, one for libmime (JF, we will need your review for this at some point), and one for imap.
Attached patch diffs for mime (obsolete) — Splinter Review
need to add a dependency on the necko cache directory for windows and linux (I assume that we don't need to do this for the mac, right JF?). Make the mime img code put the image in the memory cache if the url has a cache session to use, and set the content type on the meta data for the cache entry.
with the two above patches, this should make this work with just one fetch from the server.
Cavin has found one scenario that's not working - click on the problem message, click on another message with inline images, and click back on the problem message - the second time the message gets loaded 28 times or so. The cache lookup of the images in the second time hit this code: mTryingToReadPart && access & nsICache::ACCESS_WRITE && !(access & nsICache::ACCESS_READ)) and doom the cache entry. If the second message you click on is a normal message w/o images, the second time you click on the problem message works fine. I'll debug it some more.
The problem Cavin saw looks like an unfortunate interaction with the memory cache and entries getting evicted out from under us. If I increase the memory cache size, my test case of clicking on two large messages with images works fine. But with the default cache size, what happens is that our entry gets evicted between the time we start filling it in and the time libimg asks for it (when the message is loaded for the second time). The problem is exacerbated by the fact that libimg is also caching the image in the memory cache, and the fact that we don't seem to always clean up after the first message load all the time.
the next level cause of the problem is that the urls that libimg runs to fetch each of the inline images are kept around for quite a while, which keeps the cache entry descriptors around, which means the images can't be forced out of the image cache, until, unfortunately, the next time we click on the message. I'm trying to figure out the lifetime of those urls, but it seems like that lifetime is much longer than it should be. The imgRequest object doesn't get destroyed until much later...I could try to clear out the cache entry when the url finishes running, for these part urls.
I think this mozilla imap's cache bug like 46233,when i open a mail,mozilla will cache this mail but not cache mail's attachment or image with another key, so when i save attachment or image,mozilla will reload it from imap server,then cache it,so now when i save the attachment or image again ,mozilla will load it from cache.
yes, this bug is closely related to bug 46233, in the sense that if we were to cache all the parts in the memory cache, then both bugs would be fixed. But that's a very difficult thing to do, and I'm not sure it's really the right thing to do in the general case. It's a complicated issue. Anyway, back to this more specific problem. I can't use the image lib cache at all, since it really caches img requests, which hold onto the data, instead of caching the data directly. I've fixed it so that the imap urls that are run to fetch individual image urls release their mem cache entries when they're finished running, but that's not sufficient for this one test case, because imglib is still holding onto all the images, which blows the memory cache limit by itself to such a great extent that when we go back to the problem message, we're still not below the cache threshhold. So I can either figure out some way to make imglib let go of its cache entries when the second message is loaded (which it should be doing anyway), or I can make the imap/mime code hold onto all the cache entries until the message is finished loading. Or, I can just decide that this is such a degenerate case that it's not worth bothering with.
Attached patch complete patchSplinter Review
This is the complete patch - we now hold onto the individual cache elements in the mailnews url, which guarantees they won't get freed until we've finished displaying the message. Cavin and Scott, if you could review, I'd appreciate it. There's still some more stuff we could do - in particular, saving the attachments from the attachment menu doesn't use the mem cache because the url there doesn't have the type and filename set, so we don't get a hit in the mem cache. I'll open a new bug for that.
Attachment #60005 - Attachment is obsolete: true
Attachment #62026 - Attachment is obsolete: true
Attachment #63585 - Attachment is obsolete: true
Attachment #63586 - Attachment is obsolete: true
Attachment #63587 - Attachment is obsolete: true
Comment on attachment 64186 [details] [diff] [review] complete patch sr=mscott Hey David, this looked pretty good to me. One question though, do you think the image cache entries should be stored via nsIMsgMailNewsUrl or possible pushed down into nsIImapUrl? Do you think any of the other protocols could one day leverage this information?
Attachment #64186 - Flags: superreview+
yes, other protocols could leverage this info, in particular, news, if for some reason we don't put the whole news message in the mem cache.
r=sspitzer four minor nits: 1) interCaps on the idl changes: CacheCacheEntry() -> cacheCacheEntry() RemoveCacheEntry() -> removeCacheEntry() SetImageCacheSessionForUrl() -> setImageCacheSessionForUrl() 2) + *imageCacheSession = m_imageCacheSession; + NS_IF_ADDREF(*imageCacheSession); dmose pointed out that you can do: NS_IF_ADDREF(*imageCacheSession = m_imageCacheSession); 3) kCImapService can you use the contract id, instead? 4) + nkcache \ don't forget about the mac.
from an aim conversation with bienvenu: >3) >kCImapService >can you use the contract id, instead? we don't have a #define for a good imap service contract id. I'll log a bug on me to clean that up, and remove all the static cids. >4) >+ nkcache \ >don't forget about the mac. I think you should be safe on the mac. famous last words. re-assign to david, as it is his fix.
Assignee: cavin → bienvenu
fix checked in.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
*** Bug 22034 has been marked as a duplicate of this bug. ***
QA Contact: huang → trix
verified in trunk 2002042410
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: