Open Bug 1443750 Opened 6 years ago Updated 2 years ago

MOZ_ASSERT(!aNew, "Logic error: Trying to read part from entire message which doesn't exist") sometimes triggered

Categories

(MailNews Core :: Networking: IMAP, defect)

defect

Tracking

(Not tracked)

People

(Reporter: jorgk-bmo, Unassigned)

References

Details

Attachments

(2 files)

+++ This bug was initially created as a clone of Bug #1430480 +++

While working on bug 1430480 we discovered that
      // We are here with the URI of the entire message which we know exists.
      MOZ_ASSERT(!aNew,
                 "Logic error: Trying to read part from entire message which doesn't exist");
at
https://dxr.mozilla.org/comm-central/rev/2edea7714121b2f25cc155b0592520ce1ebfa6fa/mailnews/imap/src/nsImapProtocol.cpp#9194
triggers at times. From bug 1430480 comment #10:

I was doing a copy of problem message with big attachment from server A to server B with tb copy menus. Seemed to work OK but when tried to open the attachment on server B, tb crashed.

The situation is this: To retrieve a message part ("open the attachment"), TB checks whether the entire message is in the cache and whether the part is in the cache. If only the message is in the cache, it executes:
https://dxr.mozilla.org/comm-central/rev/2edea7714121b2f25cc155b0592520ce1ebfa6fa/mailnews/imap/src/nsImapProtocol.cpp#9390

I think what happens is this:
We copy a large message to a new location. When it's not yet completely in the cache, we try to access a part. Querying the cache for the message first gives the reply "yes", already in the cache (but not yet complete), when accessing it, it we receive a different answer.

Honza, can this situation arise?
We write to the cache, while the write is in progress we call Exists(). This returns true, so we come to open the entry, but OnCacheEntryAvailable() tells us aNew==true, which appears to be a contradiction.

Note that we have:
https://dxr.mozilla.org/comm-central/rev/2edea7714121b2f25cc155b0592520ce1ebfa6fa/mailnews/imap/src/nsImapProtocol.cpp#9257
which should avoid such situations, so trying to open an entry which is still being written should be delayed.
Flags: needinfo?(honzab.moz)
Attached patch IMAP-cache-debug.patch — — Splinter Review
Here's my refreshed debug patch which applies to latest trunk. Gene, could you please take a look and see what the debug says in the case where the MOZ_ASSERT() triggers.
Hmm, I tried with a 363KB attachment and it didn't trigger the MOZ_ASSERT(). Without the patch from bug 1430480 the attachment was empty, with that patch it opened OK. What am I missing? Source and target folders are not set for offline use which should trigger caching.

Honza, please don't invest too much time in an answer here, let us clarify the issue first ;-)
Severity: critical → normal
(In reply to Jorg K (GMT+1) from comment #0)
> +++ This bug was initially created as a clone of Bug #1430480 +++
> 
> While working on bug 1430480 we discovered that
>       // We are here with the URI of the entire message which we know exists.
>       MOZ_ASSERT(!aNew,
>                  "Logic error: Trying to read part from entire message which
> doesn't exist");
> at
> https://dxr.mozilla.org/comm-central/rev/
> 2edea7714121b2f25cc155b0592520ce1ebfa6fa/mailnews/imap/src/nsImapProtocol.
> cpp#9194
> triggers at times. From bug 1430480 comment #10:
> 
> I was doing a copy of problem message with big attachment from server A to
> server B with tb copy menus. Seemed to work OK but when tried to open the
> attachment on server B, tb crashed.
> 
> The situation is this: To retrieve a message part ("open the attachment"),
> TB checks whether the entire message is in the cache and whether the part is
> in the cache. If only the message is in the cache, it executes:
> https://dxr.mozilla.org/comm-central/rev/
> 2edea7714121b2f25cc155b0592520ce1ebfa6fa/mailnews/imap/src/nsImapProtocol.
> cpp#9390
> 
> I think what happens is this:
> We copy a large message to a new location. When it's not yet completely in
> the cache, we try to access a part. Querying the cache for the message first
> gives the reply "yes", already in the cache (but not yet complete), when
> accessing it, it we receive a different answer.
> 
> Honza, can this situation arise?
> We write to the cache, while the write is in progress we call Exists().

Exists() on cacheStorage does two things:
- checks the memory hash tables, if there really is an actively open and strongly held entry or, a pooled entry (aka an entry being in a hashtable, in memory, recently used), it returns true
- if not, it goes to the cache index for disk backed entries

The cache index can tho be out of date and is therefor mostly considered only as a hint.  I don't remember why exactly we have ever added the Exists() API, but I'm usually discouraging people from using it or using only as the last measure.  A better design using only AsyncOpen is better and suggested.

> This
> returns true, so we come to open the entry, but OnCacheEntryAvailable()
> tells us aNew==true, which appears to be a contradiction.

If you don't hold a strong reference to the entry (nsICacheEntry object) being written up until the seconds consumer's OnCacheEntryAvailable call then it's possible, yes.

holding a strong ref and ensuring the entry being written is not AsyncDoom()'ed ensures the second consumers gets that same entry.

> 
> Note that we have:
> https://dxr.mozilla.org/comm-central/rev/
> 2edea7714121b2f25cc155b0592520ce1ebfa6fa/mailnews/imap/src/nsImapProtocol.
> cpp#9257
> which should avoid such situations, so trying to open an entry which is
> still being written should be delayed.

ah!  OK, but it can happen the write fails, the entry being written is AsyncDoom()'ed, etc.

it's NOT ensured you get _an_ entry at all after the OnCacheEntryCheck call!  

(also note that the cache API you are using is tailored for HTTP, with frecency and expiration tracking inside the cache backend and also automatic eviction of stuff over the cache maximum (smart) quota limit.)

catching a log with cache2:5 and some upper level module to track callers of the cache API would say what happens, but this is likely hard to repro?
Flags: needinfo?(honzab.moz)
Honza, thanks for the information.

(In reply to Honza Bambas (:mayhemer) from comment #3)
> I don't remember why exactly we have ever added the
> Exists() API, but I'm usually discouraging people from using it or using
> only as the last measure.  A better design using only AsyncOpen is better
> and suggested.
I believe you and Kent James had a chat on IRC four years ago and you added it for us, see bug 1021843 comment #17.
Gene, could you do a bit more research here. From Honza's answer I take that we can get "exists" feedback for an entry which is still in the process of being written and where the write eventually fails, most likely due to excessive size.

This matches your description of moving a large message and accessing it straight away in the new location.

So if this doesn't constitute a logic error, we should just add a comment and change the MOZ_ASSERT() to a warning and make sure it still works even if the message has to be retrieved again from the server.
You can also enlarge the limit per entry.

- disk entry: browser.cache.disk.max_entry_size in kilobytes.  default is 50MB.  
- memory cache has a limit too: browser.cache.memory.max_entry_size and default is 5MB.
Thanks, we're already running with entries increased to 25MB. That should cover "regular" messages.
(In reply to Jorg K (GMT+1) from comment #5)
> Gene, could you do a bit more research here. From Honza's answer I take that
> we can get "exists" feedback for an entry which is still in the process of
> being written and where the write eventually fails, most likely due to
> excessive size.
> 
> This matches your description of moving a large message and accessing it
> straight away in the new location.
> 
> So if this doesn't constitute a logic error, we should just add a comment
> and change the MOZ_ASSERT() to a warning and make sure it still works even
> if the message has to be retrieved again from the server.

I have been working quite a lot with the caching stuff. Been trying to minimize the amount of redundant network accesses when the data should be in cache and maybe made some improvement. The main issue that I see is that the url doesn't distinguish between accessing the top level (always displayed) content and the full email ("not modified"). Both are requested with the "entire message" style url.

Anyhow, tried testing for the crash again and haven't seen the "assert(!aNew)" failure. However, I am seeing another failure when I copy a huge message from server A to server B and quickly access it at server B. It is in the new chunking code. Sometimes when the "orphan" \n should be present it is detecting something else and triggering the assert. Comparing the point in the imap.log where the "split" is detected with the wireshark trace, I see a full and completely incorrect line in the imap.log instead of the \n. In the wireshark trace I only see the correct orphan \n. So it seems that tb is somehow returning a random line at times that is not really produced by the server.

I think this may be due to the complex structure of the message. The top level message contains another message (eml) as a part and that part has a large png attachement comprising the bulk of the 17M email. When I display with attachments not inline I click on the eml link that loads the png from imap. While the eml/png is loading I can also click on the now visible png link and that seems to be what causes the assert, usually. If I wait for the eml/png to finish loading and then click on png I haven't seen a problem, yet.

Ignoring the assert crash, this also illustrates what seems like redundant network accesses. When the eml link is clicked, the eml body text and the png attachment are downloaded from imap. Then when I click on the png link now visible, the png is downloaded again even though it should already be in cache. I haven't looked at what is happening at this level in detail but I think it involves several variation on the "part style" URL and not the "entire message" URL form.
(In reply to gene smith from comment #8)
> Anyhow, tried testing for the crash again and haven't seen the
> "assert(!aNew)" failure. However, I am seeing another failure when I copy a
> huge message from server A to server B and quickly access it at server B. It
> is in the new chunking code. Sometimes when the "orphan" \n should be
> present it is detecting something else and triggering the assert.
Hmm, so the new code now detects that condition? Or did that bug break something?
(In reply to Jorg K (GMT+1) from comment #9)
> (In reply to gene smith from comment #8)
> > Anyhow, tried testing for the crash again and haven't seen the
> > "assert(!aNew)" failure. However, I am seeing another failure when I copy a
> > huge message from server A to server B and quickly access it at server B. It
> > is in the new chunking code. Sometimes when the "orphan" \n should be
> > present it is detecting something else and triggering the assert.
> Hmm, so the new code now detects that condition? Or did that bug break
> something?

It looks like this assert is detecting something that was always there. But not 100% sure yet and am continuing to test.

I just now saw something else. Looking at what happens, I concluded that the copy operation really has nothing to do with this. So I can do the same test with messages already on server B and just clear the cache and re-access the message/attachments in the same way as described above. When I did this the first time and click on the png attachment inside the inner eml, I see only this with the printfs:

eds: nsImapMockChannel::OpenCacheEntry |imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E32?part=1.1.3&type=application/x-message-display&filename=postacert.eml/;section=1.3.2?part=1.1.3.1.2&filename=encre_0434.png|
eds: Calling AsyncOpenURI() on existing part, will read

An nothing more happen and no assert or crash. The png attachment is not openable. But more typically when it works OK I see this:

eds: nsImapMockChannel::OpenCacheEntry |imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E32?part=1.1.3&type=application/x-message-display&filename=postacert.eml/;section=1.3.2?part=1.1.3.1.2&filename=encre_0434.png|
eds: Calling AsyncOpenURI() on entire message to get part
eds: nsImapMockChannel::OnCacheEntryAvailable, new=0
eds: nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=1
eds: |imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E32|
eds: ===Entry is not complete and cannot be used
:

Apparently in the first case it thinks the png part exists but it doesn't so nothing more happens. In the second case (where I wait to click on the png until after the internal eml has completely downloaded) it checks to see if the png is in the "full message" and since it isn't, it writes a new cache entry for the png part and all is well.

Actually, all is well except the png is downloaded from the server 3 times. The first time when the internal eml attachment is accessed. Then it is downloaded again when I click on the png attachment to the eml. However, during this download and cache write this occurs:

eds: nsImapProtocol::PseudoInterruptMsgLoad - dooming cache entry

which causes the 3rd download to occur, finally successfully. I tried not doing this "dooming" operation and then only 2 downloads occur but the png is corrupted and displays only with a message "Meta data bad". I have no idea what this "pseudo interrupt" is about. 

I also see this dooming occur and cause double downloads for simple top-level attachments. However, for small png attachments the file is not corrupted when the dooming is commented out.

Just to be clear, the above observations are done with attachments as links and not displayed inline and no offline store.
I can't remember the details now, but under some conditions, IMAP decides to "pseudo interrupt" the transfer, so we need to remove an incomplete cache entry. I added that here: https://hg.mozilla.org/comm-central/rev/f1a70bab1511
Failure to do so will result in a corrupted attachment, as you saw.

"Calling AsyncOpenURI() on entire message to get part" is correct, but when the message is "not complete", that doesn't help. I don't think the PNG is actually downloaded on that occasion, only the body structure, that's why it decides that it's incomplete since you don't display attachments inline.

Now what has gone wrong here is the first part: "Calling AsyncOpenURI() on existing part, will read". So it determined that the part was in the cache, tried to read it, and in the end it wasn't. So somehow that part must have been fetched before.

Looking at the debug, did you use the patch I provided? The cache keys looks all wrong,
|imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E32?part=1.1.3&type=application/x-message-display&filename=postacert.eml/;section=1.3.2?part=1.1.3.1.2&filename=encre_0434.png|
they should only contain part= and filename=, see:
https://dxr.mozilla.org/comm-central/rev/a8eecfe6de793af00e98cb1488515199c5fb73fb/mailnews/imap/src/nsImapProtocol.cpp#9313
How can the URL have two part queries?
The patch attached above only adds prints, right? Does it do something to "fix" url's? If so, I don't see it. I just manually merged it in and used my own prefix (eds:) since I had other small debug and printf changes in my file I wanted. Anyhow, even when it worked OK it is the same url with two ?part='s. I see this later:

eds: nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=0
eds: |imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E32?part=1.1.3.1.2&filename=encre_0434.png|

so maybe it it "normalized" to that.
Ignore this part of comment 10:

> When I did this the
> first time and click on the png attachment inside the inner eml, I see only
> this with the printfs:
> 
> eds: nsImapMockChannel::OpenCacheEntry
> |imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E32?part=1.1.
> 3&type=application/x-message-display&filename=postacert.eml/;section=1.3.
> 2?part=1.1.3.1.2&filename=encre_0434.png|
> eds: Calling AsyncOpenURI() on existing part, will read
> 
> An nothing more happen and no assert or crash. The png attachment is not
> openable.

When this happened it had already hit the assert here and was in the 300s attach debugger countdown:
https://dxr.mozilla.org/comm-central/source/mailnews/imap/src/nsImapServerResponseParser.cpp#3212
(In reply to gene smith from comment #13)
> Ignore this part of comment 10:
Good :-)

(In reply to gene smith from comment #12)
> |imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E32?part=1.1.3.1.
> 2&filename=encre_0434.png|
> so maybe it it "normalized" to that.
Yes. The patch only adds debug.

However, I've never seen a URL like this:
|imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E32?part=1.1.3&type=application/x-message-display&filename=postacert.eml/;section=1.3.2?part=1.1.3.1.2&filename=encre_0434.png|
That's coming from this debug print, no?
+  /*-DEBUG-*/nsAutoCString urlSpec0;
+  /*-DEBUG-*/m_url->GetAsciiSpec(urlSpec0);
+  /*-DEBUG-*/printf("=== nsImapMockChannel::OpenCacheEntry |%s|\n", urlSpec0.get());
The path is the stuff after the last / so
  /;section=1.3.2?part=1.1.3.1.2&filename=encre_0434.png|
and that gets normalised to what we saw above.

I can't get over the URL though, that's from the test message in the chunking bug 1216951, right?
Oh, looking at the message, I can see it now. It has an eml attachment and in there is the image. I never tested with a MIME structure like that.
(In reply to Jorg K (GMT+1) from comment #14)
> 
> I can't get over the URL though, that's from the test message in the
> chunking bug 1216951, right?

Correct. In case you need it, I just copied it to your server: tbtest@jorgk.com/rubish 
It is subject POSTA CERTIFICA: test large attachment.

This email also has problems on the charter/openwave server when you don't have offline storage. I think charter is either returning a bad bodystructure or tb is misinterpreting the b/s charter returns, don't know which. Its OK when you have an offline storage file. But this is another problem...
This is just an observation but it seems like maybe there should be some way to take advantage of this:

With the POSTA CERTIFICATA email with no offline store and attachments not inline and cache cleared, when you click on the eml link (email inside the email) several long parts style URLs with postacert.eml at the end "run" but don't seems to cause any useful network activity. Then this URL varient occurs and causes the whole message to be fetched:

imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E39?part=1.1.3&type=application/x-message-display&filename=postacert.eml

and before the fetch actually occurs, the url/cache entry is marked "Not Modified". Then OpenCacheEntry() chain is triggered to process this url and it just sees it as a part and does not see that it actually contains the whole message. (The code assumes that only "entire message" non-part style URLs can contain the whole message.)

Now since the actual entire message has been downloaded, there should be no reason for additional network fetches to occur when the png and other parts are clicked. But since the downloaded data is never stored in the "entire message" style url/cache entry, it is never used so accesses to parts and the entire message do cause more downloads.

Then again, downloading the full message when eml link is clicked doesn't really make sense either. Don't know how it concludes that this should occur. But maybe since the eml link contains 99.9% of the whole message, maybe it does make sense.
Well, the caching logic is ignorant of message attachments. It only sees the "main" message and its parts. A part of the message attachment message is also a part of the main message, no?

I'm not sure what you mean by "eml link". Maybe you're referring to opening the .eml attachment with a double-click. The logic that determines which body parts to fetch is in nsIMAPBodyShell.cpp.

I believe that if you click on a message part, so the message attachment of the main message, that part is fetched entirely.

So given the current constraints, the following should happen:
When you double-click the message attachment message it is fetched fully.
When you double-click it again, it should be already cached.
When you double-click an (image) attachment of the attached message, it will be fetched repeatedly, since the
message attachment is cached (if at all) as a part of the main message, not a message in its own right.
With luck, the (image) attachment could be cached as part of the main message.
If however you access the attachment of the message attachment via the main message, it should be cached as a part.

Do I make sense?

Main message
  Part: Message Attachment <=== Only this will/should be cached.
        Part: Image

Different view:
  Part: Message Attachment
  Part: Image  <=== image interpreted as part of main message, could/should be cached.
I have to take part of this back. When you don't show attachments inline, then the (image) attachment of the message attachment message doesn't show as part of the main message, so you can't click it.

I briefly experimented with that message on an IMAP server and I could never open the image completely, it was always shown in half.
Preface: I wrote this before I saw your comments 18 and comment 19 above.

(In reply to gene smith from comment #8)
> However, I am seeing another failure when I copy a
> huge message from server A to server B and quickly access it at server B. It
> is in the new chunking code. Sometimes when the "orphan" \n should be
> present it is detecting something else and triggering the assert. Comparing
> the point in the imap.log where the "split" is detected with the wireshark
> trace, I see a full and completely incorrect line in the imap.log instead of
> the \n. In the wireshark trace I only see the correct orphan \n. So it seems
> that tb is somehow returning a random line at times that is not really
> produced by the server.
> 

Re: https://hg.mozilla.org/comm-central/rev/4868c63422f5#l1.151

Instead of the asseert, I am printf'ing the line data that is detected when there should only be a line consisting of \n. This occurs when I click on the eml link and then immediately click on the png link that appears. The eml link causes the whole message be downloaded in chunks as described in comment 17 above. When the png is clicked, just the complete png is downloaded by addressing the body part index (not in chunks). So both fetches are occurring interleaved at the same time. 

The printf'd line (that should just be a single \n) is a recent line from the png download. This line does not cause a problem since, as noted in comment 17, the eml download in chunks is never used and goes into the bit-bucket. Also, the line that is supposed to be just \n is discarded anyhow, so it should have no ill effect on the eml download even if it the download wasn't effectively ignored. Also, since the printf'd line is actually a line from the png download, it seems to still be used to construct the png file since there is no corruption when the png displays in the application that opens it.

So, I would recommend changing the MOZ_ASSERT to something milder or just removing it entirely.
(In reply to Jorg K (GMT+1) from comment #18)
> Well, the caching logic is ignorant of message attachments. It only sees the
> "main" message and its parts. A part of the message attachment message is
> also a part of the main message, no?

If I understand it right, you cache either the "entire message" or each single part. The "entire message" cache may contain only the "top level" message content or it maybe contains everthing: top level content and all the parts and/or attachments and only then it is considered "not modified".

> 
> I'm not sure what you mean by "eml link". Maybe you're referring to opening
> the .eml attachment with a double-click.

Yes, in the example email POSTA CERTIFICATA it is called postacert.eml. In the code, when not set to inline attachement, it refers to opening attachment with "links".

> The logic that determines which
> body parts to fetch is in nsIMAPBodyShell.cpp.
> 
> I believe that if you click on a message part, so the message attachment of
> the main message, that part is fetched entirely.

Yes, but when I click on the postacert.eml link it downloads/fetches the *entire* message, not just the postacert.eml content. And it never caches it. (WRONG, see below.)

> 
> So given the current constraints, the following should happen:
> When you double-click the message attachment message it is fetched fully.

Do you mean: When you double-click the message attachment, the message is fetched fully? That is what I see when I click the eml link. If I click the png link at the bottom of  the postacert.eml screen, the png file downloads and displays fine in the application and the png part is cached so a re-click fetches it from cache. If I close the postacert.eml window and click the eml link again, it fetches 3987 bytes from the postacert.eml cache and does not go to the network and does not download the entire message again like it did the first time I click on the eml link.

> When you double-click it again, it should be already cached.

Correct.

> When you double-click an (image) attachment of the attached message, it will
> be fetched repeatedly, since the
> message attachment is cached (if at all) as a part of the main message, not
> a message in its own right.
> With luck, the (image) attachment could be cached as part of the main
> message.
> If however you access the attachment of the message attachment via the main
> message, it should be cached as a part.
> 
> Do I make sense?

Not sure...

> 
> Main message
>   Part: Message Attachment <=== Only this will/should be cached.
>         Part: Image
> 
> Different view:
>   Part: Message Attachment
>   Part: Image  <=== image interpreted as part of main message, could/should
> be cached.

(In reply to Jorg K (GMT+1) from comment #19)
> I have to take part of this back. When you don't show attachments inline,
> then the (image) attachment of the message attachment message doesn't show
> as part of the main message, so you can't click it.

Yes, the png attachment appears as only a part of the inner postacert.eml and not on the top-level email. But it does appear as a normal attachment when attachments are displayed inline.

> 
> I briefly experimented with that message on an IMAP server and I could never
> open the image completely, it was always shown in half.

I just tried where I copied it to your tbtest@jorgk.com/Inbox/rubbish with no offline store and attachments not inline and the png opens fine for me. Just a bit slower than with my localhost connection on my dovecot server but I see the same network activity.

Ok, I see a little better what is happening. When I initially click the postacert.eml link, like I said, the whole message is downloaded and I think it actually gets stored in this url/cache entry that is marked "Not Modified":

imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E39?part=1.1.3&filename=postacert.eml

Then if I close the postacert.eml page and re-click the eml link, these 3 url's are passed to OpenCacheEntry() sequentially:

|imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E39?part=1.1.3&filename=postacert.eml|
|imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E39?part=1.1.3&type=message/rfc822&filename=postacert.eml|
|imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E39?part=1.1.3&type=application/x-message-display&filename=postacert.eml|

In each case, they are "normalised" to the 1st url and read from cache (3987 bytes, apparently the length of the postacert.eml header and top-level content) with 3 calls to the OpenCacheEntry() chain. So I am wrong about this fetch of entire message going to bit-bucket. (I think just one read from cache should be enough here.)

It seems like cache entry |imap://gene@localhost:130/fetch%3EUID%3E.INBOX.555%3E39?part=1.1.3&filename=postacert.eml| contains the whole message and is "not modified" so it could be used to read any part of the message from cache without going to the network. But not sure how, or if, you would detect this in the code.
Well, I'm probably wrong again. I don't think the cache actually contains the full message but really only 3987 bytes even though the whole email is fetched. I'm don't know how to tell how much is actually being written to cache, just know how much is "available" when it is read. Don't see where the 3987 is coming from and don't see it in wireshark trace or in the imap.log.

This attachment is the printf trace that occurs when I clear cache, select the POSTA email, click on the postcert.eml link and bring up the mostly blank page. Then without clicking on the png link, close the blank page. Then click the eml link again.

I have added some annotations showing what happens and when.

I have added some additional printfs that show when "modified" or "not modified" is set and also when CR/LR is split (7 times while downloading the whole message in chunks).
Where do I start.

I've opened the message attachment from the "tbtest" account and see: "This body part will be downloaded on demand." as the mail body. Opening the image attachment gave me a broken image (about 10%) on the first attempt, and a complete image on subsequent attempts, where the 3rd, 4th and further attempts are very fast, so some cashing must be going on there. I haven't run it with the debug patch. One idea would be to introduce a new logging flag, say, IMAPCache and convert the debug into logging. That would help tremendously.

You can also enable cache2:5 logging, I'm sure you know: https://wiki.mozilla.org/MailNews:Logging

As for:
Re: https://hg.mozilla.org/comm-central/rev/4868c63422f5#l1.151
What you describe there, the interleaved download of two things at the time with data from one affecting the other, seems terrible. Do you really think removing the assert and papering over the issue is the right approach?
(In reply to Jorg K (GMT+1) from comment #23)
> 
> I've opened the message attachment from the "tbtest" account and see: "This
> body part will be downloaded on demand." as the mail body. Opening the image
> attachment gave me a broken image (about 10%) on the first attempt, and a
> complete image on subsequent attempts, where the 3rd, 4th and further
> attempts are very fast, so some cashing must be going on there.

I see the same thing too when I download from tbtest. I see it when I click the png link while the full message download is still in progress (which also triggers the interleaving assert due to lack of \n line). I don't see it with my localhost based server (need to click the png faster?). When I tried tbtest before I waited longer, maybe until the full message was downloaded before clicking the png link.

> One idea would be to introduce a new logging
> flag, say, IMAPCache and convert the debug into logging. That would help
> tremendously.

That sounds like a good idea. But I have no problem with just the printfs right now.

> 
> You can also enable cache2:5 logging, I'm sure you know:
> https://wiki.mozilla.org/MailNews:Logging

Will give it a try.

> 
> As for:
> Re: https://hg.mozilla.org/comm-central/rev/4868c63422f5#l1.151
> What you describe there, the interleaved download of two things at the time
> with data from one affecting the other, seems terrible. Do you really think
> removing the assert and papering over the issue is the right approach?

Especially in light of this partial download of the png, probably not a good idea to ignore this error.
OK, I looked at the Italian message with the debug patch. We should construct a slightly smaller message since it's painful to wait for the loads over the internet.

I saw different behaviours, most likely depending on when the message attachment is clicked:

Cached case:
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.ru
bbish%3E3?part=1.1.3&filename=postacert.eml|
=== nsImapMockChannel::OnCacheEntryAvailable: READ, size=3987
=== reading part, should use cache entry
=== End separate processing: should use cache entry: 1
=== After header check: should use cache entry: 1
=== Cache entry accepted

But I also saw:
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.ru
bbish%3E3?part=1.1.3&filename=postacert.eml|
=== nsImapMockChannel::OnCacheEntryAvailable: WRITE END
=== nsImapUrl::SetContentModified Modified View As Link
=== nsImapProtocol::PostLineDownLoadEvent: Written 3987 bytes

I also saw cases where the image was cached:
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.ru
bbish%3E3?part=1.1.3.1.2&filename=encre_0434.png|
=== nsImapMockChannel::OnCacheEntryAvailable: READ, size=16712801
=== reading part, should use cache entry
=== End separate processing: should use cache entry: 1
=== After header check: should use cache entry: 1
=== Cache entry accepted

However, I also saw repeated downloads:
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.ru
bbish%3E3?part=1.1.3&filename=postacert.eml/;section=1.3.2?part=1.1.3.1.2|
=== nsImapMockChannel::OnCacheEntryAvailable: READ, size=0
=== reading part, should use cache entry
=== End separate processing: should use cache entry: 1
=== After header check: should use cache entry: 0
=== Cache entry rejected
=== nsImapMockChannel::ReadFromMemCache: Returning 80004005
=== nsImapMockChannel::OnCacheEntryAvailable: READ not successful

So something isn't quite reproducible.

I had noticed this error here:
=== nsImapService::DecomposeImapURI |imap://tbtest%40jorgk%2Ecom@mail.your-serve
r.de:143/fetch%3EUID%3E.INBOX.rubbish%3E3?part=1.1.3&type=application/x-message-
display&filename=postacert.eml|
[952, Main Thread] ###!!! ASSERTION: URI is empty: '!aURI.IsEmpty()', file c:/mo
zilla-source/comm-central/mozilla/rdf/base/nsRDFService.cpp, line 808
[952, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80070
057: file c:/mozilla-source/comm-central/mailnews/imap/src/nsImapService.cpp, li
ne 889
[952, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80070
057: file c:/mozilla-source/comm-central/mailnews/imap/src/nsImapService.cpp, li
ne 858
[952, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80070
057: file c:/mozilla-source/comm-central/mailnews/imap/src/nsImapService.cpp, li
ne 1148

which happens in nsImapService::DecomposeImapURI() when called from nsImapService::StreamMessage()

The URL has a %3E instead of a # so nsParseImapMessageURI() return an empty folder:
https://dxr.mozilla.org/comm-central/rev/a8eecfe6de793af00e98cb1488515199c5fb73fb/mailnews/imap/src/nsImapUtils.cpp#65

So I used
   int32_t keySeparator = MsgRFindChar(uriStr, '#', folderEnd);
+  if (keySeparator == -1)
+    keySeparator = uriStr.RFind("%3E");
   if(keySeparator != -1)
   {
Maybe that made a difference.
OK, I've placed a small test message into the folder AAA, in fact I imported it twice.

After the import which appears to cache the entire main message, opening the message attachment gives:

=== Calling AsyncOpenURI() on entire message to get part
=== nsImapMockChannel::OnCacheEntryAvailable, new=0
=== nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=1
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2|  <== INBOX.AAA%3E2
=== nsImapMockChannel::OnCacheEntryAvailable: READ, size=82121
=== reading full message, should use cache entry: 1
=== annotation: |Not Modified|
=== messageSize 82121
=== End separate processing: should use cache entry: 1
=== After header check: should use cache entry: 1
=== Cache entry accepted

Later, I can't reproduce the "get part from entire message=1" any more, I get the EML part cached separately, here shown on the first message:

=== Calling AsyncOpenURI() on existing part, will read
=== nsImapMockChannel::OnCacheEntryAvailable, new=0
=== nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=0
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E1?part=1.2&filename=This%20is%20a%20message%20with%20an%20attached%20image.eml| <== INBOX.AAA%3E1
=== nsImapMockChannel::OnCacheEntryAvailable: READ, size=2635
=== reading part, should use cache entry
=== End separate processing: should use cache entry: 1
=== After header check: should use cache entry: 1
=== Cache entry accepted

Opening the image I saw:
=== Calling AsyncOpenURI() on entire message to get part
=== nsImapMockChannel::OnCacheEntryAvailable, new=0
=== nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=1
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E1|
=== nsImapMockChannel::OnCacheEntryAvailable: READ, size=82121
=== reading full message, should use cache entry: 1
=== annotation: |Not Modified|
=== messageSize 82121
=== End separate processing: should use cache entry: 1
=== After header check: should use cache entry: 1
=== Cache entry accepted

So caching on small things seems to work quite well.

Anyway, maybe an easier message to use to explore the basics before getting into the difficult message which has chunking issues.
Maybe we should understand how it should work first, to then understand why it doesn't work on larger messages.
Copied a shorter version with the same structure to your server. Has a much smaller png.
With your new short message I have to make sure there was no offline mbox file in use for folder AAA, then with cache cleared 

Open the message. See the top level message. Bodystructure, header and top level message get fetched and just the top level gets stored in the "entire message" cache entry.

Then I click the eml link and it tries to find this part in the "entire message" cache and doesn't. So I then see an imap fetch of the whole message occur. It is getting stored to the cache (aNew=1) but not sure which cache. I think the cache for the internal eml part actually receives this whole message download.

When I click on the jpg link, the jpg downloads again and gets stored in its cache part. (It was already downloaded as a part of the whole message fetch.)

Now when I exit this email and come back I see:

Top level is fetched again since it is not stored in "entire message" because it is "modified" (but actually it is stored there).
Click on eml link, gets it from eml part cache with no imap fetch.
Click on jpg link, gets it from jpg part cache with no imap fetch.

This is the same as I observer with the large Italian message if you ignore the chunking error.

I need to enable cache2 logging maybe to see some more details.
I think cach2 logging won't tell you any more. There is only one cache, but we used two different keys. If the key has a "?", it's for a part.

There seem to be different scenarios and I haven't worked out what's triggering them. One debug run I saw was this, but I tried to reproduce it and couldn't :-(

First access:

=== nsImapMockChannel::OpenCacheEntry |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2|
=== Calling AsyncOpenURI() on entire message
=== nsImapMockChannel::OnCacheEntryAvailable, new=1
=== nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=0
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2|
=== nsImapMockChannel::OnCacheEntryAvailable: WRITE END
=== nsImapUrl::SetContentModified Modified View As Link <====
=== nsImapProtocol::PostLineDownLoadEvent: Written 2635 bytes <==== Incomplete

Second access:

=== nsImapMockChannel::OpenCacheEntry |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2|
=== Calling AsyncOpenURI() on entire message
=== nsImapMockChannel::OnCacheEntryAvailable, new=0
=== nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=0
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2|
=== nsImapMockChannel::OnCacheEntryAvailable: READ, size=2635
=== reading full message, should use cache entry: 0
=== annotation: |Modified View As Link|
=== End separate processing: should use cache entry: 0
=== After header check: should use cache entry: 0
=== Cache entry rejected
=== nsImapMockChannel::ReadFromMemCache: Returning 80004005
=== nsImapMockChannel::OnCacheEntryAvailable: READ not successful <==== Rejected, will be doomed.

Third access:

=== nsImapMockChannel::OpenCacheEntry |imap://tbtest%40jorgk%2Ecom@mail.your-ser
ver.de:143/fetch%3EUID%3E.INBOX.AAA%3E2|
=== Calling AsyncOpenURI() on entire message
=== nsImapMockChannel::OnCacheEntryAvailable, new=1
=== nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=0
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2|
=== nsImapMockChannel::OnCacheEntryAvailable: WRITE END
=== nsImapMockChannel::OpenCacheEntry |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2|
=== Calling AsyncOpenURI() on entire message
=== nsImapProtocol::PostLineDownLoadEvent: Written 15969 bytes
=== nsImapProtocol::PostLineDownLoadEvent: Written 15984 bytes
=== nsImapProtocol::PostLineDownLoadEvent: Written 15984 bytes
=== nsImapProtocol::PostLineDownLoadEvent: Written 15984 bytes
=== nsImapProtocol::PostLineDownLoadEvent: Written 15984 bytes
=== nsImapProtocol::PostLineDownLoadEvent: Written 2216 bytes
=== nsImapUrl::SetContentModified Not Modified
=== nsImapProtocol::PostLineDownLoadEvent: Written 15969 bytes
=== nsImapProtocol::PostLineDownLoadEvent: Written 15984 bytes
=== nsImapProtocol::PostLineDownLoadEvent: Written 15984 bytes
=== nsImapProtocol::PostLineDownLoadEvent: Written 15984 bytes
=== nsImapProtocol::PostLineDownLoadEvent: Written 15984 bytes
=== nsImapProtocol::PostLineDownLoadEvent: Written 2216 bytes

This time we read a whole lot more from the connection and wrote it to the cache.

Next access:

=== nsImapMockChannel::OnCacheEntryAvailable, new=0
=== nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=0
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2|
=== nsImapMockChannel::OnCacheEntryAvailable: READ, size=82121
=== reading full message, should use cache entry: 1
=== annotation: |Not Modified|
=== messageSize 82121
=== End separate processing: should use cache entry: 1
=== After header check: should use cache entry: 1
=== Cache entry accepted <=== All good.

So the first fetch is incomplete. The next fetch is complete and from then on, we can also also get parts:

=== nsImapMockChannel::OpenCacheEntry |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2?part=1.2&type=application/x-message-display&filename=This%20is%20a%20message%20with%20an%20attached%20image.eml|
=== Calling AsyncOpenURI() on entire message to get part <=== Part !!
=== nsImapMockChannel::OnCacheEntryAvailable, new=0
=== nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=1
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2|
=== nsImapMockChannel::OnCacheEntryAvailable: READ, size=82121
=== reading full message, should use cache entry: 1
=== annotation: |Not Modified|
=== messageSize 82121
=== End separate processing: should use cache entry: 1
=== After header check: should use cache entry: 1
=== Cache entry accepted

This all works. I can even retrieve the image of the message attachment from the full message in the cache:

=== nsImapMockChannel::OpenCacheEntry |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2?part=1.2.1.2&type=image/jpeg&filename=A01.JPG|
=== Calling AsyncOpenURI() on entire message to get part
=== nsImapMockChannel::OnCacheEntryAvailable, new=0
=== nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=1
=== |imap://tbtest%40jorgk%2Ecom@mail.your-server.de:143/fetch%3EUID%3E.INBOX.AAA%3E2|
=== nsImapMockChannel::OnCacheEntryAvailable: READ, size=82121
=== reading full message, should use cache entry: 1
=== annotation: |Not Modified|
=== messageSize 82121
=== End separate processing: should use cache entry: 1
=== After header check: should use cache entry: 1
=== Cache entry accepted

But there appear to be other variations where more fetches happen and where the part is actually cached. I think they come from opening the message.

(In reply to gene smith from comment #29)
> Open the message. See the top level message. Bodystructure, header and top
> level message get fetched and just the top level gets stored in the "entire
> message" cache entry.
Yes, but it's tagged "incomplete" for some reason.

> Then I click the eml link and it tries to find this part in the "entire
> message" cache and doesn't. So I then see an imap fetch of the whole message
> occur. It is getting stored to the cache (aNew=1) but not sure which cache.
Yes, it does find it but it rejects it and loads the message again.
General comment: Before I fixed bug 629738, there was no caching for parts/attachments. So it's regrettable if we fetch messages too often, but not the end of the world. The cases we should focus on are those where attachments are truncated or otherwise broken.

This bug was about a certain assert, see bug summary, and has since focused on another assert related to the chunking. Now we have drifted into the deficiencies of the caching algorithm especially for messages with message attachments which in turn have attachments.

I'm not against addressing all those issues, but it needs to be done in a very systematic way otherwise we're spending hours going around in circles. And timing issues and slightly different steps carried out make the thing even more confusing. Let's have some clearly defined reproducible steps and then let's see what we can fix or improve. And let's move separate issues into separate bugs.

So what are the steps to trigger the assert mentioned in the summary and what are the effects for the user since an assert is mute in an opt build. That would take us back to comment #5.
(In reply to Jorg K (GMT+1) from comment #30)
> I think cach2 logging won't tell you any more. There is only one cache, but
> we used two different keys. If the key has a "?", it's for a part.

Yes, the cache2 log didn't help any. Also, I understand about the two key forms, "entire message" and "part" style.

> 
> There seem to be different scenarios and I haven't worked out what's
> triggering them. One debug run I saw was this, but I tried to reproduce it
> and couldn't :-(

I assume here you are not going into the parts/attachements and are just parked at the top level email. I tried to duplicate your accesses by clearing cache and clicking on the email so just the top level is visible. Then for another access I would select Junk folder and come back to AAA folder and the test emails was accessed again.

I duplicated for the most part your first and second access trace but I never see the 3rd where it gets literally the entire message and marks it "not modified". The only way I ever see that is to maybe right click the message and do a save-as or possibly view-source. Of course, after that all parts and pieces are readable from cache.

> 
> (In reply to gene smith from comment #29)
> > Open the message. See the top level message. Bodystructure, header and top
> > level message get fetched and just the top level gets stored in the "entire
> > message" cache entry.
> Yes, but it's tagged "incomplete" for some reason.

Yes and the reason is because the "entire message" cache entry contains only a small part of the entire message.

> 
> > Then I click the eml link and it tries to find this part in the "entire
> > message" cache and doesn't. So I then see an imap fetch of the whole message
> > occur. It is getting stored to the cache (aNew=1) but not sure which cache.
> Yes, it does find it but it rejects it and loads the message again.

Yes it does reject it but why does it fetch the whole message here (and then essentially discard it) when it really should just fetch the internal eml part that is visibly displayed? (But as you point out, "not the end of the world").

(In reply to Jorg K (GMT+1) from comment #31)
> General comment: Before I fixed bug 629738, there was no caching for
> parts/attachments. So it's regrettable if we fetch messages too often, but
> not the end of the world. The cases we should focus on are those where
> attachments are truncated or otherwise broken.

Like you observed with the partial display of the large png image. I didn't know this caching stuff was relatively new!

> then let's see what we can fix or improve. And let's move separate issues
> into separate bugs.

I have found one thing that I have been able to fix regarding caching that I will open a separate bug on later and not go into it here.

> 
> So what are the steps to trigger the assert mentioned in the summary and
> what are the effects for the user since an assert is mute in an opt build.
> That would take us back to comment #5.

I don't know the step to duplicate that assert now. When I do what I thought I did before to cause it all I see now is the chunking assert (that maybe is causing the partial graphic display).

I do see one possible cause of the comment 5 assert. If mTryingToReadPart gets set true near the bottom of OpenCacheEntry() and when it is checked in OnCacheEntryAvailable() and OpenCacheEntry() is *not* immediately called again to set it false, the next time OnCacheEntryAvailable() is called, mTryingToReadPart will still be true and this time OnCacheEntryAvailable() might be called for a cache write (aNew=1) which will trigger the assert.  So an additional reset of mTryingToReadPart like this might help in OnCacheEntryAvailable():

    if (aNew)
    {
      // If we are writing, then insert a "stream listener Tee" into the flow
      // to force data into the cache and to our current channel listener.
      nsCOMPtr<nsIStreamListenerTee> tee = do_CreateInstance(NS_STREAMLISTENERTEE_CONTRACTID, &rv);
      if (NS_SUCCEEDED(rv))
      {
        nsCOMPtr<nsIOutputStream> out;
        // This will fail with the cache turned off, so we need to fall through
        // to ReadFromImapConnection instead of aborting with NS_ENSURE_SUCCESS(rv,rv).
        rv = entry->OpenOutputStream(0, getter_AddRefs(out));
        if (NS_SUCCEEDED(rv))
        {
          rv = tee->Init(m_channelListener, out, nullptr);
          m_channelListener = do_QueryInterface(tee);
        }
        else
          NS_WARNING("IMAP Protocol failed to open output stream to Necko cache");
      }
    }
    else
    {
+     mTryingToReadPart = false;
      rv = ReadFromMemCache(entry);
      if (NS_SUCCEEDED(rv))
      {
        NotifyStartEndReadFromCache(true);
        entry->MarkValid();
        return NS_OK; // Kick out if reading from the cache succeeded.
      }
      entry->AsyncDoom(nullptr); // Doom entry if we failed to read from cache.
      mailnewsUrl->SetMemCacheEntry(nullptr); // We aren't going to be reading from the cache.
    }
  } while (false);

I will try more to duplicate the assert without this change.
(In reply to gene smith from comment #32)
> I duplicated for the most part your first and second access trace but I
> never see the 3rd where it gets literally the entire message and marks it
> "not modified". The only way I ever see that is to maybe right click the
> message and do a save-as or possibly view-source. Of course, after that all
> parts and pieces are readable from cache.
As I said, I had trouble reproducing it. One case I think it happens is when dragging the message from the desktop into the folder.

> Yes it does reject it but why does it fetch the whole message here (and then
> essentially discard it) when it really should just fetch the internal eml
> part that is visibly displayed? (But as you point out, "not the end of the
> world").
Yes, there is room for improvement here. I've never fully understood the marking as |Modified View As Link|. Further reading: Bug 1300402.

> I do see one possible cause of the comment 5 assert. If mTryingToReadPart
> gets set true near the bottom of OpenCacheEntry() and when it is checked in
> OnCacheEntryAvailable() and OpenCacheEntry() is *not* immediately called
> again to set it false, the next time OnCacheEntryAvailable() is called,
> mTryingToReadPart will still be true and this time OnCacheEntryAvailable()
> might be called for a cache write (aNew=1) which will trigger the assert. 
> So an additional reset of mTryingToReadPart like this might help in
> OnCacheEntryAvailable():
What you're discussing here sounds like and interleaved access.

I don't quite follow the argument. mTryingToReadPart is set at the end of OpenCacheEntry() before calling AsyncOpenURI() which should take us to OnCacheEntryAvailable(). In there, the only code path is to call OpenCacheEntry() right away. Oh, hold on, only if the message is "not modified". Otherwise, we fall out of the fake do/while (non-)loop and right into ReadFromImapConnection().

>     else
>     {
> +     mTryingToReadPart = false;
Can you get here if mTryingToReadPart was true to start with? I can't see that. But maybe setting this to false before the final ReadFromImapConnection() is missing here. Do you agree?
(In reply to Jorg K (GMT+1) from comment #33)
> (In reply to gene smith from comment #32)
> One case I think it happens is when
> dragging the message from the desktop into the folder.

I haven't tried that with offline store disabled and only cache in use. Will try it.

> Yes, there is room for improvement here. I've never fully understood the
> marking as |Modified View As Link|. Further reading: Bug 1300402.

I've seen that before and sort of understand the issue. If you use the partial/modified "entire message" cache, when you do a save-as or view-source you get missing content replaced with "attachment will be loaded on demand" in the result.

> Oh, hold on, only if the message is "not
> modified". Otherwise, we fall out of the fake do/while (non-)loop and right
> into ReadFromImapConnection().

We only fall out of the do/while with aNew false and mTryingToReadPart true if entry is modified and OpenCacheEntry() call fails for some reason and it does ReadFromImapConnection(). If mTryingToReadPart is true and the entry is Not Modified, then it does to the else (aNew false) down below and calls ReadFromMemCache(). 

> 
> >     else
> >     {
> > +     mTryingToReadPart = false;
> Can you get here if mTryingToReadPart was true to start with? I can't see
> that.

I think it could if aNew is false, mTryingToReadPart is true and the entry is "not modified". However, not the best place to do the reset; see following.

> But maybe setting this to false before the final
> ReadFromImapConnection() is missing here. Do you agree?

I agree. resetting mTryingToReadPart to false right before ReadFromImapConnection() and the main return is needed. However, I think it is also needed at another point in the do/while where a return occurs and would cause the main return at the bottom to be skipped.

    else  <<<<<<< aNew is false
    {
      rv = ReadFromMemCache(entry);
      if (NS_SUCCEEDED(rv))
      {
        NotifyStartEndReadFromCache(true);
        entry->MarkValid();
+       mTryingToReadPart = false;    
        return NS_OK; // Kick out if reading from the cache succeeded.
      }
      entry->AsyncDoom(nullptr); // Doom entry if we failed to read from cache.
      mailnewsUrl->SetMemCacheEntry(nullptr); // We aren't going to be reading from the cache.
    }
  } while (false);

+ mTryingToReadPart = false;
  // If reading from the cache failed or if we are writing into the cache, default to ReadFromImapConnection.
  return ReadFromImapConnection();
}
(In reply to gene smith from comment #34)
> I think it could if aNew is false, mTryingToReadPart is true and the entry
> is "not modified".
Thanks, I see it now.

> However, not the best place to do the reset; see
> following.
Yes, it should be reset in various places.

How about eliminating any possible damage by doing this:
    if (mTryingToReadPart)
    {
       ... The following call needs mTryingToReadPart to be set.
          rv = OpenCacheEntry();
          if (NS_SUCCEEDED(rv))
       ...
    }
    mTryingToReadPart = false;

Am I missing something?
Re: bug 1430480 comment 68

I have now determined when and (sort of) why the MOZ_ASSERT(!aNew) occurs. As previously noted, it occurs when you copy a message from server A to server B and server B destination folder is setup to use offline store and not just cache. You have to open the message on Server B before the "auto sync" occurs that saves the message to the mbox file. If you wait a while (maybe no more than a minute) there is no assert since the message gets saved to and read back from the mbox file. But if clicked before the auto-sync this occurs:

I just copied the large POSTA message from charter imap to my localhost dovecot account. Then I click on the POSTA message in dovecot.

gds: calling OpenCacheEntry()
eds: nsImapMockChannel::OpenCacheEntry |imap://gene@localhost:130/fetch%3EUID%3E.INBOX%3E464|
gds: part = 
gds: filename = 
eds: Calling AsyncOpenURI() on entire message

OpenCacheEntry() is called with message UID 464 which is an "entire message" url with no part or filename and then calls AsyncOpen() with the "entire message" url.

eds: nsImapMockChannel::OnCacheEntryAvailable, new=0
eds: nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=0
eds: ||
gds: NO cache READ or WRITE: OnCacheEntryAvailable entry status is bad

Then OnCacheEntryAvailable() occurs and seems to indicate there is a readable url (new=0) ready. However, the url is empty ("||") and at the beginning of the do/while loop there is a check of the function's passed in status parameter and it is not NS_OK, so the do/while falls through, no cache write or read ever occurs. 

eds: nsImapMockChannel::OnCacheEntryAvailable: READ not successful

The fall through of do/while prints the "READ not successful" but still calls ReadFromImapConnection().

gds: calling OpenCacheEntry()
eds: nsImapMockChannel::OpenCacheEntry |imap://gene@localhost:130/fetch%3EUID%3E.INBOX%3E464?part=1.3.1.2&type=image/png&filename=encre_0434.png|
gds: part = ?part=1.3.1.2
gds: filename = &filename=encre_0434.png
eds: Calling AsyncOpenURI() on entire message to get part

Now OpenCacheEntry() is called again with a "part" style url. The part is determined not to "exist" in cache so AsyncOpenURI() is called with the "entire message" url with the mTryingToReadPart variable true, even though the "entire message" was never stored to cache.

eds: nsImapMockChannel::OnCacheEntryAvailable, new=1
eds: nsImapMockChannel::OnCacheEntryAvailable, get part from entire message=1
eds: |imap://gene@localhost:130/fetch%3EUID%3E.INBOX%3E464|
eds: ********ASSERT CRASH!!!!**********

This time OnCacheEntryAvailable() is passed good status and new==1 meaning the url (entire message style) should be written. But since mTryingToReadPart is now true and new is 1, the assert occurs.

The cause of this seems to be the "status" parameter above (*) that is bad on the first "entire message" attempt. I don't know why the status is bad at that point. However, I don't see anything bad happening if the resulting assert is ignored. This fairly complex test message still comes in and displays OK. Also, storing to cache at this point is kind of useless since the mbox (or possibly maildir) will be used for offline store in short order.

(*) Just to be clear, I am referring to the status parameter of this function:
NS_IMETHODIMP
nsImapMockChannel::OnCacheEntryAvailable(nsICacheEntry *entry, bool aNew, nsIApplicationCache* aAppCache, nsresult status)
(In reply to gene smith from comment #36)
> 
> The cause of this seems to be the "status" parameter above (*) that is bad
> on the first "entire message" attempt. I don't know why the status is bad at
> that point. However, I don't see anything bad happening if the resulting
> assert is ignored. This fairly complex test message still comes in and
> displays OK. Also, storing to cache at this point is kind of useless since
> the mbox (or possibly maildir) will be used for offline store in short order.
> 
> (*) Just to be clear, I am referring to the status parameter of this
> function:
> NS_IMETHODIMP
> nsImapMockChannel::OnCacheEntryAvailable(nsICacheEntry *entry, bool aNew,
> nsIApplicationCache* aAppCache, nsresult status)

Looked a little closer and see that the status parameter is actually NS_ERROR_CACHE_KEY_NOT_FOUND because AsyncOpenURI() is called with parameter cacheAccess equal to OPEN_READONLY because the offline store is enabled. However, later calls use OPEN_NORMALLY and don't cause the status parameter to be NS_ERROR_CACHE_KEY_NOT_FOUND and work with no more asserts before the auto sync happens and makes all this moot. TBD: what happens if I disable autosync in cfg editor?
It seems that if I disable autosync and the folder of interest uses offline store, a new message copied in is stored and accessed from cache and not automatically put in mbox. Only when you go to the folder properties and do "download now" does it (and any other newly copied in messages) get stored to mbox and the cache is then not used, until another message is copied in. With autosync enabled, this download is automatic. Is that how it is supposed to work?
Jorg?

(assuming last question is for jorg)
Flags: needinfo?(jorgk)
(In reply to gene smith from comment #38)
> Is that how it is supposed to work?
I have no idea, but I assume the entire area will be reworked in bug 1454542. We really need the IMAPCache logging introduced there to investigate/answer questions like this. Right now, I can't dig into that area.
Flags: needinfo?(jorgk)
Depends on: 1454542
Depends on: 1579341
No longer depends on: 1579341
See Also: → 1678461
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: