Bug 1742975 Comment 53 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

OK, there's lots of different things all going on at once, so it's time for a bit of a recap.

1. The patch in Bug 1728924 landed. This made some changes to the code that copies messages from elsewhere to a local mail store (ie to a local folder). There's always been an assumption that all local messages are in mbox files, and all the code assumes it has direct access to the raw file (which is problematic for all kinds of reasons). This patch removed some file seeks to loosen this assumption.
Unfortunately, it turns out that the messageparser the copy code uses is reused without being reinitialised if multiple messages are being copied. The end result is that the .msf database records the wrong message offsets/size for subsequent messages.
So, if multiple messages come in at once (from IMAP, say)  and are moved to a local folder by a filter rule, the first one will be fine, but the others will look appear up (because of the borked offset/size in the .msf). The backing local mbox file should be OK though. If those screwed-up messages are then copied to another local folder, then the borked offset/size is used and the resultant mbox will contain screwed-up messages :-(

2. The regression was tracked down and fixed in Bug 1742975. However, before this happened, a bunch of changes were made that rely on loosening the "everything is an mbox file" assumption that we're working toward. Changes like moving the protocol-independant message quarantining out of POP3 code (quarantining means just single messages get embargoed by anti-virus, rather than the entire folder).
These changes are what caused the attempted backout of the Bug 1728924 patch to fail.

3. Because the originally-bad patch of Bug 1728924 was out in the tree for a while without the Bug 1742975 fix... a bunch of people ended up with scrambled messages. This _should_ fixable by "repair folder", but it looks like there are some issues  there too (Bug 1740486). I'm not sure that is related - it doesn't happen to everyone, so it might just be that there was already a folder-repair bug for some messages, but the sudden rash of people doing folder-repair has brought more cases to light...
Worth noting that folder-repair is a completely different operation for local folders than for IMAP. For local folders is just rebuilds the .msf file from the mbox file. For IMAP it re-downloads the messages.

Phew.

Next steps:
I'm pretty confident that Bug 1742975 fixes the Bug 1728924 regression. My suspicion is that most of the problems people are having now are due to a combination of data being borked before the fix went in, combined with folder repair not working as it should (very hypothetical example: maybe a badly-formatted message on an IMAP server throwing the folder repair into an endless loop).
So for now, unless we can nail down a replicatable case of new corruption in non-borked folders, I'm going to focus on Bug 1740486, and make sure folder repair is working properly.
OK, there's lots of different things all going on at once, so it's time for a bit of a recap.

1. The patch in Bug 1728924 landed. This made some changes to the code that copies messages from elsewhere to a local mail store (ie to a local folder). There's always been an assumption that all local messages are in mbox files, and all the code assumes it has direct access to the raw file (which is problematic for all kinds of reasons). This patch removed some file seeks to loosen this assumption.
Unfortunately, it turns out that the messageparser the copy code uses is reused without being reinitialised if multiple messages are being copied. The end result is that the .msf database records the wrong message offsets/size for subsequent messages.
So, if multiple messages come in at once (from IMAP, say)  and are moved to a local folder by a filter rule, the first one will be fine, but the others will look appear up (because of the borked offset/size in the .msf). The backing local mbox file should be OK though. If those screwed-up messages are then copied to another local folder, then the borked offset/size is used and the resultant mbox will contain screwed-up messages :-(

2. The regression was tracked down and fixed in Bug 1734847. However, before this happened, a bunch of changes were made that rely on loosening the "everything is an mbox file" assumption that we're working toward. Changes like moving the protocol-independant message quarantining out of POP3 code (quarantining means just single messages get embargoed by anti-virus, rather than the entire folder).
These changes are what caused the attempted backout of the Bug 1728924 patch to fail.

3. Because the originally-bad patch of Bug 1728924 was out in the tree for a while without the Bug 1734847 fix... a bunch of people ended up with scrambled messages. This _should_ fixable by "repair folder", but it looks like there are some issues  there too (Bug 1740486). I'm not sure that is related - it doesn't happen to everyone, so it might just be that there was already a folder-repair bug for some messages, but the sudden rash of people doing folder-repair has brought more cases to light...
Worth noting that folder-repair is a completely different operation for local folders than for IMAP. For local folders is just rebuilds the .msf file from the mbox file. For IMAP it re-downloads the messages.

Phew.

Next steps:
I'm pretty confident that Bug 1734847 fixes the Bug 1728924 regression. My suspicion is that most of the problems people are having now are due to a combination of data being borked before the fix went in, combined with folder repair not working as it should (very hypothetical example: maybe a badly-formatted message on an IMAP server throwing the folder repair into an endless loop).
So for now, unless we can nail down a replicatable case of new corruption in non-borked folders, I'm going to focus on Bug 1740486, and make sure folder repair is working properly.
OK, there's lots of different things all going on at once, so it's time for a bit of a recap.

1. The patch in Bug 1728924 landed. This made some changes to the code that copies messages from elsewhere to a local mail store (ie to a local folder). There's always been an assumption that all local messages are in mbox files, and all the code assumes it has direct access to the raw file (which is problematic for all kinds of reasons). This patch removed some file seeks to loosen this assumption.
Unfortunately, it turns out that the messageparser the copy code uses is reused without being reinitialised if multiple messages are being copied. The end result is that the .msf database records the wrong message offsets/size for subsequent messages.
So, if multiple messages come in at once (from IMAP, say)  and are moved to a local folder by a filter rule, the first one will be fine, but the others will look appear up (because of the borked offset/size in the .msf). The backing local mbox file should be OK though. If those screwed-up messages are then copied to another local folder, then the borked offset/size is used and the resultant mbox will contain screwed-up messages :-(

2. The regression was tracked down and fixed in Bug 1734847. However, before this happened, a bunch of changes were made that rely on loosening the "everything is an mbox file" assumption that we're working toward. Changes like moving the protocol-independant message quarantining out of POP3 code (quarantining means just single messages get embargoed by anti-virus, rather than the entire folder).
These changes are what caused the attempted backout of the Bug 1728924 patch to fail.

3. Because the originally-bad patch of Bug 1728924 was out in the tree for a while without the Bug 1734847 fix... a bunch of people ended up with scrambled messages. This _should_ fixable by "repair folder", but it looks like there are some issues  there too (Bug 1740486). I'm not sure that is related - it doesn't happen to everyone, so it might just be that there was already a folder-repair bug for some messages, but the sudden rash of people doing folder-repair has brought more cases to light...
Worth noting that folder-repair is a completely different operation for local folders than for IMAP. For local folders is just rebuilds the .msf file from the mbox file. For IMAP it re-downloads the messages.

Phew.

Next steps:
I'm pretty confident that Bug 1734847 fixes the Bug 1728924 regression. My suspicion is that most of the problems people are having now are due to a combination of data being borked before the fix went in, combined with folder repair not working as it should (very hypothetical example: maybe a badly-formatted message on an IMAP server throwing the folder repair into an endless loop).
So for now, unless we can nail down a replicatable case of new corruption in non-borked folders, I'm going to focus on Bug 1740486, and make sure folder repair is working properly.

[UPDATE: updated links to Bug 1734847, with the regression fix. They originally linked to _this_ bug by mistake]

Back to Bug 1742975 Comment 53