Closed Bug 484605 Opened 15 years ago Closed 15 years ago

TB doesn't import messages with many recipients or large msg header from Outlook Express

Categories

(MailNews Core :: Import, defect)

x86
Windows Vista
defect
Not set
critical

Tracking

(thunderbird3.0 .2-fixed)

RESOLVED FIXED
Thunderbird 3.1a1
Tracking Status
thunderbird3.0 --- .2-fixed

People

(Reporter: pepalogik, Assigned: philbaseless-firefox)

Details

(Keywords: dataloss, fixed-seamonkey2.0.3)

Attachments

(1 file, 7 obsolete files)

User-Agent:       Opera/9.64 (Windows NT 6.0; U; cs) Presto/2.1.1
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; cs; rv:1.9.1b3pre) Gecko/20090223 Thunderbird/3.0b2

In the discussion of bug 415457, I was asked to test the mail import from Outlook Express in new TB 3 beta. I did so and the result is that not all messages have been successfully imported. I have analyzed them and found out that it obviously depends only on the number of recipients (or maybe the length of header). E.g. a message with 296 recipients has been imported but a message with 533 recipients has not.

Reproducible: Always

Steps to Reproduce:
1. Import mail from Outlook Express.
Actual Results:  
Messages with too many recipients are missing.

Expected Results:  
All messages should be imported.
Thanks Jan
Severity: normal → critical
Component: Migration → Import
Keywords: dataloss
Product: Thunderbird → MailNews Core
QA Contact: migration → import
Version: unspecified → Trunk
confirming...assuming Jan is correct
Status: UNCONFIRMED → NEW
Ever confirmed: true
we have a max OE record of 16000.  Probably because we didn't know what size to make it. I'll post a patch and Neil can see if it's ok.

http://mxr.mozilla.org/comm-central/source/mailnews/import/oexpress/nsOE5File.cpp#268
Assignee: nobody → philbaseless-firefox
Status: NEW → ASSIGNED
%S for wide char in our logging doesn't work. I changed where needed here.
see log in other attachment to see if you can catch any problems with this change.
Import now works with large headers.
Attachment #419220 - Flags: review?(neil)
Attachment #419220 - Flags: approval-thunderbird3.0.1?
Attached file log of patched import (obsolete) —
Attachment #419221 - Flags: review?(neil)
This is the problem table we fetch that has a lot of strings one of them is recipients but it could be any number of things to make it a rather larger record that we don't need.

We only need index 1,2,and 4 so I'll change the patch accordingly.

Reason for the extraneous info (spam) is in case there are other bugs filed wanting other things. I note we are using the create/sent date and maybe we ought to use the download date available in this header.
Comment on attachment 419220 [details] [diff] [review]
record size increased for OE import-also printf %S not valid

>-#define  kMailboxBufferSize  0x4000
Looks like this was the value in the original Outlook Express import code.
I've got no particular objection to changing this constant from one apparently arbitrary value to another arbitrary value.

>+// records know to go to 0x7400 bug 484605
How about
// Bug 484605 has an example of a record of size 0x7400

>+#define  kMailboxBufferSize  0xffff
Nit: might as well use single spaces.

Also the logging changes look reasonable to me.
Attachment #419220 - Flags: review?(neil) → review+
Comment on attachment 419220 [details] [diff] [review]
record size increased for OE import-also printf %S not valid

hang on please.
More appropriate patch coming. This one still has a max value that could be exceeded and there will be more unnecessary data loss.
Attachment #419220 - Flags: review+
I have comments on the problem. This resolves unknown buffer size required instead of guessing.
tested.
Attachment #419220 - Attachment is obsolete: true
Attachment #419254 - Flags: review?(neil)
Attachment #419220 - Flags: approval-thunderbird3.0.1?
Attached image view of import with patch (obsolete) —
Attachment #419255 - Flags: review?(neil)
Comment on attachment 419254 [details] [diff] [review]
reads record data from file due to undefined record node size

>-#define  kMailboxBufferSize  0x4000
>+#define kMailboxBufferSize 0x4000
No point just doing the whitespace change.

>+        if (ReadBytes( pFile, pBuffer, kDontSeek, PR_MIN(recordSize, kMailboxBufferSize))) {
So, do you actually use all of the record? It looks like you're only reading the attributes, and then you read the data separately. Or would it help to read an amount of data, and try to use that to avoid a separate read?
the structure is
header structure
  number of indexes
index structure
  index
  data of this index or datamass[x]
<repeat for up to x23 indexes>
datamass
  length varies.

The indexes we need are originally in the first part of the datamass but after changes I'm only 99% sure so you're 99% right.....

 I don't know how changes to drafts and flags are made by oe, but I can't imagine they wouldn't write out a new record or over the old one and preserve the expected order of things. If some of the strings change then they definitely would write out a new record but that data is past our data in the datamass.

As far as read performance maybe we ought cut down the max size from 0x4000 to 0x300 or we can take a pretty good chance of getting all our data in the buffer the first time
(In reply to comment #12)
> the structure is
> header structure
>   number of indexes
> index structure
>   index
>   data of this index or datamass[x]
> <repeat for up to x23 indexes>
> datamass
>   length varies.
And what part(s) of the structure are represented by the recordSize?
all, including the datamass. That's why it can grow. There's an index for recipients. A string. And there are other strings for header data.
So the old code hoped that the entire datamass fitted in 0x4000, while the new code only really cares about the indexes? In that case I'd either only bother reading the indexes, unless you want to read the first 0x4000 of the datamass to try to avoid separate reads for the offset, flags and time.
I like that idea,
I just imported the buggy email and I also did an import of my main oe mail which has 16117. All email showed up. Spot checking, large attachments are complete, star, read, reply and forward flags check out.
Import took about 45 seconds.
Attachment #419254 - Attachment is obsolete: true
Attachment #419269 - Flags: review?(neil)
Attachment #419254 - Flags: review?(neil)
BTW, I don't think I know for sure that indexes are sorted so I'm reading all 30h. Other sources on web have the max at 23h and MSDN shows 28h.
Comment on attachment 419269 [details] [diff] [review]
ver with reading indexes into buffer and index data from file

> #define  kMailboxBufferSize  0x4000
>+#define  kMaxAttrQnt         0x0030
I think we can afford to spell it "Count", not "Qnt"

>+      if (marker == pIndex[i]  && numAttrs <= kMaxAttrQnt) {
Nit: extra space crept in before "&&"
Attachment #419269 - Flags: review?(neil) → review+
(In reply to comment #18)
> >+#define  kMaxAttrQnt         0x0030
> I think we can afford to spell it "Count", not "Qnt"

Ha! That's what I was trying to think of, duh
Attached patch for checkin r+= neil@httl.net (obsolete) — Splinter Review
Attachment #419221 - Attachment is obsolete: true
Attachment #419227 - Attachment is obsolete: true
Attachment #419255 - Attachment is obsolete: true
Attachment #419269 - Attachment is obsolete: true
Attachment #419386 - Flags: review+
Attachment #419221 - Flags: review?(neil)
Attachment #419255 - Flags: review?(neil)
Summary: TB skips messages with too many recipients when importing mail from Outlook Express → TB doesn't import messages with many recipients or large msg header from Outlook Express
Attachment #419386 - Flags: approval-thunderbird3.0.1?
Keywords: checkin-needed
Attachment #419386 - Flags: superreview?(bienvenu)
Comment on attachment 419386 [details] [diff] [review]
for checkin r+= neil@httl.net

As its in mailnews/ this patch currently requires sr. I'm hoping David can give it a quick look.
Comment on attachment 419386 [details] [diff] [review]
for checkin r+= neil@httl.net

looks reasonable, thx very much for the fix, Phil.
Attachment #419386 - Flags: superreview?(bienvenu) → superreview+
Comment on attachment 419386 [details] [diff] [review]
for checkin r+= neil@httl.net

>   PRUint32  flags;
>   PRUint64  time;
>+  PRUint32  dataStart;
>+
> 
>   for (PRUint32 i = 0; i < size; i++) {

don't know how Neil miss this nit!
I'll remove extra line and mark for checkin tonight
The extra line got by Neil because I slipped it in after the review.
Attachment #419386 - Attachment is obsolete: true
Attachment #420450 - Flags: superreview+
Attachment #420450 - Flags: review+
Attachment #419386 - Flags: approval-thunderbird3.0.1?
Keywords: checkin-needed
Comment on attachment 420450 [details] [diff] [review]
checkin
[Checkin: Comment 25 & 27]


http://hg.mozilla.org/comm-central/rev/a3ebce1a6cb5
Attachment #420450 - Attachment description: checkin r+ neil sr+ bienvenu → checkin [Checkin: Comment 25]
Attachment #420450 - Flags: approval-thunderbird3.0.1?
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 3.1a1
Comment on attachment 420450 [details] [diff] [review]
checkin
[Checkin: Comment 25 & 27]

Not going to take this for 3.0.1, will let it bake on trunk a bit longer and will consider for 3.0.2.
Attachment #420450 - Flags: approval-thunderbird3.0.2?
Attachment #420450 - Flags: approval-thunderbird3.0.1?
Attachment #420450 - Flags: approval-thunderbird3.0.1-
Attachment #420450 - Flags: approval-thunderbird3.0.2? → approval-thunderbird3.0.2+
Attachment #420450 - Attachment description: checkin [Checkin: Comment 25] → checkin [Checkin: Comment 25 & 27]
I lack a testcase to verify.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: