Closed Bug 1037505 Opened 5 years ago Closed 5 years ago

test_pop3MoveFilter.js failure matching messages

Categories

(MailNews Core :: Networking: POP, defect)

All
Windows XP
defect
Not set

Tracking

(thunderbird34 fixed)

RESOLVED FIXED
Thunderbird 34.0
Tracking Status
thunderbird34 --- fixed

People

(Reporter: rkent, Assigned: sshagarwal)

Details

(Keywords: intermittent-failure)

Attachments

(2 files, 3 obsolete files)

TEST-UNEXPECTED-FAIL | C:/slave/test/build/xpcshell/tests/mailnews/local/test/unit/test_pop3MoveFilter.js | "Bugzilla has received a request to create a user account using your email address (example@example.org). To confirm that you wa == "Do not reply to this email. You can add comments to this bug at https://bugzilla.mozilla.org/show_bug.cgi?id=436880 -- Configur - See following stack:
C:/slave/test/build/xpcshell/tests/mailnews/local/test/unit/test_pop3MoveFilter.js:verifyMessages:88
resource://gre/modules/Task.jsm:createAsyncFunction/asyncFunction:239
resource://gre/modules/Task.jsm:Task_spawn:164
C:\\slave\\test\\build\\xpcshell\\head.js:_run_next_test:1308
C:\\slave\\test\\build\\xpcshell\\head.js:do_execute_soon/<.run:570
C:\\slave\\test\\build\\xpcshell\\head.js:_do_main:191
C:\\slave\\test\\build\\xpcshell\\head.js:_execute_test:405
-e:null:1
null:null:0
Attached patch Patch v1 (obsolete) — Splinter Review
Attachment #8454969 - Flags: review?(kent)
What I am seeing with this (building under Windows server 2008 with VS2012 and Windows 8 SDK) is a hard failure in the comparison of the maildir preview. The cur directory only has a single file in it. The msgdb has two different items in it, with the two different subjects, but the storeToken in both is the same, and the previews are the same.

Perhaps I am seeing the actual failure that showed up in the random orange? If so, that nature of this is not as we expected (that the messages are correct just out of order), but in fact this is a loss of data.

This will take more investigation on my part, which is hard because I have barely anything building at the moment. I'm going to postpone looking at this so that I can look at your other patches.
Here is my fix. My theory of the bug is that when two files with identical filenames are being created in a very short time span, then CreateUnique is somehow not seeing the first filename, and failing to generate a unique filename. When I was seeing the problem, a printf of the filename showed that it did not change at the point of creation. To solve this, I added a counter to help insure the filename is unique.

It's hard to test this because it is timing sensitive. What do you think? If we do this we'll probably need to get a review from someone else.
Attachment #8455955 - Flags: feedback?(syshagarwal)
I've done a little googling of the problem of uniqueness of maildir names, and what I conclude is that it is recognized as problematic to rely on the file system to ensure uniqueness. It is common to append uniqueness characters to the filename so that you do not have to rely on the file system.

I'm going to treat my patch as a proof of concept, and ask Suyash to incorporate the principles in his patch. But here is what I think we should do. Rather than add the counter to the time number, create a unique filename that looks like this:

nnnn_mmm

were "nnnn" is the existing timestamp value, and "mmm" is am incremented counter. Use leading zeroes in the mmm so that a filename looks like:

14323443940300_002

OK?
Comment on attachment 8454969 [details] [diff] [review]
Patch v1

Review of attachment 8454969 [details] [diff] [review]:
-----------------------------------------------------------------

The changes that you did look fine, but per my comments we need to take additional steps to insure a unique filename.
Attachment #8454969 - Flags: review?(kent) → review-
Comment on attachment 8455955 [details] [diff] [review]
Add an incremented integer to file name to insure unique

Review of attachment 8455955 [details] [diff] [review]:
-----------------------------------------------------------------

Hi,

Sorry I am unable to test it because it works for me as is.
Though, the idea looks great to me.

::: mailnews/local/src/nsMsgMaildirStore.cpp
@@ +645,5 @@
> +
> +  // CreateUnique did not seem to be working reliably when files
> +  //  are created with little time delay, so add an offset to the file
> +  //  name to help ensure it is unique.
> +  newName.AppendInt(static_cast<int64_t>(PR_Now()) + mFileNameOffset++);

Just one thing though, shouldn't we first get the unique filename based on createUnique() and if another file with the same name exists, then append mFileNameOffset?

Though this wouldn't make much difference but we can still avoid mFileNameOffset in many cases this way.
Rest is what you decide.

Thanks.
Attachment #8455955 - Flags: feedback?(syshagarwal) → feedback+
"shouldn't we first get the unique filename based on createUnique() and if another file with the same name exists"

I don't think so. The CreateUnique becomes a secondary way of assuring a unique name in this case, but we leave it just to be safe. It would be very rare that it would come into play, if ever, and that leaves the filenames with a fairly consistent definition. So I prefer to add the counter first, and then use CreateUnique
Okay then, that's good :)
So, shall we go for checking in both the patches?
(In reply to Suyash Agarwal (:sshagarwal) from comment #8)
> Okay then, that's good :)
> So, shall we go for checking in both the patches?

Per commment 4, I asked you to revise the patch some. Could you do that? You can't test the failure, but you can certainly test that the counter generates names that are incremented.

Can you do that? Afterwards we might want to get someone else to review it.
Oh okay, sure.
Attached patch Patch v1.9 (obsolete) — Splinter Review
Okay made the suggested changes.
The thing that still worries me is, if we have too many messages, will
mFileNameOffset not overflow? 

Thanks.
Attachment #8454969 - Attachment is obsolete: true
Attachment #8459139 - Flags: review?(neil)
Attachment #8459139 - Flags: review?(kent)
Comment on attachment 8459139 [details] [diff] [review]
Patch v1.9

I'm sorry but you'll need to do better than that to convince me that the call to CreateUnique is failing. And what about the other call to CreateUnique?
Attachment #8459139 - Flags: review?(neil)
(In reply to neil@parkwaycc.co.uk from comment #12)
> Comment on attachment 8459139 [details] [diff] [review]
> Patch v1.9

PR_NOW() considers microsecs. So, I think that's quite rare scenario that two files will be created in
time less than that. But if they happen, as in the case of this test, we have our own unique counter to cover it up.

Though, this failure doesn't happen for me on Linux so I don't think I can convince you apart from what
rkent says.
Umm.. for the other call to CreateUnique.. maybe we rely on it after our counter?
But my concern from comment 11 remains.

Thanks.
"PR_NOW() considers microsecs"

The units are in microseconds, but on Windows computers the resolution is only 20 milliseconds. So this is not as rare as you might think.
Neil:

Adding path printout to show the post-CreateUnique path like this:

  // CreateUnique, in case we get more than one message per millisecond :-)
  newFile->CreateUnique(nsIFile::NORMAL_FILE_TYPE, 0600);
  newFile->GetNativeLeafName(newName);
  nsAutoCString path;
  newFile->GetNativePath(path);
  printf("newPath %s\n", path.get());
  // save the file name in the message header - otherwise no way to retrieve it
  (*aNewMsgHdr)->SetStringProperty("storeToken", newName.get());

I get this result:

TEST-INFO | c:/tb/central/tb-debug/mozilla/_tests/xpcshell/mailnews/local/test/u
nit/test_pop3MoveFilter.js | Starting getLocalMessages1

TEST-INFO | (xpcshell/head.js) | test getLocalMessages1 pending (2)

TEST-INFO | (xpcshell/head.js) | test pending (3)
newPath c:\\users\\builder\\appdata\\local\\temp\\1\\xpcshell\\xpcshellprofile\\
Mail\\Local Folders-1\\Inbox\\tmp\\1406002635260000
newPath c:\\users\\builder\\appdata\\local\\temp\\1\\xpcshell\\xpcshellprofile\\
Mail\\Local Folders-1\\Inbox\\tmp\\1406002635260000

TEST-PASS | ../../../resources/POP3pump.js | [OnStopRunningUrl : 70] 0 == 0

That is, on two subsequent calls to CreateUnique I am returning the same filename/path after each. I don't like to believe that CreateUnique is failing, but what other explanation is there?

One possible explanation is that this is an artifact of the testing environment where somewhere there are event queues that are being emptied to convert async calls to sync, so that the CreateUnique which ideally should be handled async is being interrupted by a an async file call, and the second CreateUnique gets called in the middle of the first CreateUnique call when an event loop is being spun. I don't see any evidence of that in debugger stacks, but then when I capture this in a debugger the problem disappears.

What would you suggest?
Flags: needinfo?(neil)
The message is originally downloaded to c:\users\builder\appdata\local\temp\1\xpcshell\xpcshellprofile\Mail\Local Folders-1\Inbox\tmp\ and then moved to c:\users\builder\appdata\local\temp\1\xpcshell\xpcshellprofile\Mail\Local Folders-1\Inbox\cur\. This means that the CreateUnique call serves no purpose as the tmp folder is always empty.

It looks as the problem also applies when filtering messages (but not when copying, as that does an extra call to CreateUnique on the destination folder).
Flags: needinfo?(neil)
(In reply to neil@parkwaycc.co.uk from comment #16)
> The message is originally downloaded to
> c:\users\builder\appdata\local\temp\1\xpcshell\xpcshellprofile\Mail\Local
> Folders-1\Inbox\tmp\ and then moved to
> c:\users\builder\appdata\local\temp\1\xpcshell\xpcshellprofile\Mail\Local
> Folders-1\Inbox\cur\. This means that the CreateUnique call serves no
> purpose as the tmp folder is always empty.

OK that makes sense, thanks for thinking about this!

I do wonder though about "the tmp folder is always empty". Suyash, could you look at when the copy from tmp to cur occurs? Could multiple message accumulate in the tmp folder? It seems to me in the general case they could, because there is no inherent reason why the copy of the message from tmp to cur has to occur before the next message starts arriving in tmp.

How do we handle possible file name collision copying from tmp to cur? Is there a check for duplicates and rename at the point?

In any case, it seems to me to make sense to include a counter on the folder as proposed to minimize/eliminate name collisions. Does that make sense to you, Neil?
Comment on attachment 8459139 [details] [diff] [review]
Patch v1.9

I'm going to remove the review request from this, as Neil pointed out a likely alternate theory of the bug.

Suyash, I need for you to look into the issues that I noted in my previous comment.
Flags: needinfo?(syshagarwal)
Attachment #8459139 - Flags: review?(kent)
So, the msg move from tmp/ to cur/ takes place here:
http://mxr.mozilla.org/comm-central/source/mailnews/local/src/nsMsgMaildirStore.cpp#684

in FinishNewMessage().
And, I don't see anything that would mean only one message remains in tmp/ at any time.

So, even if CreateUnique() succeeds in creating unique filenames in tmp/ we still need
our manual counter in the cases observed (when another filename with the same timestamp
is created, the previous one with the same filename(timestamp) is already moved).

Or, should we somehow introduce a "copy/moveUnique()" method that resolves duplicate filenames
while copying/moving ?

Thanks.
Flags: needinfo?(syshagarwal)
OK, the move to /cur will overwrite an existing file as written.

My sense of the correct thing to do is to abandon the counter, and in FinishNewMessage(), before the copy, do a CreateUnique, then check the filename and update the msgdb token if it has changed. Then do the copy.

Can you do a patch with that change?
Attached patch Patch v2.5 (obsolete) — Splinter Review
Please let me know if this is correct.

Thanks.
Attachment #8459139 - Attachment is obsolete: true
Attachment #8464903 - Flags: review?(kent)
Comment on attachment 8464903 [details] [diff] [review]
Patch v2.5

Review of attachment 8464903 [details] [diff] [review]:
-----------------------------------------------------------------

Since I just fixed the issues and tested, I'll upload the fixed path for feedback.

::: mailnews/local/src/nsMsgMaildirStore.cpp
@@ +746,5 @@
> +
> +  if (exists) {
> +    toPath->CreateUnique(nsIFile::NORMAL_FILE_TYPE, 0600);
> +    toPath->GetNativeLeafName(fileName);
> +    aNewHdr->SetStringProperty("storeToken", fileName.get());

you need to CreateUnique on existingPath, not toPath

Also, in testing this, this is not actually where the failure occurs, but in the other MoveToNative call
Attachment #8464903 - Flags: review?(kent) → review-
Here's my updated version of your patch. I've tested this and it fixes the issues. OK to land after you've given your positive feedback.
Attachment #8464903 - Attachment is obsolete: true
Attachment #8468625 - Flags: review+
Attachment #8468625 - Flags: feedback?(syshagarwal)
Comment on attachment 8468625 [details] [diff] [review]
v 2.6, fix of 2.5 patch per previous comments

Review of attachment 8468625 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for taking so long.
Ya, this is great and works fine for me.
Thanks for rectifying it.
Attachment #8468625 - Flags: feedback?(syshagarwal) → feedback+
Assignee: nobody → syshagarwal
Keywords: checkin-needed
Status: NEW → ASSIGNED
https://hg.mozilla.org/comm-central/rev/9dc2458b7a79
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 34.0
You need to log in before you can comment on or make changes to this bug.