Closed Bug 534143 Opened 15 years ago Closed 15 years ago

offline mode: moving messages to folders, not all actions executed when going online

Categories

(Thunderbird :: General, defect)

x86
Windows Vista
defect
Not set
normal

Tracking

(blocking-thunderbird3.1 beta2+, thunderbird3.1 beta2-fixed, blocking-thunderbird3.0 .4+, thunderbird3.0 .4-fixed)

VERIFIED FIXED
Thunderbird 3.1b2
Tracking Status
blocking-thunderbird3.1 --- beta2+
thunderbird3.1 --- beta2-fixed
blocking-thunderbird3.0 --- .4+
thunderbird3.0 --- .4-fixed

People

(Reporter: kris.boulez, Assigned: Bienvenu)

References

Details

Attachments

(6 files, 3 obsolete files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.5) Gecko/20091204 Thunderbird/3.0 When going offline, mail is synchronised from two imap accounts. In offline mode mails are answered/moved to folders . When going on-line all new-compiled mails are sent and deletes (seem to) happen. The moving of mail to imap folders does not happen for both accounts. It looks as it mostly fails for one of the two accounts. Happens mostly if a larger number (30+) of mails is moved to different folders. I've used TB2 with these two servers for a long time (also offline reading) and never had any of these problems. Reproducible: Sometimes Steps to Reproduce: 1. Go offline 2. Move mails to different folders on different IMAP accounts 3. Go online Actual Results: When going online, some mails are moved to their folders, others are not Offline reading as a whole seems to be a step back when going from TB2 -> TB3. The -RC2 hang indefinitely in one case, needing a full reinstall.
Can you reproduce and log imap and imapoffline and attach the log file when you reproduce the issue (https://wiki.mozilla.org/MailNews:Logging) ?
Tried it out a few times and was able to reproduce it once. Unfortunately at that moment I only had imap logging on (and not imapoffline). There were no errors in the imap logging (so apparently the request didn't got queued). The problem seems to happen mostly when a lot of mails are moved/deleted/sent. I will try on monday during my commute and put on all debugging. I'll keep you posted
This is a screenshot of TB3 after having worked in off-line mode. Inboxes are nearly empty, everything moved to folders or deleted. There is a one hour difference with timestamps in debug file (is UTC).
Version: unspecified → 3.0
This screenshot is after TB3 has gone on-line. Remark all the entries in the inboxes which were not moved/deleted (although done during offline).
Parsing of the debug file for all setDestinationFolderUri lines (i.e. all actions performed when working off-line). There are 71 lines (which looks about right). Compare to the online part in next attach.
Parsing of the TB debug file for all lines when going on-line, that contain "queuing". When looking more specificly for lines containing "onlinemove" , the number drops to 4. Which is in line with what is observed in the inbox (see second screenshot).
I was able to reproduce the bug. First I went off-line and downloaded all mails (not doing debugging). Then I started debugging version. Read/moved/deleted mails during my commute. At the end of this I took the first screenshot (off-line mode) showing a nearly empty inbox. When arriving at work, I connected to internet and made TB go online, after a few minutes the screen refreshes and a lot of the mails are not moved/deleted (see second screenshot, showing again a full inbox). I looked at the log file and did some analysis. During the offline part (see attachment id 417446) 71 lines are found with a setDestinationFolderUri. This roughly corresponds to the number of actions performed. When going online (see attach id 417449), all actions are to be queued. Looking for lines containing "queuing", shows that only a small subset of the offline actions is eventually queued. Unfortunately I can't find a unique identifier linking the off-line actions to the online ones. I also cannot find a list somewhere with everything that is/must be queued. Something strange I noticed is that the URI for one of the servers is incorrect. It reads url:imap://kboulez@192.168.10.12:143/onlinemove , whereas the IP address of the server is 192.168.10.2 (and not .12). I have no idea where this comes from as the account settings are correct everywhere and mail sending receiving works fine. There is no server with the .12 address. If you need more info, please let me know. - 10.12
David do you need anything else ?
Whiteboard: [has protocol logs]
What are these logs? zip? archives?
I did not attach the complete log files to the ticket. This bugzilla is publically available and I saw the log contains personal info. I did some parsing of the logs (attached to the ticket and described in Comment 7). Do you want the log files ?
Kris, I actually just wanted to know what kind of files you attached to the bug report. I could not open them with a text editor... without being able to see the logs, I don't think queuing means what you think it means - it just means we tried to run a url against a folder while an other url was running against the folder. Not all urls get queued. Some run immediately. I wonder if the reason the URI is off is because you changed the hostname at some point, for that host. If you go into the config editor, do you see a hostName and a realHostName for that server?
Tne first two are screenshot (jpeg). The latter two are plain ASCII text files. I just tried downloading one and opening with an editor without any problem. Shall I re-upload and give a .txt extension? I inferred that everything was put on a queue and then executed. When looking in the config editor I see indeed that the .hostname is .12 and that the .realhostname is .2 . Should this be changed, or is it best to leave alone ?
Attachment #417446 - Attachment mime type: application/octet-stream → text/plain
Attachment #417449 - Attachment mime type: application/octet-stream → text/plain
ah, ok, I don't know why my text editor wouldn't load the files. Best not to change the hostname prefs - the hostname pref is what we use for persistent uri's, and the realhostname is what we connect to. So you can change the realhostname and the persistent uri's will continue to work. ProcessCurrentURL:imap should be a much more interesting string to grep for in the log to show you what urls are actually run.
Grep for ProcessCurrentURL in debug file
Attached output of grep for ProcessCurrentURL:imap. Did quick analysis on number and type of actions. imap.google.com addmsgflags 1 appendmsgfromfile 6 customKeywords 5 fetch 126 folderstatus 25 liteselect 1 onlinemove 2 search 1 select 22 setmsgflags 2 192.168.10.12 addmsgflags 6 appendmsgfromfile 4 customKeywords 4 fetch 28 folderstatus 34 liteselect 1 onlinemove 4 search 1 select 33 setmsgflags 1
yeah, the full logs are really the only useful way forward, and even those aren't guaranteed to show the problem. Perhaps you could get a reduced test case and logs from that? In either case, you could send the log to bienvenu@mozillamessaging.com.
Could someone change the status to confirmed (I'm not able to do so). Together with David the full logs were analysed and the bug confirmed. Unfortunately the logs didn't show why TB "forgets" to move certain messages when going online again.
Could someone PLEASE do something about this bug (and tell me what to do to debug this one). This morning on my commute I processed 50 mails (move/delete/reply) and when coming online I have to redo 10+ messages. This is unacceptable and I will have to go back to TB2 or look for another MUA. P.S. Bug confirmed with 3.0.1
blocking-thunderbird3.1: --- → ?
blizzard, did you happen to have already filed another bug on this? If not, does this appear to be the same thing you saw recently?
Note that we won't generally hold a release for bugs that we can't reproduce well enough to fix (i.e. bugs still in the UNCONFIRMED state). Talking to bienvenu, it sounds like the information in the partial logs he has isn't sufficient to reproduce the problem. If we can get enough info to reproduce, please renominate to block. Adding qawanted keyword in case someone in QA has the bandwidth to spend time figuring out more details.
blocking-thunderbird3.1: ? → -
Keywords: qawanted
I didn't file another bug, but the symptoms appear about the same as what I saw. And I can't reproduce now, only seems to happen when I move a lot of stuff around, start and stop the mail client a lot and then go online possibly with bad connectivity. Just leaves a mess in my inbox - some stuff left unmoved, some moved, some marked as read.
Chris, can you tweak your environment to include the NSPR_LOG_MODULES and NSPR_LOG_FILE so that imap:5 and imapoffline:5 are always aimed at a file somewhere? That way, next time it happens, you'll have a log lying around that may have extra clues to help us figure out how to reproduce....
Lets make this very clear. I can reproduce this bug without any problem. It happened, in the mean time I've switched back to TB2, to me every morning/evening during my commute. With the help of David Bienvenu I ran the debugging as requested. The problem is that the debugging log doesn't contain enough info to see where TB3 "forgets" to move the messages. So, this bug is absolutely confirmed (ask David), but the debugging options fail. If someone wants any help, I'm more then willing to resolve this issue.
I'm happy to confirm this. I'll try to figure out what more logging could be added...
Status: UNCONFIRMED → NEW
Ever confirmed: true
Attached patch proposed fix (obsolete) — Splinter Review
I found a reproducible case of this, and it fixes it for me. I'll request a 3.02 try server build for Kris to try, and see if it's possible to write a unit test for this...
Assignee: nobody → bienvenu
Status: NEW → ASSIGNED
3.02 build w/ this fix will appear here in a couple hours - http://tinderbox.mozilla.org/showbuilds.cgi?tree=ThunderbirdTry Kris (or Chris), if you could try this build and let me know if it fixes this for you, that would be very helpful. Before I bang my head against the xpcshell+offline wall, I'll give some simple steps for reproducing this. Sort an imap folder by order received (IMAP UID) Mark a couple messages unread Go offline mark the two messages read (or select them to read them) move the message with smaller uid to one imap folder move the second message to a second imap folder Go online What you should see is both flag changes played back, but only the first move operation is played back. The bug was that when we were clearing the current ops from the imap url we just ran, we were not removing offline ops from currentOpsToClear if there were still any operations left for that message. This is the correct thing to do when removing operations from the db, but not from the array of current ops to clear. When we have a flag change and a move for a message, both those are stored in the same operation, since operations are keyed by imap uid. Because of the bug, we were playing back the flag changes first, but leaving the operation in the m_currentOpsToClear to list. Then, we playback the first move operation (moves are coalesced by target folder). When we finish the first move, we clear the m_currentOpsToClear of move operations. Unfortunately, this clears all the moves that were still left because they also had flag changes. The fix is to always remove items from m_currentOpsToClear when the url is done, because we only put items in there that are part of the current url playback.
I'll try this one out later this weekend and keep you posted.
Attached patch wip on test (obsolete) — Splinter Review
this doesn't work; fake server and xpcshell don't seem particularly happy about going offline and then going back offline, in terms of recreating the connections. I'll keep banging my head for a while and see if I can figure it out.
Good news: did a quick test and everything worked as expected. I'll do a more extensive test on Monday and let you know. Had only one problem when switching from TB2 to TB3. The first few times TB3 is started, the program becomes "not responding", the only indication is "determining which messages to index". I have a few large mailboxes. I think the above is a bug I already saw somewhere in bugzilla. Kris,
I spent a good part of yesterday trying to get fakeserver to come back to life after going offline and back online, w/o any luck. It just rejects connections; I'll keep banging on it.
This to confirm that the Shredder version solves the problem I reported initially. During my Monday morning commute I handled 60+ messages and when I connected in the office everything went fine. Thanks for the effort in bug fixing. Kris,
Comment on attachment 427825 [details] [diff] [review] proposed fix nsImapOfflineSync::ClearCurrentOps() is where the actual fix is; the rest is whitespace cleanup. I'm making progress on getting the test working, but I'd rather not hold 3.1 b1 hostage for this, and the reporter says this does fix it for him. I'm still hopeful I can get the test working, but I've been working on it for several days now
Attachment #427825 - Flags: superreview?(bugzilla)
Attachment #427825 - Flags: review?(bugzilla)
OK, I've finally got imap fakeserver able to make a second connection after closing the first connection. Now I need to finish my test, but I should have that finished tonight.
Attached patch wip v2 (obsolete) — Splinter Review
This seems to generally work. Now I have to clean it up a bit, and see which of the changes to fake server I really need. And make sure the test fails w/o the core fix, of course.
Attachment #427893 - Attachment is obsolete: true
Whiteboard: [has protocol logs] → [has patch for review, working on test]
this adds a unit test that fails w/o the imap changes, and succeeds with them. The test leaks, which I've spent a few hours looking into, and will continue to look into.
Attachment #428538 - Attachment is obsolete: true
Attachment #428612 - Flags: superreview?(bugzilla)
Attachment #428612 - Flags: review?(bugzilla)
Whiteboard: [has patch for review, working on test] → [has patch for review, working on test leaks]
the leaks have to do with doing offline move/copies of messages. I believe it's more of a bloat than leak issue but I'll try to figure it out. It's a separate issue from this bug, however, so the fix might end up in a separate bug.
Bug 548458 has the fix for the leak in the unit test.
Comment on attachment 427825 [details] [diff] [review] proposed fix fix is in cumulative patch
Attachment #427825 - Flags: superreview?(bugzilla)
Attachment #427825 - Flags: review?(bugzilla)
Do I see it correctly that this patch is not included in 3.0.3 ? Did a quick download, but problem reappeared. Kris,
It's not even fixed on the trunk yet, since it's awaiting code review. But I'm going to mark it blocking 3.04 and 3.1 beta 2.
blocking-thunderbird3.0: --- → .4+
blocking-thunderbird3.1: - → beta2+
David, So, we have a problem :(, it's happening again ... I has worked perfectly, but the last days it started happening again. This evening I checked very carefully and 5-10 messages weren't moved at all. I must say that I have multiple versions of TB installed at this moment (chasing another bug). I'm pretty sure I'm still using the correct Shredder version (About only displays 3.0.3pre, no more specific version number). I propose to remove all TB versions and only use your version with the fix. Could you provide a pointer to where I can download the fixed version and retest ? Kris,
Whiteboard: [has patch for review, working on test leaks] → [has patch for review standard8, test leaks fixed]
Attachment #427825 - Attachment is obsolete: true
Comment on attachment 428612 [details] [diff] [review] proposed fix with unit test >+ function prepareToGoOffline() { >+ let rootFolder = gIMAPIncomingServer.rootFolder; >+ gSecondFolder = rootFolder.getChildNamed("secondFolder") >+ .QueryInterface(Ci.nsIMsgImapMailFolder); >+ gThirdFolder = rootFolder.getChildNamed("thirdFolder") >+ .QueryInterface(Ci.nsIMsgImapMailFolder); nit: one space after = and probably 2-space indent the QI from the "rootFolder".
Attachment #428612 - Flags: superreview?(bugzilla)
Attachment #428612 - Flags: superreview+
Attachment #428612 - Flags: review?(bugzilla)
Attachment #428612 - Flags: review+
fixed for b2 - Kris, tomorrow's nightly 3.1 b2 pre build will have the fix - http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-1.9.2/ It won't be available until tomorrow. We will probably have 3.0.0x builds in a week or two with the fix as well.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 3.1b2
Attachment #428612 - Flags: approval-thunderbird3.0.4?
Attachment #428612 - Flags: approval-thunderbird3.0.4? → approval-thunderbird3.0.4+
the unit test isn't working on the 3.0.x branch. I need to figure out what's different. I suspect it's something in the testing infrastructure that's changed.
fixed for 3.0.4
this will show leaks for the test on 3.0.4, unless the fix for Bug 548458 lands for 3.0.4
Whiteboard: [has patch for review standard8, test leaks fixed]
Status: RESOLVED → VERIFIED
Just a quick note that I have been using Lanikai (3.1b2pre) for a few days and everything seems to work fine. Thanks for all the effort.
thx for the feedback, Kris, and the patience to help track this down.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: