Closed Bug 527834 Opened 16 years ago Closed 16 years ago

gloda indexes junk in IMAP accounts even though it shouldn't

Categories

(MailNews Core :: Database, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 3.0rc1

People

(Reporter: asuth, Assigned: asuth)

Details

Attachments

(7 files, 1 obsolete file)

I thought I could approximate what happens when the bayesian classifier is actually in use in my unit test but I was wrong. The implementation problem was that OnItemPropertyFlagChanged was firing for the Offline bit before the message was reported to us via msgsClassified. It went offline before getting marked as junk, of course. So assuming the indexer got to the message before the junk classifier got around to marking it as junk, it would index it. Then the message would get deleted and what not when the other notifications came through, wastefully. The solution is to build on bug 527687's introduction of a flag that lets us know the message has not yet been reported to us by msgs. Previously the code actually would check for the junk processing flag, but only in the sweep case. (I think that code was a leftover from before the introduction of msgsClassified.) We now check in both sweep and event-driven cases for the not-reported case. While I was in there I also layed a minor change on top of the fix for bug 527687; that bug's fix could generate msgsClassified events with no messages reporting because they all got filtered out. Now we no longer generate a notification if that is the case. I also had to enhance glodaTestHelper and friends slightly so that wait_for_gloda_indexer can know it has to wait until it has seen a msgsClassified event to determine whether the indexer is active. (We previously were relying on the notifications to reliably occur within the asynchronous process of injection, but the asynchronous process of classification can outlive that.)
Flags: blocking-thunderbird3?
Attachment #411586 - Flags: superreview?(bienvenu)
Attachment #411586 - Flags: review?(bienvenu)
Attached file test output on windows
Comment on attachment 411586 [details] [diff] [review] use bayesian filter in test, don't index messages not reported by msgsClassified yet v1 the code looks fine, but we need to figure out the test failures I'm seeing. I'll try it on my mac as well.
Attachment #411586 - Flags: superreview?(bienvenu)
Attachment #411586 - Flags: superreview+
Attachment #411586 - Flags: review?(bienvenu)
Attachment #411586 - Flags: review+
Please add this patch on top of the others. There is a lot of extremely useful data that gets emitted via the logsplodery channel that does not get dumped via the normal route. This is part of my long-term plan for making it easier to figure out what goes wrong with unit tests/what actually happens in unit tests, so I'm gonna ask for review for this too. (logsploder is a xulrunner app but it should be trivial to give it a firefox-supported web interface for browsing the logs.) Then, in your temp dir, touch the file "logsploder.filelog". (Ideally, keep the file contents empty unless you want a custom prefix.) Then, when you run the tests they will create a log file named like "logsploder-test_index_junk_imap_offline.json" in your temp directory. If you could run a check-one for the failed tests and then attach the json blobs it would be appreciated: test_index_messages_imap_offline.js test_index_messages_local.js Once you are done, remove the "logsploder.filelog" file from your temp directory to stop the awesomeness from happening. Thanks!
Attachment #411605 - Flags: review?(bienvenu)
er, there's a harmless dump statement in there but I have removed it locally.
Attached file another one...
That you so much for doing all the legwork in running the tests and getting me the output! The problems I observed looked like they could come from two different vintages: - A haste-based screw-up in the logic of my code that makes the gloda test helper logic wait for expected msgsCompleted notification before actually checking if the indexer is done. This could result in premature assertion checks as well as potentially corrupting the asynchronous execution by getting two events happening simultaneously. I have simplified and improved this logic. - Database commits happening when we don't explicitly request them. The code that tries to stop the adaptive indexer from being adaptive was not doing anything about the commit timers. Worse, I was actually registering us with the real idle service; even though we hard-code the idleTime to be something that we always perceive as idle, the "idle" observer notification didn't check that (since it doesn't need to) so it would happily commit things if the user running the test went idle. The "idle" event problem may have been windows only; I have the vague recollection that xpcshell tests on linux don't have the idle service available to them. I changed this so we effectively will only perform a commit when the unit test explicitly requests it. This makes the tests deterministic which is really the goal of the tests in question. While the one PendingCommitTracker-related explosion I saw is concerning, it could have been the previous bullet point showing up, or it could just be the commit landed during a time period where the unit test is doing something sketchy but intentional that just does not expect that to happen. (I have been writing unit tests assuming commits only happen when I tell them to.) While I was in there I made a few other minor testing consistency changes: - The adaptive indexer is more thoroughly de-fanged. I think the timer based logic might have started changing things around in some cases. Again, the justification is determinism and consistency. We can write a worst-case scenario indexing test if the need arises. - I added some parens to the doCommit boolean logic because code that relies on &&/|| precedence is either too clever or simply wrong. - I removed a bunch of dump statements in favor of logging events. The dumps were not actually synchronized with the log output which resulted in them being effectively useless. I suspect that this should resolve the unit test failures you were observing. Even if the failures were caused by real bugs they probably won't get a chance to be triggered anymore.
Attachment #411586 - Attachment is obsolete: true
Attachment #411626 - Flags: superreview+
Attachment #411626 - Flags: review?(bienvenu)
No longer depends on: 527687
Comment on attachment 411626 [details] [diff] [review] use bayesian filter in test, don't index messages not reported by msgsClassified yet v2 (From IRC) <bienvenu> I've been getting a couple failures for a while <bienvenu> but tinderbox has remained happy? ... <bienvenu> asuth, oddly, running the test singly, it passes <bienvenu> both of them, actually So, actually, this sounds very much like it was the "idle" event that was the source of bienvenu's long-standing errors, since even a windows tinderbox would generally never have an idle transition occur. I expect that one or two of the other errors was the first case as I reported. As such, I'm going to actually carry the review approvals forward after all and land this. Although I would be very greatful if bienvenu could sanity check that the tests are happy again on his machine. I'll make sure the tinderbox gets at least one good cycle before heading to bed (otherwise I'll backout).
Attachment #411626 - Flags: review?(bienvenu) → review+
pushed to trunk: http://hg.mozilla.org/comm-central/rev/8d6e1f1fd0a5 pushed to comm-1.9.1: http://hg.mozilla.org/releases/comm-1.9.1/rev/49717804bcc6 Marking fixed. bienvenu, please reopen if the tests are unhappy or you have any issues with the changes made for the second patch.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
I'm getting a different set of failures now :-(
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I'll try to attach a json file of one or more of the failures - the exception getting the uint32 property is troubling to the uneducated eye.
In it's original unfixed state, I don't think this would have been worth blocking on, but it's a little unclear what the consequences of the current state are. Awaiting thoughts from asuth...
Whiteboard: [needs input asuth]
Right, so my late-night fix for bienvenu's idle problem did not actually fix bienvenu's idle problem. I neglected to remember that our lobotomy of the adaptive indexer happens _after_ the indexer has already initialized itself. So it is already fully registered with the idle service by the time we replace its idle service with a fake one. The rogue commit is not only invoking PendingCommitTrigger at an unacceptable time in many cases, but it is also triggering a call to GlodaIndexer._notifyListeners which can mislead glodaTestHelper's code into thinking indexing is done when it is not. Indexing is active but it has no job, so we report as idle. It is possible to not have a job for brief moments of time. I suspect the proper course of action is to make glodaTestHelper also check the "indexing" flag on GlodaIndexer rather than taking the notification at its word. I'll provide a fix a little later today. In the meantime, the workaround is for bienvenu to move his mouse frantically while running any gloda tests.
Comment on attachment 411783 [details] [diff] [review] unregister idle observer, make sure indexer is truly done indexing tests all pass for me now.
Attachment #411783 - Flags: review?(bienvenu) → review+
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
Comment on attachment 411605 [details] [diff] [review] add logsploder-style file logging to help asuth debug more v1 + if (data.length) + outFile.append(data + suffix); + else + outFile.append("logsploder-" + suffix); can be just outFile.append((data.length ? data : "logsploder-") + suffix);
Attachment #411605 - Flags: review?(bienvenu) → review+
this made rc1, so setting tfv.
Flags: blocking-thunderbird3?
Whiteboard: [needs input asuth]
Target Milestone: --- → Thunderbird 3.0rc1
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: