Closed Bug 511860 Opened 11 years ago Closed 10 years ago

test_results-as-visits.js randomly (but constantly) fails, and then reports a failure in head_queries.js

Categories

(Toolkit :: Places, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla2.0b9

People

(Reporter: mak, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [test disabled on 1.9.2])

Attachments

(1 obsolete file)

this is most likely another instance of bug 507790, but since it's happening too often i'm going to disable the test.
disabled test since it was causing perma orange.
http://hg.mozilla.org/mozilla-central/rev/e932e93fc953
notice this test executed in loops locally was never failing, and failures on Linux and Windows were slightly different, reporting wrong results with duplicates from the temp tables (16 results on Lin, 23 on Win, correct number was 15)
the main problem appear related to batches, and i can guess a reason, when we fire onEndUpdateBatch we call Refresh on query nodes, that will cause the node to requery the db sync.
But just before that, DBFlush will receive the notification and start syncing.

I think that if we enqueue the sync we would solve most of the random oranges we have due to underlying db changes. Indeed i can locally reproduce bug 509970 at every run, the same way tinderboxes can reproduce this bug at every run, enqueuing the sync makes the test pass.

So, i'd like to try this approach, won't solve the underlying UNION ALL issue, but delaying the sync will allow our results to update most of the times with fixed underlying data.

Will post a patch, i'm just ensuring tests are passing
I don't see anything bad in doing this, but could be i'm missing something related to storage.
I'm delaying the sync only when we receive direct notifications, that always happen synchronously and sometimes in the middle of our methods (see removeXXXByTimeframe for example), results listen to those, so it would be better to avoid messing up with the db while results are refreshing.
The purpose of this is not to solve the issue that actually has hard solutions, but to reduce the number of times it could happen to really rare events.

If approved, after letting this bake for a run, i'll re-enable the test to see if it is now working as expected.
Attachment #395821 - Flags: review?(sdwilsh)
Comment on attachment 395821 [details] [diff] [review]
delay syncs after results updates

r=sdwilsh

We need to start coming up with a plan on how to get rid of temporary tables.
Attachment #395821 - Flags: review?(sdwilsh) → review+
Assignee: nobody → mak77
Status: NEW → ASSIGNED
i'm splitting out the patch to a new bug because this way is hard to track, will forward r+ there.
Depends on: 511965
Blocks: 438871
Whiteboard: [orange]
moved patch to bug 511965, will push there, then i'll re-enable the test here if everything works correctly.
Summary: test_result-as-visits.js is randomly (but constantly) failing. → test_results-as-visits.js is randomly (but constantly) failing.
Duplicate of this bug: 516505
Attachment #395821 - Attachment is obsolete: true
Whiteboard: [orange] → [orange][test disabled on trunk and 1.9.2]
Depends on: 521225
nothing i can do till bug 507790 is solved, but i'll stop orange reports in bug 521225
Assignee: mak77 → nobody
Status: ASSIGNED → NEW
WINNT 5.2 mozilla-central test everythingelse on 2009/10/14 15:59:05

TEST-UNEXPECTED-FAIL | e:/builds/moz2_slave/mozilla-central-win32-unittest-everythingelse/build/xpcshell/tests/test_places/queries/test_results-as-visit.js | amo == ab moz - See following stack:
this is interesting...
Whiteboard: [orange][test disabled on trunk and 1.9.2] → [orange][test disabled on 1.9.2]
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1255647144.1255648815.12624.gz
WINNT 5.2 mozilla-central test opt everythingelse on 2009/10/15 15:52:24  TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-everythingelse\build\xpcshell\tests\test_places\queries\test_results-as-visit.js | test failed (with xpcshell return code: 0), see following log:
Duplicate of this bug: 522602
i fixed a wrong name in the workaround code and that has stopped reporting this orange. The bug is still there though. But that explains the failure in comment 12.
Adding "head_queries.js" to bug summary, per the duplicate bug marked in comment 13.

Just hit an orange on that test:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257281227.1257283792.13458.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/03 12:47:07
Summary: test_results-as-visits.js is randomly (but constantly) failing. → test_results-as-visits.js and head_queries.js is randomly (but constantly) failing.
Summary: test_results-as-visits.js and head_queries.js is randomly (but constantly) failing. → test_results-as-visits.js and head_queries.js are randomly (but constantly) failing.
(In reply to comment #15)
> Adding "head_queries.js" to bug summary, per the duplicate bug marked in
> comment 13.

does not make much sense, any test failing in queries folder will fail in head_queries.js...
indeed the failing test in comment 15 is test_redirectsMode.js and not test_results-as-visits.js
Summary: test_results-as-visits.js and head_queries.js are randomly (but constantly) failing. → test_results-as-visits.js is randomly (but constantly) failing.
just to be clear HEAD_XXX are header files included by xpcshell tests, any failure reported in HEAD_XXX files must be reported to the failing xpcshell test.

an header cannot fail by itself.
Oh, sorry - I misunderstood your comment in Bug 522602.

Still, I think it's worth mentioning "head_queries.js" in the bug's summary, for the benefit of log-starrers, because the tinderbox log *does* highlight the failures as two distinct failures (in two distinct tests).

e.g. "random failure in test_results-as-visits.js, followed by head_queries.js"
(and the same for Bug 523578 -- the test_redirectsMode.js failure.)

This would make orange-diagnosis much easier...  Marco, is this ok by you?
(In reply to comment #19)
> Still, I think it's worth mentioning "head_queries.js" in the bug's summary,
> for the benefit of log-starrers, because the tinderbox log *does* highlight the
> failures as two distinct failures (in two distinct tests).
> 
> e.g. "random failure in test_results-as-visits.js, followed by head_queries.js"
> (and the same for Bug 523578 -- the test_redirectsMode.js failure.)

i have a fear that in such way people will start reporting failures to the first head_queries bug they find with "head_queries.js" in the title, without bothering to look at the fact it's the same failure.

that said, if "followed by" is specified should be clear enough, so that could be fine.
I'm not sure how we can make clear where to report head_xxx failures, since this problem is not only in Places, can be anywhere in our codebase where a test is using some common util in its headers.

finally, this test should not fail anywhere atm, it's disabled on 1.9.2 and workarounded (But running) on trunk.
Summary: test_results-as-visits.js is randomly (but constantly) failing. → test_results-as-visits.js randomly (but constantly) fails, and then reports a failure in head_queries.js
Whiteboard: [orange][test disabled on 1.9.2] → [orange][test disabled on 1.9.2][workarounded by bug521225 on trunk]
Blocks: 559354
No longer blocks: 509868
cannot happen anymore on trunk after Places branch merge.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [orange][test disabled on 1.9.2][workarounded by bug521225 on trunk] → [orange][test disabled on 1.9.2]
Target Milestone: --- → mozilla2.0b9
Whiteboard: [orange][test disabled on 1.9.2] → [test disabled on 1.9.2]
You need to log in before you can comment on or make changes to this bug.