Closed
Bug 427142
Opened 17 years ago
Closed 17 years ago
sporadic test_393498.js failure on bm-win2k3-pgo01 because of VM timer issues
Categories
(Firefox :: Bookmarks & History, defect)
Firefox
Bookmarks & History
Tracking
()
RESOLVED
FIXED
People
(Reporter: dietrich, Assigned: mak)
References
()
Details
(Keywords: fixed1.9.0.2, intermittent-failure)
Attachments
(3 files, 3 obsolete files)
3.36 KB,
patch
|
dietrich
:
review+
|
Details | Diff | Splinter Review |
2.03 KB,
patch
|
Details | Diff | Splinter Review | |
2.75 KB,
patch
|
Details | Diff | Splinter Review |
well, one test failure at this point. it went green next run. there were no recent checkins that looked anything related.
possibly related to bug 381240.
NEXT ERROR ../../../../_tests/xpcshell-simple/test_places/bookmarks/test_393498.js: FAIL
../../../../_tests/xpcshell-simple/test_places/bookmarks/test_393498.js.log:
>>>>>>>
*** test pending
*** exiting
*** CHECK FAILED: 1207338863116625 != 1207338863116625
JS frame :: d:/slave/trunk_2k3_pgo/mozilla/tools/test-harness/xpcshell-simple/head.js :: do_throw :: line 99
JS frame :: d:/slave/trunk_2k3_pgo/mozilla/tools/test-harness/xpcshell-simple/head.js :: do_check_neq :: line 109
JS frame :: ../../../../_tests/xpcshell-simple/test_places/bookmarks/test_393498.js :: run_test :: line 129
JS frame :: d:/slave/trunk_2k3_pgo/mozilla/tools/test-harness/xpcshell-simple/tail.js :: _execute_test :: line 41
JS frame :: d:/slave/trunk_2k3_pgo/mozilla/tools/test-harness/xpcshell-simple/execute_test.js :: <TOP_LEVEL> :: line 38
2147500036
*** FAIL ***
Comment 1•17 years ago
|
||
I would like to know if this is actually related to bug 381240, because 381240 is a non-blocking p4 bug and I'm wondering if we should continue to block putting the PGO box back in to production for this issue.
Comment 2•17 years ago
|
||
Over about 60 runs I've seen this issue ~12 times on two separate PGO boxes.
This issue is PGO specific as we haven't seen it on the win2k3 unittest machines once.
Assignee | ||
Comment 3•17 years ago
|
||
I'm seeing the same thing on my virtual build machine, sometimes the test pass, sometimes not. same exact test / line.
XP on Virtual PC 2007, with latest Windows Server 2008 SDK and mozillabuild.
Comment 4•17 years ago
|
||
mak77, are you using PGO enabled builds?
Comment 5•17 years ago
|
||
So, the unittest box on the Firefox tinderbox is not a VM, iirc. (robcee can verify that) This might just be a VM timing issue?
Assignee | ||
Comment 6•17 years ago
|
||
(In reply to comment #4)
> mak77, are you using PGO enabled builds?
no, i'm using the standard build environment, with these options
ac_add_options --enable-debugger-info-modules=yes
ac_add_options --disable-installer
ac_add_options --disable-accessibility
ac_add_options --disable-activex
ac_add_options --disable-activex-scripting
ac_add_options --disable-libxul
Comment 7•17 years ago
|
||
Ok, so we now know that it's not a PGO issue but is a VM issue.
Can we get a developer owner for this bug to fix the timing in the test so that it passes more reliably on virtual machines?
Reporter | ||
Updated•17 years ago
|
Assignee: nobody → dietrich
Reporter | ||
Comment 8•17 years ago
|
||
try delaying verification.
Mikeal and Marco, can you please try this locally?
Assignee | ||
Comment 9•17 years ago
|
||
(In reply to comment #8)
> Created an attachment (id=314727) [details]
> Mikeal and Marco, can you please try this locally?
4 runs: 3 PASS, 1 FAIL
Assignee | ||
Comment 10•17 years ago
|
||
well, this is NOT a fix, but i'm not able to reproduce the fail with this in... the *hack* is duplicating the call to setItemTitle in the test. without this i fail at least 1 time on 10, with the duplication i've not been able to fail (50 consecutive runs without failing). At least will reduce a lot the failing statistics... Maybe could be useful to find the real cause...
As a bonus i did a refactoring of SetItemTitle to be faster (one query is enough and transaction is implicit).
We could check-in this as a temporary hack to reduce test failing and then look for the real fix
Comment 11•17 years ago
|
||
A failure hasn't been observed in 12 hours on both PGO boxes.
Doesn't mean this issue is gone, just an observation.
Comment 12•17 years ago
|
||
We are, optimistically, thinking this will be fixed by EOD. We are scheduling production downtime for tomorrow at 10am PST to put the PGO untittest box back on .
Are we going to take this *hack*, or are we still waiting on a different fix?
Reporter | ||
Comment 13•17 years ago
|
||
Comment on attachment 314845 [details] [diff] [review]
duplication
r=me on the test change. please file a different bug on the perf change.
We should take this change, and keep this bug open to followup and resolve the root cause of the failures.
Attachment #314845 -
Flags: review+
Reporter | ||
Comment 14•17 years ago
|
||
a=mconnor over IM.
Checking in toolkit/components/places/tests/bookmarks/test_393498.js;
/cvsroot/mozilla/toolkit/components/places/tests/bookmarks/test_393498.js,v <-- test_393498.js
new revision: 1.2; previous revision: 1.1
done
Assignee: dietrich → mak77
Comment 15•17 years ago
|
||
Observed failure in test_393498.js from a checkout at ~9am PST on Friday April 11th, 2008.
http://tinderbox.mozilla.org/showlog.cgi?log=UnitTest/1207931947.1207940927.15739.gz#err0
Assignee | ||
Comment 16•17 years ago
|
||
still, is the fail ratio better now? we should have reduced the possibility of a fail... not 0 though
Comment 17•17 years ago
|
||
This was hit again tonight:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1209951934.1209960243.19098.gz&fulltext=1
Assignee | ||
Comment 18•17 years ago
|
||
I've built up a test with a loop to get some statistics about this fault.
on 1000 runs i got about:
- 100 faults with 1 setItemTitle call, so failing about 1/10 of times
- 5 faults with 2 setItemTitle calls (fault ~ 1/100)
- 0/1 fault with 3 setItemTitle calls (fault ~ 1/1000)
- with n setItemTitle calls fault ~ 1/10^n
PR_Now() in these cases is returning the same value returned by getItemTitle (so the value we previously saved to the db, that was another PR_Now()).
So definately appear a PR_Now() bug on VM timers.
Can we make the test, after its setTimeout, check that the Date.now() is different from what it was before the setTimeout, and if it isn't, do the setTimeout again? Would that fix this?
Summary: sporadic test failure on bm-win2k3-pgo01 → sporadic test_393498.js failure on bm-win2k3-pgo01 because of VM timer issues
Depends on: 445580
So we did contact VMware technical support (see bug 445580), and they also brought up the point that time adjustments caused by NTP can cause things like this to happen.
So I think we should just fix the test to do what comment 19 suggests (and perhaps even check > rather than !=).
Comment 22•17 years ago
|
||
At one point in time it was known to not run ntp but I don't know how the Windows VMs are setup. All the ESX hosts are configured to run ntp in the service console.
Can someone from RelEng check the settings in the VM?
Comment 23•17 years ago
|
||
bm-win2k3-pgo01 / qm-win2k3-pgo01 has out-of-date vmware tools.
The vmx config is set correctly:
tools.syncTime = "TRUE"
Don't know if Windows is still trying to do it's own time sync (and I don't know how to check that). Either way, those tools should be updated.
Comment 24•17 years ago
|
||
(In reply to comment #19)
> Can we make the test, after its setTimeout, check that the Date.now() is
> different from what it was before the setTimeout, and if it isn't, do the
> setTimeout again? Would that fix this?
I think we should provide a helper function that does this in the test harness, so we can fix any other tests that make assumptions like this.
Assignee | ||
Comment 25•17 years ago
|
||
(In reply to comment #19)
> Can we make the test, after its setTimeout, check that the Date.now() is
> different from what it was before the setTimeout, and if it isn't, do the
> setTimeout again? Would that fix this?
we are saving now() in the places db in the backend, and then reading that, so i doubt that it would be fixed that way, instead i was thinking we could maybe manually change the lastModified value for that bookmark in the past before doing the lastModified update, so we will check that it has changed and skip the problem.
Assignee | ||
Comment 26•17 years ago
|
||
this should reduce fail to zero (and from my testing does) for this specific test without nullifying its usefulness (we still check that the lastModified get changed).
I suggest taking this specific fix for this testcase and find a more general way for handling the timing problem in VMs in bug 445580.
Attachment #330538 -
Flags: review?(dietrich)
Reporter | ||
Comment 27•17 years ago
|
||
Comment on attachment 330538 [details] [diff] [review]
set lm in the past patch
>? toolkit/components/places/tests/unit/test_433317_query_title_update.js
>? toolkit/components/places/tests/unit/test_now.js
>Index: toolkit/components/places/tests/bookmarks/test_393498.js
>===================================================================
>RCS file: /cvsroot/mozilla/toolkit/components/places/tests/bookmarks/test_393498.js,v
>retrieving revision 1.2
>diff -u -8 -p -r1.2 test_393498.js
>--- toolkit/components/places/tests/bookmarks/test_393498.js 10 Apr 2008 20:06:11 -0000 1.2
>+++ toolkit/components/places/tests/bookmarks/test_393498.js 21 Jul 2008 09:13:32 -0000
>@@ -117,23 +117,24 @@ function run_test() {
> do_check_eq(rootNode.childCount, 1);
> var childNode = rootNode.getChild(0);
>
> // confirm current dates match node properties
> do_check_eq(bmsvc.getItemDateAdded(bookmarkId), childNode.dateAdded);
> do_check_eq(bmsvc.getItemLastModified(bookmarkId), childNode.lastModified);
>
> // test live update of lastModified caused by other changes:
>+ // We set lastModified 1ms in the past to workaround a timing bug on
>+ // virtual machines, see bug 427142 for details.
>+ var pastDate = Date.now() * 1000 - 1;
>+ bmsvc.setItemLastModified(bookmarkId, pastDate);
> // set title (causing update of last modified)
> var oldLastModified = bmsvc.getItemLastModified(bookmarkId);
>- // This double call to setItemTitle is a temporary hack to workaround a
>- // timing bug on virtual machines. See bug 427142 for details.
>- bmsvc.setItemTitle(bookmarkId, "Google");
> bmsvc.setItemTitle(bookmarkId, "Google");
>- // test that lm is updated
>+ // test that lastModified is updated
> do_check_neq(oldLastModified, childNode.lastModified);
should change this to be more specific, testing that oldLastModified is < childNode.lastModified.
r=me, thanks.
Attachment #330538 -
Flags: review?(dietrich) → review+
Assignee | ||
Comment 28•17 years ago
|
||
for sure, doing stricter check
Attachment #330538 -
Attachment is obsolete: true
Assignee | ||
Updated•17 years ago
|
Keywords: checkin-needed
Comment 29•17 years ago
|
||
Pushed to mozilla-central:
http://hg.mozilla.org/index.cgi/mozilla-central/rev/d8fbfb07f444
I'll check into 1.9.0 CVS in a minute.
Thanks for the fix!
Comment 30•17 years ago
|
||
Checking in toolkit/components/places/tests/bookmarks/test_393498.js;
/cvsroot/mozilla/toolkit/components/places/tests/bookmarks/test_393498.js,v <--
test_393498.js
new revision: 1.3; previous revision: 1.2
done
Keywords: fixed1.9.0.2
Comment 31•17 years ago
|
||
I think we need to patch this test again:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1217174744.1217178926.18269.gz#err1
It failed the test here:
http://hg.mozilla.org/mozilla-central/index.cgi/file/d8fbfb07f444/toolkit/components/places/tests/bookmarks/test_393498.js#l109
which is a similar lastModified check.
Updated•17 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 32•17 years ago
|
||
This particular FAIL does not happen on my VM, while previous was frequent. Hwv this is the same trick as above, set lastModified in the past and then check.
Attachment #332865 -
Flags: review?(dietrich)
Reporter | ||
Comment 33•17 years ago
|
||
Comment on attachment 332865 [details] [diff] [review]
same trick as above
>diff --git a/toolkit/components/places/tests/bookmarks/test_393498.js b/toolkit/components/places/tests/bookmarks/test_393498.js
>--- a/toolkit/components/places/tests/bookmarks/test_393498.js
>+++ b/toolkit/components/places/tests/bookmarks/test_393498.js
>@@ -75,17 +75,19 @@ bmsvc.addObserver(observer, false);
> // main
> function run_test() {
> var testFolder = bmsvc.createFolder(bmsvc.placesRoot, "test Folder",
> bmsvc.DEFAULT_INDEX);
> var bookmarkId = bmsvc.insertBookmark(testFolder, uri("http://google.com/"),
> bmsvc.DEFAULT_INDEX, "");
> do_check_true(observer.itemChangedProperty == null);
>
>- var newDate = Date.now() * 1000;
>+ // We set lastModified 1ms in the past to workaround a timing bug on
>+ // virtual machines, see bug 427142 for details.
>+ var newDate = Date.now() * 1000 - 1;
wouldn't this actually set it to 1 microsecond in the past?
r=me with that fixed.
Attachment #332865 -
Flags: review?(dietrich) → review+
Comment 34•17 years ago
|
||
Can you also check this fix into CVS? It still occurs there:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.0/1218962400.1218969900.23218.gz#err0
Assignee | ||
Comment 35•17 years ago
|
||
Assignee | ||
Updated•17 years ago
|
Attachment #332865 -
Attachment is obsolete: true
Assignee | ||
Updated•17 years ago
|
Attachment #314727 -
Attachment is obsolete: true
Assignee | ||
Updated•17 years ago
|
Keywords: checkin-needed
Comment 36•17 years ago
|
||
Pushed to mozilla-central:
http://hg.mozilla.org/mozilla-central/index.cgi/rev/c756a4a9e2d3
Keywords: checkin-needed
Comment 37•17 years ago
|
||
and CVS:
Checking in toolkit/components/places/tests/bookmarks/test_393498.js;
/cvsroot/mozilla/toolkit/components/places/tests/bookmarks/test_393498.js,v <--
test_393498.js
new revision: 1.4; previous revision: 1.3
done
Status: REOPENED → RESOLVED
Closed: 17 years ago → 17 years ago
Resolution: --- → FIXED
Comment 38•17 years ago
|
||
Thanks again for the fixes!
Updated•16 years ago
|
Whiteboard: [orange]
Comment 39•16 years ago
|
||
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".
In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body contains places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.
Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.
Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
Updated•13 years ago
|
Keywords: intermittent-failure
Updated•13 years ago
|
Whiteboard: [orange]
You need to log in
before you can comment on or make changes to this bug.
Description
•