Closed Bug 442819 Opened 16 years ago Closed 16 years ago

qm-win2k3-pgo01 is unreliable

Categories

(Release Engineering :: General, defect, P3)

x86
Windows Server 2003
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dietrich, Assigned: lsblakk)

References

Details

(Keywords: intermittent-failure)

This box has caused odd test failures several times, and caused tree-closure on the eve of a freeze. It has a history of misbehavior. See Rob's comment here: https://bugzilla.mozilla.org/show_bug.cgi?id=442778#c11.

(did i file this against the right component, etc?)
Once bug 442778 was fixed, this box stayed orange for different test failures, in an entirely unrelated suite of tests.
Re-opened bug 199692 which seems to be causing a lot of errors on this box in the mochitest suite.
qm-win2k3-pgo01 is still consistently orange. 

bug#440531 about intermittent red seems to have stopped, although nothing was
done to fix that!
(In reply to comment #0)

> (did i file this against the right component, etc?)

You done good, dietrich!

(In reply to comment #2)
> Re-opened bug 199692 which seems to be causing a lot of errors on this box in
> the mochitest suite.

lukas: as roc pointed out, we need to file new bugs for new failures in most cases. This is most-likely a separate issue from the originating bug due to whatever quirks this machine is suffering from.

Can you do some splunk mining for previous failures on this box?

lastly, 
(In reply to comment #3)
> qm-win2k3-pgo01 is still consistently orange. 
> 
> bug#440531 about intermittent red seems to have stopped, although nothing was
> done to fix that!

Like I said: this box has been consistently deranged. Worse, it's results have been largely ignored since inception.
(In reply to comment #4)
> Like I said: this box has been consistently deranged. Worse, it's results have
> been largely ignored since inception.
> 

adjusting code to please the deranged box seems far worse. this box should be run through the chipper, and replaced.

fwiw, a new and different test is failing today.
I think this is just symptomatic of all our other bugs about "our tests are not reliable on VMs". I don't believe it has anything to do with the box per-se, but the fact that our unit tests behave differently in a VM, which causes failures that otherwise wouldn't be seen.
Just had three failed compiles in a row:

two of the three have "make[6]: stat: nsILineInputStream.idl: Bad 
                    file number

first red: make[8]: *** [npbasic.dll] Error 80
second red: make[8]: *** [npscriptable.dll] Error 80
third red: make[8]: *** [npwinless.dll] Error 80

When I went in to look there was a hang up regarding the copying of mimeTypes.rdf from 'bin' to 'bin' - "Cannot copy mimeTypes: Cannot find the specified file"

Closed the error message, cancelled the copy and the next compile went green.
While the specific error message is different, I wonder if this is related to bug#440531.
Some failing tests were fixed recently. Leaving this bug open for now, but please update if you see future problems.
Suggestion in triage was to change filesystem from NTFS to FAT to see if that speedup will help.

Open question: do we need this on the Firefox3.0 tree anymore or can we just kill it. We do not have PGO on FF2, or FF3.1, at this time. Setting up a new (more stable?) PGO unittest machine on FF3.1 should be done as a separate bug, imho.
Assignee: nobody → lukasblakk
OS: Mac OS X → Windows Server 2003
So this VM only had the two drives that the ref platforms come with, not the additional 30GB fcal drive that we give to all win32 VMs nowadays.  This machine (and it's pgo builds & tests) will be back up on the new unittest 1.9 master as WINNT 5.2 fx-win32-1.9-slave09 (pgo) dep unit test

We can continue to keep an eye on it, but I hope that the new build drive will help.
So far, since the move, this machine has been consistently failing one reftest - bug 450637 has been filed on that.

Resolving this bug as fixed, with the new drive and the upgrades to buildbot and python I think it's safe to do so.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Whiteboard: [orange]
Whiteboard: [orange]
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.