Closed Bug 467791 Opened 16 years ago Closed 16 years ago

Windows Vista Txul talos machines are not stable

Categories

(Release Engineering :: General, defect)

x86
Windows Vista
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 463020

People

(Reporter: dbaron, Unassigned)

Details

The Txul numbers on Windows Vista are not stable -- instead, they're on a roughly-constant upward slope (although the slope has started varying a little more lately).  This seems to indicate some sort of problem with the machines.

See http://graphs.mozilla.org/graph.html#show=787087,787109,787113,1431842
These machines probably just need a reboot.  I'll reboot these manually.

See #463020.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → DUPLICATE
Each time they drop down to 150 is when they got rebooted.  There appears to be some problem that should be investigated as to why it gets worse over time, hence the reason why this bug was filed.

It may be true that rebooting the machines makes the numbers go down, but it doesn't change the fact that they are clearly not stable.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
please see https://bugzilla.mozilla.org/show_bug.cgi?id=463020#c27

Significant effort has already been put into investigating why results get worse over time, and the conclusion is that it's O/S related, not related to product or talos code.
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → DUPLICATE
(In reply to bug 463020 comment 27)
> Results being impacted by machine uptime shouldn't be surprising.  We're very
> dependent on all sorts of O/S issues like file system caches and memory
> fragmentation that we have little control over.
Where is the data that shows that this randomness is caused by those OS issues?  Until we have that data, we should assume it's something in our code and try to fix it.  In the specific case of this bug, we had a conversation with a developer yesterday (who has been cc'd to this bug) where he had an idea on what could be causing that issue.  Until we have data showing that this is caused by OS issues and not our code, we shouldn't be duplicating it to that bug.

Bugs are cheap.  Consolidating issues with no evidence that they are related comes across as a cheap way to either overlook the issue or keep open bug numbers down.  Neither is constructive to getting this issue fixed.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
From previous investigations of gradual increases in vista talos results we had corroboration with this developer:

https://bugzilla.mozilla.org/show_bug.cgi?id=419620#c6

He's seeing a similar issue with vista holding onto resources far past test completion times.  At that point we determined that the reasonable course of action was to put effort into making vista talos boxes rebootable and then make them reboot automatically.

We aren't trying to bury an issue.  We aren't trying to keep bug numbers down.  We are saying that we aren't in the business of debugging the OS when a simple reboot resolves any number weirdness.
Please read https://bugzilla.mozilla.org/show_bug.cgi?id=463020#c33

Relying on a system state that is dirty as a side effect of other tests isn't a good way to test how firefox behaves over the long term.

Please direct further discussion to 463020.
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → DUPLICATE
Component: Release Engineering: Talos → Release Engineering
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.