Closed
Bug 893970
Opened 12 years ago
Closed 12 years ago
compare foopy system performance across datacenters
Categories
(Infrastructure & Operations :: RelOps: General, task)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: arich, Assigned: arich)
Details
This bug is to test put foopies in a different datacenter from the attached tegras.
We'll start by setting up a test foopy in scl3 and then tracking metrics on existing and cross-datacenter foopies to see if there are differences.
Adding in :ctalbert -- ateam owns these tests, and would be the one to say "no impact" on tests.
RelEng will be the ones to say "no impact" on tegra device operations.
I don't understand the testing plan -- how are you proposing to load the test foopy?
| Assignee | ||
Comment 2•12 years ago
|
||
Let me clarify in the summary. This bug is to be able to set up and track IT work, e.g. measuring system performance, not acceptability for tests or tegra device operations (I'm not sure what the latter means in reference to foopies, but you said that it was the purview of releng, so I presume that will be defined in a bug that releng opens). I would presume there will be other bugs filed/tracked by releng and ateam to test the portions that they can verify as pass/fail, since those are both application layer things that IT doesn't have the expertise to test.
Summary: compare foopy performance across datacenters → compare foopy system performance across datacenters
Updated•12 years ago
|
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
For the tests, the most reliable measurement will be network latency, bandwidth, and packet loss across the data center connection versus those same numbers from the network that foopies and tegras currently share.
Any changes in these three things will cause the tests to behave in erratic, unpredictable ways. The clearest tests to see this on will be the mochitests. The easiest tests to measure the impact on will be the talos tests because there will be a jump in the numbers corresponding to this effect.
Currently the tegras have a very low intermittent failure rate, something we all worked very hard to create and maintain. It's exceptionally hard to measure changes to that rate from a staging instance - and experience shows that we don't usually do it well. I have my doubts that we'll be able to truly see intermittent destabilization on main tests with any kind of staging set up.
Our best bet test-wise would be to run the talos tests and compare the numbers we get with those of the current setup. We can also try running a constant build through many test cycles on the existing set up as well as the new set up and that might give us some indication of what affect the change in foopy location has on the robustness of the tests. But, as I said above, we have tried this in the past and have not been successful at predicting stability/instability figures at the low volumes of jobs produced in staging environments. (The android panda rollout was a great example of this).
I'm not sure what the timeline is for doing this, but I'm going to be very hesitant to commit folks to this testing and analysis until our existing 2013Q3 priorities are complete.
| Assignee | ||
Comment 4•12 years ago
|
||
Email conversation indicates that we aren't going to attempt to put foopies in scl3 while we have tegras in evelyn, so this is no longer needed.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•