Closed Bug 1227626 Opened 9 years ago Closed 9 years ago

log system information current usage (cpu, swap, ram) in talos

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: parkouss, Assigned: parkouss)

Details

Attachments

(1 file)

The idea is to try to find the reason for the intermittent bug 1220327.

We could log system information to see if this kind of failure may be associated to overloaded systems, or not.
Bug 1227626 - log system information current usage (cpu, swap, ram) in talos. r=jmaher
Attachment #8691546 - Flags: review?(jmaher)
Comment on attachment 8691546 [details]
MozReview Request: Bug 1227626 - log system information current usage (cpu, swap, ram) in talos. r=jmaher

https://reviewboard.mozilla.org/r/26127/#review23541

::: testing/talos/talos/run_tests.py:195
(Diff revision 1)
> +            system_collector.collect_and_log()

this will show at the beginning and the end of each overall test, but not necessarily when the browser hangs- we might get something out of this, but I suspect we will want to do this from javascript or on a timer like we do the counters.

::: testing/talos/talos/system_data.py:62
(Diff revision 1)
> +        LOG.debug("SystemData: %s" % self._format_data(data))

will we actually log.debug by default?
Attachment #8691546 - Flags: review?(jmaher)
https://reviewboard.mozilla.org/r/26127/#review23541

> this will show at the beginning and the end of each overall test, but not necessarily when the browser hangs- we might get something out of this, but I suspect we will want to do this from javascript or on a timer like we do the counters.

Yes, but that can give us an idea of the overall system state, at a low cost. From javascript, it won't be easy to get system state information, instead you could have information browser related I guess. Since the idea here is to test the system load, I am not sure that taking values while the browser is running is really valuable - but if we want to make this a counter instead (that can be tracked on perfherder/graphserver), that could be interesting.

> will we actually log.debug by default?

yes, https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/testing/talos.py#375.
what do you think about printing this out every 1000ms ?
(In reply to Joel Maher (:jmaher) from comment #4)
> what do you think about printing this out every 1000ms ?

Sure, that will involve a thread ofc. Pushing a review for you soon. :)
Comment on attachment 8691546 [details]
MozReview Request: Bug 1227626 - log system information current usage (cpu, swap, ram) in talos. r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/26127/diff/1-2/
Attachment #8691546 - Flags: review?(jmaher)
Attachment #8691546 - Flags: review?(jmaher) → review+
Comment on attachment 8691546 [details]
MozReview Request: Bug 1227626 - log system information current usage (cpu, swap, ram) in talos. r=jmaher

https://reviewboard.mozilla.org/r/26127/#review23581

thanks for making this in a thread.  I assume this passes flake8 :)
(In reply to Joel Maher (:jmaher) from comment #7)
> thanks for making this in a thread.  I assume this passes flake8 :)

Eh, it have to now! I will push on try to be sure everything works for all platforms.
this is great on try, but in seeing failed jobs I do not see any difference in resource usage.  Personally I do not think we should land this- unless we want to land this and not collect by default.
(In reply to Joel Maher (:jmaher) from comment #10)
> this is great on try, but in seeing failed jobs I do not see any difference
> in resource usage.  Personally I do not think we should land this- unless we
> want to land this and not collect by default.

Yeah, I agree, we should not land this, as it does not looks like it can add any value.

Maybe we can rename the bug for further exploration with the pageloader though.
(In reply to Julien Pagès (:parkouss) from comment #11)
> Maybe we can rename the bug for further exploration with the pageloader
> though.

Oh, now that I read this again, that sound not good. We can use the intermittent bug for that, would make more sense. So, closing this bug as WONTFIX since we now that this is not helpful. Feel free to reopen Joel if you think otherwise.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: