log system information current usage (cpu, swap, ram) in talos

RESOLVED WONTFIX

Status

Testing
Talos
RESOLVED WONTFIX
3 years ago
3 years ago

People

(Reporter: parkouss, Assigned: parkouss)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

MozReview Requests

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

(Assignee)

Description

3 years ago
The idea is to try to find the reason for the intermittent bug 1220327.

We could log system information to see if this kind of failure may be associated to overloaded systems, or not.
(Assignee)

Comment 1

3 years ago
Created attachment 8691546 [details]
MozReview Request: Bug 1227626 - log system information current usage (cpu, swap, ram) in talos. r=jmaher

Bug 1227626 - log system information current usage (cpu, swap, ram) in talos. r=jmaher
Attachment #8691546 - Flags: review?(jmaher)
Comment on attachment 8691546 [details]
MozReview Request: Bug 1227626 - log system information current usage (cpu, swap, ram) in talos. r=jmaher

https://reviewboard.mozilla.org/r/26127/#review23541

::: testing/talos/talos/run_tests.py:195
(Diff revision 1)
> +            system_collector.collect_and_log()

this will show at the beginning and the end of each overall test, but not necessarily when the browser hangs- we might get something out of this, but I suspect we will want to do this from javascript or on a timer like we do the counters.

::: testing/talos/talos/system_data.py:62
(Diff revision 1)
> +        LOG.debug("SystemData: %s" % self._format_data(data))

will we actually log.debug by default?
Attachment #8691546 - Flags: review?(jmaher)
(Assignee)

Comment 3

3 years ago
https://reviewboard.mozilla.org/r/26127/#review23541

> this will show at the beginning and the end of each overall test, but not necessarily when the browser hangs- we might get something out of this, but I suspect we will want to do this from javascript or on a timer like we do the counters.

Yes, but that can give us an idea of the overall system state, at a low cost. From javascript, it won't be easy to get system state information, instead you could have information browser related I guess. Since the idea here is to test the system load, I am not sure that taking values while the browser is running is really valuable - but if we want to make this a counter instead (that can be tracked on perfherder/graphserver), that could be interesting.

> will we actually log.debug by default?

yes, https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/testing/talos.py#375.
what do you think about printing this out every 1000ms ?
(Assignee)

Comment 5

3 years ago
(In reply to Joel Maher (:jmaher) from comment #4)
> what do you think about printing this out every 1000ms ?

Sure, that will involve a thread ofc. Pushing a review for you soon. :)
(Assignee)

Comment 6

3 years ago
Comment on attachment 8691546 [details]
MozReview Request: Bug 1227626 - log system information current usage (cpu, swap, ram) in talos. r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/26127/diff/1-2/
Attachment #8691546 - Flags: review?(jmaher)
Attachment #8691546 - Flags: review?(jmaher) → review+
Comment on attachment 8691546 [details]
MozReview Request: Bug 1227626 - log system information current usage (cpu, swap, ram) in talos. r=jmaher

https://reviewboard.mozilla.org/r/26127/#review23581

thanks for making this in a thread.  I assume this passes flake8 :)
(Assignee)

Comment 8

3 years ago
(In reply to Joel Maher (:jmaher) from comment #7)
> thanks for making this in a thread.  I assume this passes flake8 :)

Eh, it have to now! I will push on try to be sure everything works for all platforms.
this is great on try, but in seeing failed jobs I do not see any difference in resource usage.  Personally I do not think we should land this- unless we want to land this and not collect by default.
(Assignee)

Comment 11

3 years ago
(In reply to Joel Maher (:jmaher) from comment #10)
> this is great on try, but in seeing failed jobs I do not see any difference
> in resource usage.  Personally I do not think we should land this- unless we
> want to land this and not collect by default.

Yeah, I agree, we should not land this, as it does not looks like it can add any value.

Maybe we can rename the bug for further exploration with the pageloader though.
(Assignee)

Comment 12

3 years ago
(In reply to Julien Pagès (:parkouss) from comment #11)
> Maybe we can rename the bug for further exploration with the pageloader
> though.

Oh, now that I read this again, that sound not good. We can use the intermittent bug for that, would make more sense. So, closing this bug as WONTFIX since we now that this is not helpful. Feel free to reopen Joel if you think otherwise.
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.