Enhance the xpcshell harness to provide per test resource usage data

ASSIGNED
Assigned to

Status

enhancement
ASSIGNED
6 years ago
6 years ago

People

(Reporter: mihneadb, Assigned: mihneadb)

Tracking

(Depends on 1 bug, Blocks 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 14 obsolete attachments)

14.97 KB, patch
gps
: review+
Details | Diff | Splinter Review
All the xpcshell processes that we open are wrapped with psutil[1] (if available), so we can use that to get this data.

http://code.google.com/p/psutil/
Blocks: 895225
I think this should be built on the structured logging work.

When head.js is about to run tests, send a structured message that instructs Python to take a resource usage "snapshot." Then, when tests have finished but before process exit, send another message for a snapshot. Then, the Python process will have visibility into the xpcshell process init overhead *and* the test-specific resource usage.
Yes, I think this is the proper way to do it. I'll try to come up with a prototype before structured logging lands, it will be useful anyway.
To answer your question in bug 895225, if we can get this data with no or little performance penalty, we should have it always on.
(In reply to Gregory Szorc [:gps] from comment #1)
> I think this should be built on the structured logging work.
> 
> When head.js is about to run tests, send a structured message that instructs
> Python to take a resource usage "snapshot." Then, when tests have finished
> but before process exit, send another message for a snapshot. Then, the
> Python process will have visibility into the xpcshell process init overhead
> *and* the test-specific resource usage.

I talked to Chris about this and with the current state of things this approach would not be helpful.

We would need some sort of a socket over which the harness could talk to head.js and establish a "handshake". The current implementation just outputs data and so depending on the OS scheduler & such we might end up finding out (on the python side) that xpcshell has started when the process is actually done.

I think I can come up with an approach that will give us relevant data on top of what we have now. Will be back with a patch.


(In reply to Gregory Szorc [:gps] from comment #3)
> To answer your question in bug 895225, if we can get this data with no or
> little performance penalty, we should have it always on.

So if the perf is ok, we don't mind filling the logs with more stuff? Wasn't sure if it would be ok or not.
(In reply to Mihnea Dobrescu-Balaur (:mihneadb) from comment #4)
> (In reply to Gregory Szorc [:gps] from comment #1)
> > I think this should be built on the structured logging work.
> > 
> > When head.js is about to run tests, send a structured message that instructs
> > Python to take a resource usage "snapshot." Then, when tests have finished
> > but before process exit, send another message for a snapshot. Then, the
> > Python process will have visibility into the xpcshell process init overhead
> > *and* the test-specific resource usage.
> 
> I talked to Chris about this and with the current state of things this
> approach would not be helpful.
> 
> We would need some sort of a socket over which the harness could talk to
> head.js and establish a "handshake". The current implementation just outputs
> data and so depending on the OS scheduler & such we might end up finding out
> (on the python side) that xpcshell has started when the process is actually
> done.

Right. xpcshell would need to wait for an ack before continuing. It's possible to open a socket from JS. nsISocketServer or something like that.
 
> I think I can come up with an approach that will give us relevant data on
> top of what we have now. Will be back with a patch.

Yes, having data - even if it's not ideally or robustly obtained - is better than no data.

> (In reply to Gregory Szorc [:gps] from comment #3)
> > To answer your question in bug 895225, if we can get this data with no or
> > little performance penalty, we should have it always on.
> 
> So if the perf is ok, we don't mind filling the logs with more stuff? Wasn't
> sure if it would be ok or not.

Long term we'll be writing machine readable files and relying on logs less. In that future, writing a little extra metadata for each test shouldn't be a big deal.
Here we go.

CPU_percentage is sketchy because it is called with interval=0 (see docs). I kept
it in there just in case. The io_counters feature is unavailable on OS X, sadly.

Try push incoming.
Assignee: nobody → mihneadb
Status: NEW → ASSIGNED
More details in the output, also more "machine readable" (all json).

https://tbpl.mozilla.org/?tree=Try&rev=37480022278f
Attachment #793235 - Attachment is obsolete: true
Will probably have to add some condition to not perform this stuff while on mobile (since we don't have processes but file objects). Oh well.
Proper path handling for test_file.
Attachment #793263 - Attachment is obsolete: true
Using this to get a json file with all the data.
Posted file resources.json (obsolete) —
json file from OS X
Improving the structure a bit. Also I found that we have to run this with
--sequential to get better data.
Attachment #793282 - Attachment is obsolete: true
Better structure, added duration.
Attachment #793565 - Attachment is obsolete: true
Rebased
Attachment #793287 - Attachment is obsolete: true
> Improving the structure a bit. Also I found that we have to run this with
> --sequential to get better data.

Why? I thought psutil would grab counters from individual processes?
It does. But the process still has to exist in the PID table, and for short running tests, because of OS scheduling & such, you end up missing reading the process data.
Ok, I stitched together a graph for the resource usage data [1]. You have to point it to the resources.json file that you can generate with the 2nd patch in this bug. If you only want to try it out, there's a link to a default file on the page (and one attached to the bug).


[1] http://www.mihneadb.net/xpcshelltest-resource-usage
Posted file resources.json (linux) (obsolete) —
Folded into one patch. I think we should merge this into m-c. Other people could
use it this way.
Attachment #793815 - Flags: review?(gps)
Attachment #793576 - Attachment is obsolete: true
Attachment #793578 - Attachment is obsolete: true
(In reply to Mihnea Dobrescu-Balaur (:mihneadb) from comment #17)
> It does. But the process still has to exist in the PID table, and for short
> running tests, because of OS scheduling & such, you end up missing reading
> the process data.

If we're not actually collecting good data, then what's the point?

I see a few solutions:

1) Have the xpcshell process send a signal to Python to collect resource usage and wait for an ack before proceeding. You could potentially do this over stdin and stdout. You could also use this to measure resource usage of individual phases - e.g. process startup, actual test execution, just before process exit, etc.

2) Grab resource usage from JS and send it back to Python. There is some code at https://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryPing.js#104 for grabbing I/O counters on Windows. You could potentially extract that code into a standalone JSM and then add support for Linux and OS X (although Windows is the most important - so shipping with just it would be a win).

3) Something else???
(In reply to Gregory Szorc [:gps] from comment #21)
> (In reply to Mihnea Dobrescu-Balaur (:mihneadb) from comment #17)
> > It does. But the process still has to exist in the PID table, and for short
> > running tests, because of OS scheduling & such, you end up missing reading
> > the process data.
> 
> If we're not actually collecting good data, then what's the point?
>

Well, it's still relevant data, it's just not complete.
 
> I see a few solutions:
> 
> 1) Have the xpcshell process send a signal to Python to collect resource
> usage and wait for an ack before proceeding. You could potentially do this
> over stdin and stdout. You could also use this to measure resource usage of
> individual phases - e.g. process startup, actual test execution, just before
> process exit, etc.

We could do this, but this will definitely have to be behind a flag, since it will affect perf.

> 
> 2) Grab resource usage from JS and send it back to Python. There is some
> code at
> https://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/
> TelemetryPing.js#104 for grabbing I/O counters on Windows. You could
> potentially extract that code into a standalone JSM and then add support for
> Linux and OS X (although Windows is the most important - so shipping with
> just it would be a win).
>

I'll look into this as well.
 
> 3) Something else???

If I can find a way to make Popen keep the process alive (valid PID) until I want it to, we can solve it just from the python side.
(In reply to Mihnea Dobrescu-Balaur (:mihneadb) from comment #22)
> If I can find a way to make Popen keep the process alive (valid PID) until I
> want it to, we can solve it just from the python side.

I /think/ there is a low-level system-y way to do this, but my low-level POSIX knowledge is rusty. Furthermore, a solution on Windows would likely be very different.

I think it would be acceptable for head.js to send a message to Python just before shutdown then wait for Python to "ack" that signal. You could employ blocking I/O for this. Once the message is sent from head.js, do a blocking read on stdin.
(In reply to Mihnea Dobrescu-Balaur (:mihneadb) from comment #22)
> If I can find a way to make Popen keep the process alive (valid PID) until I
> want it to, we can solve it just from the python side.

If you're using .communicate() then it's always going to wait on the process and reap it. On Linux you can use waitid to wait for the process to finish but not reap it, I've done that before to grab data out of /proc/$pid:
http://hg.mozilla.org/users/tmielczarek_mozilla.com/procio/file/3edb270a07d5/procio.c#l84

I don't know how to do that on Windows or Mac though.
Comment on attachment 793815 [details] [diff] [review]
Enhance the xpcshell harness to provide per test resource usage data

Review of attachment 793815 [details] [diff] [review]:
-----------------------------------------------------------------

Cancelling review until this is more robust.
Attachment #793815 - Flags: review?(gps)
Depends on: 909577
Updated for the latest ProcessInfo.jsm
Attachment #793815 - Attachment is obsolete: true
Attachment #799205 - Attachment is obsolete: true
Attachment #793291 - Attachment is obsolete: true
Attachment #793729 - Attachment is obsolete: true
I'm not sure what else I could/should do to make this better.
Attachment #801957 - Flags: review?(gps)
Attachment #799206 - Attachment is obsolete: true
Comment on attachment 801957 [details] [diff] [review]
Enhance the xpcshell harness to provide per test resource usage data

Review of attachment 801957 [details] [diff] [review]:
-----------------------------------------------------------------

I love it! Just have a bunch of nits.

::: testing/xpcshell/head.js
@@ +389,5 @@
> +    Components.utils.import("resource://gre/modules/ProcessInfo.jsm");
> +    let resources = {
> +      IO: ProcessInfo.getIOCounters(),
> +      CPU: ProcessInfo.getCPUTimes(),
> +      MEM: ProcessInfo.getMemoryUsage()

Nit: use lower case keys.

::: testing/xpcshell/mach_commands.py
@@ +227,5 @@
>      @CommandArgument('--rerun-failures', action='store_true',
>          help='Reruns failures from last time.')
> +    @CommandArgument('--resource-usage-log', default=None,
> +        dest='resource_usage_log',
> +        help='Path for dumping JSON data of used resources.')

I wonder if enabled-by-default is sensible. Presumably we'll have this enabled by default on automation eventually. Easy to do in a followup though.

::: testing/xpcshell/runxpcshelltests.py
@@ +620,5 @@
> +                if self.resource_usage_list is not None:
> +                    self.resource_data["duration"] = timeTaken
> +                    # get test file path relative to $SRC_DIR
> +                    full_path = self.test_object["path"]
> +                    if "xpcshell" in full_path:

This seems fragile. What if xpcshell exists in the middle of the path or at the end? I think you should look for a "xpcshell" prefix somehow.

@@ +1263,5 @@
>          self.try_again_list = []
>  
> +        resource_usage_list = None
> +        if resource_usage_log:
> +            resource_usage_list = []

I'd just call it "resource_usage."

@@ +1283,5 @@
>              'logfiles': self.logfiles,
>              'xpcshell': self.xpcshell,
>              'xpcsRunArgs': self.xpcsRunArgs,
> +            'failureManifest': failureManifest,
> +            'resource_usage_list': resource_usage_list,

This is why adding trailing commas is good style :)

@@ +1462,5 @@
>  
> +        if resource_usage_log:
> +            with open(resource_usage_log, 'w') as f:
> +                json.dump(resource_usage_list, f, sort_keys=True, indent=2)
> +                print "Resource usage log stored at %s." % resource_usage_log

An issue with JSON files where a list is the top-level element is that it's bad for security.

http://stackoverflow.com/questions/3503102/what-are-top-level-json-arrays-and-why-are-they-a-security-risk

Write out an object with a single list element. Or, change the data structure to be something like:

{
    "test_foo.js": [{<run 1>}, {<run 2>}],
    "test_bar.js": ...
}

@@ +1539,5 @@
>                          action="store",
>                          help="path to file where failure manifest will be written.")
> +        self.add_option("--resource-usage-log", default=None,
> +                        dest='resource_usage_log',
> +                        help='Path for dumping JSON data of used resources.')

If you define a log as an append-only file or stream, then this really isn't a log since you are just dumping a JSON data structure. Let's call it "--resource-usage-file."
Attachment #801957 - Flags: review?(gps) → feedback+
I dropped the prettifying of the test_path for now.

Updated according to the comments, also updated the resource viewer[1].


[1] https://github.com/mihneadb/xpcshelltest-resource-usage/commit/a78874b837bfc198d99ac74306f5eee4b2ab7f40
Attachment #804738 - Flags: review?(gps)
Attachment #801957 - Attachment is obsolete: true
Comment on attachment 804738 [details] [diff] [review]
Enhance the xpcshell harness to provide per test resource usage data

Review of attachment 804738 [details] [diff] [review]:
-----------------------------------------------------------------

\o/

::: testing/xpcshell/runxpcshelltests.py
@@ +618,5 @@
>                  self.xunit_result["time"] = now - startTime
>  
> +                if self.resource_usage_results is not None:
> +                    self.resource_data["duration"] = timeTaken
> +                    self.resource_usage_results[self.test_object["path"]] = self.resource_data

Can't we have multiple measurements per test? Shouldn't we store those separately instead of last write wins? I think it would be interesting to e.g. run tests N times and verify the resource usage is consistent!

@@ +1454,5 @@
>              return False
>  
> +        if resource_usage_file:
> +            resource_usage_data = {
> +                'version': 1.0,

Integer version 1, please. You aren't releasing software here, so we don't need minor version numbers.
Attachment #804738 - Flags: review?(gps) → review+
Let's do the multiple measures per test in a follow up.
Attachment #808226 - Flags: review?(gps)
Attachment #804738 - Attachment is obsolete: true
Comment on attachment 808226 [details] [diff] [review]
Enhance the xpcshell harness to provide per test resource usage data

Review of attachment 808226 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/xpcshell/mach_commands.py
@@ +97,5 @@
>              'debugger': debugger,
>              'debuggerArgs': debuggerArgs,
>              'debuggerInteractive': debuggerInteractive,
> +            'rerun_failures': rerun_failures,
> +            'resource_usage_file': resource_usage_file

Nit: Add the trailing comma so you don't have to change 2 lines the next time you add something.
Attachment #808226 - Flags: review?(gps) → review+
You need to log in before you can comment on or make changes to this bug.