Closed Bug 1034290 Opened 5 years ago Closed 4 years ago

Integrate Structured Logging with Reftests

Categories

(Testing :: Mozbase, defect)

x86
macOS
defect
Not set

Tracking

(firefox46 fixed, firefox47 fixed)

RESOLVED FIXED
mozilla47
Tracking Status
firefox46 --- fixed
firefox47 --- fixed

People

(Reporter: chmanchester, Assigned: ahal)

References

(Blocks 3 open bugs)

Details

Attachments

(2 files, 3 obsolete files)

No description provided.
This is a work in progress for (sequential) desktop reftests. It's working pretty well on try:

With raw logs in blobber:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=dab213f46a30

And an earlier version with a seeded failure and reftest-analyzer compatible output:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=d186ec479070
Assignee: nobody → cmanchester
Status: NEW → ASSIGNED
This is a work in progress. I still need to determine whether we can use the standard tbplformatter to produce reftest-analyzer compatible output.
Attachment #8516154 - Attachment is obsolete: true
I took this a little further to log test identifiers with reftypes.
Attachment #8522397 - Attachment is obsolete: true
I hacked up a version of the reftest analyzer that works with structured logging for the web-platform-tests [1]; it's the same but reading the input from a structured log file rather than using regexps. I suggest we target this rather than going for compatibility with the existing analyzer.

[1] http://hoppipolla.co.uk/410/reftest-analyser-structured.xhtml
I'm going to look into finishing this up. Chris, do you remember what there's still left to do? Is it just reftest analyzer + testing? Or was there something you got stuck on?
Flags: needinfo?(cmanchester)
Great to hear that. Hopefully what I have here will be helpful. These were my open questions at the time:

I didn't integrate with the parallel reftest stuff. My impression is that that effort has died, so that might not be a problem.

The "ReftestFormatter" in my patch goes to lengths to re-construct output in the harness in a way that will work with the reftest analyzer. This was tested and effective at the time, but as jgraham notes in comment 4 we may be better off exposing this further downstream. I haven't looked in to what it would take to coordinate this integration with treeherder.

I didn't get this working with Android or B2G -- those platforms still use automation.py, so will probably be some raw log parsing to kill off.

jsreftests are handled by their own code path in reftest.js, I only did cursory testing those were handled in a way that made sense.

Feel free to ping me about anything else related to this, I'm happy to lend a hand.
Flags: needinfo?(cmanchester)
Bug 1034290 - Use structured log output for test results in desktop reftests.
Comment on attachment 8533866 [details] [diff] [review]
Use structured log output for test results in desktop reftests

The above is simply this patch rebased onto latest mozilla-central. There was a lot of bitrot. Haven't tested or dug into the changes yet, but this should at least give me a place to start from.
Attachment #8533866 - Attachment is obsolete: true
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Bug 1034290 - Use structured log output for test results in desktop reftests.
Depends on: 1201616
Depends on: 1204015
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Bug 1034290 - Use structured log output for test results in desktop reftests.
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Bug 1034290 - Use structured log output for test results in desktop reftests.
Quick update:

This is still on my radar, but has been taking a back seat to bug 1171033. Since comment 8, I've gotten both desktop and android working with structured logging. I'm a little stuck on B2G though. Basically, the reftest-content.js script just never seems to load for some reason. But my biggest problem is that I can't get my B2G builds to work anymore, so can't test it locally.
Depends on: 1219870
Depends on: 1228595
Did a bit more pushing on this, here's the latest state:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5b5fc6e8ab4b&selectedJob=14176596

B2G Emulator: this is a shutdown segfault in indexedDB (filed bug 1228595). Did some digging and it actually happens right now, but prior to this patch the jobs weren't turning red. If there is no progress on bug 1228595 anytime soon, I'll see if I can hack around the issue in order to get this landed.

Mulet: Still need to hook this up to the OutputHandler class. Tests are running, but the raw logs are just being dumped to stdout unprocessed.
No longer depends on: 1228595
Mulet fixed:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=973bad7006ef

B2G Emulator: did some more digging, and it's actually red because the magic string 'REFTEST TEST-START | Shutdown' was getting formatted after the place that checks for it and therefore the output loop never exited. The fact that there was a segfault immediately prior in the log was pure coincidence.

I've also tested the reftest analyzer, and it still works despite the slightly different log format. So looks like all issues fixed \o/. Will have the patch up for review shortly.
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/17797/diff/4-5/
Attachment #8655032 - Attachment description: MozReview Request: Bug 1034290 - Use structured log output for test results in desktop reftests. → MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=chmanchester
Attachment #8655032 - Flags: review?(cmanchester)
Sorry for the large hard to review patch. I don't think there's any good way of breaking it down. Here is the try run for the latest patch:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=761b306f46f6

Jobs still turn orange, reftest-anaylzer still works (except android.. it never worked for android). Will do a lot of re-triggers before landing though.
How does this change the output that developers see when running reftest (locally, or looking at logs on treeherder)?

(I'm a little concerned about the "Result summary:" bit going away; I do look at that pretty regularly, because it lets me see a useful summary when I forgot to redirect the "./mach reftest" output to a file.)
My goal was to get this landed without breaking anything or changing the status quo. The logs should be nearly identical (there are some very minor changes, like the "loading a blank page" message is now logged at the debug level).

The result summary is still there, it's just getting formatted by mozlog on the python side (see ReftestFormatter.suite_end in output.py).
I'm going through this and will give feedback, but we need another reviewer as well (I recognize a lot of this from the original patch, and we should have someone more familiar with reftests look at it).
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

https://reviewboard.mozilla.org/r/17797/#review25591

::: build/automation.py.in:8
(Diff revision 5)
> +import json

I don't see this import used.

::: build/mobile/b2gautomation.py:450
(Diff revision 5)
> -            return '\n'.join(lines)
> +            return lines

I don't think I understand this part -- in the case we get more lines, where are they actually logged?

::: build/mobile/remoteautomation.py:98
(Diff revision 5)
> -    def waitForFinish(self, proc, utilityPath, timeout, maxTime, startTime, debuggerInfo, symbolsPath):
> +    def waitForFinish(self, proc, utilityPath, timeout, maxTime, startTime, debuggerInfo, symbolsPath, outputHandler=None):

I don't see this parameter used.

::: build/mobile/remoteautomation.py:333
(Diff revision 5)
> -                return [newLogContent]
> +                return True

Just so I understand, this change to return a boolean from this function is cleanup, and doesn't really change behavior?

::: layout/tools/reftest/output.py:30
(Diff revision 5)
> +            # Output from the "automationutils" component requires that no prefix be added
> +            # so that mozharness will pick up these failures.
> +            return "%s\n" % data['message']

It looks like automationutils has been deleted and the leak log stuff has been overhauled since I last looked at it.

::: layout/tools/reftest/output.py:57
(Diff revision 5)
> +        if extra:

This "if" isn't necessary.

::: layout/tools/reftest/reftest.js:186
(Diff revision 5)
> -    if (!gVerbose) {
> -        // In verbose mode, we've dumped all these messages already.
> +    // In debug mode, we've dumped all these messages already.
> +    if (gLogLevel !== 'debug') {

We don't seem to be using the idea of a log level a lot, I think this could just as well still be a verbose flag.

::: layout/tools/reftest/reftest.js:510
(Diff revision 5)
>          // tURLs is a temporary array containing all active tests
>          var tURLs = new Array();
> +        var tIDs = new Array();

Maybe add a note about tIDs to this comment.

::: layout/tools/reftest/reftest.js:1806
(Diff revision 5)
> -            gDumpLog("REFTEST TEST-UNEXPECTED-PASS | " + gURLs[0].prettyPath +
> -                 " | assertion count " + numAsserts + " is less than " +
> -                 expectedAssertions + "\n");
> +            logger.testStatus(gURLs[0].identifier, "assertion check", "PASS", "FAIL",
> +                              "assertion count " + numAsserts + " is less than " +
> +                              expectedAssertions);

Test status seems wrong here, this is always going to happen between tests. Log a testStart/testEnd pair here?

::: layout/tools/reftest/reftest.js:1934
(Diff revision 5)
> -    gDumpLog("REFTEST TEST-UNEXPECTED-FAIL | " + gCurrentURL + " | " + what + "\n");
> +    logger.error([gCurrentURL, what]);

This isn't supposed to be an array.

::: layout/tools/reftest/remotereftest.py:23
(Diff revision 5)
> +from runreftest import RefTest, ReftestResolver

This is duped from line 14.

::: layout/tools/reftest/runreftest.py:66
(Diff revision 5)
> +log = logging.getLogger()

I think this and the import logging above can go away now...

::: layout/tools/reftest/runreftest.py:559
(Diff revision 5)
> -                                           symbolsPath=symbolsPath,
> +            if message['action'] == 'test-start':

test_start

::: layout/tools/reftest/runreftest.py:603
(Diff revision 5)
> +        if self.slog is not None:
> +            crashed = mozcrash.log_crashes(self.slog, os.path.join(profile.profile, 'minidumps'),
> +                                           symbolsPath, test=self.lastTestSeen)

Populating the structured logger isn't treated as optional anymore, I would re-name this to self.log and remove checks like this.

::: testing/modules/StructuredLog.jsm:90
(Diff revision 5)
> -  suiteStart: function (tests, runinfo=null) {
> +  suiteStart: function (tests, runinfo=null, versioninfo=null, deviceinfo=null, extra=null) {

Are these new parameters used?

::: testing/modules/StructuredLog.jsm:163
(Diff revision 5)
> +  processOutput: function(thread, message) {

I don't see this being used.
Attachment #8655032 - Flags: review?(cmanchester)
https://reviewboard.mozilla.org/r/17797/#review25591

> I don't think I understand this part -- in the case we get more lines, where are they actually logged?

The '\n'.join() was moved to line 221 of b2gautomation.py for the old case. In the new case (an outputHandler gets passed in), they get logged in output.py in the reftest harness via StructuredLogger.log_raw

> I don't see this parameter used.

You're correct, but it's needed because of the convoluted inheritance that's happening in mochitest. Without it, there is a TypeError.

> Just so I understand, this change to return a boolean from this function is cleanup, and doesn't really change behavior?

Correct. I noticed that the return value wasn't being used anywhere so simplified it a bit.

> It looks like automationutils has been deleted and the leak log stuff has been overhauled since I last looked at it.

I think this was copied from your original patch, so I don't really understand it myself. I'll remove it and see if it makes any difference on try.

> We don't seem to be using the idea of a log level a lot, I think this could just as well still be a verbose flag.

You're right again, but it's actually easier to use a level as it fits well with the StructuredLog paradigm. So given that a level is more flexible and easier to implement than trying to keep 'gVerbose', I'd like to just keep it. It's also hooked up to the command line argument, which is a nice benefit.

> Test status seems wrong here, this is always going to happen between tests. Log a testStart/testEnd pair here?

Ah, I didn't realize this.. Hmm, this is a little tricky. I guess ideally the testEnd call would be after this happens, doesn't look like doing that will be super straightforward though.

> Are these new parameters used?

Just extra is, but versioninfo and deviceinfo are defined in the StructuredLogging spec, so figured might as well make this implementation compatible while I was changing the signature anyway.

> I don't see this being used.

Correct, it was used in an old variation of this patch. But as 'process_output' is part of the spec, I decided to leave it in rather than remove it.
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/17797/diff/5-6/
Attachment #8655032 - Attachment description: MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=chmanchester → MozReview Request: Bug 1034290 - Use structured log output for test results in reftest
Assignee: cmanchester → ahalberstadt
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/17797/diff/6-7/
https://reviewboard.mozilla.org/r/17797/#review25591

> Ah, I didn't realize this.. Hmm, this is a little tricky. I guess ideally the testEnd call would be after this happens, doesn't look like doing that will be super straightforward though.

Yeah, looking into this some more I don't really have a good answer. I think a brand new test_start/test_end would be both too confusing and too verbose, and putting this inside the existing test is also a little weird. I see two better options:

1. Turn the first two into logger.error, and the last one into logger.info. This doesn't maintain backwards compat.
2. Log these outside of structured logging with gDumpFn and file a follow-up to revisit this.

I'm leaning towards option 2 because I don't have a ton of time for this, but want to get it landed ASAP while it still works.
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/17797/diff/7-8/
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/17797/diff/8-9/
Attachment #8655032 - Flags: review?(jmaher)
Joel, I'm flagging you for review because :chmanchester rightly pointed out that he wrote part of the patch, so shouldn't review it. Sorry it's massive. Please take your time, no rush. It can wait until the new year.

Here's a working try run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9600efd0bae3
https://reviewboard.mozilla.org/r/17797/#review25591

> I think this was copied from your original patch, so I don't really understand it myself. I'll remove it and see if it makes any difference on try.

From your try run:

10:35:31     INFO -  REFTEST INFO | TEST-INFO | leakcheck | default process: leak threshold set at 0 bytes
10:35:31     INFO -  REFTEST INFO | TEST-INFO | leakcheck | plugin process: leak threshold set at 0 bytes
10:35:31     INFO -  REFTEST INFO | TEST-INFO | leakcheck | tab process: leak threshold set at 5000 bytes
10:35:31     INFO -  REFTEST INFO | TEST-INFO | leakcheck | geckomediaplugin process: leak threshold set at 0 bytes
10:35:31     INFO -  REFTEST INFO |
10:35:31     INFO -  REFTEST INFO | == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, default process 1886
10:35:31     INFO -  REFTEST INFO |
10:35:31     INFO -  REFTEST INFO |      |<----------------Class--------------->|<-----Bytes------>|<----Objects---->|
10:35:31     INFO -  REFTEST INFO |      |                                      | Per-Inst   Leaked|   Total      Rem|
10:35:31     INFO -  REFTEST INFO |    0 |TOTAL                                 |       20        0|224187112        0|
10:35:31     INFO -  REFTEST INFO |
10:35:31     INFO -  REFTEST INFO | nsTraceRefcnt::DumpStatistics: 1410 entries
10:35:31     INFO -  REFTEST INFO |
10:35:31     INFO -  REFTEST INFO | TEST-PASS | leakcheck | default process: no leaks detected!

The REFTEST INFO prefix would keep mozharness from recognizing this as a failure if TEST-PASS were instead TEST-UNEXPECTED-FAIL.

> Yeah, looking into this some more I don't really have a good answer. I think a brand new test_start/test_end would be both too confusing and too verbose, and putting this inside the existing test is also a little weird. I see two better options:
> 
> 1. Turn the first two into logger.error, and the last one into logger.info. This doesn't maintain backwards compat.
> 2. Log these outside of structured logging with gDumpFn and file a follow-up to revisit this.
> 
> I'm leaning towards option 2 because I don't have a ton of time for this, but want to get it landed ASAP while it still works.

If we're ok with missing assertion counts in auto-classifications I guess, but leave a comment about it.
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

https://reviewboard.mozilla.org/r/17797/#review25913

a few things here- I would like to understand this more by getting answers to the questions and any appropriate changes made before r=me.

::: layout/tools/reftest/output.py:98
(Diff revisions 4 - 5)
> +        return "INFO | Result summary:\n{}\n".format('\n'.join(lines))

is there a need for SUITE-END?  this seems odd to add it in here.

::: layout/tools/reftest/output.py:99
(Diff revisions 4 - 5)
>  

for structured logging, we are adding hardcoded structure with the |, that doesn't seem logical.

::: layout/tools/reftest/output.py:110
(Diff revisions 4 - 6)
> +        self.suite_finished = False

I assume suite_finished is only used by b2g?  if so, we should add a comment to that.

::: layout/tools/reftest/reftest.js:1326
(Diff revisions 4 - 6)
> -                    logger.testEnd(test.identifier, "FAIL", "OK",
> +                    logger.testEnd(test.identifier, "FAIL", "PASS",

is it ok to s/OK/PASS/ ?

::: layout/tools/reftest/reftest.js:1934
(Diff revisions 4 - 5)
>      logger.error([gCurrentURL, what]);

more hard coding the | character :(

::: layout/tools/reftest/reftest.js:1934
(Diff revisions 5 - 6)
> -    logger.error([gCurrentURL, what]);
> +    logger.error(gCurrentURL + " | " + what);

ack, this seems more hacky rather than structured.

::: build/mobile/b2gautomation.py:236
(Diff revision 8)
> +                        hasattr(self, 'logFinish') and self.logFinish in currentlog):

this is really hacky looking.  If we always have current log, then why do we need to query the outputHandler?  are there cases which outputHandler will have defined suite_finished and logFinish will not be in currentlog?

::: build/mobile/remoteautomation.py:343
(Diff revision 8)
> +            if not lines:

why not elif not lines?

::: build/mobile/remoteautomation.py:351
(Diff revision 8)
> -                    if message['action'] == 'test_start':
> +                    if isinstance(message, dict) and message['action'] == 'test_start':

if we are going to check for the proper type we should check ot see if has_key('action') as well.

::: layout/tools/reftest/reftest.js:174
(Diff revision 8)
> +  gDumpFn("\n" + line + "\n");

why a \n before and after?  can we ensure one before and ignore the after?

::: layout/tools/reftest/reftest.js
(Diff revision 8)
> -    gVerbose = !!env.get("MOZ_REFTEST_VERBOSE");

we should remove references to this in automation and elsewhere.  In fact this patch sets debug=true when this env is set, I would prefer we use MOZ_REFTEST_DEBUG instead.

::: layout/tools/reftest/reftest.js:789
(Diff revision 8)
> +    if (aTest.url2 !== null)

can you add a comment here to help explain this?  I am not sure what aTest.url2 references and why it could possibly be null.

::: layout/tools/reftest/reftest.js:1811
(Diff revision 8)
> -            gDumpLog("REFTEST TEST-UNEXPECTED-FAIL | " + gURLs[0].prettyPath +
> +            gDumpFn("REFTEST TEST-UNEXPECTED-FAIL | " + gURLs[0].prettyPath +

are there bugs on file to convert these hardcoded messages into structured logs?

::: layout/tools/reftest/runreftest.py:214
(Diff revision 8)
> +        if not options.log_tbpl_level and os.environ.get('MOZ_REFTEST_VERBOSE'):

here we should use MOZ_REFTEST_DEBUG instead of verbose == debug.

::: layout/tools/reftest/runreftest.py
(Diff revision 8)
> -            didTimeout = proc.didTimeout

why are we not caring about didTimeout?

::: testing/mozbase/mozlog/mozlog/formatters/tbplformatter.py:135
(Diff revision 8)
> +            sections.append(time_msg)

isn't the time usually at the front of the message?  I am not sure what time would be at the end of the log line
Attachment #8655032 - Flags: review?(jmaher)
Thanks for looking at this so thoroughly and taking the time to try and understand it/ask questions! Explaining it helps me double check things and make sure I know what I'm talking about :). Replies to follow.
https://reviewboard.mozilla.org/r/17797/#review25913

> is there a need for SUITE-END?  this seems odd to add it in here.

Yeah, this is in the formatter so it's where it belongs. It's the way it is so the output can be preserved from how it was before.

> for structured logging, we are adding hardcoded structure with the |, that doesn't seem logical.

This is in the formatter, so it is supposed to be hardcoded here. The goal was to output the exact same format as before so we don't break log parsers (mozharness, reftest-analyzer etc). It is a bit confusing because we aren't *using* the structured logs for anything other than formatting them back into the old format. Actually using the structured logs (i.e uploading them to blobber, replacing log parsing etc) will be follow up work.

> is it ok to s/OK/PASS/ ?

Yep. PASS is the mozlog protocol. This is actually an interdiff you commented on, so I'm not even sure it was being changed from 'OK' to begin with.

> ack, this seems more hacky rather than structured.

This is logging an error message with takes a message string as a parameter. This is also an interdiff and how it was before was an error, so adding the | back in is a bug fix. The full diff would make more sense, you should probably just ignore all these interdiffs.

> this is really hacky looking.  If we always have current log, then why do we need to query the outputHandler?  are there cases which outputHandler will have defined suite_finished and logFinish will not be in currentlog?

I agree, this is hacky.. but it's automation.py, so what else can we expect :).

We need to query outputHandler because this file *might* also be used by mochitest somewhere. At least there is a code path for it, not sure if it's dead code or not though. I thought it safer to just make sure I keep backwards compatibility everywhere and not worry about maintainability, as it's already about as awful as it can get and we want to get rid of these files anyway

> why not elif not lines?

Because we delete the last line in the first if block, so it's possible that lines becomes empty by the time it gets here.

> why a \n before and after?  can we ensure one before and ignore the after?

No, there's no other way to ensure it otherwise. For example, a test could have `dump("foobar");` without a trailing newline. Then our structured message would be appended to the previous message and the line would no longer be valid json. Similarly, if we don't print one after, the next thing to get logged would get appended to our message.

> we should remove references to this in automation and elsewhere.  In fact this patch sets debug=true when this env is set, I would prefer we use MOZ_REFTEST_DEBUG instead.

My aim here was to not break people's environment. I can imagine a "MOZ_REFTEST_VERBOSE" no longer works bug being filed. With that in mind, should I still change it?

> can you add a comment here to help explain this?  I am not sure what aTest.url2 references and why it could possibly be null.

aTest.url2 is the reference url of the test. It can be null for LOAD ONLY tests (e.g crashtests). The mozlog protocol requires a unique identifier for each test. In reftest, the unique identifier is a 3 tuple, the two images to compare, and the comparison operator (== or !=). Or in the case of LOAD ONLY, it's simply the test url.

> are there bugs on file to convert these hardcoded messages into structured logs?

Not yet, but I'll file them. This is a bit tricky because assertions don't really fit into the mozlog protocol all that well. That's why they're just hardcoded for now. Easiest way to get this landed without blocking on a potentially major mozlog change.

> here we should use MOZ_REFTEST_DEBUG instead of verbose == debug.

This ties in to the previous comment, so I'll close this, but leave the other issue open.

> why are we not caring about didTimeout?

It was an unused variable, just a little drive-by cleanup.

> isn't the time usually at the front of the message?  I am not sure what time would be at the end of the log line

Ah, this is a poorly named variable. It's actually the duration message, e.g the little "took 10ms" at the end. I changed the variable name to duration_msg.
https://reviewboard.mozilla.org/r/17797/#review25591

> From your try run:
> 
> 10:35:31     INFO -  REFTEST INFO | TEST-INFO | leakcheck | default process: leak threshold set at 0 bytes
> 10:35:31     INFO -  REFTEST INFO | TEST-INFO | leakcheck | plugin process: leak threshold set at 0 bytes
> 10:35:31     INFO -  REFTEST INFO | TEST-INFO | leakcheck | tab process: leak threshold set at 5000 bytes
> 10:35:31     INFO -  REFTEST INFO | TEST-INFO | leakcheck | geckomediaplugin process: leak threshold set at 0 bytes
> 10:35:31     INFO -  REFTEST INFO |
> 10:35:31     INFO -  REFTEST INFO | == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, default process 1886
> 10:35:31     INFO -  REFTEST INFO |
> 10:35:31     INFO -  REFTEST INFO |      |<----------------Class--------------->|<-----Bytes------>|<----Objects---->|
> 10:35:31     INFO -  REFTEST INFO |      |                                      | Per-Inst   Leaked|   Total      Rem|
> 10:35:31     INFO -  REFTEST INFO |    0 |TOTAL                                 |       20        0|224187112        0|
> 10:35:31     INFO -  REFTEST INFO |
> 10:35:31     INFO -  REFTEST INFO | nsTraceRefcnt::DumpStatistics: 1410 entries
> 10:35:31     INFO -  REFTEST INFO |
> 10:35:31     INFO -  REFTEST INFO | TEST-PASS | leakcheck | default process: no leaks detected!
> 
> The REFTEST INFO prefix would keep mozharness from recognizing this as a failure if TEST-PASS were instead TEST-UNEXPECTED-FAIL.

Ah thanks for the pointer. I'll make sure I see a job turn orange due to assertions before landing. Man, we really need selftests.
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/17797/diff/8-9/
Attachment #8655032 - Flags: review?(jmaher)
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/17797/diff/9-10/
Attachment #8655032 - Flags: review?(jmaher)
Blocks: 1244739
https://reviewboard.mozilla.org/r/17797/#review25913

> Not yet, but I'll file them. This is a bit tricky because assertions don't really fit into the mozlog protocol all that well. That's why they're just hardcoded for now. Easiest way to get this landed without blocking on a potentially major mozlog change.

Filed bug 1244739
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/17797/diff/9-10/
Attachment #8655032 - Attachment description: MozReview Request: Bug 1034290 - Use structured log output for test results in reftest → MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r?jmaher
Attachment #8655032 - Flags: review?(jmaher)
Hey Joel, sorry it took me awhile to get back to this, and you've probably forgotten everything from the first time :(.

Here's the interdiff since the last time you looked at this though:
https://reviewboard.mozilla.org/r/17797/diff/7-10/

For some reason, there are some bit-rot fixes that show up in this interdiff that have nothing to do with this patch, oh well :/.

Also, here is a green looking try run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=389fde475dfd

Here is a try run with intentional failures to make sure reftest analyzer works:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7d6ddcb5bc63

And I've also verified that the leaklogs are in the right format.
https://reviewboard.mozilla.org/r/17795/#review30293

I would like to see a bug on file for removing the hardcoded messages with | in them to actually be structured.  I mentioned a concern with that in my previous review and the reply was it is needed, but a bug would be put on file to document what the problem is and how it could be fixed.

::: build/mobile/b2gautomation.py:445
(Diff revision 10)
> +            if not lines:

I don't understand this condition, we only enter if lines isn't defined, is empty, etc.

then in try, we lines.append(...); can we not just lines = ...; can we assume lines is an empty array if not lines?

::: build/mobile/remoteautomation.py:379
(Diff revision 10)
> -                    messages = self.read_stdout()
> +                    gotOutput = self.read_stdout()

I really don't like this variable name, could we use: currentOutput?

::: layout/tools/reftest/output.py:19
(Diff revision 10)
> +    better.

is this only needed for reftest analyzer?  Does treeherder require this?

::: layout/tools/reftest/reftest.js:521
(Diff revision 10)
> -        gDumpLog("REFTEST TEST-UNEXPECTED-FAIL | | Unable to find reftest.manifests pref.  Please ensure your profile is setup properly\n");
> +        logger.error("Unable to find reftest.manifests pref.  Please ensure your profile is setup properly\n");

do we need the \n at the end of this message?

::: layout/tools/reftest/reftest.js:533
(Diff revision 10)
> -            gDumpLog("Reading manifest " + manifestURL + "\n");
> +            logger.info("Reading manifest " + manifestURL + "\n");

same here, we have an info with a \n in the message, I don't think we need it.

::: layout/tools/reftest/runreftest.py:663
(Diff revision 10)
> -                log.warning(
> +                self.log.warning("runreftest.py | Failed to copy %s to profile" % abspath)

this was line split before and now it is a single LONG line, is there a reason to merge this?  I believe it was split before to support flake8
Attachment #8655032 - Flags: review?(jmaher) → review-
https://reviewboard.mozilla.org/r/17795/#review30293

> I don't understand this condition, we only enter if lines isn't defined, is empty, etc.
> 
> then in try, we lines.append(...); can we not just lines = ...; can we assume lines is an empty array if not lines?

It's a bit tricky, but the way it was before was actually a bug. We want this function to return lines as quickly as possible so that they get processed by the log handlers efficiently. I.e this function is a bottleneck. Before, we were waiting `timeout` seconds, even if we already had lines ready to be processed. We should only block when there are no lines queued up already. Does that make sense?

And yes, `not lines` means that it is an empty list.

> I really don't like this variable name, could we use: currentOutput?

This a boolean now though, so `currentOutput` seems like it is still a list of messages. Is `hasOutput` better?

> is this only needed for reftest analyzer?  Does treeherder require this?

Yes, partly for the reftest-analyzer, but also for mozharness which still parses the log to find test failures and errors. A follow-up to this work would be to get mozharness to read the structured log instead, but that's something for another bug. Eventually it would be nice to modify the format to something better, but for now my goal is to simply not break anything :).

> this was line split before and now it is a single LONG line, is there a reason to merge this?  I believe it was split before to support flake8

Personally I really dislike the 80 character limit :).. but I'll grudgingly change it back.
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/17797/diff/10-11/
Attachment #8655032 - Flags: review- → review?(jmaher)
https://reviewboard.mozilla.org/r/17795/#review30293

> This a boolean now though, so `currentOutput` seems like it is still a list of messages. Is `hasOutput` better?

yes, hasOutput is better

> Yes, partly for the reftest-analyzer, but also for mozharness which still parses the log to find test failures and errors. A follow-up to this work would be to get mozharness to read the structured log instead, but that's something for another bug. Eventually it would be nice to modify the format to something better, but for now my goal is to simply not break anything :).

please adjust the comment to indicate that it is more than reftest analyzer.
Attachment #8655032 - Flags: review?(jmaher) → review+
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

https://reviewboard.mozilla.org/r/17797/#review30319
Comment on attachment 8655032 [details]
MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/17797/diff/11-12/
Attachment #8655032 - Attachment description: MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r?jmaher → MozReview Request: Bug 1034290 - Use structured log output for test results in reftest, r=jmaher
Just removing the line encoding seems to fix the problem:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4dff02c624d4

I'm not really sure why that got added in the first place.. seems like prior to this patch, we didn't do any encoding before passing things into stack_fixer_function anyway.
Here's a try run with 10.10/android-api-15:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4dff02c624d4

Here's another full try run because I'm now paranoid:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cf3d4eb68169

I wonder how it will find a way to get backed out this time..
There's a minor backwards incompatible change here. When the application returns non-zero,

"REFTEST ERROR | TEST-UNEXPECTED-FAIL | <test> | <msg>"

gets printed, instead of:

"TEST-UNEXPECTED-FAIL | <test> | <msg>"

This difference messes up the treeherder starring capability. It's an easy fix, I'll just make it log the old message out verbatim instead. I'll get a patch up shortly after some testing. This will have to get added to the list of follow up bugs (i.e make treeherder use structured logs rather than parsing output for starring suggestions).
I tested this locally by faking a bad return code. It produces the same output as it used to. I'll file a follow-up bug to get this to log 'error' again without breaking the treeherder parsers.
Attachment #8717102 - Flags: review?(jmaher)
Attachment #8717102 - Flags: review?(jmaher) → review?(cmanchester)
Attachment #8717102 - Flags: review?(cmanchester) → review+
https://hg.mozilla.org/mozilla-central/rev/bdc44931833f
https://hg.mozilla.org/mozilla-central/rev/e2aa43808183
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Blocks: 1246946
I'm going to need to uplift this to beta, as not doing so would cause too many conflicts with the addon signing change in bug 12450925.
Blocks: 1245092
Question: was it intended that this bug should change the workflow for developers who are running tests locally? (And is reftest.log intended to be useful for developers who are running tests locally?)

Before this bug landed, I was able to upload my "reftest.log" files (generated by a local "./mach reftest" run) directly to reftest-analyzer (by visiting https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml and clicking the "Browse" button).

But, now that this bug has changed the format of the generated reftest.log file, reftest-analyzer rejects it, and I end up having to do e.g. "./mach reftest > the-real-reftest.log", and ignore reftest.log.)

Any suggested workarounds? (and is the new JSON-based reftest.log format intended to be useful for any local-developer tasks and/or some version of reftest-analyzer?)
Flags: needinfo?(ahalberstadt)
(And if reftest.log is not intended to be uploadable to reftest-analyzer anymore, should we perhaps stop generating it for local runs?)
I believe ahal gets back next week, he would be the best person to answer these questions.
I have a structured version of the reftest analyzer which we could use if the reftest harness learned to save screenshots in the structured log in the same way as the wpt harness. Then for bonus points we could integrate this with wptview and it would be directly accessible from treeherder.
Ah, I assume this is the log used for android? It wasn't explicitly intentional, but there's not much we can do about that particular log, it'll need to contain the structured messages.

But mozlog does provide options to log arbitrary formats to arbitrary files. So you can do something like:
./mach reftest --log-tbpl my-reftest.log

You can combine this with mach's new command aliases function to make it happen all the time by default.

1. Create a machrc or .machrc file in either topsrcdir, ~/.mozbuild or $MACHRC
2. Add:
[aliases]
reftest = reftest --log-tbpl my-reftest.log

It may be worth changing the logfile used to a tmp file so people don't try and use it. Though it may still come in handy when we switch to the structured reftest-analyzer that jgraham made.
Flags: needinfo?(ahalberstadt)
(In reply to Andrew Halberstadt [:ahal] from comment #63)
> Ah, I assume this is the log used for android?

Maybe? I wasn't doing anything Android-specific. I was just running "./mach reftest" in a normal desktop debug build.  (And until recently, doing that would produce a log of everything that appeared on my terminal, called "reftest.log".  But as of this bug, that logfile seems to be JSON formatted.)

> So you can do something like:
> ./mach reftest --log-tbpl my-reftest.log

Nice! This does seem to create a reftest-analyzer-friendly file. Thanks!

> 1. Create a machrc or .machrc file in either topsrcdir, ~/.mozbuild or
> $MACHRC
> 2. Add:
> [aliases]
> reftest = reftest --log-tbpl my-reftest.log

This sounds awesome, though I haven't figured out how to get this .machrc file to be respected yet. We can sort that out elsewhere though. (I'll find you in IRC.)

> It may be worth changing the logfile used to a tmp file so people don't try
> and use it.

I fully support this -- right now the filename ("reftest.log") makes it extremely tempting to upload it to reftest-analyzer & expect results (particularly for developers who've had success with that in the past, but for new developers as well).
Yeah, you're right.. it is being set for non-android as well:
https://dxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftestcommandline.py#504

Also the section name is [alias] not [aliases] (my bad), probably why it wasn't working for you.
FYI, the commit message in https://hg.mozilla.org/try/rev/6b064f9f6e10 mentions the following:

It's worth noting that ~half of the remaining I/O system calls are related to reftest.log, which now accounts for the largest percentage of write I/O (only ~53 MB, however). It's worth noting that reftest.log appears to be using unbuffered writes and is requiring an excessive amount of system calls for writing.
I *believe* reftest.log is only actually used by android, but I noticed it was being saved for all platforms during my discussion with dholbert above. I'll try disabling it for non-android platforms. People who still need/want it can use --log-tbpl + mach [alias] to turn it back on if they need to.
Depends on: 1271448
Depends on: 1292338
No longer depends on: 1321127
You need to log in before you can comment on or make changes to this bug.