Closed Bug 1236655 Opened 6 years ago Closed 6 years ago

Exceptions whilst parsing TaskCluster logs since they contain dates like "2016-00-04"

Categories

(Taskcluster :: Services, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED
mozilla46

People

(Reporter: froydnj, Assigned: garndt)

References

Details

Attachments

(2 files)

For:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4600c9ff8a35

treeherder was unable to provide straightforward summaries of the B2G build oranges.  The erroneous lines from the full log look like:

20:35:48     INFO -  ../../../../workspace/gecko/b2g/app/B2GLoader.cpp: In function 'bool GetXPCOMPath(const char*, char*, int)':
20:35:48    ERROR -  ../../../../workspace/gecko/b2g/app/B2GLoader.cpp:69:18: error: 'MakeUnique' was not declared in this scope
20:35:48     INFO -  ../../../../workspace/gecko/b2g/app/B2GLoader.cpp:69:18: note: suggested alternative:

It's worth noting that the B2G ICS Emulator opt build was able to parse the error out of the log.
Almost positive this is a Taskcluster issue.
20:57:34    ERROR -  /home/worker/workspace/gecko/b2g/app/B2GLoader.cpp:69:18: error: 'MakeUnique' was not declared in this scope
12:37:24    ERROR -  ../../../gecko/b2g/app/B2GLoader.cpp:69:18: error: 'MakeUnique' was not declared in this scope

Unless th insists that build errors have a path starting with ../../../, or that the times be in Pacific time, I'd think it's a treeherder issue with ingestion rather than a taskcluster issue with logging.

The fun thing is, it's been this way for every taskcluster build failure since last week sometime.
Link to an example job:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4600c9ff8a35&selectedJob=15029771&filter-searchStr=B2G%20ICS%20Emulator%20opt%20Submitted%20by%20taskcluster%20[TC]%20B2G%20Emulator%20ICS%20%28Opt%29%20%28B%29

The error summary says:
"""
Log parsing failed. Unable to generate failure summary
"""

Note this is different from "No summary found.", and actually means there was a problem during parsing.

Checking New Relic, I see this exception:
exceptions:ValueError: time data '2016-00-04 21:53:14.788000' does not match format '%Y-%m-%d %H:%M:%S.%f' 
(https://rpm.newrelic.com/accounts/677903/applications/4180461/traced_errors/4f5109-4b15b674-b330-11e5-bb1e-b82a72d22a14)

If you look at the raw log (I suggest downloading rather than opening in the browser):
https://queue.taskcluster.net/v1/task/Ssrl3bt7Tw-YCwt3_EwX7w/runs/0/artifacts/public/logs/live_backing.log

There's:

========= Started Clone gecko tc-vcs checkout workspace/gecko https://hg.mozilla.org/mozilla-central https://hg.mozilla.org/try/ 4600c9ff8a3517fc176ecd6c12a916bb505937c4 4600c9ff8a3517fc176ecd6c12a916bb505937c4 (results: 0, elapsed: 0 secs) (at 2016-00-04 19:44:35.838000) =========
...
========= Finished Clone gecko tc-vcs checkout workspace/gecko https://hg.mozilla.org/mozilla-central https://hg.mozilla.org/try/ 4600c9ff8a3517fc176ecd6c12a916bb505937c4 4600c9ff8a3517fc176ecd6c12a916bb505937c4 (results: 0, elapsed: 158 secs) (at 2016-00-04 19:47:12.880000) =========
========= Started Build ./build-emulator.sh /home/worker/workspace (results: 0, elapsed: 0 secs) (at 2016-00-04 19:47:12.936000) =========
...
Severity: normal → blocker
Component: Treeherder → Integration
Flags: needinfo?(garndt)
Priority: -- → P1
Product: Tree Management → Taskcluster
Summary: log parsing unable to identify compiler errors on b2g builds → Exceptions whilst parsing TaskCluster logs since they contain dates like "2016-00-04"
Version: --- → unspecified
It appears, for reasons unknown to me, that the 'builder' image we use for these jobs is using a JS version of the buildbot step script rather than the bash version that's used in other images, such as 'tester'.

Switching the image over to using the bash version should fix up the problem and also standardize things.

I'll make see how that looks.
Assignee: nobody → garndt
Flags: needinfo?(garndt)
So it appears that the JS version was an off by one error when handling the month.

However, when looking at the python and bash versions that were created of the buildbot step log line, I do notice that there is a difference in what gets logged.  I'm not sure if someone knows what exactly it should be...

In JS it's something like this, where it's "Started <step> <command> (results: ...)"
$ ./buildbot_step.js 'clone gecko' ls -lah something
========= Started clone gecko ls -lah something (results: 0, elapsed: 0 secs) (at 2016-01-04 17:41:36.232000) =========

In bash it's this (without the <command>):
$ ./buildbot_step 'clone gecko' ls -lah something
========= Started clone gecko ls -lah something (results: 0, elapsed: 0 secs) (at 2016-01-04 17:45:24.N) =========
Flags: needinfo?(emorley)
Sorry, the bash version was my "patched" version so it looks the same...

Here it is in it's original form:
$ ./buildbot_step 'clone gecko' ls -lah something
========= Started clone gecko (results: 0, elapsed: 0 secs) (at 2016-01-04 17:59:30.N) =========
Just the command on it's own should be fine, presuming there aren't multiple similarly named commands that we can't tell apart otherwise :-)
Flags: needinfo?(emorley)
Thank you for looking at this!
Once reviewed, need to deploy new builder image to docker hub to be used by phone-builder later
Attachment #8704274 - Flags: review?(wcosta) → review+
Comment on attachment 8704274 [details]
MozReview Request: Bug 1236655 - Use builder 0.5.11 image for decision tasks r=wcosta

https://reviewboard.mozilla.org/r/29601/#review26521
See Also: → 1237237
Duplicate of this bug: 1237250
:wcosta builder 0.5.11 is being pushed now and should be available soon.  Would you be able to get the phone-builder updated with that new base image?
Flags: needinfo?(wcosta)
(In reply to Greg Arndt [:garndt] from comment #16)
> :wcosta builder 0.5.11 is being pushed now and should be available soon. 
> Would you be able to get the phone-builder updated with that new base image?

Yep, I am leaving the bug open so I push the patch here.
Flags: needinfo?(wcosta)
Keywords: leave-open
Comment on attachment 8705127 [details] [diff] [review]
Update buildbot step script in phone-builder images r=garndt

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d9de5ebaa53e
Attachment #8705127 - Flags: review?(garndt)
Attachment #8705127 - Flags: review?(garndt) → review+
Keywords: leave-open
https://hg.mozilla.org/mozilla-central/rev/4c43a03ab0d6
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
It appears that the gecko decision task has a hard coded version number for the image and doesn't pick up newer changes like our other tasks do.  Need to investigate what's different about this image and if it can use our newer builder image that has the update.
Ah thank you :-)
Comment on attachment 8704274 [details]
MozReview Request: Bug 1236655 - Use builder 0.5.11 image for decision tasks r=wcosta

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/29601/diff/1-2/
Attachment #8704274 - Attachment description: MozReview Request: Bug 1236655 - Use bash version of buildbot_step script in builder images r=wcosta → MozReview Request: Bug 1236655 - Use builder 0.5.11 image for decision tasks r=wcosta
Comment on attachment 8704274 [details]
MozReview Request: Bug 1236655 - Use builder 0.5.11 image for decision tasks r=wcosta

Not sure what I did wrong to have this review marked as r+ but reflagging for r?
Attachment #8704274 - Flags: review+ → review?
Comment on attachment 8704274 [details]
MozReview Request: Bug 1236655 - Use builder 0.5.11 image for decision tasks r=wcosta

https://reviewboard.mozilla.org/r/29601/#review28463
Attachment #8704274 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/036360e2c9cd
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Attachment #8704274 - Flags: review?
Component: Integration → Services
You need to log in before you can comment on or make changes to this bug.