Closed Bug 1507254 Opened 5 years ago Closed 5 years ago

Missed GitHub.com pull request events

Categories

(Taskcluster :: Services, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jugglinmike, Assigned: bstack)

Details

web-platform-tests a GitHub.com-hosted project which integrates with
Taskcluster and TravisCI to validate pull requests from all contributors.
Between 2018-11-06 and 2018-11-14, TravisCI validated 112 commits, but
Taskcluster validated only 109 of those:

- pull request: http://github.com/web-platform-tests/wpt/pull/14004
  commit: f4bd28a8415ff506bf4adcf4c749e1682a1d61d4
- pull request: http://github.com/web-platform-tests/wpt/pull/14001
  commit: 842c1b07631cbf2a186172702208d1b7cb24b3be
- pull request: http://github.com/web-platform-tests/wpt/pull/14000
  commit: e56f8cbfae02183ffa49eb24cc7bd06a3978b3c1

I previously reported similar behavior via bug 1499576, but the commits
referenced there described a number of different situations (some of which were
expected). To the best of my knowledge, the three commits referenced above
describe a bug in Taskcluster. I'm hopeful that the enhanced logging which
:owlish implemented [1] will help diagnose the behavior further.

Thanks!

[1] https://github.com/taskcluster/taskcluster-github/pull/298
Assignee: nobody → bugzeeeeee
Here are some additional data points. In the time since I opened this issue,
TravisCI has validated 111 commits in WPT, and Taskcluster has missed 7 of
those:

- pull request: http://github.com/web-platform-tests/wpt/pull/14138)
  commit: d43d3b5d8e7225387c7a08c25beea5153b388d75
- pull request: http://github.com/web-platform-tests/wpt/pull/14132)
  commit: 2cf4f5cc9924be5e0ef0ffbbc74c9e7503c45cc4
- pull request: http://github.com/web-platform-tests/wpt/pull/14088)
  commit: b37b12ddcb54f285cdf512e7fbab3d500067f6c0
- pull request: http://github.com/web-platform-tests/wpt/pull/14088)
  commit: 53ef62c9ae76063ec08c07c384ba09097efbd6ad
- pull request: http://github.com/web-platform-tests/wpt/pull/14081)
  commit: 465df5ee67f0dce04dc567539f9e7d4d41d75f21
- pull request: http://github.com/web-platform-tests/wpt/pull/14081)
  commit: 7a62d2667191382e0ffb3cc1e296d6fcd129463e
- pull request: http://github.com/web-platform-tests/wpt/pull/14074)
  commit: 067dec7c9a6c284c19e2aef3875954f9eaa949f9
Any progress on this? We've had to make Taskcluster non-blocking on https://github.com/web-platform-tests/wpt because of this, and sooner or later a PR will be merged even though Taskcluster correctly identified a problem with the tests.
Flags: needinfo?(bugzeeeeee)
I'm pretty busy at the moment, but I will try to squeeze this bug in.

Thank you for the new data, Mike!
Flags: needinfo?(bugzeeeeee)
I an try to look at this today.
Assignee: bugzeeeeee → bstack
Status: NEW → ASSIGNED
I can see in the logs that PR 14138 does indeed get webhooked to tc-github correctly but we log that there is no .taskcluster.yml available for the ref.

At this time, the url it checked [0], definitely does return a .taskcluster.yml that appears to be valid. I wonder if this is some sort of consistency thing with the github api that we need to be better at retries with. I'll look further.


[0] https://api.github.com/repos/web-platform-tests/wpt/contents/.taskcluster.yml?ref=d43d3b5d8e7225387c7a08c25beea5153b388d75
The only case where we log that error is when the call to github returns a 404 and we don't do any of that call ourselves -- instead we rely on the @octokit/rest library for this. They've had a few releases that seem to change/fix http stuff since the version we're using. I think our best bet is to just upgrade and hope that they've fixed it. I'll also put this inside an exponential-backoff retry loop.
This should be resolved now! Please let us know if you see this issue occurring again.
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
For discoverability, https://github.com/web-platform-tests/wpt/issues/14165 is the wpt issue that was blocked on this.
In the 2 weeks since this was marked "resolved", Taskcluster has validated 218 commits for the WPT project; we haven't experienced a single miss yet. Thanks!
Excellent, thanks Mike!
Component: Github → Services
You need to log in before you can comment on or make changes to this bug.