Closed Bug 1156810 Opened 9 years ago Closed 9 years ago

make private buildapi instance POST/PUT/DELETE requests work

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: bhearsum, Assigned: bhearsum)

References

Details

The private buildapi interface appears to be completely broken. http://buildapi.pvt.build.mozilla.org/buildapi/self-serve seems to return 403s for absolutely everything (GET/POST/DELETE), even when I make requests from a machine such a buildbot-master.

The Buildbot <-> Taskcluster bridge is going to need to be able to kill Buildbot builds, and it needs a functioning buildapi interface to do so.
Dustin, I'm not trying to volunteer you for this, but any guidance would be appreciated. I really don't have an understanding of how buildapi is deployed.

Needinfo'ing myself for now, because Bugzilla won't let me set it on you...
Flags: needinfo?(bhearsum)
Interesting:

[root@web2.releng.webapp.scl3 ~]# tail -n100 /var/log/httpd/buildapi.pvt.build.mozilla.org/error_log
[Tue Apr 21 08:07:53 2015] [error] [client 10.22.81.211] mod_wsgi (pid=29201): Exception occurred processing WSGI script '/data/ww/buildapi/buildapi.wsgi'.
[Tue Apr 21 08:07:53 2015] [error] [client 10.22.81.211] IOError: failed to write data
Looks like part of this probably making sure that auth_override is set in the private deployment's config, per: https://github.com/mozilla/build-buildapi/blob/master/buildapi/controllers/selfserve.py#L115
(In reply to Justin Wood (:Callek) from comment #2)
> Interesting:
> 
> [root@web2.releng.webapp.scl3 ~]# tail -n100
> /var/log/httpd/buildapi.pvt.build.mozilla.org/error_log
> [Tue Apr 21 08:07:53 2015] [error] [client 10.22.81.211] mod_wsgi
> (pid=29201): Exception occurred processing WSGI script
> '/data/ww/buildapi/buildapi.wsgi'.
> [Tue Apr 21 08:07:53 2015] [error] [client 10.22.81.211] IOError: failed to
> write data

Huh, good find! This path looks funky to me: /data/ww/buildapi/buildapi.wsgi
Flags: needinfo?(bhearsum)
(In reply to Ben Hearsum [:bhearsum] from comment #4)
> (In reply to Justin Wood (:Callek) from comment #2)
> > Interesting:
> > 
> > [root@web2.releng.webapp.scl3 ~]# tail -n100
> > /var/log/httpd/buildapi.pvt.build.mozilla.org/error_log
> > [Tue Apr 21 08:07:53 2015] [error] [client 10.22.81.211] mod_wsgi
> > (pid=29201): Exception occurred processing WSGI script
> > '/data/ww/buildapi/buildapi.wsgi'.
> > [Tue Apr 21 08:07:53 2015] [error] [client 10.22.81.211] IOError: failed to
> > write data
> 
> Huh, good find! This path looks funky to me: /data/ww/buildapi/buildapi.wsgi

Looks like this was some sort of mispaste...I see www when I look at it:
[Tue Apr 21 08:07:53 2015] [error] [client 10.22.81.211] mod_wsgi (pid=29201): Exception occurred processing WSGI script '/data/www/buildapi/buildapi.wsgi'.
[Tue Apr 21 08:07:53 2015] [error] [client 10.22.81.211] IOError: failed to write data
Interestingly, there's some 200s in the access logs that I'm unable to reproduce. Eg:
10.22.81.211 - - [21/Apr/2015:08:38:02 -0700] "GET /buildapi/self-serve/jobs HTTP/1.1" 200 57207 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
10.22.81.211 - - [21/Apr/2015:08:42:28 -0700] "GET /buildapi/self-serve/jobs HTTP/1.1" 403 226 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0"
I think my original tests from a buildbot master were borked in some way, requests such as this get through fine now:
curl "http://buildapi.pvt.build.mozilla.org/buildapi/recent/panda-0363?format=json&numbuilds=1"
[{"buildername": "Android 4.0 armv7 API 11+ cedar opt test mochitest-1", "buildnumber": 9, "slavename": "panda-0363", "master": "bm102-tests1-panda", "result": 5, "starttime": 1429560010, "buildname": "cedar_panda_android_test-mochitest-1", "endtime": 1429560857, "id": 64235762}

So, that's good. It means that the only problem to solve is the auth one AFAICT.
This bug has become a bit confused. To summarize the current state:
* GET requests from buildbot-masters always worked. My original tests before filing must've been done wrong.
* POST/PUT/DELETE requests don't work, most likely because the buildapi app requires "auth" (as defined in https://github.com/mozilla/build-buildapi/blob/master/buildapi/controllers/selfserve.py#L115)

The action item here is to fix the latter. This might be done by adjusting the buildapi code, or perhaps by tweaking the deployment. I'm going to investigate further.
Assignee: nobody → bhearsum
Summary: fix buildapi private interface → make private buildapi instance POST/PUT/DELETE requests work
(In reply to Ben Hearsum [:bhearsum] from comment #8)
> This bug has become a bit confused. To summarize the current state:
> * GET requests from buildbot-masters always worked. My original tests before
> filing must've been done wrong.
> * POST/PUT/DELETE requests don't work, most likely because the buildapi app
> requires "auth" (as defined in
> https://github.com/mozilla/build-buildapi/blob/master/buildapi/controllers/
> selfserve.py#L115)
> 
> The action item here is to fix the latter. This might be done by adjusting
> the buildapi code, or perhaps by tweaking the deployment. I'm going to
> investigate further.

I see two ways of fixing this which are basically equivalent:
1) Set auth_override in production.ini, but only for buildapi.pvt. I think buildapi.pvt shares a config with the public version, so this may not be easy.
2) Set X-Remote-User (yep, not REMOTE_USER) as a header in the Apache config for buildapi.pvt. buildapi.pvt already has its own Apache config, so this _seems_ like it should be easy to my naive brain.

Either of these things will cause _require_auth() to have "who" set, and not bail on POST/PUT/DELETEs. Because there's no HTTP Auth present on buildapi.pvt we'd have to choose a string to set this to. I suggest something like "releng-internal" or something like that.

Dustin, does this sound sane to you? I'm happy to make the changes myself if I have access to the right repository.
It seems like the best bet would be to add

  Remote-User: buildbot-taskcluster-bridge

to the request, rather than adjusting the configuration of the internal endpoint.  At least then we have some indication of what's going on in the logs.

I see this as a temporary fix until buildapi moves to relengapi, at which point clients will need to pass an authentication token (which is just a different header and value).
(In reply to Dustin J. Mitchell [:dustin] from comment #10)
> It seems like the best bet would be to add
> 
>   Remote-User: buildbot-taskcluster-bridge
> 
> to the request, rather than adjusting the configuration of the internal
> endpoint.  At least then we have some indication of what's going on in the
> logs.

Good idea. My brain totally glossed over the fact that I could just set it on the client side. And indeed, this works:
[bhearsum@buildbot-master81.bb.releng.scl3.mozilla.com ~]$ curl -H "X-Remote-User: buildbot-bridge" -X DELETE "http://buildapi.pvt.build.mozilla.org/buildapi/sel/alder/request/67845355?format=json"
{"status": "OK", "request_id": 1280471}

> I see this as a temporary fix until buildapi moves to relengapi, at which
> point clients will need to pass an authentication token (which is just a
> different header and value).

Good to know. Thanks for the heads up.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INVALID
Component: Tools → General
You need to log in before you can comment on or make changes to this bug.