Closed
Bug 1126879
Opened 10 years ago
Closed 8 years ago
slaveapi fails at filing tracking bugs when it wants to file an unreachable bug for a slave without a problem tracking bug
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: philor, Assigned: aobreja)
Details
Attachments
(3 files)
See, until the next time we toss history, https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slave.html?class=test&type=t-snow-r4&name=t-snow-r4-0011 with two reboots with the result "400 Client Error: Bad Request" before I figured out that it wanted to call the slave unreachable but was failing to file the tracking bug to make it blocked by the unreachable bug, then after I filed the tracker for it, the successful "Failed. Filed IT bug for reboot"
Comment 1•10 years ago
|
||
this is a regression in bmo im sure. a seamonkey irc bot (i dont control) had issues yesterday as well
> 400 Client Error: Bad Request
in order to look at this, we'll need to know what request the bot is making against bmo -- what webservice endpoint is it hitting, what method, and what parameters?
Flags: needinfo?(bugspam.Callek)
Comment 3•10 years ago
|
||
for example:
2015-01-28 09:35:01,262 - INFO - panda-0524 - Sending request: POST https://bugzilla.mozilla.org/rest/bug
2015-01-28 09:35:01,879 - ERROR - panda-0524 - Something went wrong while processing!
2015-01-28 09:35:01,879 - ERROR - panda-0524 - Traceback (most recent call last):
2015-01-28 09:35:01,879 - ERROR - panda-0524 -
2015-01-28 09:35:01,880 - ERROR - panda-0524 - File "/builds/slaveapi/prod/lib/python2.7/site-packages/slaveapi/processor.py",
line 64, in _worker
2015-01-28 09:35:01,880 - ERROR - panda-0524 - res, msg = action(slave, *args, **kwargs)
2015-01-28 09:35:01,880 - ERROR - panda-0524 -
2015-01-28 09:35:01,880 - ERROR - panda-0524 - File "/builds/slaveapi/prod/lib/python2.7/site-packages/slaveapi/actions/reboot.
py", line 116, in reboot
2015-01-28 09:35:01,880 - ERROR - panda-0524 - slave.reboot_bug = file_reboot_bug(slave)
2015-01-28 09:35:01,880 - ERROR - panda-0524 -
2015-01-28 09:35:01,880 - ERROR - panda-0524 - File "/builds/slaveapi/prod/lib/python2.7/site-packages/slaveapi/clients/bugzill
a.py", line 76, in file_reboot_bug
2015-01-28 09:35:01,880 - ERROR - panda-0524 - resp = bugzilla_client.create_bug(data)
2015-01-28 09:35:01,880 - ERROR - panda-0524 -
2015-01-28 09:35:01,881 - ERROR - panda-0524 - File "/builds/slaveapi/prod/lib/python2.7/site-packages/bzrest/client.py", line
55, in create_bug
2015-01-28 09:35:01,881 - ERROR - panda-0524 - return self.request("POST", "bug", data)
2015-01-28 09:35:01,881 - ERROR - panda-0524 -
2015-01-28 09:35:01,881 - ERROR - panda-0524 - File "/builds/slaveapi/prod/lib/python2.7/site-packages/bzrest/client.py", line
40, in request
2015-01-28 09:35:01,881 - ERROR - panda-0524 - r.raise_for_status()
2015-01-28 09:35:01,881 - ERROR - panda-0524 -
2015-01-28 09:35:01,881 - ERROR - panda-0524 - File "/builds/slaveapi/prod/lib/python2.7/site-packages/requests/models.py", lin
e 683, in raise_for_status
2015-01-28 09:35:01,881 - ERROR - panda-0524 - raise HTTPError(http_error_msg, response=self)
2015-01-28 09:35:01,881 - ERROR - panda-0524 -
2015-01-28 09:35:01,881 - ERROR - panda-0524 - HTTPError: 400 Client Error: Bad Request
Which is:
http://mxr.mozilla.org/build/source/slaveapi/slaveapi/clients/bugzilla.py#66
Which is calling into https://github.com/bhearsum/bzrest/blob/master/bzrest/client.py
specifically its just calling a POST with that data: https://github.com/bhearsum/bzrest/blob/master/bzrest/client.py#L54
Flags: needinfo?(bugspam.Callek) → needinfo?(glob)
Comment 4•10 years ago
|
||
Specifically I suspect this is a regression from: bug 1124437 Backport upstream bug 1090275 to bmo/4.2 to whitelist webservice api methods
this doesn't appear to be related to bug 1124437 - i'm able to create bugs via rest without issue.
bugzilla will be returning the reason for the failure in its json response:
{"documentation":"http://www.bugzilla.org/docs/tip/en/html/api/","code":32000,"error":true,"message":"The version value 'other' is not active."}
however the library used here is catching and dealing with the http/400 result first, which results in it dropping the error message in favour of a generic "bad request" one.
my guess is the bot is setting the "blocks" field to a bug which doesn't exist.
Flags: needinfo?(glob)
Comment 6•10 years ago
|
||
Should be fixed with bzrest 0.9, which I just updated on prod slaveapi.
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 7•10 years ago
|
||
Not quite done here:
http://mxr.mozilla.org/build/source/puppet/modules/slaveapi/manifests/instance.pp#54
http://git.mozilla.org/?p=build/slaveapi.git;a=blob;f=setup.py;h=c0e5eecf06d642fac8e71bc0866c7f23a6a737e2;hb=HEAD#l20
Assignee: nobody → bugspam.Callek
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 8•9 years ago
|
||
It would be sweet to finally get this fixed, since I've generally rebooted a slave two or three times, and thus lost 24-48 hours of its life, before I finally notice it's not actually getting anywhere.
Reporter | ||
Comment 9•8 years ago
|
||
It would be sweet to finally get this fixed, since we now have employees doing buildduty, including doing my non-job when I'm on non-PTO, and since I didn't train them they don't know this bug exists.
Comment 10•8 years ago
|
||
Alin or Andrei should be able to tackle this.
Assignee: bugspam.Callek → nobody
Component: Tools → Buildduty
QA Contact: hwine → bugspam.Callek
Updated•8 years ago
|
Assignee: nobody → aobreja
Assignee | ||
Comment 11•8 years ago
|
||
This patch should upgrade bzrest to 0.9 version which could solve the problem with calling the POST "return self.request("POST", "bug", data)".
A recent example of this problem could be found on t-yosemite-r7-0387.
The Puppet repository can be found here: https://github.com/mozilla/build-puppet
Callek what could be the risks if we do this upgrade on puppet for bzrest?
Comment 12•8 years ago
|
||
Comment on attachment 8784864 [details] [diff] [review]
bug1126879_puppet.patch
Review of attachment 8784864 [details] [diff] [review]:
-----------------------------------------------------------------
As I said in c#6 -- I updated bzrest on prod and thought that fixed it. I didn't realize we had bzrest==.7 pinned here. so do eet.
Attachment #8784864 -
Flags: review+
Comment 13•8 years ago
|
||
(In reply to Justin Wood (:Callek) from comment #12)
> Comment on attachment 8784864 [details] [diff] [review]
> bug1126879_puppet.patch
>
> Review of attachment 8784864 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> As I said in c#6 -- I updated bzrest on prod and thought that fixed it. I
> didn't realize we had bzrest==.7 pinned here. so do eet.
Ahh and based on that comment too, we need to update: https://github.com/mozilla/build-slaveapi/blob/master/setup.py#L20
Otherwise we'll fail to install things right.
So steps:
* Update github's slaveapi repo (version bump for bzrest and slaveapi itself).
* Package it up and deploy to relengweb's pypi and puppet's pypi mirrors
* Deploy this puppet patch + a version bump for slaveapi.
Assignee | ||
Comment 14•8 years ago
|
||
Attachment #8786738 -
Flags: review?(bugspam.Callek)
Updated•8 years ago
|
Attachment #8786738 -
Flags: review?(bugspam.Callek) → review+
Assignee | ||
Comment 15•8 years ago
|
||
Callek I don't have merge rights for this patch, I get " Only those with write access to this repository can merge pull requests."
Can you merge this patch for me?
thanks
Flags: needinfo?(bugspam.Callek)
Assignee | ||
Comment 17•8 years ago
|
||
> * Package it up and deploy to relengweb's pypi and puppet's pypi mirrors
> * Deploy this puppet patch + a version bump for slaveapi.
Done this part.
Reporter | ||
Comment 18•8 years ago
|
||
If we expect this to work now, it doesn't.
Assignee | ||
Comment 19•8 years ago
|
||
Seems that upgrading Bzrest to 0.9 from 0.7 did not solved the issue ,is the same issue as in Comment 3 with :
Sending request: POST https://bugzilla.mozilla.org/rest/bug
The problem can be seen on t-w864-ix-230 and t-w864-ix-199.
Callek do you have any suggestions here?
Flags: needinfo?(bugspam.Callek)
Comment 20•8 years ago
|
||
Two thoughts:
* I thought there was a puppet issue with the version bumps, did that get sorted out, if not then we're not actually running the new code.
* Slaveapi needs to be manually restarted after the ver bumps, since there is no soft-reset and it retains state in memory (ala: the History is not flushed to disk anywhere).
Flags: needinfo?(bugspam.Callek)
Assignee | ||
Comment 21•8 years ago
|
||
Puppet was changed,now bzrest is at version 0.9 on github:
https://github.com/mozilla/build-puppet/blob/master/modules/slaveapi/manifests/instance.pp
Also slaveapi was manually restarted taking step by step Comment 13 and https://wiki.mozilla.org/ReleaseEngineering/Applications/SlaveAPI.
Comment 22•8 years ago
|
||
Callek do have any other suggestions on this bug? Andrei mentioned this morning that he was still stuck on this bug.
Flags: needinfo?(bugspam.Callek)
Comment 23•8 years ago
|
||
Nothing offhand, tracebacks would be useful if one exists in the logs.
Also running similar commands with slave's venv of slaveapi+bzrest to validate that it can indeed reach bmo with the creds it has and is able to submit a bug in a similar fashion.
If we feel this is important enough and our buildduty team can't decipher the app, I can look into it but its a big context switch so I'd like :coop to confirm with me that he does want me to look in for debugging sake, if I am to do so.
Flags: needinfo?(bugspam.Callek)
Reporter | ||
Comment 24•8 years ago
|
||
Apparently it just needed dhouse to restart slaveapi (a couple of times) after he did a kernel upgrade on it, since it just filed some tracking bugs for the first time in just over two years.
Status: REOPENED → RESOLVED
Closed: 10 years ago → 8 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•