Closed
Bug 1461379
(bmo-dbi-connector)
Opened 6 years ago
Closed 6 years ago
API DB Availability Exceptions on recurring BMO scripts
Categories
(bugzilla.mozilla.org :: API, defect, P1)
Tracking
()
RESOLVED
FIXED
People
(Reporter: claudijd, Assigned: dylan)
References
Details
Attachments
(2 files)
We have a scheduled job setup where by we periodically poll the BMO API and look for unassigned jobs in a component where we have a round robin assignment script running. I happened to notice that periodically (like once every few days) it will throw an exception on the API and although we are taking steps to work with that, I suspect the BMO team very much cares about how robust and reliable the API is, so I'm sharing some stack traces for transparency and comment (maybe there is something that can be done to make this more reliable)... $ /bin/bash -e /tmp/jenkins2189971386254713860.sh [00:00:01] INFO [__main__.autoassign:213] No unassigned bugs for component [00:00:02] INFO [__main__.autoassign:213] No unassigned bugs for component [00:00:02] DEBUG [__main__.autocasa:89] Analyzing 10 closed bugs... [00:00:04] WARNING [__main__.autocasa:113] Project 10829 already has a security status set (done), skipping! [00:00:06] WARNING [__main__.autocasa:119] Project 11513 is already in status 'none' and will not be modified [00:00:08] WARNING [__main__.autocasa:113] Project 11518 already has a security status set (done), skipping! [00:00:09] WARNING [__main__.autocasa:113] Project 11524 already has a security status set (done), skipping! [00:00:11] WARNING [__main__.autocasa:113] Project 11535 already has a security status set (done), skipping! [00:00:13] WARNING [__main__.autocasa:119] Project 11558 is already in status 'none' and will not be modified [00:00:14] WARNING [__main__.autocasa:113] Project 11618 already has a security status set (done), skipping! [00:00:16] WARNING [__main__.autocasa:119] Project 11619 is already in status 'none' and will not be modified Traceback (most recent call last): File "./assigner.py", line 220, in <module> main() File "./assigner.py", line 66, in main autocasa(bapi, capi, bcfg, ccfg, args.dry_run) File "./assigner.py", line 93, in autocasa comments = bapi.get_comments(bug.get('id'))['bugs'][str(bug.get('id'))]['comments'] File "/usr/lib/python3.6/site-packages/bugzilla.py", line 55, in get_comments return self._get('bug/{bugid}/comment'.format(bugid=bugid)) File "/usr/lib/python3.6/site-packages/bugzilla.py", line 123, in _get raise Exception(r.url, r.reason, r.status_code, r.json()) Exception: ('https://bugzilla.mozilla.org/rest/bug/1456277/comment?api_key=****', 'OK', 200, {'documentation': 'https://bmo.readthedocs.org/en/latest/api/', 'error': True, 'code': 100500, 'message': "\nCan't connect to the database.\nError: Lost connection to MySQL server at 'reading initial communication packet', system error: 104\n Is your database installed and up and running?\n Do you have the correct username and password selected in localconfig?\n\n"}) Build step 'Execute shell' marked build as failure The error seems to suggest that BMO is having communication reliability issues between the Web front-end and the backend MySQL DB.
Reporter | ||
Comment 1•6 years ago
|
||
Here's the issue for our project that is triggering these exceptions every so often (twice in a few days time, running once every 10min or so): https://github.com/mozilla/infosec-risk-management-bugzilla/issues/11 Code that triggers this: https://github.com/mozilla/infosec-risk-management-bugzilla/blob/master/assigner.py
Reporter | ||
Comment 2•6 years ago
|
||
Dylan: is this expected behavior on BMO API endpoints?
Flags: needinfo?(dylan)
(In reply to Jonathan Claudius [:claudijd] (use NEEDINFO) from comment #1) > Here's the issue for our project that is triggering these exceptions every > so often (twice in a few days time, running once every 10min or so): > > https://github.com/mozilla/infosec-risk-management-bugzilla/issues/11 > > Code that triggers this: > > https://github.com/mozilla/infosec-risk-management-bugzilla/blob/master/ > assigner.py How many bugs are you making edits on at a time? I'm wondering if this is related to the bulk editing issue I've run into.
Flags: needinfo?(jclaudius)
Priority: -- → P1
Reporter | ||
Comment 4•6 years ago
|
||
:emceeaich - The tool does a couple things: 1.) READ/WRITE - Looks for unassigned bugs in two security components (Enterprise Information Security::Vulnerability Assessment and Enterprise Information Security::Rapid Risk Analysis) and if they are unassigned it will assign them in a round robin assignment 2.) READ - Looks for all bugs in component in two security components (Enterprise Information Security::Vulnerability Assessment and Enterprise Information Security::Rapid Risk Analysis) and if they are in sync with Mozilla CASA system. If the status' are out of sync, it will correct them in the CASA system. I don't believe it will go the other direction, so no WRITE here. Let us know if you would prefer a real-time troubleshooting session where we could trigger multiple runs on this in succession and see if it triggers your issue. The runs are scheduled to run once every 10min 24/7 to triage any new security issues ASAP.
Flags: needinfo?(jclaudius)
Reporter | ||
Comment 5•6 years ago
|
||
Please note that we are still experiencing these issues, more context and times of occurrence here: https://github.com/mozilla/infosec-risk-management-bugzilla/issues/13 Note that we do this every 10min around the clock.
Reporter | ||
Comment 6•6 years ago
|
||
:digi - making you aware of this, as our latest error (documented in issue #13 above) seems to suggest proxy tunnel issues
Comment 7•6 years ago
|
||
(In reply to Jonathan Claudius [:claudijd] (use NEEDINFO) from comment #6) > :digi - making you aware of this, as our latest error (documented in issue > #13 above) seems to suggest proxy tunnel issues Thanks! What host(s) run this job?
Reporter | ||
Comment 8•6 years ago
|
||
pentest-master.private.mdc1.mozilla.com || pentest-slave1.private.mdc1.mozilla.com (I can't login to check ATM, but I think it's slave)
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → dylan
Flags: needinfo?(dylan)
Assignee | ||
Comment 9•6 years ago
|
||
Assignee | ||
Comment 10•6 years ago
|
||
Assignee | ||
Comment 11•6 years ago
|
||
I'm not entirely sure the landed changes will fix the bug, but I'm going to resolve this bug and hope that next week we see a difference.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 12•6 years ago
|
||
:dylan - thank you, I'll report back / reopen as needed.
Assignee | ||
Updated•6 years ago
|
Alias: bmo-dbi-connector
Assignee | ||
Updated•6 years ago
|
Blocks: bmo-db-connector-fix
You need to log in
before you can comment on or make changes to this bug.
Description
•