Closed Bug 1500411 (bmo-errors-2018-10-19) Opened 6 years ago Closed 6 years ago

Intermittent 500 server errors on BMO all around

Categories

(bugzilla.mozilla.org :: General, defect, P1)

Production
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: Usul, Assigned: dylan)

References

(Depends on 1 open bug)

Details

I got reports from both Pike and Aryx that they encountered issues submitting bug changes to bugzilla. 13:29:11 CET <Pike> bugzilla just puked a rainbow on me. "something went very wrong" this was on bug 1317336 around 13:28 CET Aryx had the issue 5/6 minutes before on either bug 1500031 or 1500037 (he wasn't sure). At 14:05 Aryx reproduced: <Aryx> Usul: just got it again for https://bugzilla.mozilla.org/show_bug.cgi?id=1500312 <Aryx> edited bug summary and tried to submit
I had this a few minutes ago on bug 1487703, and now tried 3 times to 'follow' this bug using the 1-click button at the top of the bug, and all of them came back "Unexpected Error".
<Gijs> yeah, multiple people reporting this now. <Gijs> see #fx-team in the last 5 minutes had the issue when trying to add this comment.
I just got this 5 times in a row when trying to check the "block reviews" checkbox. The 6th time worked. And got it twice when trying to change a reviewer on a bug; third time worked.
For me, BMO is unusable right now. Can't update or file bugs.
Comment #4 got posted on the 3rd attempt :-(
It took me 5 attempts to submit bug 1498188 comment 1, and then 9 attempts to set the "wontfix" flag in the same bug.
I tried 5 times to upload a patch to bug 1498740. Gave up.
I cannot review, give feedback, edit attachments, needinfo others, or edit my profile.
Folks, I don't think the "me too" comments are bringing new information ;)
BTW, "Preview" doesn't work either, or is this another bug?
(In reply to Jorg K (GMT+2) from comment #13) > BTW, "Preview" doesn't work either, or is this another bug? Looks like the same; the server returns 500 Internal Server Error.
Summary: Sometimes users need to resubmit when editing bugs → Intermittent 500 server errors on BMO all around
Hey! For the moment the error should be resolved. Preliminary theory is that while making the Phabricator/bugzilla sync daemon more resilient in the face of MySQL errors, the web heads are now less resilient to MySQL errors. This manifested today because (until we upgrade mysql) the MySQL servers will crash every Friday.
Assignee: nobody → dylan
Just a note, you didn't receive emails from my last update because my account was set to silent mode, from handling admin tasks yesterday. Sorry for any worry that may have caused.
Component: Bug Creation/Editing → General
Priority: -- → P1
Summary: Intermittent 500 server errors on BMO all around → BMO webheads do not handle database disconnection events gracefully
Depends on: bmo-infra-tmp
Depends on: 1500550
Okay, so the preliminary diagnosis was not entirely correct. The root cause was MySQL crashing, but the error that was triggering the 500 errors was that the /tmp directory became filled up. We're fixing that in bug 1500547. There is some room for improvement in how we handle and report database errors, so that change will go in bug 1500547. This bug will be resolved when both those (and anything else we find and link to this bug).
Depends on: bmo-kills-mysql
Summary: BMO webheads do not handle database disconnection events gracefully → A combination of /tmp filling up in the container caused by a MySQL crash lead to ISE 500 errors
Summary: A combination of /tmp filling up in the container caused by a MySQL crash lead to ISE 500 errors → BMO producing HTTP 500 errors after tmp filling up in the container caused by a MySQL crash
Alias: bmo-errors-2018-10-19
Summary: BMO producing HTTP 500 errors after tmp filling up in the container caused by a MySQL crash → Intermittent 500 server errors on BMO all around
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.