Closed Bug 594476 Opened 14 years ago Closed 14 years ago

Input production is down

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
blocker

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: stephend, Assigned: fox2mike)

References

()

Details

Input.mozilla.com production is down (oddly enough, it returns 200 OK?), but it's a blank page.
Looking into this.
Assignee: server-ops → shyam
Okay so here is what I think happened :

1) Due to an unrelated incident, we had pm-app-generic03 offline for a rebuild.
2) Beta5 hits folks, everyone gets happy/sad and hits on input
3) input's DB cluster can't take the load, so we see general performance impact on the site and DB
4) I kill off a query on the DB which was running for over 800 seconds. This action is a little too late.
5) The generic cluster, already short one machine is slowly overwhelmed. All 5 machines page for a number of issues.
6) MySQL on the DB slave for input was kicked. Webheads start coming back to sanity.
7) Dave identifies a possibly problem query :

12:51:05 <@justdave> ok, here's input's problem:
12:51:06 <@justdave> SELECT         id, positive, url, description, product,         CRC32(version) as version,         CRC32(os) as os,         CRC32(locale) AS locale,         UNIX_TIMESTAMP(created) 
                     AS created,          url IS NOT NULL AND url != '' AS has_url     FROM feedback_opinion
12:51:23 <@justdave> no WHERE clause
12:51:30 <@justdave> 411k rows in that table
12:51:36 <@justdave> it's retrieving the entire thing
12:51:37 <@justdave> frequently

8) Stuff recovers across the cluster. Lack of a webhead probably contributed to the issue.
9) pm-app-generic03 rebuild complete and is back online.

So I think this was a combination of issues + release day that caused this. As of now, everything should have stabilized. 

Reopen if you see further issues, ping me on IRC if you have further questions.

Thanks to justdave, jabba and jlaz for their help.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Verified FIXED, thanks, all!
Status: RESOLVED → VERIFIED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.