Closed Bug 620572 Opened 14 years ago Closed 14 years ago

Graph server failing to connect to its database

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Assigned: fox2mike)

Details

From http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1292890769.1292891497.14098.gz FAIL: Failed to send data 5 times... quitting RETURN:send failed, graph server says: RETURN:(2003, "Can't connect to MySQL server on '10.2.70.130' (110)") So far it has only taken out two Talos runs, so it could have just been a couple-minute hiccup, but if it keeps on, I'll have to close the tree, thus the severity.
10.2.70.130 is a stage database. Need to understand why production is trying to access a stage database.
10.2.70.130 is pamo, this is the first step in getting amo performance test results inserted into a given amo db. I'm working with justdave currently on getting this working.
Another four, and we're closed.
Done as work on bug 620015.
Can someone explain the relationship of the tree to graphs and why this outage requires the tree to close?
Assignee: server-ops → shyam
(In reply to comment #2) > 10.2.70.130 is pamo, this is the first step in getting amo performance test > results inserted into a given amo db. I'm working with justdave currently on > getting this working. 10.2.70.130 is the stage DB. Production should *never* be connecting to the stage DB. Why isn't this being tested with graphs-stage?
It has been tested on graphs-stage. We were attempting a stepped move to production. Roll out production graph server code reporting to pamo, check that new graph server code works along with various releng updates, switch from pamo to production amo.
(In reply to comment #5) > Can someone explain the relationship of the tree to graphs and why this outage > requires the tree to close? Talos tests post their results to graph server, so if it refuses to accept them, the test run turns red, rightfully, because we never really look at the individual numbers in the build log, only averages and graphs over time.
I was told removing the "amodb" config value from the config file would make it skip the portion of code that's erroring, so that's been removed until we straighten this out.
(In reply to comment #7) > It has been tested on graphs-stage. We were attempting a stepped move to > production. Roll out production graph server code reporting to pamo, check > that new graph server code works along with various releng updates, switch from > pamo to production amo. This dependency was missed from the deployment docs.
I'm seeing more green runs, I believe that you are clear to re-open.
(In reply to comment #8) Wouldn't it be better (if it is possible) to queue the results if it cannot connect to the DB? At face value it seems like the right thing to do to prevent something like this since otherwise all other functions are otherwise working, right?
Reopened.
Severity: blocker → normal
Closing this out since the immediate issue is fixed. (In reply to comment #13) > (In reply to comment #8) > > Wouldn't it be better (if it is possible) to queue the results if it cannot > connect to the DB? At face value it seems like the right thing to do to > prevent something like this since otherwise all other functions are otherwise > working, right? I opened Bug 620596 for this.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.