Closed
Bug 822661
Opened 12 years ago
Closed 12 years ago
Can't connect from crashanalysis.dmz.phx1 to tp-socorro01-ro-zeus any more
Categories
(Socorro :: Database, task)
Socorro
Database
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: kairo, Assigned: selenamarie)
References
Details
For creating my custom crash analysis reports, I need to connect from crashanalysis.dmz.phx1 to the secondary Socorro DB at tp-socorro01-ro-zeus.phx1.mozilla.com:6432 - this worked fine until yesterday but today I get this error:
Warning: pg_pconnect(): Unable to connect to PostgreSQL server
This blocks us from doing stability analysis for our upcoming Firefox releases, so I'm filing it with blocker severity.
Comment 1•12 years ago
|
||
Moving to webops
Assignee: server-ops → server-ops-webops
Component: Server Operations → Server Operations: Web Operations
QA Contact: shyam → nmaul
Updated•12 years ago
|
Assignee: server-ops-webops → dgherman
Comment 2•12 years ago
|
||
[root@crashanalysis.dmz.phx1 ~]# nc -vz tp-socorro01-ro-zeus.phx1.mozilla.com 6432
Connection to tp-socorro01-ro-zeus.phx1.mozilla.com 6432 port [tcp/pgbouncer] succeeded!
Can you give more details please?
Updated•12 years ago
|
Severity: blocker → normal
| Reporter | ||
Comment 3•12 years ago
|
||
[rkaiser@crashanalysis.dmz.phx1 crash-report-tools]$ psql -h tp-socorro01-ro-zeus.phx1.mozilla.com -p 6432 -U analyst breakpad
psql: [rkaiser@crashanalysis.dmz.phx1 crash-report-tools]$
Note how it doesn't even ask me for a password. Maybe it's actually the PostgreSQL instance there that is unhappy.
Comment 4•12 years ago
|
||
The VIP points to tp-socorro01-master02.phx1.mozilla.com:6432.
But on master02, even if I see postgres processes running, this port is not open:
[root@tp-socorro01-master02.phx1 ~]# ps aux | grep postg
postgres 20183 0.0 0.2 8949436 202328 ? S Dec17 0:12 /usr/pgsql-9.2/bin/postmaster -p 5432 -D /pgdata/9.2/data
postgres 20185 0.0 0.0 177220 1552 ? Ss Dec17 0:00 postgres: logger process
postgres 20186 0.5 11.5 8953632 8558908 ? Ss Dec17 5:34 postgres: startup process recovering 0000001C00000EE50000004B
postgres 20191 0.0 9.5 8954192 7052992 ? Ss Dec17 0:46 postgres: checkpointer process
postgres 20192 0.0 6.1 8953536 4579428 ? Ss Dec17 0:14 postgres: writer process
postgres 20194 0.0 0.0 179620 1792 ? Ss Dec17 0:14 postgres: stats collector process
postgres 20593 0.3 0.0 8964492 5156 ? Ss Dec17 3:17 postgres: wal receiver process streaming EE5/4B9934A0
[root@tp-socorro01-master02.phx1 ~]# netstat -tunap | grep 6432
[root@tp-socorro01-master02.phx1 ~]#
| Assignee | ||
Comment 5•12 years ago
|
||
The replica is currently broken.
Matt's not in yet. I'm going to try to kick off a base backup to try to fix this. The underlying problem is that something is deleting the WAL before it can be replayed on the replica. I'm guessing this is a PgX script. In lieu of documentation, I'm pulling Josh Berkus in to see if we can sort out why the WAL disappears prematurely.
Updated•12 years ago
|
Component: Server Operations: Web Operations → Database
Product: mozilla.org → Socorro
QA Contact: nmaul
| Assignee | ||
Comment 6•12 years ago
|
||
Replica is now working and access to _ro is restored.
Assignee: dgherman → sdeckelmann
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•