Closed
Bug 936878
Opened 11 years ago
Closed 11 years ago
/buildjson/builds-4hr.js.gz is CRITICAL ** not updading
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: cbook, Assigned: nthomas)
References
Details
via nagios:
Service: http file age - /buildjson/builds-4hr.js.gz
Host: builddata.pub.build.mozilla.org
Address: 63.245.215.57
State: CRITICAL
Date/Time: 11-10-2013 00:18:02
Additional Info:
HTTP CRITICAL: HTTP/1.1 200 OK - Last modified 0:18:51 ago - 195813 bytes in 0.036 second response time
not closing the trees at this time because they are already closed due to bug 936827
Assignee | ||
Comment 1•11 years ago
|
||
I fixed this up. The weekly restart of the redis server failed:
On 10/11/13 9:00 PM, Cron Daemon wrote:
> Found redis running on pid 2789
> Open files 295 in /proc, 304 via lsof
> Stopping redis-server: [FAILED]
> Starting redis-server: [ OK ]
> cat: /var/run/redis/redis.pid: No such file or directory
> /root/weekly_restart: line 20: test: : integer expression expected
> redis confusion: pid_file=, pgrep=2789
> Redis apparantly not running after restart
>
From redis01:/var/log/redis/redis.log:
[2789] 09 Nov 23:57:18 * Background saving terminated with success
[2789] 10 Nov 00:00:01 # Received SIGTERM, scheduling shutdown...
[2789] 10 Nov 00:00:01 # User requested shutdown...
[2789] 10 Nov 00:00:01 * Saving the final RDB snapshot before exiting.
[575] 10 Nov 00:00:07 # Opening port 6379: bind: Address already in use
[1128] 10 Nov 00:27:09 * Server started, Redis version 2.4.10
ie PID 575 failed to start because it hadn't shut down before that, leaving nothing running when 2789 exited eventually. Manually restarted.
Please file these as blockers rather than critical, as it's better to not change the priority after the other blocker is closed and I'd argue this is more important anyway (global tbpl reporting).
Assignee: nobody → nthomas
Severity: critical → blocker
Status: NEW → RESOLVED
Closed: 11 years ago
Priority: -- → P1
Resolution: --- → FIXED
Comment 2•11 years ago
|
||
It looks like this failed again. Why wasn't the startup script fixed?
Comment 3•11 years ago
|
||
There has been discussion about redoing the redis service (moving off of kvm, into scl3, managed by webops). Please see bug 934627 and bug 934593 for proposed future work.
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•