Closed Bug 467412 Opened 16 years ago Closed 16 years ago

push zxtm logs to im-log02

Categories

(mozilla.org Graveyard :: Server Operations, task)

All
Other
task
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mrz, Assigned: oremj)

Details

Attachments

(1 file, 1 obsolete file)

VAMO logs are going to each ZXTM server.  Need to put something in place to gzip old logs and transfer to im-log02.

Details to follow.
Assignee: server-ops → oremj
Need to be able to access im-log02:ssh from zxlb*.  It looks like the default route for zxlb01 is 10.2.80.1, so I guess in that case 10.2.80.20 would need access?  Seems wrong to use that interface since it is being used for http/https.
This is a temp solution at best so maybe the best way is:

cat > /etc/sysconfig/network-scripts/route-eth0
10.2.0.0/16 via 10.2.10.1

That network should have connectivity to im-log02.  Or you could be more restrictive, 10.2.75.0/24 via 10.2.10.1 ?
Logs are also on zxlb04-08 now.
Make sure to use the new log settings.  I'll switch that cluster.
Done: im-log02:/data/stats/logs/zxlb*
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
how often are they copied over?
Crontab:
# Puppet Name: compress
0 * * * * /data/bin/gzip_files.sh
# Puppet Name: ship
30 * * * * /data/bin/rsync_files.sh
Will there be any consolidation of those logs or will the AMO ETL have to process each host individually?

Also, can someone fix the permissions on the directories to be similar to the others such as nladm01 and im-log01?  I see that it has the r and x bits set for other, but when I try to ls the directories from etl01, I get permission denied.
Would it be a big help to consolidate them?
I don't know.  I guess it mostly depends on how long this arrangement is likely to stand.

With the addition of the vamo log directory, I refactored my ETL to be able to handle additional data sources a bit better.  The only thing I'm not sure of is whether to treat all these zxtm sources as one logical source or if I should track them separately.  If I group them then my chart looks a little cleaner, but I won't be able to point at one particular machine that might be missing logs, I'd just (hopefully) notice a percentage dip in teh traffic from all ZX logs.

What do you think?
permissions on the zxtm01 directory still don't allow go+rx
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
ZXTM is likely to be a long term solution.  I'd rather not spend time trying to consolidate logs - I think it'll add a lot of overhead.  The ZXTM cluster could grow in size and I don't think it's realistic to push logs from some number of nodes and try to combine them.
I was thinking more along the lines of consolidating them in to one directory, which is pretty easy, but if it doesn't really help then I'd rather just leave it as it is.
Also, looks like someone set go+x on those directories.
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
If ZXTM is going to be long term then I should make some small changes to my ETL to be able to track the individual hosts so we can know which one might be missing logs when problems arise.

Regarding comment #14, it seems to me that comment #11 is still true..  Do I need to get the drive remounted or something?

[deinspanjer@cm-metricsetl01 logs]$ ls -ld zxlb*
drwx------ 2 112 112 12288 Dec  4 22:00 zxlb01
drwxr-xr-x 2 112 112 12288 Dec  4 22:00 zxlb02
drwxr-xr-x 2 112 112 12288 Dec  4 22:00 zxlb03
drwxr-xr-x 2 112 112 12288 Dec  4 22:00 zxlb04
drwxr-xr-x 2 112 112 12288 Dec  4 22:00 zxlb05
drwxr-xr-x 2 112 112 12288 Dec  4 22:00 zxlb06
drwxr-xr-x 2 112 112 12288 Dec  4 22:00 zxlb07
drwxr-xr-x 2 112 112 12288 Dec  4 22:00 zxlb08
[deinspanjer@cm-metricsetl01 logs]$ ls zxlb01
ls: zxlb01: Permission denied
[deinspanjer@cm-metricsetl01 logs]$ sudo ls -ld zxlb01
drwx------ 2 112 112 12288 Dec  4 22:00 zxlb01
[deinspanjer@cm-metricsetl01 logs]$ sudo ls -l zxlb01 | wc -l
88
my bad, i misunderstood - I didn't know the files were in sub dirs (didn't pay attention to the log changes oremj did yesterday)
Oops, now the perms on zxlb01 are fixed.
Sorry, on cm-metricsetl01, I'm still seeing zxlb01 with no go+rx.
I bumped up the priority because my ETL of versioncheck is stalled in december on this.
Severity: minor → critical
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
The bad perms were rsyncing over the top of my changed.  Should be fixed for good now.
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
Attached file Script to consolidate logs (obsolete) —
I'm going to start running this script which will hardlink the zeus logs to /data/stats/logs/im-log02/APP_NAME/access_YYYY-MM-DD-HH.z\d+.gz.  So you should be able to remove /data/stats/logs/zxlb*/ from your scripts.
Attachment #351832 - Attachment mime type: text/x-python → text/plain
Attached file Consolidates logs
I realized the last script didn't check if the file was already there.  This should work better.
Attachment #351832 - Attachment is obsolete: true
I had mentioned in comment #15 that I was going to make changes to my script to be able to track which host the logs came from so we'd be able to point to a specific server if log files were missing.  Will this script reliably allow that?
When will the script go live?
The script can go live at any time.  I can change the filename to /data/stats/logs/im-log02/APP_NAME/access_YYYY-MM-DD-HH.HOSTNAME\d+.gz if you would like to track per host statistics.
That would help.  I'm still not sure what the advantage to this script over leaving the files where they are is, considering that I've already implemented checking zxlb* for logfiles.

In your new filename format, you don't have any place for roll-over file numbers.  Does that mean that the cluster won't have roll-over files? e.g. 2008-12-01.1.gz
The roll-over part is the \d+ in access_YYYY-MM-DD-HH.HOSTNAME\d+.gz.  The advantage would be not having to change configs, for all the other applications that are processing logs, every time a zeus machine is added to the pool.
Ah, I thought the \d was the digit part of the hostname.  My Bad. :)

Okay.  I assumed that whatever other AMO processes that needed to parse these logs would just do a .../zxlb*/*access* type fileglob like I'm doing.

So you said this was a hardlink, that means that I'll be able to cut my ETL config over to look at the new files whenever I'm ready and I won't be missing anything by continuing to read the zxlb* directories for a little while after the script goes live?
Yeah, you can continue to use the zxlb* directories.  Unfortunately, the other scripts would all require code changes to work with the new filename/directory format.
Final file name format "/data/stats/logs/im-log02/APP_NAME/access_YYYY-MM-DD-HH.HOSTNAME_\d.gz"
Jeremy, there are some files in the amo directory that have just z1.\d instead of the full hostname.  What are these?

Also, during the second half of the test, the hard links were being created an extra hour behind the normal logs.  Will that be a requirement going forward?
The z1 files are there, because I accidentally ran the script on the 7th before I made the hostname change.  I didn't want to rm them, because some of the stats scripts may have already looked at them.
Gotcha. good to know.
Here is the crontab entry for the consolidate script:
0 * * * * /root/bin/consolidate-zxlb.py

and here is the rsync:
30 * * * * /data/bin/rsync_files.sh

Usually the rsync takes <10 min, so the links should be created pretty shortly after.
I was just replying to comment 29.  I saw Bug 469130 after and posted a more detailed comment.
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: