Closed Bug 1031159 Opened 11 years ago Closed 11 years ago

Improve slaveapi logging

Categories

(Release Engineering :: General, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Callek, Assigned: Callek)

Details

Attachments

(1 file, 1 obsolete file)

Attached patch [slaveapi] v1 (obsolete) — Splinter Review
So, slaveapi logs usually have a mix of "slavename" and "ip" as well as lots of lines without either (e.g. exceptions or lines that we forgot to stick a slavename on) This patch should do it quite well, by using a Log Filter attached to the handler, and using gevent.local.local() The concept was tested manually in a python shell, but I did not test this patch in the context of slaveapi yet. I don't anticipate testing it tomorrow, but jordan, if you can roll this into your aws testing path that would be appreciated, but not required.
Attachment #8446979 - Flags: review?(jlund)
Comment on attachment 8446979 [details] [diff] [review] [slaveapi] v1 Review of attachment 8446979 [details] [diff] [review]: ----------------------------------------------------------------- Cool! better consistent logging would be shweet. To clear my review queue, I'm gonna r- this for now until one of us tests this. I can add it to my aws stuff but IIUC this won't quite work well for me. aws_create_instance is the first time slaveapi will be working on a slave that does not yet exist and therefore won't avail of class Machine or Slave. But this will be good for negative testing like when log_data.slave == '-.-' ::: slaveapi-server.py @@ +85,5 @@ > handler = logging.StreamHandler() > + > + class SlaveLogFilter(logging.Filter): > + def filter(self, record): > + record.slave = getattr(log_data, "slave", "-=-") So if '-M-' == messenger, '-.-' == None, what is '-=-' ? @@ +89,5 @@ > + record.slave = getattr(log_data, "slave", "-=-") > + return True > + > + fmt = logging.Formatter("%(asctime)s - %(levelname)s - %(slave)s - %(message)s") > + handler.addFiltre(slaveLogFilter()) I'm guessing this is supposed to be s/addFiltre/addFilter ? ::: slaveapi/actions/buildslave_last_activity.py @@ +44,5 @@ > log.debug("slave.basedir='%s'" % slave.basedir) > # we'll disregard the return code b/c it will be non-zero if twistd.log.1 is not found > rc, output = console.run_cmd("tail -n 100 %(basedir)s/twistd.log.1 %(basedir)s/twistd.log" % { 'basedir': slave.basedir }) > except RemoteCommandError: > + return FAILURE, "failed to tail twistd.log" does this log filter stuff account for return msg and will add the slave to this string?
Attachment #8446979 - Flags: review?(jlund) → review-
Attached patch [slaveapi] v2Splinter Review
I'd love to deploy this pretty soon, since it greatly improves stuff. Added logException function which lets us, well, log an exception better (exc_info=True and log.exception both omits Filter prefix's, since it internally just prints lines of the exception info to the log) making grep of the log saner. Also fixed for bitrot :-) Other answers in next comment.
Attachment #8446979 - Attachment is obsolete: true
Attachment #8460037 - Flags: review?(jlund)
(In reply to Jordan Lund (:jlund) from comment #1) > Comment on attachment 8446979 [details] [diff] [review] > [slaveapi] v1 > > Review of attachment 8446979 [details] [diff] [review]: > ----------------------------------------------------------------- > > Cool! better consistent logging would be shweet. To clear my review queue, > I'm gonna r- this for now until one of us tests this. I can add it to my aws > stuff but IIUC this won't quite work well for me. aws_create_instance is the > first time slaveapi will be working on a slave that does not yet exist and > therefore won't avail of class Machine or Slave. But this will be good for > negative testing like when log_data.slave == '-.-' > > ::: slaveapi-server.py > @@ +85,5 @@ > > handler = logging.StreamHandler() > > + > > + class SlaveLogFilter(logging.Filter): > > + def filter(self, record): > > + record.slave = getattr(log_data, "slave", "-=-") > > So if '-M-' == messenger, '-.-' == None, what is '-=-' ? -=- is "default" meaning anyplace that doesn't have a log_data set. In practice that means web-initiated requests (and startup) -.- is actually "the processor thread, without a slave defined" (See log snippet at end of comment) > @@ +89,5 @@ > > + record.slave = getattr(log_data, "slave", "-=-") > > + return True > > + > > + fmt = logging.Formatter("%(asctime)s - %(levelname)s - %(slave)s - %(message)s") > > + handler.addFiltre(slaveLogFilter()) > > I'm guessing this is supposed to be s/addFiltre/addFilter ? yep fixed. > ::: slaveapi/actions/buildslave_last_activity.py > @@ +44,5 @@ > > log.debug("slave.basedir='%s'" % slave.basedir) > > # we'll disregard the return code b/c it will be non-zero if twistd.log.1 is not found > > rc, output = console.run_cmd("tail -n 100 %(basedir)s/twistd.log.1 %(basedir)s/twistd.log" % { 'basedir': slave.basedir }) > > except RemoteCommandError: > > + return FAILURE, "failed to tail twistd.log" > > does this log filter stuff account for return msg and will add the slave to > this string? No, it doesn't. But in those cases the caller/api knows what slave it is. (e.g. the processor will associate it) ----------- (log from my venv on slaveapi-dev1) ----------- 2014-07-21 22:39:21,745 - INFO - -=- - Running with pid 29340 2014-07-21 22:39:21,746 - INFO - -=- - Running at <WSGIServer at 0x17fd710 fileno=4 address=0.0.0.0:8091> 2014-07-21 22:39:28,321 - INFO - -.- - Processing item: (u't-xp32-ix-039', <function reboot at 0x1790758>, (), {}, <slaveapi.actions.results.ActionResult object at 0x17fdf10>) 2014-07-21 22:39:28,323 - INFO - -=- - 127.0.0.1 - - [2014-07-21 22:39:28] "POST /slaves/t-xp32-ix-039/actions/reboot HTTP/1.1" 202 281 0.004896 2014-07-21 22:39:28,328 - INFO - t-xp32-ix-039 - Getting inventory info 2014-07-21 22:39:28,679 - INFO - t-xp32-ix-039 - Getting devices.json info 2014-07-21 22:39:28,773 - INFO - t-xp32-ix-039 - Getting bug info 2014-07-21 22:39:28,774 - INFO - t-xp32-ix-039 - Sending request: GET https://bugzilla-dev.allizom.org/rest/bug/t-xp32-ix-039 2014-07-21 22:39:29,111 - INFO - t-xp32-ix-039 - Got response: 200 2014-07-21 22:39:32,924 - WARNING - t-xp32-ix-039 - First password as administrator didn't work. 2014-07-21 22:39:34,250 - WARNING - t-xp32-ix-039 - First password as root didn't work. 2014-07-21 22:39:35,454 - WARNING - t-xp32-ix-039 - First password as cltbld didn't work. 2014-07-21 22:39:36,063 - INFO - t-xp32-ix-039 - Couldn't connect with any credentials. 2014-07-21 22:39:36,064 - ERROR - t-xp32-ix-039 - Authentication failed. 2014-07-21 22:39:36,064 - ERROR - t-xp32-ix-039 - Traceback (most recent call last): 2014-07-21 22:39:36,064 - ERROR - t-xp32-ix-039 - 2014-07-21 22:39:36,064 - ERROR - t-xp32-ix-039 - File "/home/jwood/slaveapi/repo/slaveapi/slave.py", line 139, in get_console 2014-07-21 22:39:36,065 - ERROR - t-xp32-ix-039 - console.connect() # Make sure we can connect properly 2014-07-21 22:39:36,065 - ERROR - t-xp32-ix-039 - 2014-07-21 22:39:36,065 - ERROR - t-xp32-ix-039 - File "/home/jwood/slaveapi/repo/slaveapi/clients/ssh.py", line 86, in connect 2014-07-21 22:39:36,065 - ERROR - t-xp32-ix-039 - raise last_exc 2014-07-21 22:39:36,065 - ERROR - t-xp32-ix-039 - 2014-07-21 22:39:36,065 - ERROR - t-xp32-ix-039 - AuthenticationException: Authentication failed. 2014-07-21 22:39:36,066 - ERROR - t-xp32-ix-039 - 2014-07-21 22:39:36,066 - ERROR - t-xp32-ix-039 - 2014-07-21 22:39:36,066 - INFO - t-xp32-ix-039 - AFTER_TRACEBACK: INFO 2014-07-21 22:39:36,068 - INFO - t-xp32-ix-039 - Powercycling 2014-07-21 22:41:56,634 - INFO - t-xp32-ix-039 - Powercycle completed. 2014-07-21 22:41:56,635 - INFO - t-xp32-ix-039 - Waiting 60 seconds for reboot. 2014-07-21 22:41:58,654 - INFO - t-xp32-ix-039 - Checking for signs of life 2014-07-21 22:43:13,723 - INFO - t-xp32-ix-039 - Finished Processing item: (u't-xp32-ix-039', <function reboot at 0x1790758>, (), {}, <slaveapi.actions.results.ActionResult object at 0x17fdf10>)
Comment on attachment 8460037 [details] [diff] [review] [slaveapi] v2 Review of attachment 8460037 [details] [diff] [review]: ----------------------------------------------------------------- lgtm. I wonder if we will lose context in some log lines like e.g., when we use fqdn or Machine attrs instead of slave. I also wonder how to handle things link slaveapi logging for my upcoming create_aws_instance since it won't have a Slave assoctiated with it until it is created. But either way these changes look like a huge improvement while providing some standard. ::: slaveapi/util/__init__.py @@ +18,5 @@ > "truthy - %s, falsy - %s" % (value, true_values, false_values) > ) > + > + > +def logException(log_fn, message=None): Nice!
Attachment #8460037 - Flags: review?(jlund) → review+
deployed!
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: