Only buffer mochitest structured logs when needed and not for raw json logs

ASSIGNED
Assigned to

Status

ASSIGNED
4 years ago
3 years ago

People

(Reporter: akachkach, Assigned: chmanchester)

Tracking

({leave-open})

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 7 obsolete attachments)

28.60 KB, patch
Details | Diff | Splinter Review
22.49 KB, patch
Details | Diff | Splinter Review
38 bytes, text/x-review-board-request
jgraham
: review+
Details
(Reporter)

Description

4 years ago
Currently, the log buffering is done in the MessageLogger class, which causes buffered message to not be passed to all handlers, including StreamHandlers that stream to a file.

We don't want that to happen since the raw logs should contain all the information we have about a test run, as that can be useful later.
(Reporter)

Comment 1

4 years ago
Created attachment 8475741 [details] [diff] [review]
0001-Bug-1055765-Only-buffer-mochitest-structured-logs-fo.patch

WIP patch.

This currently works (except in the case of --run-by-dir and some exceptions using dump_buffered since I have to find a "smart" way to do that).

I was also thinking I could have a shared list of buffered messages instead of having one per formatter (and just check if a message isn't already added to the list before buffering it)
(Reporter)

Comment 2

4 years ago
Created attachment 8476344 [details] [diff] [review]
0001-Bug-1055765-Only-buffer-mochitest-structured-logs-fo.patch

Had to switch to wrapping handlers instead of wrapping formatters (because of the way dumping buffered logs work).

This should work just fine with run-by-dir and other things that need to explicitly dump buffered logs.
Attachment #8475741 - Attachment is obsolete: true
Attachment #8476344 - Flags: feedback?(cmanchester)
(Assignee)

Comment 3

4 years ago
Comment on attachment 8476344 [details] [diff] [review]
0001-Bug-1055765-Only-buffer-mochitest-structured-logs-fo.patch

Review of attachment 8476344 [details] [diff] [review]:
-----------------------------------------------------------------

f+ because I think this should happen, but I don't think this is ready.

::: testing/mochitest/runtests.py
@@ +139,5 @@
> +            self.buffered_messages.append(message)
> +        else:
> +            if self.buffered_messages:
> +                dumped_messages = min(len(self.buffered_messages), LogMochitestBuffering.BUFFERING_THRESHOLD)
> +                dumping_message = dict(message="dumping last {0} message(s)\nif you need more context, please use SimpleTest.requestCompleteLog() in your test".format(dumped_messages),

This is the only place we explicitly mention mochitest in this class. At least xpcshell also has a notion of buffering, and it's not totally incompatible with this one. Maybe there's an underlying buffer implementation that could live in mozlog.structured. It would need to be configurable by the buffer size and maybe give a callback to determine whether a message needs to be buffered or logged immediately.

@@ +170,5 @@
> +        # Setting up buffering
> +        if buffering:
> +            for i, handler in enumerate(self.logger.handlers):
> +                if isinstance(handler, StreamHandler) and handler.stream == sys.stdout:
> +                    self.logger.handlers[i] = LogMochitestBuffering(handler)

I'm not a fan of this because it has an effect on all the handlers for this logger. Shouldn't buffering be turned off by default (or maybe default to on for some formatters) for everyone and set per-formatter by the commandline in automation? I think the main criteria for buffering or will end up being the formatter and whether something is running in automation, not whether the log is writing to stdout.

@@ +183,5 @@
>  
> +    @classmethod
> +    def is_error_message(self, message):
> +        return 'expected' in message or (message['action'] == 'log' and message['message'].startswith('TEST-UNEXPECTED'))
> +

I think making these classmethods is buying you much.

@@ +229,5 @@
>  
> +    def dump_buffered(self):
> +        for handler in self.logger.handlers:
> +            if isinstance(handler, LogMochitestBuffering):
> +                handler.dump_buffered()

I don't think this should be based on an instance check. Could be 'hasattr', but maybe we could have a log action to make this happen (like we have a special action to turn the buffering on and off)?
Attachment #8476344 - Flags: feedback?(cmanchester) → feedback+
(Reporter)

Comment 4

4 years ago
(In reply to Chris Manchester [:chmanchester] from comment #3)
> Comment on attachment 8476344 [details] [diff] [review]
> 0001-Bug-1055765-Only-buffer-mochitest-structured-logs-fo.patch
> 
> Review of attachment 8476344 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> f+ because I think this should happen, but I don't think this is ready.
> 
> ::: testing/mochitest/runtests.py
> @@ +139,5 @@
> > +            self.buffered_messages.append(message)
> > +        else:
> > +            if self.buffered_messages:
> > +                dumped_messages = min(len(self.buffered_messages), LogMochitestBuffering.BUFFERING_THRESHOLD)
> > +                dumping_message = dict(message="dumping last {0} message(s)\nif you need more context, please use SimpleTest.requestCompleteLog() in your test".format(dumped_messages),
> 
> This is the only place we explicitly mention mochitest in this class. At
> least xpcshell also has a notion of buffering, and it's not totally
> incompatible with this one. Maybe there's an underlying buffer
> implementation that could live in mozlog.structured. It would need to be
> configurable by the buffer size and maybe give a callback to determine
> whether a message needs to be buffered or logged immediately.
> 

Yup, maybe we could put this in mozlog.structured; I was thinking of this as a temporary fix since others (like :mwargers) are already interested in exploiting the raw logs.

> @@ +170,5 @@
> > +        # Setting up buffering
> > +        if buffering:
> > +            for i, handler in enumerate(self.logger.handlers):
> > +                if isinstance(handler, StreamHandler) and handler.stream == sys.stdout:
> > +                    self.logger.handlers[i] = LogMochitestBuffering(handler)
> 
> I'm not a fan of this because it has an effect on all the handlers for this
> logger. Shouldn't buffering be turned off by default (or maybe default to on
> for some formatters) for everyone and set per-formatter by the commandline
> in automation? I think the main criteria for buffering or will end up being
> the formatter and whether something is running in automation, not whether
> the log is writing to stdout.
> 

Again, I only did this as a temporary fix (since users aren't usually worried about log buffering in files, especially with raw logs). Ultimately if we can choose if a formatter buffers the logs or not, that would give more control.

> @@ +183,5 @@
> >  
> > +    @classmethod
> > +    def is_error_message(self, message):
> > +        return 'expected' in message or (message['action'] == 'log' and message['message'].startswith('TEST-UNEXPECTED'))
> > +
> 
> I think making these classmethods is buying you much.
> 

I did this because it doesn't really require any state from the MessageLogger, and I didn't want to duplicate the is_error condition between MessageLogger and LogMochitestBuffering.

> @@ +229,5 @@
> >  
> > +    def dump_buffered(self):
> > +        for handler in self.logger.handlers:
> > +            if isinstance(handler, LogMochitestBuffering):
> > +                handler.dump_buffered()
> 
> I don't think this should be based on an instance check. Could be 'hasattr',
> but maybe we could have a log action to make this happen (like we have a
> special action to turn the buffering on and off)?

Well, actually suite-end should be perfect for this. I always wanted to dump the buffered messages on suite-end (which would always be emitted at the end of a run). But we currently have some exceptions that force us to first dump the buffer then send a suite-end later on.

Actually, speaking of a special action to turn buffering on/off, I didn't think about something: with this patch, these actions would be problematic if the handlers are not buffering (they will just pass them to the formatter which will crash because the action isn't recognized).
(Assignee)

Updated

4 years ago
Summary: Only buffer mochitest structured logs for stdout, not file handlers. → Only buffer mochitest structured logs when needed and not for raw json logs
(Assignee)

Comment 5

4 years ago
Created attachment 8484365 [details] [diff] [review]
Implement mochitest compatible log buffering in mozlog

The integration with mozlog's commandline is a little hacky, but this pretty much works. The idea is to be able to specify --log-tbpl-buffer=100 at the commandline and have that mean the tbpl formatter has a buffer of 100 messages but the raw json log never buffers. The logic for dumping and clearing the buffer is still in the harness, but that might be the right place for it.

Buffer size seems a bit of a weird thing to configure on, but exposing this from the commandline means we should be able to pass --log-tbpl-buffer=UNLIMITED to xpcshell and get the same benefits for that harness. I haven't taken the step of integrating this into the xpcshell harness, but it should work.
Attachment #8484365 - Flags: feedback?(ahalberstadt)
(Assignee)

Updated

4 years ago
Assignee: nobody → cmanchester
Status: NEW → ASSIGNED
(Assignee)

Comment 6

4 years ago
Just to see, here it is on try logging full raw log:

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=d546af35e9a8
Comment on attachment 8484365 [details] [diff] [review]
Implement mochitest compatible log buffering in mozlog

Review of attachment 8484365 [details] [diff] [review]:
-----------------------------------------------------------------

Awesome, thanks for doing this! A few comments.. also updating the docs to mention this buffering feature and adding a test or two would be nice.

::: testing/mochitest/mochitest_options.py
@@ +396,5 @@
>             "default": False,
>             "dest": "quiet",
> +           "help": ("Do not print test log lines unless a failure occurs. [DEPRECATED -"
> +                    "Buffering for individual formatters may be set with"
> +                    "--log-<formatter>-buffer")

Let's just update the files in testing/config/mozharness as part of the same patch and remove --quiet.. we can leave it in the mach_commands.py and devs shouldn't ever notice the difference (though I'd be tempted to remove it from there as well).

::: testing/mozbase/mozlog/mozlog/structured/commandline.py
@@ +26,5 @@
>      # <option name>: (<wrapper function>, description, <applicable formatters>)
>      'level': (level_filter_wrapper,
>                "A least log level to subscribe to for the given formatter (debug, info, error, etc.)",
>                ["mach", "tbpl"]),
> +    'buffer': (None,

If the second option added already breaks the assumptions made for the first option, maybe we need to revise how this works.

::: testing/mozbase/mozlog/mozlog/structured/handlers/bufferhandler.py
@@ +13,5 @@
> +                             rather than log directly.
> +    :param error_callback: A function taking a json log message and returning
> +                           a boolean. If true, that message will cause the
> +                           buffer to be dumped.
> +    """

Does sphinx pick this up as belonging to the constructor? Or should it be under the __init__ method? (honest question, I'm not sure)

@@ +27,5 @@
> +
> +        if self.message_limit is not None:
> +            self._buffer = [None] * self.message_limit
> +            self._buffer_pos = 0
> +            self._start_pos = 0

Unused variable

::: testing/mozbase/mozlog/mozlog/structured/structuredlog.py
@@ +135,5 @@
>      @property
> +    def _buffered_handlers(self):
> +        return self._state._buffered_handlers
> +
> +    def _append_buffered_handler(self, handler):

If this is only called from outside this class, it isn't really a private member. Making this "private" also makes it harder for people to set up programmatically (i.e not via commandline.py). Please also add a sphinx style docstring.
Attachment #8484365 - Flags: feedback?(ahalberstadt) → feedback+
(Assignee)

Comment 8

4 years ago
Comment on attachment 8484365 [details] [diff] [review]
Implement mochitest compatible log buffering in mozlog

As ahal pointed out, it's unfortunate this already violates the assumptions of the "formatter option" scheme, but this can't really be implemented as a formatter because it needs to emit buffered messages. James, do have any feedback on the api or how to deal with this is a nicer way?
Attachment #8484365 - Flags: feedback+ → feedback?(james)
(Assignee)

Comment 9

4 years ago
Created attachment 8485199 [details] [diff] [review]
Implement mochitest compatible log buffering in mozlog

This is pretty much the same patch with docs and tests updated.
Attachment #8485199 - Flags: feedback?(james)
(Assignee)

Updated

4 years ago
Attachment #8484365 - Attachment is obsolete: true
Attachment #8484365 - Flags: feedback?(james)
(Assignee)

Comment 10

4 years ago
I was thinking about this again, and I guess in principle buffering isn't really a first class idea with structured logging, because the idea is generally to output everything and do any filtering after the fact. But this assumes saving the entire output is always possible, and assumes a ui integrated with the raw logs, so I think this or something similar will be useful as a way to get the entire output in the raw log while maintaining the existing (and perhaps reasonable) constraints of keeping noise down for human consumers and keeping log sizes down.

At any rate, getting test output unbuffered was an original motivation for structured logging, so I'd like to get something in place that does at least that. Maybe it's enough to never buffer raw json logs, and let harnesses manage their legacy output as they like.
So, sorry I haven't looked at this yet.

I think that in principle the idea of adding buffering either into specific formattters or, more generally, as a wrapper around specific formatters, makes sense in a structured logging context. The only difficulty is that some of the logic for deciding when to flush buffers may be harness specific. I'm not sure that's an insurmountable problem, or means that we need to push everything down into the harness code.
Comment on attachment 8485199 [details] [diff] [review]
Implement mochitest compatible log buffering in mozlog

Review of attachment 8485199 [details] [diff] [review]:
-----------------------------------------------------------------

So generally this seems OK, but I'm slightly concerned that the way it adds lots of api surface to all loggers that doesn't make sense in cases where buffering isn't being used is bad design.

::: testing/mochitest/runtests.py
@@ +129,5 @@
>                  continue
>              try:
>                  message = json.loads(fragment)
>                  if not self.valid_message(message):
> +                    message = dict(action='log', level='info', message=fragment, bypass_mozlog_buffer=True)

So what's the scenario here? Are these messages that are malformed because they're unstructured, and we don't want to buffer them because even though they are being logged at info level they actually could be anything?

@@ +144,4 @@
>              self.tests_started = True
>  
> +        if action == 'buffering_on':
> +            self.logger.enable_buffering()

Should {action:"buffering", state:"on"} become an actual logged thing, since it seems important to understanding the rest of the logs?

::: testing/mozbase/mozlog/mozlog/structured/commandline.py
@@ +28,5 @@
>                "A least log level to subscribe to for the given formatter (debug, info, error, etc.)",
>                ["mach", "tbpl"]),
> +    'buffer': (None,
> +               ("Turns on buffering for the given formatter with the given buffer size, or"
> +                "UNLIMITED to indicate no limit to the size of the buffer"),

I would probably design this so that unlimited is the default and --log-buffer-foo=X limits to X messages.

@@ +29,5 @@
>                ["mach", "tbpl"]),
> +    'buffer': (None,
> +               ("Turns on buffering for the given formatter with the given buffer size, or"
> +                "UNLIMITED to indicate no limit to the size of the buffer"),
> +               ["mach", "tbpl", "raw"]),

This probably shouldn't be allowed for raw logs unless we have a compelling use case.

::: testing/mozbase/mozlog/mozlog/structured/handlers/bufferhandler.py
@@ +20,5 @@
> +        self.inner = inner
> +        self.message_limit = message_limit
> +        if buffered_actions is None:
> +            buffered_actions = set(['log', 'test_status'])
> +        self.buffered_actions = buffered_actions

Probably want to force it to a set here?

@@ +32,5 @@
> +            self._buffer = []
> +
> +    def __call__(self, data):
> +        action = data['action']
> +        if 'bypass_mozlog_buffer' in data:

This seems unfortunate. I guess I will find the use case when I read more code.

@@ +62,5 @@
> +            self._buffer = []
> +
> +    def dump_buffered(self):
> +        """Logs the contents of the current buffer"""
> +        for msg in self._buffer[self._buffer_pos:] + self._buffer[:self._buffer_pos]:

Dunno how much you care about performance but there's no real need to allocate a new list here.

::: testing/mozbase/mozlog/mozlog/structured/structuredlog.py
@@ +132,5 @@
>                  del self._state.handlers[i]
>                  break
>  
>      @property
> +    def _buffered_handlers(self):

So I don't really like having all this API on the structured logger for this one use case. It would be nice if we could design a more generic system that could apply to all formatters. For example one can imagine having a message broadcast system which you might use like:

logger.formatters.message("buffers:flush")

and then the formatters that knew how to deal with that kind of message would respond to it and the others would ignore it. I guess there is a problem with this design because it doesn't compose too well. Specifically if you have a formatter composed like FooFilter(BarFilter(formatter)) and you want to send a message to the BarFilter you have to rely on the FooFilter doing the right thing to pass it on. Maybe that's not a big problem though.

It also has the problem that it doesn't allow any safety in the API e.g. if you write "buffer:dump" when you meant "buffer:flush" it won't complain at you. That might be solvable with some effort.
Attachment #8485199 - Flags: feedback?(james)
(Assignee)

Comment 13

4 years ago
Thanks for the feedback, I think some minimal facility for broadcasting directives to formatters/handlers will be a reasonable way forward. Might be ugly getting through to all the wrapped up handlers, but saves on pointless api surface.

> ::: testing/mochitest/runtests.py
> @@ +129,5 @@
> >                  continue
> >              try:
> >                  message = json.loads(fragment)
> >                  if not self.valid_message(message):
> > +                    message = dict(action='log', level='info', message=fragment, bypass_mozlog_buffer=True)
> 
> So what's the scenario here? Are these messages that are malformed because
> they're unstructured, and we don't want to buffer them because even though
> they are being logged at info level they actually could be anything?
> 

Right. I advocated never buffering this output because it never would have been buffered by the previous mode of buffering, and like you say it could be anything.
(Assignee)

Comment 14

4 years ago
Created attachment 8495326 [details] [diff] [review]
Implement log buffering as a handler in mozlog.

This is a work in progress, I still need to figure out a good way to log low level messages about when the buffer gets turned on/off without the messages themselves getting swallowed by the buffer. The compelling use case for buffering I've seen is the mozharness log from the harness process' stdout, which gets truncated at a certain point, so may need to institute this buffering in mozharness when we start streaming the raw log to stdout.

Here's a try run to check for anything else unexpected:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=cdbba2e91402
(Assignee)

Comment 15

4 years ago
Once we add "--log-raw=-" to mozharness, doing this per formatter doesn't sense because we need buffering there and not for the uploaded log. I guess we could do the buffering in mozharness, because that's where the restriction is imposed, but that would seem a lot like re-implementing the current harness logic there.

More generally this exposes the issue of how to configure formatter options per handler, which isn't supported at all from the commandline.
(Assignee)

Comment 16

4 years ago
Created attachment 8500580 [details] [diff] [review]
Implement log buffering as a handler in mozlog.

Here's the mozlog only part.
Attachment #8500580 - Flags: feedback?(james)
(Assignee)

Updated

4 years ago
Attachment #8495326 - Attachment is obsolete: true
(Assignee)

Updated

4 years ago
Blocks: 1066785
Comment on attachment 8500580 [details] [diff] [review]
Implement log buffering as a handler in mozlog.

Review of attachment 8500580 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/mozbase/mozlog/mozlog/structured/handlers/bufferhandler.py
@@ +60,5 @@
> +
> +        error_message = ("Directives to the log buffer are of the form "
> +                         "buffer:<action>, <action> one of %s" % (actions,))
> +
> +        parts = message.split(":")

So it occurs to me that this whole thing will have to be repeated if we ever add something else that accepts messages.

What if we had a method like 

BufferedLogFilter.register_message_handler(logger)

which could return [("buffer", {"on": self.enable, "off":self.disable, [...]})].

This would be called from _add_handler in the structured logger and would mean that all the dispatch logic could be kept in one place. 

In the structured logger you would probably set it up like {"buffer": (handler_instance, {"on": handler.enable, [...]})} so that when a handler is removed the messages would also be removed.

Then maybe the send_message function would look like send_message(target, cmd, *args) and dispatch would be something like 

callback = self.message_handlers.get(target, (None, {})).get(cmd, None)
if callback is None:
    return False, None
return True, callback(*args)

So you could tell if the message had been handled.
Comment on attachment 8500580 [details] [diff] [review]
Implement log buffering as a handler in mozlog.

Review of attachment 8500580 [details] [diff] [review]:
-----------------------------------------------------------------

I think this general approach will work, but I think the suggested refactoring will make things simpler.

::: testing/mozbase/mozlog/mozlog/structured/handlers/bufferhandler.py
@@ +60,5 @@
> +
> +        error_message = ("Directives to the log buffer are of the form "
> +                         "buffer:<action>, <action> one of %s" % (actions,))
> +
> +        parts = message.split(":")

So it occurs to me that this whole thing will have to be repeated if we ever add something else that accepts messages.

What if we had a method like 

BufferedLogFilter.register_message_handler(logger)

which could return [("buffer", {"on": self.enable, "off":self.disable, [...]})].

This would be called from _add_handler in the structured logger and would mean that all the dispatch logic could be kept in one place. 

In the structured logger you would probably set it up like {"buffer": (handler_instance, {"on": handler.enable, [...]})} so that when a handler is removed the messages would also be removed.

Then maybe the send_message function would look like send_message(target, cmd, *args) and dispatch would be something like 

callback = self.message_handlers.get(target, (None, {})).get(cmd, None)
if callback is None:
    return False, None
return True, callback(*args)

So you could tell if the message had been handled.
Attachment #8500580 - Flags: feedback?(james) → feedback+
(Assignee)

Comment 19

4 years ago
I have an updated patch for this, but I caused an issue I find fairly inexplicable on the mobile plaftorms I need to investigate: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=463fb7fd7359
(Assignee)

Comment 20

4 years ago
Here's a better looking try run (I hadn't unbitrot my own bitrot): https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=7efbbe6ffad6

Still need to track down that issue in android_emulator_unittest.py and ensure failures here aren't due to this patch.
(Assignee)

Comment 21

4 years ago
Created attachment 8509044 [details] [diff] [review]
Implement log buffering as a handler in mozlog.
Attachment #8509044 - Flags: review?(james)
(Assignee)

Updated

4 years ago
Attachment #8500580 - Attachment is obsolete: true
(Assignee)

Comment 22

4 years ago
The issue with the mozharness script in comment 20 is probably pretty straightforward to track down, but this script doesn't run on Ash, so without bug 1086672 I'm pretty much at a loss.
Depends on: 1086672
(Assignee)

Comment 23

4 years ago
(In reply to Chris Manchester [:chmanchester] from comment #22)
> The issue with the mozharness script in comment 20 is probably pretty
> straightforward to track down, but this script doesn't run on Ash, so
> without bug 1086672 I'm pretty much at a loss.

I think I found the issue, but as Armen pointed out yesterday the other way to debug this is with is to request a loaner.
No longer depends on: 1086672
Comment on attachment 8509044 [details] [diff] [review]
Implement log buffering as a handler in mozlog.

Review of attachment 8509044 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry to push back on this a bit more, but I think the fact that message handling doesn't compose at all; you have to explicitly call add_callable for each part of the implementation rather than calling something (e.g. add_handler) on the top_level and having it add everything that's defined in the implementation is an API problem.

::: testing/mozbase/mozlog/mozlog/structured/handlers/bufferhandler.py
@@ +50,5 @@
> +            self._buffer_pos = (self._buffer_pos + 1) % self.message_limit
> +
> +    @property
> +    def message_handlers(self):
> +        """Buffer actions configurable by a user.

What about any handlers defined on inner?

::: testing/mozbase/mozlog/mozlog/structured/structuredlog.py
@@ +125,5 @@
>      def add_handler(self, handler):
>          """Add a handler to the current logger"""
>          self._state.handlers.append(handler)
>  
> +    def _remove_from_list(self, lst, item):

Isn't this just lst.remove(item)?

@@ +137,5 @@
> +        if hasattr(handler, "mozlog_identifier"):
> +            self._remove_from_list(self._state.callables[handler.mozlog_identifier], handler)
> +        self._remove_from_list(self.handlers, handler)
> +
> +    def add_callable(self, handler):

I'm not sure why this is needed compared to just add_handler with a check for a register_message_handler attribute on the handler object. I guess it allows formatters to handle messages too, although we could provide that by having the register_message_handler implementation in StreamHandler call down into the formatters and return a list of all the things that handle messages. That maintains the simple idea that the logger just knows about Handlers and that formatters are implementation details of handlers.

@@ +141,5 @@
> +    def add_callable(self, handler):
> +        """Register a callable (handler or formatter wrapper) with the logger to receive
> +        messages from users."""
> +        if hasattr(handler, "mozlog_identifier"):
> +            self._state.callables[handler.mozlog_identifier].append(handler)

Is there any reason that you are sometimes using self._callables and sometimes accessing the state object explicitly?

@@ +158,5 @@
> +        :param command: The command to issue.
> +        :param args: Any arguments known to the target for specialized behavior.
> +        """
> +        for handler in self._callables[target]:
> +            callback = handler.message_handlers.get(command)

So I think I preferred my suggested approach where the API surface is a single register_message_handler (or whatever) callback that provides a list of tuples (hander_name, {message_name: message_callback}) and everything gets stored in this class instead of being fetched every time. Is there a reason that you went with this approach?
Attachment #8509044 - Flags: review?(james) → review-
(Assignee)

Comment 25

4 years ago
(In reply to James Graham [:jgraham] from comment #24)
> Comment on attachment 8509044 [details] [diff] [review]
> Implement log buffering as a handler in mozlog.
>
> Review of attachment 8509044 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Sorry to push back on this a bit more, but I think the fact that message
> handling doesn't compose at all; you have to explicitly call add_callable
> for each part of the implementation rather than calling something (e.g.
> add_handler) on the top_level and having it add everything that's defined in
> the implementation is an API problem.
>
> ::: testing/mozbase/mozlog/mozlog/structured/handlers/bufferhandler.py
> @@ +50,5 @@
> > +            self._buffer_pos = (self._buffer_pos + 1) % self.message_limit
> > +
> > +    @property
> > +    def message_handlers(self):
> > +        """Buffer actions configurable by a user.
>
> What about any handlers defined on inner?
>
> ::: testing/mozbase/mozlog/mozlog/structured/structuredlog.py
> @@ +125,5 @@
> >      def add_handler(self, handler):
> >          """Add a handler to the current logger"""
> >          self._state.handlers.append(handler)
> >
> > +    def _remove_from_list(self, lst, item):
>
> Isn't this just lst.remove(item)?

Is it?

>
> @@ +137,5 @@
> > +        if hasattr(handler, "mozlog_identifier"):
> > +            self._remove_from_list(self._state.callables[handler.mozlog_identifier], handler)
> > +        self._remove_from_list(self.handlers, handler)
> > +
> > +    def add_callable(self, handler):
>
> I'm not sure why this is needed compared to just add_handler with a check
> for a register_message_handler attribute on the handler object. I guess it
> allows formatters to handle messages too, although we could provide that by
> having the register_message_handler implementation in StreamHandler call
> down into the formatters and return a list of all the things that handle
> messages. That maintains the simple idea that the logger just knows about
> Handlers and that formatters are implementation details of handlers.

I want it for formatters. Formatters aren't an implementation detail of handlers as far as our public api is concerned -- for any users of commandline.py the formatters are the most visible part of the api. We do a ton of important work in formatters, keeping them out of the implementation in structuredlog.py seems like a small victory.

>
> @@ +141,5 @@
> > +    def add_callable(self, handler):
> > +        """Register a callable (handler or formatter wrapper) with the logger to receive
> > +        messages from users."""
> > +        if hasattr(handler, "mozlog_identifier"):
> > +            self._state.callables[handler.mozlog_identifier].append(handler)
>
> Is there any reason that you are sometimes using self._callables and
> sometimes accessing the state object explicitly?

An oversight.

>
> @@ +158,5 @@
> > +        :param command: The command to issue.
> > +        :param args: Any arguments known to the target for specialized behavior.
> > +        """
> > +        for handler in self._callables[target]:
> > +            callback = handler.message_handlers.get(command)
>
> So I think I preferred my suggested approach where the API surface is a
> single register_message_handler (or whatever) callback that provides a list
> of tuples (hander_name, {message_name: message_callback}) and everything
> gets stored in this class instead of being fetched every time. Is there a
> reason that you went with this approach?

This part I actually have an opinion on and it’s informed the rest of the code down to the distinction between add_handler and add_callable. We agree the dispatch mechanism should be based on strings because it saves api surface, the takeaway from the review in comment 12. We need extra machinery to do dispatch based on strings because of this decision, but keeping a dictionary full of named callbacks manipulating state elsewhere seems like a code smell because we've invented something that looks and sounds a lot like an object reference.

It’s a pretty fine point by the token that we agree about the part of the api that most consumers will interact with. But I have given it some thought.
(Assignee)

Comment 26

4 years ago
Created attachment 8512196 [details] [diff] [review]
Implement log buffering as a handler in mozlog.

I'm all ears if there's another way around the import situation this creates.
Attachment #8512196 - Flags: review?(james)
(Assignee)

Updated

4 years ago
Attachment #8509044 - Attachment is obsolete: true
(Assignee)

Updated

4 years ago
Attachment #8512196 - Flags: review?(james)
(Assignee)

Comment 27

4 years ago
Created attachment 8520768 [details]
MozReview Request: bz://1055765/chmanchester
(Assignee)

Comment 28

4 years ago
/r/421 - Bug 1055765 - Implement log buffering as a handler in mozlog.;r=jgraham

Pull down this commit:

hg pull review -r 39623105b8e1fde195914bdafd11245376a688fe
(Assignee)

Updated

4 years ago
Attachment #8476344 - Attachment is obsolete: true
(Assignee)

Comment 29

4 years ago
/r/421 - Bug 1055765 - Implement log buffering as a handler in mozlog.;r=jgraham

Pull down this commit:

hg pull review -r 39623105b8e1fde195914bdafd11245376a688fe
(Assignee)

Updated

4 years ago
Attachment #8520768 - Flags: review?(james)
(Assignee)

Comment 30

4 years ago
/r/421 - Bug 1055765 - Implement log buffering as a handler in mozlog.;r=jgraham

Pull down this commit:

hg pull review -r 39623105b8e1fde195914bdafd11245376a688fe
(Assignee)

Comment 31

4 years ago
/r/421 - Bug 1055765 - Implement log buffering as a handler in mozlog.;r=jgraham

Pull down this commit:

hg pull review -r 39623105b8e1fde195914bdafd11245376a688fe
https://reviewboard.mozilla.org/r/419/#review191

::: testing/mozbase/mozlog/mozlog/structured/handlers/base.py
(Diff revision 1)
> +    mozlog_topic = None

Is there a strong reason to make this a class-level variable rather than doing something like

```
def register_handlers(self, topic, handlers):
    self.handlers[topic] = handlers

def handle_message(self, topic, cmd, *args):
   rv = []
   if topic in self.handlers and cmd in self.handlers[topic]:
       rv.append(self.handlers[topic][cmd](*args))
   for attrname in self.wraps:
       inner = getattr(self, attrname)
       if hasattr(inner, "handle_message"):
           rv.extend(handler.handle_message(topic, cmd, *args))
   return rv
```

That somewhat reduces the number of magic attributes that form part of the interface (basically just "register_handler", "wraps" and the "handle_message" function; we could also remove "wraps" with a little effort).

Feel free to diagree if you feel the current design is better.

::: testing/mozbase/mozlog/mozlog/structured/handlers/base.py
(Diff revision 1)
> +    def add_filter(self, filter_func):

I'm not sure this filters stuff is actually used anywhere? (I know it was already there).

::: testing/mozbase/mozlog/mozlog/structured/handlers/base.py
(Diff revision 1)
> +class LogLevelFilter(object):

Seems like this should inherit from BaseHandler.
(Assignee)

Comment 33

4 years ago
https://reviewboard.mozilla.org/r/419/#review201

> Is there a strong reason to make this a class-level variable rather than doing something like
> 
> ```
> def register_handlers(self, topic, handlers):
>     self.handlers[topic] = handlers
> 
> def handle_message(self, topic, cmd, *args):
>    rv = []
>    if topic in self.handlers and cmd in self.handlers[topic]:
>        rv.append(self.handlers[topic][cmd](*args))
>    for attrname in self.wraps:
>        inner = getattr(self, attrname)
>        if hasattr(inner, "handle_message"):
>            rv.extend(handler.handle_message(topic, cmd, *args))
>    return rv
> ```
> 
> That somewhat reduces the number of magic attributes that form part of the interface (basically just "register_handler", "wraps" and the "handle_message" function; we could also remove "wraps" with a little effort).
> 
> Feel free to diagree if you feel the current design is better.

In this scenario what has a view of each topic and its associated handlers and how does it come to that? This implementation takes the stance that there's no centralized notion of topic -> handlers, messages go to every handler and propagate through the nesting structure. I like this because it doesn't clutter the idea of handlers and keeps the implementation pretty small. The previous version had this mapping in structuredlog.py in "callables", but that exposed the nesting implementation detail to callers interested in doing things like buffering. It seems like this proposes something of a middle ground, but I don't see exactly how it would work.

> I'm not sure this filters stuff is actually used anywhere? (I know it was already there).

I doubt anything does, I think the nested handler scheme subsumes it nicely.
https://reviewboard.mozilla.org/r/419/#review207

> In this scenario what has a view of each topic and its associated handlers and how does it come to that? This implementation takes the stance that there's no centralized notion of topic -> handlers, messages go to every handler and propagate through the nesting structure. I like this because it doesn't clutter the idea of handlers and keeps the implementation pretty small. The previous version had this mapping in structuredlog.py in "callables", but that exposed the nesting implementation detail to callers interested in doing things like buffering. It seems like this proposes something of a middle ground, but I don't see exactly how it would work.

So the idea is that in this scheme each class instance has a nested dictionary of topic => command => hander. For classes that inherit the base class implementation that data structure can be purely defined in the base class so the api surface for subclasses is just a register method rather than attributes with special names. There isn't any change to the external API. I tend to think that interfaces purely based on methods are easier to understand than interfaces that require setting particular base class attributes to specific values (unless you are going fully declarative like SQLAlchemy or something), so I have a mild preference to remove the magic attributes here. As a side effect it also makes things a bit more flexible.

> I doubt anything does, I think the nested handler scheme subsumes it nicely.

Do you want to remove it here or shall we file a bug for a followup?
(Assignee)

Comment 35

4 years ago
https://reviewboard.mozilla.org/r/419/#review209

> Do you want to remove it here or shall we file a bug for a followup?

I'll take care of it here.

> So the idea is that in this scheme each class instance has a nested dictionary of topic => command => hander. For classes that inherit the base class implementation that data structure can be purely defined in the base class so the api surface for subclasses is just a register method rather than attributes with special names. There isn't any change to the external API. I tend to think that interfaces purely based on methods are easier to understand than interfaces that require setting particular base class attributes to specific values (unless you are going fully declarative like SQLAlchemy or something), so I have a mild preference to remove the magic attributes here. As a side effect it also makes things a bit more flexible.

Oh I see. That does seem better.
(Assignee)

Comment 36

4 years ago
/r/421 - Bug 1055765 - Implement log buffering as a handler in mozlog.;r=jgraham

Pull down this commit:

hg pull review -r 9966ebda37c9211d3f7a44610ba7db2cfc2871b9
https://reviewboard.mozilla.org/r/419/#review379

::: testing/mozbase/docs/mozlog_structured.rst
(Diff revisions 1 - 2)
> -.. autoclass:: LogLevelFilter
> +.. autoclass:: BufferHandler

What happened to LogLevelFilter?
(Assignee)

Comment 38

4 years ago
https://reviewboard.mozilla.org/r/419/#review381

> What happened to LogLevelFilter?

Ah, my mistake. Will restore.
(Assignee)

Comment 39

4 years ago
/r/421 - Bug 1055765 - Implement log buffering as a handler in mozlog.;r=jgraham

Pull down this commit:

hg pull review -r 5f2bcc9e9757c27bd754e4b29e48c5782cd4ccfd
Attachment #8520768 - Flags: review?(james) → review+
(Assignee)

Comment 43

3 years ago
Comment on attachment 8520768 [details]
MozReview Request: bz://1055765/chmanchester
Attachment #8520768 - Attachment is obsolete: true
Attachment #8618280 - Flags: review+
(Assignee)

Comment 44

3 years ago
Created attachment 8618280 [details]
MozReview Request: Bug 1055765 - Implement log buffering as a handler in mozlog.;r=jgraham
You need to log in before you can comment on or make changes to this bug.