Closed Bug 1349452 Opened 7 years ago Closed 7 years ago

Improve type checking code in parse_events.py

Categories

(Toolkit :: Telemetry, enhancement, P4)

enhancement

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: gfritzsche, Assigned: mayankmadan, Mentored)

Details

(Whiteboard: [measurement:client] [lang=python])

Attachments

(1 file, 5 obsolete files)

Mauro commented in a PR [1] how we could improve the `TypeChecker` code in parse_events.py [2].
This should also make it easier to share this code with parse_scalars.py and histogram_tools.py in a follow-up bug.

1: https://github.com/mozilla/probe-scraper/pull/2
2: https://dxr.mozilla.org/mozilla-central/rev/05bfa2831c0ba4a26fa72328ffe6a99aba9c356a/toolkit/components/telemetry/parse_events.py#40
For easier access, here is the proposed design:

> The implementation of this class seems to be a bit odd. An alternative could be to have a set of TypeChecker classes which know how to run their own validation. Something like
> 
> class AtomicTypeChecker:
>     """Validate a simple value against a given type"""
>     def __init__(self, instance_type):
>         self.instance_type = instance_type
>    
>     def check(self, identifier, key, value):
>         pass
> 
> 
> class ListTypeChecker:
>     """Validate a list of values against a given type"""
>     def __init__(self, instance_type):
>         self.instance_type = instance_type
>    
>     def check(self, identifier, key, value):
>         pass
> 
> 
> class DictTypeChecker:
>     """Validate the keys and the values of a dict against a given type"""
>     def __init__(self, keys_instance_type, values_instance_type):
>         self.keys_instance_type  = keys_instance_type
>         self.values_instance_type = values_instance_type
>    
>     def check(self, identifier, key, value):
>         pass
> 
> 
> class MultiTypeChecker:
>     """Validate a simple value against a list of possible types"""
>     def __init__(self, *instance_types):
>         if not instance_type:
>             raise ValueError("At least one instance type is required")
>         self.instance_types = instance_types
>         self.values_instance_type = values_instance_type
>    
>     def check(self, identifier, key, value):
>         pass
> 
> This way you could still call .check on the instances of these classes, ignoring what's the actual TypeChecker class that you are using. Also, this design lets you define even more specific checkers (or validators) e.g.
> 
> class StringTypeChecker(AtomicTypeChecker):
>     def __init__(min_length, max_length):
>         super(StringTypeChecker, self).__init__(basestring)
>         self.min_length = min_length
>         self.max_length = max_length
> 
>     def check(self, identifier, key, value):
>         assert isinstance(value, basestring) and self.min_length <= len(value) <= self.max_length
Hello, I'd like to work on this issue.
Hi Mayank,

let me know if anything is unclear from comment 0 and comment 1.

In order to make sure your code works, you should try the following commands:

> ./mach lint -l flake8 toolkit/components/telemetry
> ./mach test toolkit/components/telemetry

If they both work, you could test a bit more by changing the Events.yaml file to manually triggers the errors: http://searchfox.org/mozilla-central/source/toolkit/components/telemetry/Events.yaml
Assignee: nobody → maddiemadan
Attached patch Patch (obsolete) — Splinter Review
So, I split up the TypeChecker class into different classes for each type of data. And changed the references to TypeChecker class to the appropriate new ones. But I dont see how the tests were utilizing the TypeChecker class or will utilize the new ones
Flags: needinfo?(alessio.placitelli)
(In reply to Mayank Madan [:mayankmadan] from comment #4)
> Created attachment 8851638 [details] [diff] [review]
> Patch
> 
> So, I split up the TypeChecker class into different classes for each type of
> data. And changed the references to TypeChecker class to the appropriate new
> ones. But I dont see how the tests were utilizing the TypeChecker class or
> will utilize the new ones

The tests are indirectly calling parse_events.py to generate C++ code that gets linked when building Firefox.
If, for accident, you break something in parse_events.py and the events definition fail to generate, you would see an error when building Firefox (not in Artifact mode!) and you would see the events related test fail [1] when running the test command.

Moreover, problems with the Python code style are discovered by running the following command:

> ./mach lint -l flake8 toolkit/components/telemetry

This checks if the code you wrote/changed still conforms to the linting rules we enabled for Python in our repo.

However, we still don't have an automated way to test if parse_events.py correctly fails with malformed Events.yaml file. And that's why I suggested manual testing for that in comment 3. You could, for example, change [2] to an integer and force the build to fail. It should fail the same way with and without your patches.

[1] - http://searchfox.org/mozilla-central/source/toolkit/components/telemetry/tests/unit/test_TelemetryEvents.js
[2] - http://searchfox.org/mozilla-central/rev/7419b368156a6efa24777b21b0e5706be89a9c2f/toolkit/components/telemetry/Events.yaml#22
Flags: needinfo?(alessio.placitelli)
There were some problems in the last one that i fixed with this one and also checked if it works by intentionally generating the ValueError by changing the type of values in Events.yaml and failing the build (initiated by ./mach build)
Attachment #8851638 - Attachment is obsolete: true
Attachment #8852352 - Flags: review?(alessio.placitelli)
Comment on attachment 8852352 [details] [diff] [review]
Fixed some mistakes that were in the previous attachment

Review of attachment 8852352 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay, this looks like it's almost ready. Just a few general observations and some other feedback below.

Please, even though it's not strictly enforced right now, keep the line length of the code you're adding below 99 characters. This way we won't have to do the same work twice as this will be enforced in bug 1344834.

::: toolkit/components/telemetry/parse_events.py
@@ +36,5 @@
>      """
>      pass
>  
>  
> +class MultiTypeChecker:

nit: please add docstring to this class also mentioning that this expects up to two types.

@@ +39,5 @@
>  
> +class MultiTypeChecker:
> +    def __init__(self, *instance_types):
> +        if not instance_types:
> +            raise ValueError("At least one instance type is required")

If we're expecting at most two types, also make sure that it's what we're getting.

@@ +45,4 @@
>  
>      def check(self, identifier, key, value):
> +        if not isinstance(value, self.instance_types[0]) and not isinstance(value, self.instance_types[1]):
> +            raise ValueError("%s: failed type check for %s - expected %s or %s, got %s" % (identifier, key, self.instance_types[0], self.instance_types[1], type(value)))

Is there any reason why you dropped the use of |nice_type_name| here and below?

@@ +57,5 @@
> +            raise ValueError("%s: failed check for %s - list should not be empty" % (identifier, key))
> +
> +        for x in value:
> +            if not isinstance(x, self.instance_type):
> +                raise ValueError("%s: failed type check for %s - expected list value type %s, got %s" % (identifier, key, self.instance_type, type(x)))

Please use |nice_type_name|

@@ +70,5 @@
> +        if len(value.keys()) < 1:
> +            raise ValueError("%s: failed check for %s - dict should not be empty" % (identifier, key))
> +        for x in value.iterkeys():
> +            if not isinstance(x, self.keys_instance_type):
> +                raise ValueError("%s: failed dict type check for %s - expected key type %s, got %s" % (identifier, key, self.keys_instance_type, type(x)))

Use nice_type_name

@@ +73,5 @@
> +            if not isinstance(x, self.keys_instance_type):
> +                raise ValueError("%s: failed dict type check for %s - expected key type %s, got %s" % (identifier, key, self.keys_instance_type, type(x)))
> +        for k, v in value.iteritems():
> +            if not isinstance(v, self.values_instance_type):
> +                raise ValueError("%s: failed dict type check for %s - expected value type %s for key %s, got %s" % (identifier, key, self.values_instance_type, k, type(v)))

Please use nice_type_name

@@ +76,5 @@
> +            if not isinstance(v, self.values_instance_type):
> +                raise ValueError("%s: failed dict type check for %s - expected value type %s for key %s, got %s" % (identifier, key, self.values_instance_type, k, type(v)))
> +
> +
> +class AtomicTypeChecker:

Let's move this class before MultiTypeChecker and add a docstring for it too.

@@ +82,5 @@
> +        self.instance_type = instance_type
> +
> +    def check(self, identifier, key, value):
> +        if not isinstance(value, self.instance_type):
> +            raise ValueError("%s: failed type check for %s - expected %s, got %s" % (identifier, key, self.instance_type, type(value)))

Use nice_type_name
Attachment #8852352 - Flags: review?(alessio.placitelli) → feedback+
I did make the required changes and also added docstring for the new classes
Attachment #8852352 - Attachment is obsolete: true
Attachment #8853047 - Flags: review?(alessio.placitelli)
Comment on attachment 8853047 [details] [diff] [review]
Improve type checking code in parse_events.py

Review of attachment 8853047 [details] [diff] [review]:
-----------------------------------------------------------------

This looks better overall: please fix the style of the error messages, restore them as they were before. Just don't let the lines be longer than 99 characters, but still use all the available space.

Also, MultiTypeChecker should allow for more than 2 types, even if we don't use more than 2 right now. Sorry for not catching this earlier.

::: toolkit/components/telemetry/parse_events.py
@@ +44,5 @@
>  
>      def check(self, identifier, key, value):
> +        if not isinstance(value, self.instance_type):
> +            raise ValueError(
> +                            "%s: failed type check for %s - expected %s, got %s" %

Let's leave the text string on the previous line, there's no need to break it there. We can take up to 99 characters per line. So this would become:

>            raise ValueError("%s: failed type check for %s - expected %s, got %s" %
>                             (identifier, key,
>                              nice_type_name(self.instance_type),
>                              nice_type_name(type(value))))

Try to keep the same style that was used before.

Please do the same for all the other messages that you changed as well.

@@ +53,3 @@
>  
> +class MultiTypeChecker:
> +    """Validate a value against one of two types

Let's use the docstring from comment 1 here.

@@ +60,5 @@
> +            raise ValueError("At least one instance type is required")
> +        self.instance_types = instance_types
> +
> +    def check(self, identifier, key, value):
> +        if not isinstance(value, self.instance_types[0]) and not\

Sorry for not catching this earlier, but let's not assume we only receive 2 types here.

Let's assume the list of instance_types is arbitrarily long.
Attachment #8853047 - Flags: review?(alessio.placitelli) → feedback+
I did change the MultiTypeChecker class so that it is valid for more than just two types and also changed the messages so that there arent any lines with more than 99 characters
Attachment #8853047 - Attachment is obsolete: true
Comment on attachment 8853508 [details] [diff] [review]
Improve type checking code in parse_events.py

Please don't forget to flag me for review next time you upload a patch, otherwise it might get lost in the bugmail noise :)
Attachment #8853508 - Flags: review?(alessio.placitelli)
Status: NEW → ASSIGNED
Comment on attachment 8853508 [details] [diff] [review]
Improve type checking code in parse_events.py

Thanks Georg!
Attachment #8853508 - Flags: review?(alessio.placitelli) → review?(gfritzsche)
Comment on attachment 8853508 [details] [diff] [review]
Improve type checking code in parse_events.py

Review of attachment 8853508 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks! This looks close, but still needs a few smaller changes.

::: toolkit/components/telemetry/parse_events.py
@@ +61,5 @@
> +        for i in self.instance_types:
> +            if isinstance(value, i):
> +                assertion = True
> +
> +        if not assertion:

I think we can do this in a more pythonic way instead of the loop here:
if not any(isinstance(value, i) for i in self.instance_types):
  ...

@@ +64,5 @@
> +
> +        if not assertion:
> +            raise ValueError("%s: failed type check for %s - expected: %s, got %s" %
> +                             (identifier, key,
> +                              " or ".join(map(nice_type_name, self.instance_types)),

As this now supports >2 types, i think this would be more readable as:
"... - got %s, expected one of: %s"

@@ +81,5 @@
> +
> +        for x in value:
> +            if not isinstance(x, self.instance_type):
> +                raise ValueError("%s: failed type check for %s - expected list value type %s, got\
> +                                 %s" % (identifier, key, nice_type_name(self.instance_type),

Breaking the line this way inserts a lot of whitespace before the last "%s".
You can test the result by intentionally introducing errors to Events.yaml and building.

It's better to use something like:
"%s: ...,"
"%s" % (...

@@ +98,5 @@
> +                             (identifier, key))
> +        for x in value.iterkeys():
> +            if not isinstance(x, self.keys_instance_type):
> +                raise ValueError("%s: failed dict type check for %s - expected key type %s, got \
> +                                 %s" %

Same here for the string line breaking.

@@ +105,5 @@
> +                                  nice_type_name(type(x))))
> +        for k, v in value.iteritems():
> +            if not isinstance(v, self.values_instance_type):
> +                raise ValueError("%s: failed dict type check for %s- \
> +                                 expected value type %s for key %s, got %s" %

Same here for the string line breaking.
Also there is a missing space for "%s-".
Attachment #8853508 - Flags: review?(gfritzsche) → feedback+
Attachment #8853508 - Attachment is obsolete: true
Attachment #8855905 - Flags: review?(gfritzsche)
Attachment #8855905 - Flags: review?(gfritzsche) → review?(alessio.placitelli)
Comment on attachment 8855905 [details] [diff] [review]
Improve type checking code in parse_events.py

Review of attachment 8855905 [details] [diff] [review]:
-----------------------------------------------------------------

This looks like it's nearing completion, but there are a couple of more things to do.

Please add the bug number to the commit message, the full message should be "Bug 1349452 - Improve type checking code in parse_events.py. r?dexter"

Please run the following command before submitting the next patch, is it easily allows you to spot any trivial mistake in the code:

> ./mach lint -l flake8 toolkit/components/telemetry/

::: toolkit/components/telemetry/parse_events.py
@@ +57,5 @@
> +        self.instance_types = instance_types
> +
> +    def check(self, identifier, key, value):
> +
> +        if not any(isinstance(value,i) for i in self.instance_types):

nit: please add a whitespace in |isinstance| after "value,"

@@ +60,5 @@
> +
> +        if not any(isinstance(value,i) for i in self.instance_types):
> +            raise ValueError("%s: failed type check for %s - got %s, expected one of: %s," %
> +                             (identifier, key,
> +                              nice_type_name(type(value)),  

Please remove the trailing whitespaces.

@@ +78,5 @@
> +        for x in value:
> +            if not isinstance(x, self.instance_type):
> +                raise ValueError("%s: failed type check for %s - expected list value type %s, got"
> +                                 " %s" % (identifier, key, nice_type_name(self.instance_type),
> +                                 nice_type_name(type(x))))

Please align this line with "identifier" on the previous line, as flake8 complains about that.
Attachment #8855905 - Flags: review?(alessio.placitelli) → feedback+
Attached patch patch.patchSplinter Review
Sorry about the previous upload, i just tried to rush through it. This one should be okay though.
Attachment #8855905 - Attachment is obsolete: true
Attachment #8856954 - Flags: review?(alessio.placitelli)
Comment on attachment 8856954 [details] [diff] [review]
patch.patch

Review of attachment 8856954 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good now, thanks! I'll push it to try for you.
Attachment #8856954 - Flags: review?(alessio.placitelli) → review+
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/35a62ecf4475
Improve type checking code in parse_events.py. r=dexter
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/35a62ecf4475
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: