Closed Bug 1171105 Opened 9 years ago Closed 9 years ago

Ability to aggregate Files metadata

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(firefox41 fixed)

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: gps, Assigned: gps)

References

Details

Attachments

(1 file)

A collection of Files is nice. What's even nicer is if we can aggregate multiple instances together to obtain a composite result. e.g. given metadata on N files, recommend a Bugzilla component to file a bug under. (This exact use case is needed for MozReview to auto file bugs.)
Bug 1171105 - Ability to aggregate metadata from Files instances

We want this logic to live next to where metadata types are defined so
downstream consumers of Files-based metadata don't have to reinvent the
wheel.

The work in this commit will be used to enable auto filing bugs during
pushes to MozReview.
Attachment #8614798 - Flags: review?(mh+mozilla)
Comment on attachment 8614798 [details]
MozReview Request: Bug 1171105 - Ability to aggregate metadata from Files instances

https://reviewboard.mozilla.org/r/10059/#review8913

::: python/mozbuild/mozbuild/frontend/context.py:592
(Diff revision 1)
> +        for p, f in sorted(files.items()):

you shouldn't need to sort here. Or are you doing it to stabilize most_common if there are several with the same count? If so please add a comment.

::: python/mozbuild/mozbuild/frontend/context.py:600
(Diff revision 1)
> +                                              count))

is the nested tuple really necessary?

::: python/mozbuild/mozbuild/frontend/context.py:603
(Diff revision 1)
> +            d['recommended_bug_component'] = d['bug_component_counts'][0][0]

Don't emit a recommendation when there are multiple components with the same count at the top of the list.

::: python/mozbuild/mozbuild/test/frontend/test_reader.py:391
(Diff revision 1)
> +        })

Maybe add tests for the weird cases mentioned above.
Attachment #8614798 - Flags: review?(mh+mozilla)
https://reviewboard.mozilla.org/r/10059/#review9025

> is the nested tuple really necessary?

I'm assuming you mean as opposed to a 3-tuple? The 2-tuple does make it easier to convert to a map. And, the bug component logically is a tuple. It's kinda ugly, but I think it is proper.
Comment on attachment 8614798 [details]
MozReview Request: Bug 1171105 - Ability to aggregate metadata from Files instances

Bug 1171105 - Ability to aggregate metadata from Files instances

We want this logic to live next to where metadata types are defined so
downstream consumers of Files-based metadata don't have to reinvent the
wheel.

The work in this commit will be used to enable auto filing bugs during
pushes to MozReview.
Attachment #8614798 - Flags: review?(mh+mozilla)
Comment on attachment 8614798 [details]
MozReview Request: Bug 1171105 - Ability to aggregate metadata from Files instances

https://reviewboard.mozilla.org/r/10059/#review9259

::: python/mozbuild/mozbuild/frontend/context.py:607
(Diff revision 2)
> +            # Don't recommend a componnent if it doesn't have a clear lead.

s/componnent/component/

::: python/mozbuild/mozbuild/frontend/context.py:613
(Diff revision 2)
> +

It seems to me this would be easier to grasp if you did something like this (untested):

for c, count in bug_components.most_common():
    component = (c.product, c.component)
    d['bug_component_counts'].append((c, count))
    if 'recommended_bug_component' not in d:
        d['recommended_bug_component'] = c
        recommended_count = count
    elif recommended_count == count:
        # Don't recommend a component if it doesn't have a clear lead
        d['recommended_bug_component'] = None

::: python/mozbuild/mozbuild/frontend/context.py:595
(Diff revision 2)
> +        for p, f in files.items():

If you're not using p, does aggregate actually need to be passed a map? Why not a simple list? Maybe some future improvements for the algorithm? If so please add a comment, and use .values instead of .items. Otherwise, I'd say passing a list would be a simpler API.

::: python/mozbuild/mozbuild/test/frontend/test_context.py:628
(Diff revision 2)
> +        })

Reading the test, I realize this may not be aggregating the right things.
Specifically, imagine the following typical case:
you have changes to moz.build adding a source file, and multiple changes to sources and headers in the same directory. As far as Files are concerned, you have one matching moz.build giving Core::Build Config, and another matching the sources and heads, giving, say, Core::Gfx.
So, you have two Files instances, and there will be no clear winner, while, if you consider the number of actual files changed, Core::Gfx clearly wins.
But maybe you're planning to use this API by sending it Files instances for each and every file, in which case the concern above is addressed. But then, the intent would be slightly clearer if your test_multiple_bug_components was passing f1 twice instead of having a separate instance for the same component. This should probably be clarified in the doc string for aggregate.
Attachment #8614798 - Flags: review?(mh+mozilla)
https://reviewboard.mozilla.org/r/10059/#review9411

> If you're not using p, does aggregate actually need to be passed a map? Why not a simple list? Maybe some future improvements for the algorithm? If so please add a comment, and use .values instead of .items. Otherwise, I'd say passing a list would be a simpler API.

Yeah, I think we'll need paths in the future. I have a few code review scenarios in mind. e.g. a patch touches .cpp and a Makefile.in file. We'd ideally want MozReview to flag a build peer for just the Makefile.in bit.

> It seems to me this would be easier to grasp if you did something like this (untested):
> 
> for c, count in bug_components.most_common():
>     component = (c.product, c.component)
>     d['bug_component_counts'].append((c, count))
>     if 'recommended_bug_component' not in d:
>         d['recommended_bug_component'] = c
>         recommended_count = count
>     elif recommended_count == count:
>         # Don't recommend a component if it doesn't have a clear lead
>         d['recommended_bug_component'] = None

This is close to correct. You'll see a correct version with the next review.

> Reading the test, I realize this may not be aggregating the right things.
> Specifically, imagine the following typical case:
> you have changes to moz.build adding a source file, and multiple changes to sources and headers in the same directory. As far as Files are concerned, you have one matching moz.build giving Core::Build Config, and another matching the sources and heads, giving, say, Core::Gfx.
> So, you have two Files instances, and there will be no clear winner, while, if you consider the number of actual files changed, Core::Gfx clearly wins.
> But maybe you're planning to use this API by sending it Files instances for each and every file, in which case the concern above is addressed. But then, the intent would be slightly clearer if your test_multiple_bug_components was passing f1 twice instead of having a separate instance for the same component. This should probably be clarified in the doc string for aggregate.

The primary use case I have for this right now is to take commits and do things like auto file bugs and auto assign reviewers. Essentially:

1) Extract list of files modified by a commit
2) Call reader.files_info() -> {path: File}
3) Call aggregate({path: File}) to aggregate the results from multiple input paths

So, the File instances being passed to aggregate in my mind will be the "final" File instances produced by reader.files_info(), not the raw instances from a single moz.build file. I'll update the docs.
Comment on attachment 8614798 [details]
MozReview Request: Bug 1171105 - Ability to aggregate metadata from Files instances

Bug 1171105 - Ability to aggregate metadata from Files instances

We want this logic to live next to where metadata types are defined so
downstream consumers of Files-based metadata don't have to reinvent the
wheel.

The work in this commit will be used to enable auto filing bugs during
pushes to MozReview.
Attachment #8614798 - Flags: review?(mh+mozilla)
Comment on attachment 8614798 [details]
MozReview Request: Bug 1171105 - Ability to aggregate metadata from Files instances

https://reviewboard.mozilla.org/r/10059/#review9565

Ship It!
Attachment #8614798 - Flags: review?(mh+mozilla) → review+
https://hg.mozilla.org/mozilla-central/rev/220bc92911b8
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: