Open Bug 1319943 Opened 8 years ago Updated 2 years ago

Add a means for determining the owner of code

Categories

(Testing :: General, defect, P3)

defect

Tracking

(Not tracked)

People

(Reporter: jgraham, Unassigned)

References

Details

Attachments

(1 file)

We should have a system for identifying people who own specific pieces of code, using in-tree information. This is related to, but not the same as, the module owner system.

Use cases for such a system include:

* Identifying people who can investigate intermittent failures
* Autosuggesting/assigning reviewers based on a list of files changed
* suggesting good candidates to cc on particular bugs

We have some slightly complex requirements:

* People claiming ownership of a directory may not own everything in the subdirectory
* Some subdirectories are vendored code that cannot be directly modified and may or may not be synced with upstream.
So my initial strawman proposal is to use OWNERS.yml files. These would have three meaningful top level keys:

"include": A list of people (mozreview handles) owning files in the current directory
"exclude": A list of people owning files in parent directories but not in the current directories
"paths": A list of file/directory names with special rules.

Each entry under "paths" would be the name of a file or subdirectory of the current directory. For files the could have the same include/exclude rules as the directory, but specific to that file. subdirectories would be permitted to support the case of vendored paths. These would have the include/exclude keys, plus a key "vendor" indicating that the subdirectory is vendored and so must not be further examined for OWNERS.yml files. In addition it would support "paths", which in this case only could be any relative path to a directory or subdirectory under the vendored path. For example under testing/web-platform/ one might have something like:

 include:
   - jgraham
   - Ms2ger
 exclude:
   - jgriffin
 paths:
   tests:
     vendor: true
     paths:
       dom:
         include:
           - annevk
         exclude:
           - jgraham
       dom/events:
         include:
           - smaug
   mach_commands.py:
     exclude:
       - Ms2ger

Which would mean, for example, that Ms2ger was an owner of everything except the mach_commands.py file, jgraham was an owner of everything except tests/dom and smaug was an owner only of files under tests/dom.events
Attached a totally-not-production-ready patch just to demonstrate the idea. It adds a |mach owners <paths>| command. I have to say it seemed pretty neat to me, even though I populated it with more or less made up data. I can't say I love the yaml though.
The build peers and MozReview teams have discussed this several times in the past few years. Bug 1137042 even contains some code attempting to start the process.

I can likely also dig up mailing list posts about this.

Many people have started the process. But we always get bogged down in details. If you want this to happen, I suggest cornering glandium, glob, gps, and preferably other build peers in Hawaii.

Also, this problem is related to code-based notifications/subscriptions. People want e.g. emails when files they care about change or they want to be auto CCd on a review touching a file they care about. This of course means self-service, which makes in-tree definitions more difficult because of their higher barrier to change. And, you probably don't want things like email subscription data in version control!
Comment #2 reimplements the Files primitive in moz.build files, just using YAML.
FWIW my opinion on keeping this data in tree is "if it's good enough for blink it's probably good enough for us". If we want external tools to access it I think we should build ownership-as-a-service i.e. a http server that can return the owners for a list of paths based on a recent copy of the tree.

I strongly believe that keeping the information in-tree makes it more likely to be up-to-date than if it's in some other place.

I am also dubious that putting it in moz.build files is a good idea. Although they're technically probably fine, it's not really information about the build, and people are more likely to worry about changing them (and so breaking the build) than changing something that is more explictly single-purpose (I still don't love the yaml, but I so like it more than that alternative).
(In reply to James Graham [:jgraham] from comment #6)
> f we want external tools to access
> it I think we should build ownership-as-a-service i.e. a http server that
> can return the owners for a list of paths based on a recent copy of the tree.

We already have that: https://mozilla-version-control-tools.readthedocs.io/en/latest/hgmo/mozbuildinfo.html

> I strongly believe that keeping the information in-tree makes it more likely
> to be up-to-date than if it's in some other place.

I agree. That's why I'm not a fan of the canonical modules owners list being on the wiki.

> I am also dubious that putting it in moz.build files is a good idea.
> Although they're technically probably fine, it's not really information
> about the build, and people are more likely to worry about changing them
> (and so breaking the build) than changing something that is more explictly
> single-purpose (I still don't love the yaml, but I so like it more than that
> alternative).

There is stuff in moz.build that isn't part of the build. Like defining which bug components files are associated with. It just so happens that we invented moz.build files for the build system (hence "build" in their name). Then we shoehorned other stuff into moz.build because it was convenient.

I'm not opposed to creating a separate set of files to track files metadata. But, we already have the "Files" primitive in moz.build that was designed specifically for the kinds of scenarios outlined in this bug (https://gecko.readthedocs.io/en/latest/build/buildsystem/files-metadata.html). And we even have a mach query command (`mach file-info`) and HTTP service for querying the data. So, I am somewhat opposed to reinventing the data format since that seems like a mostly solved problem.

I'd prefer we add stuff to Files in moz.build and see how far that gets us. If we outgrow moz.build, we can invent something new.
I would prefer if this were in manifest files, moz.build files secondary, and alternative files like OWNERS as a last resort.  Being in-tree is ideal.

My biggest fear is making something too granular- such that we have each test or worse yet line of code assigned to a specific person(s).  Ideally everything in a given directory would have a clear point of contact and that point of contact will be responsible for finding peers to fix/review/address test cases.  It is important to know where to file bugs for everything, in many cases it isn't clear- having a top level component helps here- this is already in moz.build files, so extending that to have owners would be nice.

With fewer spots to edit, there are fewer spots to get out of date.
I think test manifest files would be a poor choice here. They only apply to tests, but ownership applies to all the files in the tree, both tests and code. It isn't even the case that all tests use manifests.

If everyone else agrees that the moz.build files are the way to go that's fine, but I think there is a big advantage in moving metadata away from code (which is what the moz.build files are), because it makes people more confident that they can update them without breaking the build. I also think there's a big advantage in using a well-known file format rather than an invented one that is only well understood by a handful of people.

I don't think that anyone is suggesting having an owner per line of code. Having a list of owners per directory with the option to override per file seems like the minimum useful level of granularity.
I was looking at web-platform tests and don't see a lot of moz-build files, I did file some OWNERS file in the tree, for example:
https://dxr.mozilla.org/mozilla-central/source/testing/web-platform/tests/pointerlock/OWNERS

this lists 2 people, although I couldn't find those names on IRC in #developers, in mozillians.org, or in phonebook.  I did see that these appear to be github usernames- so that is useful to some degree.  From my point of view, this current use of the file provides little to no value.

It does seem that a moz.build method would be best as this would be giving us a list of bugzilla/mozilla references without cluttering up the mapping of the existing OWNERS files.

In addition to owners, the bugzilla component would be useful.  In the above example of pointerlock, I would have no idea where to file a bug for an intermittent test as there is no component in bugzilla matching pointerlock or pointer (a few matches on lock, but they seem to be specific like blocklist, etc.)

I already see a BUG_COMPONENT field in moz.build:
https://dxr.mozilla.org/mozilla-central/search?q=path%3Amoz.build+bug_component&redirect=true

this seems relevant, maybe there are some edge cases to be worried about- yet using this information would give us a path forward.  Maybe we could add something like [0]:
with Files('**'):
    FILE_OWNERSHIP = ('jgraham', 'gps')

# Specific file has distinct owner
with Files('test_randomstuff.js'):
    FILE_OWNERSHIP = ('jmaher')

that would apply generically to all files in the tree specified by the moz.build file- if we feel there are exceptions, then we should call them out here so we can discuss the other use cases.

[0] https://dxr.mozilla.org/mozilla-central/source/browser/components/places/moz.build#18
Priority: -- → P3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: