933783 - Write tools to monitor code changes per author

Assignee

Description

•

11 years ago

To report productivity in terms of code contributions, we'll need an automated means of determining how much code each of us have contributed during a fixed period.  Two decent metrics for that are number of commits, and number of lines changed in those commits.

So, we'll need some scripts that can handle hg, svn, and git repositories and extract per-user numbers for each of those metrics during specific time periods.

Rob Tucker [:rtucker]

Comment 1

•

11 years ago

How much is the bonus multiplier for a negative code contribution?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 2

•

11 years ago

dear god don't let this be tied to bonuses

:Atoll

Comment 3

•

11 years ago

(In reply to Rob Tucker [:rtucker] from comment #1)
> How much is the bonus multiplier for a negative code contribution?

A lot, I hope :)

Corey Shields [:cshields]

Comment 4

•

11 years ago

(In reply to Dustin J. Mitchell [:dustin] (I read my bugmail; don't needinfo me) from comment #0)
> To report productivity in terms of code contributions, we'll need an

I disagree that we need to measure productivity in terms of lines of code.  The best programmers can spend hours writing hundreds of lines and days focusing on very few.

And, as atoll pointed out, some cand spend lots of time refactoring and -removing- code, which is just as valuable.

anyway, I can see value in such a tool for identifying where a person's time is spent  (ie: people like dustin and atoll spend a lot of time in puppet, versus other admins who may touch it little).  I just don't want to see this equate to a feeling of needing to hit code quotas or move a graph of contributions to prove their worth.

Rob Tucker [:rtucker]

Comment 5

•

11 years ago

(In reply to Corey Shields [:cshields] from comment #4)
> (In reply to Dustin J. Mitchell [:dustin] (I read my bugmail; don't needinfo
> me) from comment #0)
> > To report productivity in terms of code contributions, we'll need an
> 
> I disagree that we need to measure productivity in terms of lines of code. 
> The best programmers can spend hours writing hundreds of lines and days
> focusing on very few.
> 
> And, as atoll pointed out, some cand spend lots of time refactoring and
> -removing- code, which is just as valuable.
> 
> anyway, I can see value in such a tool for identifying where a person's time
> is spent  (ie: people like dustin and atoll spend a lot of time in puppet,
> versus other admins who may touch it little).  I just don't want to see this
> equate to a feeling of needing to hit code quotas or move a graph of
> contributions to prove their worth.

I apologize, but it is unclear to me if this is something that I should be looking into building or not?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 6

•

11 years ago

I'll probably do it when I get a chance, but if you're keen, then by all means do :)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 7

•

11 years ago

Rob -- I have a pretty stupid simple thing begun at
  https://github.com/djmitche/code-change-monitor

can you have a look and let me know what you think?

It only supports Git right now, so it will need SVN and Hg classes.  And I don't really have a good idea whether the report format is any good.  I'm figuring it can end up in an HTML email eventually.

Would you be willing to hack up SVN and Hg classes while I improve the reports?  Or the other way around?

Rob Tucker [:rtucker]

Comment 8

•

11 years ago

Yep, happy to take a look and help out. Can you comment on the priority and importance of this project?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 9

•

11 years ago

I'll leave priority to Corey.  This is important insofar as Upper Management would like metrics on who's doing what, and lots of us (yourself included) mainly "do" code, so this gives UM good visibility of our activities and how those grow or shrink over time.

Amy Rich [:arr] [:arich]

Comment 10

•

11 years ago

It would be great to have it by the end of the month so we can use it to collect stats for the next operational report cards.

Brian Hourigan [:digi]

Comment 11

•

11 years ago

Comment 9 is in direct contradiction to comment 4. Comment 4 suggested that ICs would not be tracked/scored based on commits and number of lines changed.

Amy Rich [:arr] [:arich]

Comment 12

•

11 years ago

I don't intend to track individual ICs, just use this as s tool to show how much effort my team is putting into making code changes from month to month.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 13

•

11 years ago

I spoke loosely and I shouldn't have.  You shouldn't take my comments as to how this is or is not being used to track, score, evaluate, compensate, punish, reward IC's, resistors, capacitors, employees, volunteers, contractors, or anything like that.  I have no idea.  I just know I code a lot, and if someone is measuring something about me, I'd like those measurements to reflect what I do.  So just ignore the bits where I said anything about what anyone's doing with this, and let's not try to figure that out on this bug.  Talk to your boss.

I was led to believe we should produce this information.  If that's not the case, R/INVALID.  If that is the case, let me know in more detail what it should look like, and I'll keep hacking.

Here's a sample of what I've got: http://people.v.igoro.us/~dustin/ccm-report.html

:Atoll

Comment 14

•

11 years ago

This tool will require some ability to scan GitHub for employee work (for instance, Puppetagain or the DB team's mysql/postgres modules or a lot of rtucker's work with puppetlabs-firewall).

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 15

•

11 years ago

It can do that - the demo stuff I set up is, in fact.

I'm not sure where we'll end up running this, but that might require a flow.  I'm open to ideas for hosting.

Amy Rich [:arr] [:arich]

Comment 16

•

11 years ago

To address the issue of people worrying about being held to some sort of quotas or anything of that sort, let me state that this is absolutely NOT the intention here.  The intention is to be able to show, at a team level, where a great deal of effort is being spent (this is part of the story of "what does your team do?" not keeping track of every commit or every line of code specific people write).  My intention is to give people a sense of how much work actually goes into all of the puppet/python/vbs/etc work that we do as part of an overall picture of all of the things that are on my team's plate.

I'm specifically looking for metrics about repos, because, fundamentally, that's probably 75% of what my team does. Our niche is installation and automation. We don't manage the hardware and we don't manage the applications, and machine uptime or load is not a useful metric when the machines are designed to reboot and only run one job at a time.  

So I hope that clears up any misconceptions and I apologize if people are feeling frustrated or concerned by the exchange in the above comments.  I'll take full responsibility for any ill will since I didn't specify a clear goal or requirements out of the gate.

:Atoll

Comment 17

•

11 years ago

(In reply to Richard Soderberg [:atoll] from comment #14)
> This tool will require some ability to scan GitHub for employee work (for
> instance, Puppetagain or the DB team's mysql/postgres modules or a lot of
> rtucker's work with puppetlabs-firewall).

I filed bug 941086 to create a link between employee LDAP and GitHub accounts, which is something we've needed for several reasons anyways, but it would benefit you in this case as well.

Amy Rich [:arr] [:arich]

Comment 18

•

11 years ago

So, onto what data would be useful to me...

I took a look at http://people.v.igoro.us/~dustin/ccm-report.html and I think that's an awesome start.  I'm mostly interested in raw data per month which I can then add people on my team together and slap into a spread sheet.  I like the fact that it's broken down per repo so I can pull out specific stats for relops (as a whole, not individual people) and do a comparison from month to month and as far as overall contributions to the repo.  One possible example might be: "for the releng puppet repo, there was a total of 8010 lines added, 5295 removed, and relops made 85.5% of those modifications"  That would be great to graph with a secondary graph showing the change between months once we have more than one month of data.

I think we want to query:

releng puppet
releng buildbot
releng mozpool
sysadmins puppet
the various projects we have on github (we'd need to enumerate these)
wherever we decide to keep info for the various windows program/config side of things

So as far as connectors, I think that's likely hg, svn, and git (and we need to figure out where we're keeping the windows stuff... maybe git?)

I'm probably interested in having this in csv format on the first of every month for the previous month (email is fine), but if this is more useful to have a sliding window or some other delivery format/method, I'm open to suggestions.

:Atoll

Comment 19

•

11 years ago

(In reply to Amy Rich [:arich] [:arr] from comment #18)
> I think we want to query:
> 
> releng puppet
> releng buildbot
> releng mozpool
> sysadmins puppet
> the various projects we have on github (we'd need to enumerate these)
> wherever we decide to keep info for the various windows program/config side
> of things

Bugzilla?

Rob Tucker [:rtucker]

Comment 20

•

11 years ago

(In reply to Amy Rich [:arich] [:arr] from comment #10)
> It would be great to have it by the end of the month so we can use it to
> collect stats for the next operational report cards.

I can guarantee with 100% certainty that this will not be completed by the end of the month.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 21

•

11 years ago

I'll take that as a challenge!

Rob Tucker [:rtucker]

Comment 22

•

11 years ago

(In reply to Dustin J. Mitchell [:dustin] (I read my bugmail; don't needinfo me) from comment #21)
> I'll take that as a challenge!

My statement is due to the fact I'm slammed with other things and there is no way in 5 working days time I can get the code written for 2 new connectors, not to mention additional requirements of monthly emails as CSV. Just implementing the per person associations to repositories is a large task.

With the workload of everyone in webops, I'd be surprised that if the code were done today, we could get it deployed on IT infra to production in 5 days.

Also factor in that this is being used to generate some level of metrics for tracking productivity of employees, do we really want to make this a rush job with limited eyes on testing and metrics gathering?

If you can get this thing done in 5 days, with enough precision to potentially affect an internal employees performance perception you deserve a special place in the clouds.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 23

•

11 years ago

I didn't realize this wasn't assigned to me.  I understand you don't have time, and I didn't mean to imply you should make time.

This isn't a web app, so webops isn't involved.  By "hosting" I only mean "a box to run this on", and honestly it could run on a server at my place if push came to shove.

Amy's requests are pretty straightforward.  You make a good point about testing, though.

Assignee: infra → dustin

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Updated

•

11 years ago

Component: Infrastructure: Tools → RelOps

QA Contact: rtucker → arich

Rob Tucker [:rtucker]

Comment 24

•

11 years ago

(In reply to Dustin J. Mitchell [:dustin] (I read my bugmail; don't needinfo me) from comment #23)
> I didn't realize this wasn't assigned to me.  I understand you don't have
> time, and I didn't mean to imply you should make time.
> 
> This isn't a web app, so webops isn't involved.  By "hosting" I only mean "a
> box to run this on", and honestly it could run on a server at my place if
> push came to shove.
> 
> Amy's requests are pretty straightforward.  You make a good point about
> testing, though.

Let me know if you want help with those connectors, happy to pitch in where I can. I don't have a ton of cycles but could pitch in something I'm sure.

Amy Rich [:arr] [:arich]

Comment 25

•

11 years ago

To be clear it is not necessary that this be done by the end of the quarter.  It would be nice to have, but if we don't have cycles, that's fine.  When we get the data when can start using it, but this shouldn't take precedence over other IT goals.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 26

•

11 years ago

Well, I was unable to find a place to host this that had both python-2.7 and a new enough version of git to take dates in the --since and --until arguments.  So I tarred the thing up and sent it to Amy to run by hand.

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED