Closed Bug 1229113 Opened 9 years ago Closed 6 years ago

Send more fine-grained Pulse messages when review requests are published

Categories

(MozReview Graveyard :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: mconley, Unassigned)

References

Details

We currently send a Pulse message every time the parent of a push-based review request is published with changes to the diff.

This sort of message is useful for doing things like kicking off automated static analysis tools, like the one proposed in bug 1229106.

Unfortunately, with our current setup, we'd also kick off those bots even if the push was a straight-forward rebase. That adds a lot of unnecessary noise.

Having talked to smacleod in #mozreview, I think what we want are finer grain messages. Specifically, we want different messages for the following scenarios:

1. A new parent diff revision is added (what we've currently got)
2. A commit was updated / added. (a message for each added/updated commit review request, including rebases)
3. A new parent diff revision was added (but not for simple rebases)
4. A commit was updated / added (a message for each added/updated commit, but filtering out any simple rebases)

Depending on the static analysis bot, they'd probably want to listen to either 3 or 4 to trigger their analysis.

Does my description sound right, smacleod?
Flags: needinfo?(smacleod)
Blocks: 1229106
Yup, that all sounds good to me.

Sending messages 1 & 2 are very straight forward. My idea for 3 and 4 is to have a worker listen for 1 or 2 and then do the processing needed to make decisions for the filtered messages. The worker would then send out the pulse messages if it decides there have been changes. This allows us to use some more heavyweight processing rather than trying to calculate it and send the pulse message in the web process like 1 & 2.
Blocks: 1125473
Flags: needinfo?(smacleod)
(In reply to Steven MacLeod [:smacleod] from comment #1)
> Yup, that all sounds good to me.
> 
> Sending messages 1 & 2 are very straight forward. My idea for 3 and 4 is to
> have a worker listen for 1 or 2 and then do the processing needed to make
> decisions for the filtered messages. The worker would then send out the
> pulse messages if it decides there have been changes. This allows us to use
> some more heavyweight processing rather than trying to calculate it and send
> the pulse message in the web process like 1 & 2.

That sounds good, but what is this heavyweight processing? Is this using the Review Board internal opcode stuff, or is there some hg-fu we can use?
Flags: needinfo?(smacleod)
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #2)
> That sounds good, but what is this heavyweight processing? Is this using the
> Review Board internal opcode stuff, or is there some hg-fu we can use?
I see this processing mostly involving hg and the actual diff / file contents, rather than anything Review Board has generated.

I'll illustrate the problem and solution with an example:

RRSeries_0:
... <- X <- A_0 <- B_0

RRSeries_1:
... <- Y <- A_1 <- B_1

RRSeries_0 is two draft commits (A_0 & B_0) based on public commit X, and RRSeries_1 is a rebase of A_0 & B_0 from X to public commit Y, where A_0 & B_0 may also have been modified.

Taking commits, A_0 and A_1, which we've matched and used A_1 to update the diff of the review request for A_0 (because they do not have matching sha1s). We need to determine if the author of these commits has made actual changes to A_0 to create A_1, or if A_1 is simply a rebase.

I'd say the most straightforward way to check this is to perform an actual rebase of A_0 and see if it matches A_1 (We'd also need to manually update all of the rebased commits meta data, such as commit message, and hg extra data, to match A_1). If we end up with the same sha1, we know A_1 is strictly a rebase. There might be a smarter way to check this internally in hg like comparing the file tree our test rebase commit points at, rather than the sha1, meaning we wouldn't have to worry about commit data. We should ask gps about this.

For B_0 and B_1, I believe we should be able to make our test rebased commit on top of A_1 and just perform the comparison there (I might be missing some complexity though).

Another option would be to compare diff chunks or something.

The other piece of this, is with evolve enabled we may end up with a history edit which reorders the commits, but doesn't actually modify the contents. I believe we *could* be intelligent in this case, but it would likely be much more complex. I haven't thought through how to handle this, or even decided if we should bother rather than falling back to "Assume all commits have been modified!".
Flags: needinfo?(smacleod) → needinfo?(gps)
To compare 2 Mercurial changesets without looking at the commit data, you can compare the manifests, which is the list of files and their versions+metadata in that changeset. If the manifest SHA-1 is identical, it's the same exact files. You can also iterate the manifests and examine individual differences.
Flags: needinfo?(gps)
No longer blocks: 1229106
Blocks: 1239044
Product: Developer Services → MozReview
MozReview is now obsolete. Please use Phabricator instead. Closing this bug.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.