Closed Bug 1236618 Opened 8 years ago Closed 8 years ago

Move pushlog client aggregation out of mozext and in to pushlog

Categories

(Developer Services :: Mercurial: Pushlog, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nalexander, Assigned: gps)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(3 files)

Right now, version-control-tools has pushlog related functionality in two places: the pushlog extension and the mozext extension.  IIUC, the pushlog extension was initially targeted for server-side use and the mozext extension for client-side use.  Per gps, mozext is a little hacky and not well tested; in response, it's not offered during |mach mercurial-setup|.  However, |mach artifact| uses the pushlog extensively, so we'd like people to use mozext.

To improve the situation, this ticket tracks moving the pushlog client aggregation out of mozext and in to pushlog (and hopefully improving the testing while we're there).

This doesn't block Bug 1234912, but that ticket is relevant to this one.
In preparation for providing an alternate mechanism for inserting data
into the changetracker.

Review commit: https://reviewboard.mozilla.org/r/31537/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/31537/
Attachment #8709785 - Flags: review?(nalexander)
This will allow us to obtain pushlog data from not the JSON API.

Review commit: https://reviewboard.mozilla.org/r/31539/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/31539/
Attachment #8709786 - Flags: review?(nalexander)
It is faster to obtain the pushlog via the wire protocol via an already
established TCP connection than to fetch it via multiple HTTP requests
from the Mercurial JSON web API.

Before: ~20.3s
After:  ~13.3s

This does create some duplicate pushlog fetch code. And `hg pushlogsync`
doesn't use the new API. But progress is progress. We should probably
clone the tracking bug to track doing this more robustly.

Review commit: https://reviewboard.mozilla.org/r/31541/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/31541/
Attachment #8709787 - Flags: review?(nalexander)
Comment on attachment 8709785 [details]
MozReview Request: mozautomation: extract last_push_id to its own function; r?nalexander

https://reviewboard.mozilla.org/r/31537/#review28395

I have some small questions, but nothing serious.

::: pylib/mozautomation/mozautomation/changetracker.py:111
(Diff revision 1)
> +        last_push_id = self._db.execute('SELECT push_id FROM pushes WHERE '

Is it faster to use `MAX` here?

::: pylib/mozautomation/mozautomation/changetracker.py:125
(Diff revision 1)
>          last_push_id = last_push_id[0] if last_push_id else -1

Is there a reason you're not using your new helper here?
Attachment #8709785 - Flags: review?(nalexander) → review+
Comment on attachment 8709786 [details]
MozReview Request: mozautomation: add function to insert pushes from raw data; r?nalexander

https://reviewboard.mozilla.org/r/31539/#review28397

::: pylib/mozautomation/mozautomation/changetracker.py:152
(Diff revision 1)
> +                self._db.execute('INSERT INTO pushes (push_id, tree_id, time, '

It's odd to me that the input is `who, when` and then you reverse them.  Is this more natural for a consumer?

::: pylib/mozautomation/mozautomation/changetracker.py:158
(Diff revision 1)
> +                    self._db.executemany('INSERT INTO changeset_pushes VALUES '

You named the complete set of fields in the first insert, but didn't in the second.  Consider being consistent.
Attachment #8709786 - Flags: review?(nalexander) → review+
Comment on attachment 8709787 [details]
MozReview Request: mozext: obtain pushlog data via wire protocol (bug 1236618); r?nalexander

https://reviewboard.mozilla.org/r/31541/#review28399

Neat!  I learned some stuff crawling through this code.

::: hgext/mozext/__init__.py:426
(Diff revision 1)
> +            repo.ui.warn('received pushlog entry for unknown changeset; ignoring\n')

Is it the case that there will never be a known changeset following an unknown changeset?  Would any such a known changeset be received in a later pushlogsync?
Attachment #8709787 - Flags: review?(nalexander) → review+
https://reviewboard.mozilla.org/r/31537/#review28395

> Is it faster to use `MAX` here?

Meh.
https://reviewboard.mozilla.org/r/31541/#review28399

> Is it the case that there will never be a known changeset following an unknown changeset?  Would any such a known changeset be received in a later pushlogsync?

The pushlog always returns in push order, oldest to newest.

The scenario where this happens is when the remote is pushed to in the middle of the `hg pull`. The pushlog query occurs after new changesets have been introduced on the remote. It is easier to just filter them on the client than to request the upper bound of returned pushlog entries from the server (which isn't that much work, and I concede should be implemented at some point).
Assignee: nobody → gps
Status: NEW → ASSIGNED
Fixing this one. Will clone to track making this better.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Blocks: 1241726
Blocks: 1242281
Blocks: 1242653
Depends on: 1245273
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: