Pushlog should return a "lastpushid"

RESOLVED FIXED

Status

Developer Services
Mercurial: Pushlog
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: emorley, Assigned: gps)

Tracking

(Blocks: 1 bug)

Dependency tree / graph

Details

(Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/727] )

MozReview Requests

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(12 attachments, 1 obsolete attachment)

38 bytes, text/x-review-board-request
emorley
: review+
Details | Review
39 bytes, text/x-review-board-request
emorley
: review+
Details | Review
39 bytes, text/x-review-board-request
emorley
: review+
Details | Review
39 bytes, text/x-review-board-request
emorley
: review+
Details | Review
39 bytes, text/x-review-board-request
emorley
: review+
Details | Review
39 bytes, text/x-review-board-request
emorley
: review+
Details | Review
39 bytes, text/x-review-board-request
emorley
: review+
Details | Review
39 bytes, text/x-review-board-request
emorley
: review+
Details | Review
39 bytes, text/x-review-board-request
emorley
: review+
Details | Review
39 bytes, text/x-review-board-request
emorley
: review+
Details | Review
39 bytes, text/x-review-board-request
emorley
: review+
Details | Review
39 bytes, text/x-review-board-request
emorley
: review+
Details | Review
(Reporter)

Description

3 years ago
For treeherder, we're trying to solve this problem:
* To reduce load on json-pushes when polling, we use startID to limit the range returned for each repo.
* To do this, we have to cache the most recent push ID known for each repo, and use that as the startID for the next request.
* When a repo gets reset (eg Try or a disposable project branch) the push IDs are reset.
* This then means that the startID requested doesn't exist, so we ideally need to be able to tell it doesn't exist, so we can invalidate the cached startID and fall back to /json-pushes/?full=1 which will give us the last 24 hours worth of pushes.

However currently json-pushes returns a 200 and the same (empty) json response for both the "startID doesn't exist" and "no new pushes since a valid startID" cases. If the former returned !200, we could easily handle this and invalidate the cached startID, meaning we don't need user intervention to un-jam the pushlog ingestion.

Invalid startID:
https://hg.mozilla.org/mozilla-central/json-pushes/?full=1&startID=9999999

And compare to the startID of the current tip push (see the last push in https://hg.mozilla.org/mozilla-central/json-pushes/ to find this out).
(Reporter)

Updated

3 years ago
Blocks: 1065775
(Reporter)

Comment 1

3 years ago
This will require changes around here:
https://hg.mozilla.org/hgcustom/pushlog/file/0ffe19e2e343/pushlog-feed.py#l68

I don't know whether we're best to try to perform the query, and if no results are found, then check to see if start_id exists in the table, or check first. If we checked after, it would save extra work in the common case I guess.

Some musings:
* Things like fromchange already 404, eg https://hg.mozilla.org/mozilla-central/json-pushes?fromchange=foo
* Should we also check endID too?
* What to do if either startID or endID don't exist in the DB, but there are still results? eg hg.mozilla.org/mozilla-central/json-pushes?startID=27490&endID=99999999 

Perhaps for now we just keep this simple:
* Only check startID
* If a push of that ID does not exist in the DB, return "unknown startID '1234'" along with a 404.
(Reporter)

Updated

3 years ago
No longer blocks: 1065775
(Reporter)

Comment 2

3 years ago
No longer affects treeherder, since its pushlog ingestion has switched from &startID=N to &fromchange=rev, and the pushlog extension already correctly returns a 404 for the latter.
Summary: Pushlog returns 200 even if the startID push doesn't exist → Pushlog should return HTTP 404 if the startID push doesn't exist
Whiteboard: [treeherder]
Product: Webtools → Developer Services

Updated

3 years ago
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/42]

Updated

3 years ago
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/42] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/727] [kanban:engops:https://kanbanize.com/ctrl_board/6/42]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/727] [kanban:engops:https://kanbanize.com/ctrl_board/6/42] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/727]
(Reporter)

Updated

3 years ago
Blocks: 1076826
(Reporter)

Updated

3 years ago
Blocks: 1104374
I'll work on this today.
Assignee: nobody → gps
Status: NEW → ASSIGNED
Ugh.

So it turns out that the 404 for e.g. https://hg.mozilla.org/mozilla-central/json-pushes?fromchange=foo is untested behavior that just happens to work.

In the bowels of the pushlog renderer is some code that does |repo[changeset]|. If an invalid changeset is passed, that will raise a RepoLookupError. hgweb catches this exception and converts it to a 404 (http://selenic.com/repo/hg/file/a81c76106d90/mercurial/hgweb/hgweb_mod.py#l258).

I was going to push back on using HTTP 404 here. But since I guess there is precedent...

I would like to put additional metadata in the JSON response. e.g. "lastPushID." To my dismay, the keys of the root json object returned by json-pushes are integer push IDs. And multiple clients are coded to iterate through the keys and assume they push IDs. This an example of how to not write an extensible format.

I may throw an optional query string argument on the API to change the JSON response to enable us to communicate more data to clients.
Created attachment 8528626 [details]
MozReview Request: bz://1065771/gps
Attachment #8528626 - Flags: review?(emorley)
Attachment #8528626 - Flags: review?(bkero)
/r/999 - testing: support query strings in http-request.py
/r/1001 - pushlog: testing support for comparing JSON files
/r/1003 - pushlog: use new-style class definition
/r/1005 - pushlog: port empty repo tests to .t tests
/r/1007 - pushlog: port json-pushes tests to .t tests
/r/1009 - pushlog: port test with changeset and id boundaries
/r/1011 - pushlog: port tests for querying specific changesets
/r/1013 - pushlog: test that querying a changeset that doesn't exist results in 404
/r/1015 - pushlog: sort keys during JSON serialization
/r/1017 - pushlog: add a version query string flag to control output format
/r/1019 - pushlog: implement format version 2
/r/1021 - pushlog: add last push ID to pushlog response (bug 1065771)

Pull down these commits:

hg pull review -r 3bbf4072e5a749c8bb42229ed4854cae9f622e1a
https://reviewboard.mozilla.org/r/997/#review541

I had an existing patch series for porting the pushlog tests to .t tests. I brushed off the cobwebs and tacked the new code needed to support startID-based polling onto the end. I hope you don't mind. Commits should be rummer stamp worthy.
(Reporter)

Comment 8

3 years ago
https://reviewboard.mozilla.org/r/1021/#review543

::: hgext/pushlog-legacy/pushlog-feed.py
(Diff revision 1)
> -        return {'pushes': pushes}
> +        return {'pushes': pushes, 'lastpushid': query.lastpushid}

Do we want to return lastpushid when using &changeset=SHA? Or should we only do so when using ID based or range based params?

::: hgext/pushlog-legacy/tests/test-hgweb.t
(Diff revision 1)
> -Format version 2 has pushes in a child object
> +Format version 2 has pushes in a child object and last push id

s/and last push id/and a last push id/ perhaps?
(Reporter)

Comment 9

3 years ago
https://reviewboard.mozilla.org/r/997/#review545

Ship It!
Attachment #8528626 - Flags: review?(emorley) → review+
(Reporter)

Comment 10

3 years ago
(In reply to Ed Morley (moved to Treeherder team) [:edmorley] from comment #8)
> Do we want to return lastpushid when using &changeset=SHA? Or should we only
> do so when using ID based or range based params?

Clicked too soon. I meant it just seems a little unrelated to return it when requesting a single changeset by SHA, though consistency probably wins (even if it means an additional query).
https://reviewboard.mozilla.org/r/1021/#review985

> Do we want to return lastpushid when using &changeset=SHA? Or should we only do so when using ID based or range based params?

Consistency is easiest, methinks.
https://hg.mozilla.org/hgcustom/version-control-tools/rev/a7732c976e57
Attachment #8528626 - Flags: review?(bkero)
Hopefully we'll get this deployed by EOD so downstream consumers can switch over to the new polling method.
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
This has been pushed to production.

https://hg.mozilla.org/mozilla-central/json-pushes?version=2 now works.
(Reporter)

Comment 15

3 years ago
Adjusting summary for what actually landed here, since the response is still a 200.
The patches that landed here are useful, but I still think we should return a 404 if the requested startID > lastpushid.
I'll file a new bug for that.
Summary: Pushlog should return HTTP 404 if the startID push doesn't exist → Pushlog should return a "lastpushid"
(Reporter)

Updated

3 years ago
Blocks: 1114843
(Reporter)

Updated

3 years ago
Blocks: 774862
Blocks: 1116328

Updated

3 years ago
Blocks: 1116899
Comment on attachment 8528626 [details]
MozReview Request: bz://1065771/gps
Attachment #8528626 - Attachment is obsolete: true
Attachment #8618339 - Flags: review+
Attachment #8618340 - Flags: review+
Attachment #8618341 - Flags: review+
Attachment #8618342 - Flags: review+
Attachment #8618343 - Flags: review+
Attachment #8618344 - Flags: review+
Attachment #8618345 - Flags: review+
Attachment #8618346 - Flags: review+
Attachment #8618347 - Flags: review+
Attachment #8618348 - Flags: review+
Attachment #8618349 - Flags: review+
Attachment #8618350 - Flags: review+
Created attachment 8618339 [details]
MozReview Request: testing: support query strings in http-request.py
Created attachment 8618340 [details]
MozReview Request: pushlog: testing support for comparing JSON files
Created attachment 8618341 [details]
MozReview Request: pushlog: use new-style class definition
Created attachment 8618342 [details]
MozReview Request: pushlog: port empty repo tests to .t tests
Created attachment 8618343 [details]
MozReview Request: pushlog: port json-pushes tests to .t tests
Created attachment 8618344 [details]
MozReview Request: pushlog: port test with changeset and id boundaries
Created attachment 8618345 [details]
MozReview Request: pushlog: port tests for querying specific changesets
Created attachment 8618346 [details]
MozReview Request: pushlog: test that querying a changeset that doesn't exist results in 404
Created attachment 8618347 [details]
MozReview Request: pushlog: sort keys during JSON serialization
Created attachment 8618348 [details]
MozReview Request: pushlog: add a version query string flag to control output format
Created attachment 8618349 [details]
MozReview Request: pushlog: implement format version 2
Created attachment 8618350 [details]
MozReview Request: pushlog: add last push ID to pushlog response (bug 1065771)
You need to log in before you can comment on or make changes to this bug.