Closed Bug 451350 Opened 17 years ago Closed 11 years ago

pushlog?tipsonly=1 should report the right list of files

Categories

(Developer Services :: Mercurial: hg.mozilla.org, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: Pike, Unassigned)

Details

Attachments

(1 file)

Currently, tipsonly=1 only reports the files that are in the tip changeset. It should aggregate all the files from the changesets it's aggregating, as it's really returning an entry per push. That blends nicely with the fact that merge changesets actually return an empty list for files(), and are not too uncommon to be the actual tip changeset.
djc volunteered to own this on #hg.
Assignee: nobody → dirkjan
Status: NEW → ASSIGNED
I'd like to get this in, and here's how I'd do it. Posting the idea first, as it's slightly more involved, but easier now that we have refactored stuff. I'd move the entries from tuples to objects. I could use dicts, too, but I prefer to do entry.user to entry['user']. Then I'd add an optional argument to pushlogSetup and DoQuery to get the files, which would then be in entry.files. I'd like to use sets for the actual file logic, can we do that? That'd require python 2.4 for real sets, I can wallpaper with from sets import Set as set in a catch. I'd trigger that code from pushlogFeed and use the files in the entry objects instead of getting them just from the tip changectx.
Making the entries separate objects seems sensible. Maybe use __slots__ here? Are there any cases where we're instantiating a lot of them? Does this mean you're getting the files at request time, or do you want to put them in the database?
If we'd use lots of them, we'd be dead anyway, there is only going to be one objects per displayed checkin. We tried a good deal to keep the total number of those limited, successfully I'd think. And I'd be getting the files at request time. I don't intend to touch the database piece on the pushlog side for real, that'd end up in bad mojo. I already regret touching it with django gloves on my side ;-)
Okay, sounds good.
So, here's my take. Rather intrusive patch in the end. I had to add a new test repo, and a new test class, to test for the actual merging of files. I had to move the feed tester routine to a mixin class so that it can be shared, and I had to extend it to actually check for the content of the feed. Ted, should we test this nitty-gritty? Right now, I compare the content wholesale, we could just test for <li class="file"> text nodes, too. That would strip the dependency on whitespace etc that the current tests have. Anyway, I had to tweak this somewhat, as my broken feeds were passing the tests ;-) On to the actual code, I refactored a bit, now the iteration over the query results is only one piece of code, with different cursor objects going in. I redid the default sql code, it now does two queries, instead of N pushes + 1. If we decide that the pushes are always ordered in time, I could just use one. But I didn't feel like adding that assumption. I tested both code paths for IN and <= >= by breaking the check for consecutive push ids. In general, the limits check should be faster, thus I'm checking for that. The rest was rather simple and straightforward, IMHO.
Assignee: dirkjan → l10n
Attachment #350402 - Flags: review?(ted.mielczarek)
Comment on attachment 350402 [details] [diff] [review] new tests, restructure, change some sql + self._files = set() We should probably sanity check with aravind that we're using Python 2.4+ on hg.mo. + files = list(self._files) + files.sort() + return files If you're going to require Python 2.4+, might as well just return sorted(files) + def DoQuery(self, getFiles = False): Feels like the wrong place for this, can you just make it an optional parameter to the constructor instead? + push_ids = self.conn.execute("SELECT id FROM pushlog ORDER BY date DESC LIMIT ? OFFSET ?", + (self.querystart_value, + (self.page - 1) * self.querystart_value)).fetchall() I don't think you need the fetchall here, the result of execute is iterable, so your map below (slightly tweaked) should work. + push_ids.sort() Again, just push_ids = sorted(map(...)). + else: + # pushes are non-consecutive. Not sure why, but still, + # let's search with IN. More costly, but safe. I'd ditch this logic and just throw an error here. Do you know of any scenario where we'd actually hit this using the query above? + if getFiles: + self.entries[-1].addFiles(self.repo.changectx(node).files()) I think we should just make the Entry class have a ctx property, since all the default code paths are going to need changectxs anyway (except the JSON, I guess, but nothing's polling that on a regular basis). I guess that won't actually make this line better, since you don't make an Entry for everything in the tipsonly case, but I still think it makes sense. +class FeedTestMixin: + def assertEqualFeeds(self, a, b): I'm not really wild about the Mixin concept. Could this just be a global helper function that happens to take a TestCase as the first arg? + os.environ['TZ'] = "Europe/Berlin" Nice ;-)
Attachment #350402 - Flags: review?(ted.mielczarek) → review-
(In reply to comment #7) <...> > + else: > + # pushes are non-consecutive. Not sure why, but still, > + # let's search with IN. More costly, but safe. > > I'd ditch this logic and just throw an error here. Do you know of any scenario > where we'd actually hit this using the query above? I don't think so. Possibly only if you're migrating an old database for a repo, and someone pushes to the repo while you're doing that. Could there be race conditions in general use with two folks pushing at the same time? djst? > + if getFiles: > + > self.entries[-1].addFiles(self.repo.changectx(node).files()) > > I think we should just make the Entry class have a ctx property, since all the > default code paths are going to need changectxs anyway (except the JSON, I > guess, but nothing's polling that on a regular basis). I guess that won't > actually make this line better, since you don't make an Entry for everything in > the tipsonly case, but I still think it makes sense. I actually call into the json feed in my local setup, in a script across all 150 repos or so I'm watching. And I intend to automate that, too. > +class FeedTestMixin: > + def assertEqualFeeds(self, a, b): > > I'm not really wild about the Mixin concept. Could this just be a global helper > function that happens to take a TestCase as the first arg? Yeah, could do that. > + os.environ['TZ'] = "Europe/Berlin" > > Nice ;-) Yeah, I figured I'd stick to the scheme ;-)
Comment on attachment 350402 [details] [diff] [review] new tests, restructure, change some sql Ooh, can I comment on style nits? >+ def files(self): If you're going to mandate 2.4, you should just use property(). And sorted(), like ted said. >+ def DoQuery(self, getFiles = False): You should stick to the prevalent (PEP8) style here, and lose the spaces around '='. >+ query = pushlogSetup(web.repo, req, getFiles = True) Same here.
Aravind, which version of python are we running on the hg server? This one is about hgweb in particular, but I'd be curious to know the master hg server's python version, too.
# python -V Python 2.4.3 #
Coolio, then I'll take that out for a dance. djst, and possibly aravind, too: Can we get into a race condition with multiple people trying to push into an hg repo at once in our setup?
(In reply to comment #12) > djst, and possibly aravind, too: Can we get into a race condition with multiple > people trying to push into an hg repo at once in our setup? You mean djc, I hope?
oops. sorry. yeah. djc?
Not working on this anymore.
Assignee: l10n → nobody
This is a mass change. Every comment has "assigned-to-new" in it. I didn't look through the bugs, so I'm sorry if I change a bug which shouldn't be changed. But I guess these bugs are just bugs that were once assigned and people forgot to change the Status back when unassigning.
Status: ASSIGNED → NEW
Product: mozilla.org → Release Engineering
Product: Release Engineering → Developer Services
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/213]
I'm reluctant to add yet more functionality into pushlog, especially when that functionality can be facilitated by querying core Mercurial APIs. e.g. $ hg log -v -Tjson -r 66172a04ea943 [ { "rev": 246875, "node": "66172a04ea943a2a7b8961022c2cfc54643e77b3", "branch": "default", "phase": "draft", "user": "Gregory Szorc <gps@mozilla.com>", "date": [1414688969, 25200], "desc": "Bug 1091722 - Use Read The Docs theme for Sphinx\n\nRead the Docs has a lovely Sphinx theme that beats the pants off the\nbuilt-in and default theme. And since it looks like MDN's Sphinx theme\ "bookmarks": [], "tags": ["tip"], "parents": ["34bad7e5d37392af6206d0c40a2e050e3e14baee"], "files": ["tools/docs/conf.py", "tools/docs/mach_commands.py"] } ]
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/213]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: