Closed
Bug 498055
Opened 15 years ago
Closed 14 years ago
hgpoller._parse_changes too slow?
Categories
(Release Engineering :: General, defect, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: catlee, Assigned: catlee)
References
Details
(Whiteboard: [automation])
Attachments
(1 file)
3.11 KB,
patch
|
bhearsum
:
review+
catlee
:
checked-in+
|
Details | Diff | Splinter Review |
I spent a little time tonight profiling a local copy of production-master's configs to try and figure out why production-master is so slow. One thing that showed up pretty high in the profile was code related to _parse_changes. In my test, which was to start up buildbot, let it run for 2 minutes, and then shut it down, _parse_changes accounted for 20 seconds of the total (120 second) CPU time. Here's an alternate version that brought the time down to 3 seconds. import xml.etree.cElementTree as ElementTree def _parse_changes(query): etree = ElementTree.fromstring(query) xmlns = re.sub("({.*}).*", "\\1", etree.tag) changes = [] for entry in etree.findall(xmlns+'entry'): d = {} for n in entry.getiterator(): tag = n.tag.replace(xmlns, "") if tag == "title": d['title'] = n.text d['changeset'] = n.text.split(" ")[1] elif tag == "updated": d['updated'] = parse_date_string(n.text) elif tag == "author": d['author'] = n[0].text elif tag == "link": d['link'] = n.get('href') elif tag == "{http://www.w3.org/1999/xhtml}ul" and n.get('class') == "filelist": d['files'] = [c.text for c in n.findall('{http://www.w3.org/1999/xhtml}li')] changes.insert(0, d) return changes
Comment 1•15 years ago
|
||
ElementTree is new in python 2.5, is that good enough these days? If it is, we could probably use reversed(changes) and just append to the list. That might be a tad faster still. Not sure if findall() could use a path even (and what the path would be, great docs say nothing about that). I recall that we wanted to do something with the changeset in the title at some point, Ted?
Comment 2•15 years ago
|
||
I think people wanted the title to be more useful, so they could use this feed in a feed reader. I think bug 450006 is the closest thing to this that was filed.
Updated•15 years ago
|
Component: Release Engineering → Release Engineering: Future
Priority: -- → P3
Comment 3•14 years ago
|
||
Mass move of bugs from Release Engineering:Future -> Release Engineering. See http://coop.deadsquid.com/2010/02/kiss-the-future-goodbye/ for more details.
Component: Release Engineering: Future → Release Engineering
Assignee | ||
Comment 5•14 years ago
|
||
(In reply to comment #4) > catlee: will this still be an issue in 0.8.0? yes...but since the masters will be split up, the load on the master running the poller won't affect the masters doing builds, so less of a concern. P5 / WONTFIX
Assignee | ||
Updated•14 years ago
|
Assignee: nobody → catlee
Assignee | ||
Comment 6•14 years ago
|
||
Any reason we can't use the json push log for this? It doesn't have as much data, but do we need anything more than the revision string and author?
Assignee | ||
Comment 7•14 years ago
|
||
Parsing mozilla-2.0 pushlog (a bit of an extreme case) on my laptop: existing code: 408s and lots of swapping (peaked around 3GB memory usage) ElementTree: 7.8s json: 0.01s
Comment 8•14 years ago
|
||
Yeah, but that's not too surprising, given that the json hook doesn't show the changeset details like files etc.
Assignee | ||
Comment 9•14 years ago
|
||
(In reply to comment #8) > Yeah, but that's not too surprising, given that the json hook doesn't show the > changeset details like files etc. I'm wondering if we actually need to be handling that information?
Comment 10•14 years ago
|
||
I don't see why you would. All you should need is the changeset and the identity of the pusher.
Comment 11•14 years ago
|
||
The l10n schedulers need the files in changes, at least.
Assignee | ||
Comment 12•14 years ago
|
||
(In reply to comment #11) > The l10n schedulers need the files in changes, at least. Ok, and you still use our buildbotcustom's HgPoller? I can create an alternate poller that works with json-pushes in that case.
Comment 13•14 years ago
|
||
We could do some optimizations based on the list of files that we can't do if we stop getting the file list. I would like to start doing this for mobile builds so that we don't build mobile if the change is completely unrelated to mobile. (In reply to comment #7) > Parsing mozilla-2.0 pushlog (a bit of an extreme case) on my laptop: > > existing code: 408s and lots of swapping (peaked around 3GB memory usage) > ElementTree: 7.8s > json: 0.01s What about using the ElementTree instead of json parsing? It is still a ton faster than the existing code and aiui, gives us the same information. Alternately, could we add the missing information to the json version of pushlog?
Comment 14•14 years ago
|
||
I'd like to keep the default version of json-pushes to be what it is right now, this is mostly a speed concern for the use case I have on the l10n dashboard. Here, I just get the pushes, and if I see a push, update a local clone. That local clone then serves the information on files etc. I guess that the ElementTree version should be a good step forward as is.
Comment 15•14 years ago
|
||
FYI, Arpad filed bug 584836 to get an option for full json-ess to the pushlog web apis to be used by tbpl. Synergies arise.
Assignee | ||
Comment 16•14 years ago
|
||
Attachment #465795 -
Flags: review?(bhearsum)
Comment 17•14 years ago
|
||
Comment on attachment 465795 [details] [diff] [review] Faster parse_changes using cElementTree Yay, faster and more readable!
Attachment #465795 -
Flags: review?(bhearsum) → review+
Assignee | ||
Comment 18•14 years ago
|
||
Comment on attachment 465795 [details] [diff] [review] Faster parse_changes using cElementTree changeset: 881:4c226a8f9804
Attachment #465795 -
Flags: checked-in+
Assignee | ||
Comment 19•14 years ago
|
||
Looks like it's working
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•