hg annotate output on hgweb is painfully slow

RESOLVED FIXED

Status

Developer Services
Mercurial: hg.mozilla.org
--
major
RESOLVED FIXED
9 years ago
3 years ago

People

(Reporter: bz, Assigned: Sid Kalra)

Tracking

Details

(URL)

Attachments

(6 attachments)

See also <http://www.selenic.com/mercurial/bts/issue1310>

I finally decided to dig into the performance problem the hg annotate web
interface seemed to have on some large C++ files in the Mozilla source tree as
compared to the bonsai system we used with CVS.  The two URLs I was comparing
are
<http://hg.mozilla.org/mozilla-central/annotate/31f1081d9681/docshell/base/nsDocShell.cpp>
and
<http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/docshell/base/nsDocShell.cpp&rev=1.907>.

Some basic data:

                                 hg               bonsai
File size:                 3,215,720 bytes     1,818,120 bytes
Time to wget:                    15s               6.5s
Number of DOM nodes:           106265             42649
Time to load in browser:         250s               8s
RAM usage in browser:            75MB              25MB

The browser load times are for an uncached load.  Note that the bonsai page
includes in its smaller size and smaller number of DOM nodes not just the actual
annotated source but also the text of all checkin comments for all changes made
to the file on that CVS branch.  Also note that the wget times differ by a
factor of 2.5, while the file sizes do not.  This reflects the 4-5 second lag at
the start of the download I see with hg annotate.  While it would be nice to fix
that, the heart of the problem is really that 250s vs 8s time.  This is
especially a problem when doing VCS archeology, which requires loading the
annotate for multiple revisions of the file in quick succession.

Further discussion in the mercurial/bts report indicates that the 4-5 second lag is just due to hg annotate being about an order of magnitude or more slower than CVS annotate.  But the core of the problem is that the HTML being produced here is just ridiculous, at least if the browser wants to do incremental layout.  See analysis details in the mercurial bug.

Note that I'm marking this as "major" because it's not really blocking work, but it does increase the time to do revision system archeology from "minutes" to "hours", especially with all the backouts going on (going back through 10-20 revisions at multiple minutes apiece starts to really hurt).

The mercurial bug mentions a new skin being worked on which might be a good thing to try for a start...
Woah.... The hg version is just a HUGE table !!! No wonder it takes time and resources. Should be really only one toplevel node like pre or a div with font--family:monospace and white-space: pre.
Yeah.  I covered all that ground in the hg bug report cited above...
Looks like the template in question is this:

http://hg.mozilla.org/hg_templates/file/2279eae99af9/fileannotate.tmpl

Though I guess, filediff should be fixed as well.  I'm not a python guy so not sure what template lib that was done in.
It's this one:
http://hg.mozilla.org/hg_templates/file/2279eae99af9/gitweb_mozilla/fileannotate.tmpl

and it has nothing to do with Python, it's Mercurial's custom template library. It kind of sucks.
So I guess these are the docs for it:
http://hgbook.red-bean.com/hgbookch11.html
Yeah, though those docs are prone to be at least a little out of date.
It's very simplistic, which is also its main fault. It's not hard to pick up, it just gets frustrating trying to do anything that isn't incredibly simple.

Comment 8

9 years ago
Would it improve to do this instead

Comment 9

9 years ago
Created attachment 343528 [details]
Different template not relying on table

I had forgotten the attachment
That just changes the little header at the top, not the annotation output, as far as I can tell...  Am I missing something?
This bit is the bulk of the page:
<table cellspacing="0" cellpadding="0">
#annotate%annotateline#
</table>

The annotateline template is defined here:
http://hg.mozilla.org/hg_templates/file/2279eae99af9/gitweb_mozilla/map#l28
annotateline = '<tr style="font-family:monospace" class="parity#parity#"><td class="linenr" style="text-align: right;"><a href="#url#diff/#node|short#/#file|urlescape#{sessionvars%urlparameter}">#author|obfuscate#@#rev#</a></td><td><pre><a class="linenr" href="##lineid#" id="#lineid#">#linenumber#</a></pre></td><td><pre>#line|escape#</pre></td></tr>'

And that gets output for each line of the file being annotated.
Yeah.  That's exactly the part that needs to go away and be replaced with something sane...

Is it possible to do things like "scan down the annotate outpue, find the longest changeset author, and pad out all the other authors to the same length with spaces" with this templating system?
No, there is no simple way to pre-scan output in the hg templater.
Created attachment 343850 [details]
An example

(In reply to comment #12)
> pad out all the other authors to the same length with spaces

I think that's overkill.  "display:inline-block; width:16ch;" is good enough
IMO, I don't think an occasional overflow is very disturbing (there is some
at the end of this file).  This example is about half the size of the
original.  We can cut it in half again if we avoid repeating the links in
the left column, those hrefs are awfully long!

The original:
http://hg.mozilla.org/mozilla-central/annotate/93111c5c69fd/layout/generic/nsBRFrame.cpp
Yes, not repeating those links (just outputting them the first line where the info changes and grouping based on that instead of line-by-line for purposes of background-coloring) would be a huge win.
If you can give us a patch against the hg_templates repo: http://hg.mozilla.org/hg_templates/ we can get it deployed to hg.mozilla.org pretty easily.

You can clone that repo and test the templates with your local mozilla-central clone easily by:
1) clone hg_templates
2) edit mozilla-central/.hg/hgrc to contain:
[web]
templates = /path/to/hg_templates
style = gitweb_mozilla

3) cd mozilla-central && hg serve
4) Load http://localhost:8000/ to view your repo with the Mozilla templates.
Created attachment 346909 [details] [diff] [review]
use floats

This isn't very good because of the needed 3 iterations, so the processing time 
on server side increases.
But does decrease the generated file size (tested using CSSFrameConstructor.cpp) ~29% and removes the table. I'm not actually sure if using floats causes some other performance problems.
Should handle long user names and line numbers properly and there is one
IMO nice feature: selecting multiple lines of code doesn't select any
line numbers. Easier to copy-paste code.
But still, performance may not be good enough.
If you have the generated file, you should be able to just compare load time to bonsai and the existing file.  Using floats most likely will have some perf impact, since you have to recompute the intrinsic size on every append, unless you set them to fixed widths...

And of course there's no need for div-per-codeline here: just put in linebreaks and rely on the white-space:pre style on the container to handle it for you?
Created attachment 347055 [details] [diff] [review]
hide content while loading

When loading generated files locally, CSSFC-with-table takes ~20s, CSSFC-with-float ~10s, CSSFC-with-float-and-hide-content-while-loading ~5s, CVS-blame ~5s.
The div on each line is needed if we want to decorate lines.
Static files: http://mozilla.pettay.fi/moztests/hgtest/hgweb/
Hmm.  I guess it would be nice to do the coalescing that bonsai does on the decorations and line number stuff, but that seems hard in a template system.  :(

Comment 21

9 years ago
Comment on attachment 347055 [details] [diff] [review]
hide content while loading

for those of us who are silly, could you use this instead:

+  document.getElementById('loading').innerHTML = "Loading... <a href='javascript:document.getElementById(&quot;page_body&quot;).style.display = &quot;block&quot;document.getElementById('loading').innerHTML=&quot;&quot;'>watch</a>";

(or something like that)
Assignee: nobody → Olli.Pettay
Attachment #347055 - Flags: review+
Comment on attachment 347055 [details] [diff] [review]
hide content while loading

This is better than what we have now. Please push it to the hg_templates repo. We'll file a server ops bug on getting it pushed live.
I'm still trying to tweak the patch a bit.
3 loops in template moves processing from browser to server.
I'm trying to reduce that at least to 2 loops.
I think we should move to use bonsai's annotation.
mxr-test.konigsberg.mozilla.org/bonsai/cvsblame.cgi supports now hg (thank you 
Timeless!) and loading for example http://mxr-test.konigsberg.mozilla.org/bonsai/cvsblame.cgi?file=/layout/base/nsCSSFrameConstructor.cpp is 
significantly faster to load than http://hg.mozilla.org/mozilla-central/annotate/f1af606531f5/layout/base/nsCSSFrameConstructor.cpp.
Hg-bonsai also has the features we (well at least what I) need. It has still some wrong links etc. but the feature-set and performance are way better than what hgweb annotate has.
Loading CSSFR in hg-bonsai takes ~25s, using hgweb annotate hundreds of seconds.
Created attachment 350115 [details] [diff] [review]
What I suggested with my example above.

FWIW, this makes annotate about twice as fast on my machine.
Doing this with valid HTML would be good...
(Assignee)

Comment 27

8 years ago
Created attachment 361072 [details] [diff] [review]
Improving annotate's loading time

On my machine this patch reduces hg annotate's loading time to ~7sec when testing:

http://hg.mozilla.org/mozilla-central/annotate/31f1081d9681/docshell/base/nsDocShell.cpp

This patch is an altered version of Mats Palmgren's patch. For more details view: 

http://blog.sidkalra.com/2009/02/v05-release-complete/
Attachment #361072 - Flags: review?(ted.mielczarek)
Comment on attachment 361072 [details] [diff] [review]
Improving annotate's loading time

Wow, this is way faster! 

Somehow in hg 1.1 I'm getting the full author name in the #author# field, we'll have to fix that before we move to 1.1 on hg.mozilla.org.
Attachment #361072 - Flags: review?(ted.mielczarek) → review+
Pushed to hg_templates:
http://hg.mozilla.org/hg_templates/rev/1c545b88e70b

We'll get hg.mozilla.org updated soon.
Assignee: Olli.Pettay → sid
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
(In reply to comment #28)
> Somehow in hg 1.1 I'm getting the full author name in the #author# field, we'll
> have to fix that before we move to 1.1 on hg.mozilla.org.

Correct, that's one of the aforementioned template changes that we need to do for 1.1.
I fixed that before I pushed Sid's change, anyway.
Depends on: 487669
Comment on attachment 361072 [details] [diff] [review]
Improving annotate's loading time

>+div.codeauthor { 
>+    display:inline-block; 
>+	width:16ch; 
>+	text-align: right; 
>+	color:#999999; 
>+	text-decoration:none;
>+	margin-right: 25em;
>+} 
Weird indentation and spacing aside, this places a restriction on the browsers that can display this :-(
Product: mozilla.org → Release Engineering
https://hg.mozilla.org/hgcustom/version-control-tools/rev/60692ff76bec
Product: Release Engineering → Developer Services
You need to log in before you can comment on or make changes to this bug.