investigate ways of improving the performance of bugmail composition

RESOLVED FIXED

Status

()

bugzilla.mozilla.org
General
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: Ehsan, Assigned: glob)

Tracking

Production

Details

Attachments

(3 attachments)

(Reporter)

Description

5 years ago
This is based on anecdotal evidence, but I think at least within the last month, bmo has gotten slower and slower when processing modifications to bug.  This includes making changes to bugs who have a large number of CCed people, and to day to day changes to bugs where few people are CCed.  I just noticed that I waited for 10+ seconds for me to make a bug as INVALID.  :(
(Assignee)

Comment 1

5 years ago
Created attachment 747280 [details]
response times (bug creation and updating only)

looking at the data for our response times for bug creating and updating, we've been trending pretty flat since the move to scl3.
(Assignee)

Comment 2

5 years ago
ehsan,

next time this happens, can you please make a note of the exact date and time, as well as the bug you're updating?  bonus points for providing information from firebug/web-console with regards to timings.
Flags: needinfo?(ehsan)
(Reporter)

Comment 3

5 years ago
Created attachment 748886 [details]
Network panel screenshot of the 8 second delay

Hehe, this is very easy to reproduce.  I just have to do _something_ in bugzilla.  Here's an example.
Flags: needinfo?(ehsan)
(Assignee)

Comment 4

5 years ago
i'll also need the exact date and time from that screenshot, as well as the bug you're updating.  i want to correlate it with the server-side data.
Flags: needinfo?(ehsan)
(Assignee)

Updated

5 years ago
OS: Mac OS X → All
Hardware: x86 → All
(Assignee)

Comment 5

5 years ago
fwiw, what i see via firefox locally matches what i see server-side; generally just over 4 seconds to perform a POST.
(Reporter)

Comment 6

5 years ago
(In reply to comment #4)
> i'll also need the exact date and time from that screenshot, as well as the bug
> you're updating.  i want to correlate it with the server-side data.

https://bugzilla.mozilla.org/show_activity.cgi?id=815643

ehsan@mozilla.com 	2013-05-13 12:20:48 EDT 	Blocks 		865241
Flags: needinfo?(ehsan)
(Reporter)

Comment 7

5 years ago
(In reply to comment #5)
> fwiw, what i see via firefox locally matches what i see server-side; generally
> just over 4 seconds to perform a POST.

On a good run, yes.  I'm complaining about the worse runs getting slower recently.
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #6)
> (In reply to comment #4)
> > i'll also need the exact date and time from that screenshot, as well as the bug
> > you're updating.  i want to correlate it with the server-side data.
> 
> https://bugzilla.mozilla.org/show_activity.cgi?id=815643
> 
> ehsan@mozilla.com 	2013-05-13 12:20:48 EDT 	Blocks 		865241

Sigh. Adding/removing dependencies will always inherently be slower than normal bug operations due to the fact that two or more bugs are being updated and all the possible notifications that are being sent. Especially if alot of people are Cc'ed on any of the bugs affected and also the amount of people watching the components, etc. 

Glob and I discussed moving the email processing out of the core code and letting jobqueue deal with it could help some in speeding up bug modifications.

dkl
(Assignee)

Comment 9

5 years ago
thanks eshan.

[13/May/2013:09:20:48 -0700] "POST /process_bug.cgi HTTP/1.1" 200 210383 "https://bugzilla.mozilla.org/show_bug.cgi?id=815643" 7457962

that's 7.5 seconds :|

this resulted in the generation of 58 emails (34 on bug 815643, 24 on bug 865241).

i suspect it's the email generation portion which is taking the bulk of this time.  i'll crunch some more numbers.
Assignee: nobody → glob
(Reporter)

Comment 10

5 years ago
(In reply to comment #8)
> (In reply to :Ehsan Akhgari (needinfo? me!) from comment #6)
> > (In reply to comment #4)
> > > i'll also need the exact date and time from that screenshot, as well as the bug
> > > you're updating.  i want to correlate it with the server-side data.
> > 
> > https://bugzilla.mozilla.org/show_activity.cgi?id=815643
> > 
> > ehsan@mozilla.com 	2013-05-13 12:20:48 EDT 	Blocks 		865241
> 
> Sigh. Adding/removing dependencies will always inherently be slower than normal
> bug operations due to the fact that two or more bugs are being updated and all
> the possible notifications that are being sent. Especially if alot of people
> are Cc'ed on any of the bugs affected and also the amount of people watching
> the components, etc. 

That's not the case for the bug that I updated, is it?
(Reporter)

Comment 11

5 years ago
(In reply to comment #9)
> thanks eshan.
> 
> [13/May/2013:09:20:48 -0700] "POST /process_bug.cgi HTTP/1.1" 200 210383
> "https://bugzilla.mozilla.org/show_bug.cgi?id=815643" 7457962
> 
> that's 7.5 seconds :|
> 
> this resulted in the generation of 58 emails (34 on bug 815643, 24 on bug
> 865241).
> 
> i suspect it's the email generation portion which is taking the bulk of this
> time.  i'll crunch some more numbers.

If you manage to take out sending emails from the critical path of bugzilla sending a response page, I will owe you a bottle of your favorite drink.  :-)
(Assignee)

Comment 12

5 years ago
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #11)
> If you manage to take out sending emails from the critical path of bugzilla
> sending a response page, I will owe you a bottle of your favorite drink.  :-)

note it's the initial composition of the email which is happening inline; encryption and actual delivery of the email happens in a different process.
Summary: Bugzilla is getting slower and slower → investigate ways of improving the performance of bugmail composition
(Assignee)

Comment 13

5 years ago
Created attachment 749320 [details] [diff] [review]
patch v1

on my development system it takes 0.02 seconds per email to construct and push into the job-queue, so before we optimise we need to measure where the slow points are on production (which appears to take much longer than this).

this patch adds instrumentation to the bugmail process to create simple timing logs.
Attachment #749320 - Flags: review?(dkl)
Comment on attachment 749320 [details] [diff] [review]
patch v1

Review of attachment 749320 [details] [diff] [review]:
-----------------------------------------------------------------

Instead of having everywhere

$instrument && $instrument->blah();

How bout we just have an $instrument->enabled method that takes a true/false value to turn on/off logging.
Each of the other methods would return right away if needed

    return if !$self->enabled;

Buuuut....I know this will be removed after the timings are recorded :) so otherwise works (with nits)....r=dkl

::: Bugzilla/Instrument.pm
@@ +16,5 @@
> +use Time::HiRes qw(time);
> +
> +sub new {
> +    my ($class, $name) = @_;
> +    my $filename = bz_locations()->{datadir} . "/$name/";

not used

@@ +67,5 @@
> +    if (!$_path) {
> +        $_path = bz_locations()->{datadir} . "/timings";
> +        mkdir($_path) unless -d $_path;
> +        $_host = hostname();
> +        $_host =~ s/^([^\.]+).+/$1/;

I had to change to 

        $_host =~ s/^([^\.]+).*/$1/;

otherwise it cut off the last letter of the hostname.
Attachment #749320 - Flags: review?(dkl) → review+
(Assignee)

Comment 15

5 years ago
(In reply to David Lawrence [:dkl] from comment #14)
> How bout we just have an $instrument->enabled method that takes a true/false
> value to turn on/off logging.

i wrote it this way because method calls in perl are very expensive (http://stackoverflow.com/a/3259215/953), so calling a method to immediately return in most cases would add unnecessary overhead.
(Assignee)

Comment 16

5 years ago
Committing to: bzr+ssh://bjones%40mozilla.com@bzr.mozilla.org/bmo/4.2/
modified Bugzilla/BugMail.pm
added Bugzilla/Instrument.pm
Committed revision 8814.

i'll get this pushed to production tomorrow, let it run for a day, then remove it during thursday's push.
(Assignee)

Comment 17

5 years ago
removed:
Committing to: bzr+ssh://bjones%40mozilla.com@bzr.mozilla.org/bmo/4.2/
modified Bugzilla/BugMail.pm
Committed revision 8818.
(Assignee)

Comment 18

5 years ago
we captured the timings for the generation of 1990 emails.

the average time to generate each email was 0.0669 seconds, which initially seems reasonable.  however some changes triggered a lot of bugmail, with the winner being a single change that triggered 173 emails taking 12.16 seconds to process.

as expected, the bulk of the time is spent in template processing - 82% of the time to generate an email is waiting for all three templates (header, text, and html).  while most recipients are receiving text-only bugmail, generating the html part takes nearly twice as long as generating the text part.

i realised that we're generating duplicate content in many cases -- if two users are members of the exact same groups, then the body content would be identical and doesn't need to be regenerated.  this shouldn't be a significant change, and will save on average 0.0216 seconds per email (30%).  i've filed bug 877053 to implement that, and it's a change that i hope to get into upstream bugzilla (depending on the scope of changes required).

the real fix is to defer template based generation, and let that happen in the background process.  this is much more work, filed as bug 877078.  due to the focused nature of serialisation, it's doubtful that will be palatable upstream.

i'm going to mark this as fixed as the investigations are done, and we have plans to address this pain point both in the short and medium terms.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
(Reporter)

Comment 19

5 years ago
Thanks a lot for this investigation, it's really great that we now have a plan to attack this problem!
You need to log in before you can comment on or make changes to this bug.