Closed
Bug 372795
Opened 17 years ago
Closed 16 years ago
Implement Bugzilla::Product::preload() to speed up query.cgi when there are many products
Categories
(Bugzilla :: Bugzilla-General, enhancement)
Tracking
()
RESOLVED
FIXED
Bugzilla 3.2
People
(Reporter: mkanat, Assigned: mkanat)
References
(Blocks 1 open bug)
Details
(Keywords: perf)
Attachments
(4 files, 7 obsolete files)
I'm doing some testing of Bugzilla with 2000 products that have 0-25 components, versions, and target milestones each. We spend a lot of time doing a SELECT id from components, milestones, and versions, and then generating the objects for each product. Instead, I think at least Bugzilla::Product's new_from_list should have a { preload => 1 } option, that would call a certain function to preload data by calling a single SQL query for the whole list, or a small number of SQL queries for the whole list. This is probably generally useful in Bugzilla::Object for other situations like this (say, for lists of Bug or User objects).
Assignee | ||
Updated•17 years ago
|
Status: NEW → ASSIGNED
Summary: Bugzilla::Object should be able to preload certain data to speed things up → Bugzilla::Product::new_from_list should be able to preload certain data to speed things up
Assignee | ||
Comment 1•17 years ago
|
||
Okay, in my highly unscientific testing, this seems to offer a pretty good speed-up. bkor, is there any way that you could test this out on your 3.0 test installation and see if it makes any difference? (Particularly if it makes a difference when you remove your local preload hack.) Note that this actually has some of bug 372531 in it, and I didn't really separate it out. I'm not saying that this is the exact patch I want to put in, just that I want to see how it improves things on a real machine (not on landfill, which is pretty heavily loaded and not a good perf testing system).
Attachment #257489 -
Flags: review?(bugzilla-mozilla)
Assignee | ||
Updated•17 years ago
|
Summary: Bugzilla::Product::new_from_list should be able to preload certain data to speed things up → Bugzilla::Product::new_from_list should preload certain data to speed things up
Comment 2•17 years ago
|
||
Performance stats: pristine non-cached: 2162 queries, 2.754979 seconds 4.13user 0.33system 0:10.17elapsed 43%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (29major+7885minor)pagefaults 0swaps pristine cached (5th run): 2162 queries, 1.780918 seconds 4.04user 0.24system 0:05.26elapsed 81%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+7914minor)pagefaults 0swaps after patch non-cached: 144 queries, 0.996881 seconds 2.83user 0.22system 0:07.89elapsed 38%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (29major+7774minor)pagefaults 0swaps after patch cached (5th run): 144 queries, 0.280068 seconds 2.92user 0.13system 0:03.16elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+7802minor)pagefaults 0swaps
Comment 3•17 years ago
|
||
Comment on attachment 257489 [details] [diff] [review] v1 PS: I tested this against latest Bugzilla HEAD, with InnoDB, etc. >Index: Bugzilla/Field.pm >-=item C<match> >- >-=over >- >-=item B<Description> >- >-Returns a list of fields that match the specified criteria. >- >-You should be using L<Bugzilla/get_fields> and >-L<Bugzilla/get_custom_field_names> instead of this function. There is still some references to match in this POD. Suggest to leave above warning in. I did not see a POD for match in Bugzilla/Object.pm (r-)
Attachment #257489 -
Flags: review?(bugzilla-mozilla) → review-
Assignee | ||
Comment 4•17 years ago
|
||
Okay, here's a newer version. This is much simpler, now that Bugzilla::Object already has match(). This also adds a small optimization to get_all that I figure is harmless to keep.
Attachment #257489 -
Attachment is obsolete: true
Attachment #290547 -
Flags: review?(bugzilla-mozilla)
Assignee | ||
Comment 5•17 years ago
|
||
There was an extremely strange taint error that is solved by this seemingly unimportant code change.
Attachment #290547 -
Attachment is obsolete: true
Attachment #290556 -
Flags: review?(bugzilla-mozilla)
Attachment #290547 -
Flags: review?(bugzilla-mozilla)
Comment 6•17 years ago
|
||
Is this bug critical enough from a performance point of view to block 3.2? Think about installations with thousands of products, e.g. Mandriva and their old Bugzilla installation (32,000 products where everything was slow). To be sure: with this patch, the memory consumption won't be worse than what we currently have, right (again with thousands of products in mind)?
Flags: blocking3.2?
Keywords: perf
Assignee | ||
Comment 7•17 years ago
|
||
(In reply to comment #6) > Is this bug critical enough from a performance point of view to block 3.2? Yes, it should have even gone into 3.0, probably. > To be sure: with this patch, the memory consumption won't be worse than what we > currently have, right (again with thousands of products in mind)? Right, it will be the same.
Flags: blocking3.2? → blocking3.2+
Comment 8•17 years ago
|
||
Performance data: pristine non-cached: 2114 queries, 1.534358 sec runtime 1.24user 0.05system 0:06.46elapsed 20%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (36major+10403minor)pagefaults 0swaps pristine cached: 2114 queries, 0.319684 sec runtime 1.23user 0.03system 0:01.41elapsed 90%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+10439minor)pagefaults 0swaps patched, non-cached: 95 queries, 1.346977 sec runtime 0.97user 0.05system 0:05.96elapsed 17%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (36major+10262minor)pagefaults 0swaps patches, cached: 95 queries, 0.076401 sec runtime 0.94user 0.02system 0:01.00elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+10299minor)pagefaults 0swaps
Comment 9•17 years ago
|
||
btw, 'only' 337 products. Should probably make a script to generate a much larger installation based upon the statistical data in my installation. dbiprof also shows: 0.026800 seconds for: SELECT id,value,product_id FROM versions WHERE product_id IN (?,?, [etc]) ORDER BY id but also: 0.005405 seconds for: SELECT products.id,products.name,products.classification_id,products.description,products.milestoneurl,products.disallownew,products.votesperuser,products.maxvotesperbug,products.votestoconfirm,products.defaultmilestone FROM products WHERE id IN (?,?, [etc]) ORDER BY name this for the patched cached case. Since 2007-04-08, the cached case now only produces 95 queries (24 unique ones -- some are called multiple times). Dbiprof doesn't show the SQL for 27 queries.
Comment 10•17 years ago
|
||
Note: I always tested using the command line, without being logged in. When logged in, I see the following: non-patched cached for show_bug.cgi (I tested query.cgi above): 347 queries, 0.081518 seconds 0.46user 0.02system 0:00.54elapsed 90%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+8243minor)pagefaults 0swaps patched cached for show_bug.cgi: 332 queries, 0.141313 seconds (less queries, but they take more time) 0.52user 0.02system 0:00.62elapsed 88%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+9136minor)pagefaults 0swaps I believe the cause is get_selectable_products from Bugzilla/User.pm. This now retrieves all components (etc), while before it would only (inefficiently) retrieve when needed.
Comment 11•17 years ago
|
||
(In reply to comment #10) > I believe the cause is get_selectable_products from Bugzilla/User.pm. This now > retrieves all components (etc), while before it would only (inefficiently) > retrieve when needed. This is annoying. When do we use components, versions and milestones of a list of products versus when don't we care about them? query.cgi uses them, but show_bug.cgi doesn't. What about other pages?
Comment 12•17 years ago
|
||
I'll note: * CGI script * number of queries * query time change (pristine -> patched). * 'time' request.cgi (using default params -- do_union etc): Speedup (already used components) 751 queries -> 82 queries 0.128219 secs -> 0.077958 secs 0.56user 0.02system 0:00.66elapsed -> 0.54user 0.02system 0:00.60elapsed page.cgi?id=release-notes.html No change (same # of queries) editproducts.cgi?action=edit&product=nautilus Slowdown 143 queries -> 145 queries 0.041898 secs -> 0.097888 secs 0.30user 0.01system 0:00.34elapsed -> 0.35user 0.02system 0:00.42elapsed editproducts.cgi (shows Classification selection) Slowdown 140 -> 142 0.021653 secs - > 0.079241 secs 0.29user 0.01system 0:00.32elapsed -> 0.33user 0.03system 0:00.39elapsed buglist.cgi (My Bugs query) Similar 165 -> 163 0.452663 -> 0.457684 (same unchanged queries took 0.002 longer.. too close to call) I'll check some more.
Comment 13•17 years ago
|
||
Data from GNOME Bugzilla (showing 1k+ hits, hiding bgo specific CGI scripts) 94727 show_bug.cgi 27285 attachment.cgi 23299 buglist.cgi 5829 showdependencytree.cgi 3159 query.cgi 2214 enter_bug.cgi 1837 show_activity.cgi 1047 page.cgi attachment.cgi?id=36779 No change attachment.cgi?id=36779&action=diff No change show_activity.cgi?id=163163 Slight speedup 142 -> 141 0.014744 -> 0.013627 0.26user 0.01system 0:00.29elapsed -> 0.27user 0.01system 0:00.29elapsed editgroups.cgi Slight slowdown, not caused by preload (more the other changes) 116->114 0.010542 -> 0.012989 editclassifications.cgi?action=edit&classification=Desktop Speedup 128 -> 130 0.024016 -> 0.012237
Comment 14•17 years ago
|
||
Comment on attachment 290556 [details] [diff] [review] v3 Works well, but I'd like to have keep the current show_bug.cgi performance. Any way to avoid preload from slowing down show_bug.cgi?
Attachment #290556 -
Flags: review?(bugzilla-mozilla) → review-
Assignee | ||
Comment 15•17 years ago
|
||
(In reply to comment #13) What really matters is query.cgi, for this patch, though. (In reply to comment #14) > (From update of attachment 290556 [details] [diff] [review]) > Works well, but I'd like to have keep the current show_bug.cgi performance. Any > way to avoid preload from slowing down show_bug.cgi? We could add a parameter to Product::new_from_list and Bug::get_selectable_products, { preload => 1 }, and only do the preload if it's specified.
Comment 16•17 years ago
|
||
(In reply to comment #15) > (In reply to comment #14) > > (From update of attachment 290556 [details] [diff] [review] [details]) > > Works well, but I'd like to have keep the current show_bug.cgi performance. Any > > way to avoid preload from slowing down show_bug.cgi? > > We could add a parameter to Product::new_from_list and > Bug::get_selectable_products, { preload => 1 }, and only do the preload if it's > specified. For show_bugs.cgi, could we still preload the product list (and not the components)? Reason is that I'd like to have the "New" link in the header section use YUI to hierarchically display all the products and would not want to reload the product list for both the header and the show_bugs' "product" pulldown menu.
Assignee | ||
Comment 17•17 years ago
|
||
(In reply to comment #16) > For show_bugs.cgi, could we still preload the product list (and not the > components)? There's no preloading required, there. Perhaps you misunderstood the patch?
Comment 18•17 years ago
|
||
(In reply to comment #17) > (In reply to comment #16) > > For show_bugs.cgi, could we still preload the product list (and not the > > components)? > > There's no preloading required, there. Perhaps you misunderstood the patch? > Yup, I misunderstood the patch. Had to do a special caching for my particular need.
Assignee | ||
Comment 19•17 years ago
|
||
Okay, here's a version that only preloads for get_selectable_products. Because get_selectable_products already takes an optional parameter, it was too complex, for this patch, to add a parameter to get_selectable_products. This should generally improve the state of things with query.cgi and other things that use get_selectable_products. Also, it should generally improve things in other parts of Bugzilla, since I've changed match() from doing two queries to doing one query.
Attachment #290556 -
Attachment is obsolete: true
Attachment #298823 -
Flags: review?(bugzilla-mozilla)
Assignee | ||
Comment 20•17 years ago
|
||
Oops, that patch didn't work, but this one does. :-)
Attachment #298823 -
Attachment is obsolete: true
Attachment #298824 -
Flags: review?(bugzilla-mozilla)
Attachment #298823 -
Flags: review?(bugzilla-mozilla)
Assignee | ||
Comment 21•17 years ago
|
||
I realized I'd forgotten to use the new sql_in method.
Attachment #298824 -
Attachment is obsolete: true
Attachment #298838 -
Flags: review?(bugzilla-mozilla)
Attachment #298824 -
Flags: review?(bugzilla-mozilla)
Comment 22•16 years ago
|
||
Latest patch speeds up query.cgi. No real change for show_bug.cgi. However, editproducts.cgi is still affected. editproducts.cgi?action=edit&product=nautilus Slowdown 143 queries -> 145 queries 0.041476 secs -> 0.127691 secs time shows: 0.33 user -> 0.41 user Should we care?
Comment 23•16 years ago
|
||
What are the stats for query.cgi, (buglist.cgi) and show_bug.cgi / process_bug.cgi?
Assignee | ||
Comment 24•16 years ago
|
||
Right now I only care about query.cgi. Let's file a separate bug for editproducts.cgi, since that's also going to have the problem of "lots of HTML output".
Assignee | ||
Comment 25•16 years ago
|
||
Comment on attachment 298838 [details] [diff] [review] v5 Now that we have timing data from bkor, would you do the code review on this, LpSolit?
Attachment #298838 -
Flags: review?(bugzilla-mozilla) → review?(LpSolit)
Comment 26•16 years ago
|
||
Comment on attachment 298838 [details] [diff] [review] v5 >Index: Bugzilla/Object.pm > sub match { my $id = $class->ID_FIELD; my $table = $class->DB_TABLE; Both variables are no longer needed. >+ push(@terms, "$field IN ($qmarks)"); You have to use $dbh->sql_in(). >Index: Bugzilla/Product.pm >+sub new_from_list { >+ my $self = shift; >+ my (undef, $params) = @_; >+ $params ||= {}; >+ my $products = $self->SUPER::new_from_list(@_); This is confusing. I spent some time to understand the goal of undef here. I would rather see: my ($self, $prod_ids, $params) = @_; my $products = $self->SUPER::new_from_list($prod_ids); So that we have a chance to understand which is which. >+sub preload { >+ foreach my $obj (@$objects) { >+ my $product_id = $obj->{product_id}; Milestone, version and component objects all have a public ->product_id() method. Use it. >+ next if !exists $prods{$product_id}; This can never happen as you call match() with a defined list of product IDs. So this line should go away. >+ $prods{$product_id}->{"${field}s"} ||= []; >+ push(@{$prods{$product_id}->{"${field}s"}}, $obj); This annoys me a bit. If you can Bugzilla::Product->new_from_list($prod_ids, {preload => 1}) twice in the same script (or in a subroutine called by the script), then all fields are populated twice. We should detect that they are already populated, and preload() should return immediately, IMO.
Attachment #298838 -
Flags: review?(LpSolit) → review-
Comment 27•16 years ago
|
||
Using a script I checked the timings for the latest patch. Note: * 'auth' means the cgi script ran under a valid login (important for e.g. process_bug.cgi) * Each cgi script was ran 5 times, the timings are for the 5th run * Did not flush MySQL buffers / file/vm cache Suggest to use copy/pasting to ease comparison (e.g. compare auth-patched to auth).
Updated•16 years ago
|
Attachment #306920 -
Attachment mime type: text/plain → text/csv
Comment 28•16 years ago
|
||
Comment on attachment 298838 [details] [diff] [review] v5 >Index: Bugzilla/User.pm >+ $self->{selectable_products} = >+ Bugzilla::Product->new_from_list($prod_ids, { preload => 1 }); Note that the reason some pages are slower to load is because $user->can_see_product() calls get_selectable_products() which in turn unconditionally calls new_from_list() with preload = 1. I think the caller should decide if it wants to preload the data or not.
Assignee | ||
Comment 29•16 years ago
|
||
No, I'm against putting it too far up in the call chain, because that adds complexity. If there's a *significant* performance impact, I'd consider it, but for minor performance differences I don't think the complexity would be justified.
Comment 30•16 years ago
|
||
The timings from the .csv as a chart. I had to redo the perf test to generate these charts. However, figures should be more or less the same. Unfortunately I had to customize/hack pycha for this. I want to have some standard chart thing for Python that I don't need to hack. Colors need a bit of work too, plus the font sizes (pycha limitation). Best is to compare the 1st tick on the x axis with the 2nd, then the 3rd with the 4th. this way you compare anon with anon-patched and auth with auth-patched.
Comment 31•16 years ago
|
||
(In reply to comment #29) > complexity. If there's a *significant* performance impact, I'd consider it As shown in bkor's chart, editproducts.cgi is 4-5 times slower. That's a significant perf regression.
Assignee | ||
Comment 32•16 years ago
|
||
Hey bkor. What does the Y axis mean? Also, what's select_classification? LpSolit points out that, seemingly, editproducts.cgi is slower, but are we talking a difference of 0.1 seconds to 0.5 seconds? I'm not so concerned about the performance of admin pages with lots of products--they already perform badly just because of how much HTML is generated by them, which is a totally separate issue.
Comment 33•16 years ago
|
||
Y axis is in seconds. Note that my machine is pretty fast (Duo CPU E6750 @ 2.66GHz, 4GB memory). select_classification is editproducts.cgi without any arguments. This brings up a page to select the classification.
Comment 34•16 years ago
|
||
I think allowing calls of the form $user->get_selectable_products($class_id, {preload => 0}) is not a big deal. If the 2nd argument is undefined, then use preload => 1. This brings no complexity at all and can save us from useless SQL queries.
Assignee | ||
Comment 35•16 years ago
|
||
I bypassed the whole architecture question by forcing people to manually call preload() when they need it.
Attachment #298838 -
Attachment is obsolete: true
Attachment #311925 -
Flags: review?(LpSolit)
Comment 36•16 years ago
|
||
Comment on attachment 311925 [details] [diff] [review] v6 Seems to work fine, and the performance benefit should be the same as in the previous patch. My previous review comments still stand (except for new_from_list, which is no longer present in Product.pm). r=LpSolit with my previous review comments addressed.
Attachment #311925 -
Flags: review?(LpSolit) → review+
Updated•16 years ago
|
Summary: Bugzilla::Product::new_from_list should preload certain data to speed things up → Implement Bugzilla::Product::preload() to speed things up with many products
Assignee | ||
Comment 37•16 years ago
|
||
Okay, here's the version with your other review comments addressed, which were very good. Carrying forward r+.
Attachment #311925 -
Attachment is obsolete: true
Attachment #311967 -
Flags: review+
Assignee | ||
Comment 38•16 years ago
|
||
Checking in query.cgi; /cvsroot/mozilla/webtools/bugzilla/query.cgi,v <-- query.cgi new revision: 1.181; previous revision: 1.180 done Checking in Bugzilla/Object.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/Object.pm,v <-- Object.pm new revision: 1.23; previous revision: 1.22 done Checking in Bugzilla/Product.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/Product.pm,v <-- Product.pm new revision: 1.26; previous revision: 1.25 done
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Flags: approval+
Resolution: --- → FIXED
Summary: Implement Bugzilla::Product::preload() to speed things up with many products → Implement Bugzilla::Product::preload() to speed up query.cgi when there are many products
Comment 39•16 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•