731672 - Disable bugzilla wikimo extension

Reporter

Description

•

12 years ago

The bugzilla wiki.m.o extension has the potential to deteriorate BMO's performance if used incorrectly. For example, invoking it with no id queries BMO for *all* bugs, loading the databases. A few such requests can quite easily slow down the db to a crawl affecting BMO as a whole.

The issue came up in Bug 731623 and Bug 731630. Bug 731664 is the corresponding bug for BMO to prevent such abusive queries. Until either the extension or BMO is fixed, please disable the extension so that it doesn't cause further issues.

Jake Maul [:jakem]

Comment 1

•

12 years ago

I have disabled the extension.

CC'ing some folks that might be able to work on the extension itself.

Brandon Savage [:brandon]

Comment 2

•

12 years ago

Wasn't this issue addressed in https://bugzilla.mozilla.org/show_bug.cgi?id=721366#c9 ?

Jake Maul [:jakem]

Comment 3

•

12 years ago

Sort of.... that comment was specifically concerned with the affects these queries would have on wikimo. In addition to that, poorly-formed queries also result in problems on *Bugzilla's* end. That escalates this from a "breaks your own wiki page" problem to a "makes Bugzilla slow for everyone" problem.

In addition, that comment more or less asserts that it shouldn't be a problem in the short term due to minimal initial usage, and that we can educate users about the pitfalls. It definitely did become a problem today, on multiple pages.


There is a separate bug open to change Bugzilla's behavior to deny this particular sort of global query, which should help in that respect. However, that won't help for queries that are not global, just very very broad.

It is my feeling that mediawiki-bugzilla perhaps should do some sort of sanity-checking of the queries it sends to Bugzilla... both for the health of Bugzilla, and functionality of the wikimo page itself.

Obviously the extension can't know how many bugs a given search might return from bugzilla, but I think some accurate guesses could be made. For instance:

specifying no limiting constraints at all ("return all bugs")
specifying only a product
specifying only a status
specifying a product and component, but not a status (perhaps allowable with some sort of override mechanism?)


These changes might not eliminate the problem entirely - we can still get overly-broad queries resulting in blank pages and high Bugzilla load - but they would eliminate the biggest offenders.


I think we have legitimate reason to be concerned here: this hasn't been installed for very long, and even with relatively minimal usage it's already caused significant problems, for users and IT.

Brandon Savage [:brandon]

Comment 4

•

12 years ago

Your argument makes sense, and you have a valid point.

It would seem that the correct place to resolve this issue is in the BMO application. Attempting to teach the extension what queries might harm BMO would be difficult at best. Also, since the purpose of an API is to abstract away the interworkings of an application and allow outside access, the application should have reasonable checks that the data requested is reasonable.

Also, I believe that we ought to revise the way that the table requests data, perhaps making an AJAX request after page load to prevent time out issues, or spawning a separate process that can be monitored. I'll investigate these options.

:glob ✱

Comment 5

•

12 years ago

(In reply to Brandon Savage [:brandon] from comment #4)
> It would seem that the correct place to resolve this issue is in the BMO
> application.

BMO shouldn't block queries because they may return a large number of results.  there's a lot of legitimate reasons for making broad requests.

:glob ✱

Comment 6

•

12 years ago

is it possible for the wiki extension to avoid querying bmo for every pageload?
ie. don't update the data if it was last updated within the last X minutes?

Brandon Savage [:brandon]

Comment 7

•

12 years ago

The extension already caches the results for an extended period of time.

Why shouldn't an application protect itself against people requesting too much data from it's API? I don't know of any application (Foursquare, Twitter, Facebook) that would let users make requests that could potentially take down the service.

:glob ✱

Comment 8

•

12 years ago

(In reply to Brandon Savage [:brandon] from comment #7)
> The extension already caches the results for an extended period of time.

excellent :)  how long?

> Why shouldn't an application protect itself against people requesting too
> much data from it's API?

there's a few factors which make things more complicated

there are legitimate reasons for running searching that result in large resultsets from the normal bugzilla ui.

the api you're using, bzapi, isn't actually part of bugzilla itself, it's a proxy in front of it which we don't directly control, and it hits the normal buglist.cgi query.  only the latest version presents itself with an identifiable user-agent making rate limiting difficult.

it's also difficult to know how many bugs will be returned without executing the query, which is the act that is causing the impact on the systems.


i've asked for a copy of the apache access logs, which i'll be able to use to see what requests are actually being made against bmo and at what frequency.

i suspect that the main cause of issue is the open-ended query, which you can't perform with the web ui, but was easy to accidentally craft with the wiki plugin.

Lawrence Mandel [:lmandel] (use needinfo)

Comment 9

•

12 years ago

(In reply to Jake Maul [:jakem] from comment #3)
> In addition, that comment more or less asserts that it shouldn't be a
> problem in the short term due to minimal initial usage, and that we can
> educate users about the pitfalls. It definitely did become a problem today,
> on multiple pages.

It seems there was a lot of pent up demand for this integration. There are a number of people who have been in touch about setting up pages that include a number of queries. There are a lot of people experimenting right now, which is likely a good cause of malformed/broad queries being submitted to Bugzilla.

> I think we have legitimate reason to be concerned here: this hasn't been
> installed for very long, and even with relatively minimal usage it's already
> caused significant problems, for users and IT.

Yes. We need to address these issues. Given the uptake in this short amount of time and the value that this extension has already had for the release management, program management, and security teams I think we really need to do what we can to reenable this functionality as soon as we can.

(In reply to Brandon Savage [:brandon] from comment #4)
> Also, I believe that we ought to revise the way that the table requests
> data, perhaps making an AJAX request after page load to prevent time out
> issues, or spawning a separate process that can be monitored. I'll
> investigate these options.
Is there a bug for this investigation? Can we send multiple queries in parallel - without taking down Bugzilla?

(In reply to Byron Jones ‹:glob› from comment #5)
> BMO shouldn't block queries because they may return a large number of
> results.  there's a lot of legitimate reasons for making broad requests.
Are there known restrictions on the number of concurrent queries that Bugzilla can handle? Can we create a Bugzilla mirror that is specific for REST API requests? Is there specific client validation in the Web interface that the API or extension can/should implement?

:glob ✱

Comment 10

•

12 years ago

> > BMO shouldn't block queries because they may return a large number of
> > results.  there's a lot of legitimate reasons for making broad requests.
>
> Are there known restrictions on the number of concurrent queries that
> Bugzilla can handle?

no, however it's a very hard question to answer because different queries place different loads on different parts of the system. 

> Can we create a Bugzilla mirror that is specific for REST API requests?

that's really a question for infrastructure.  it seems like it would be less work to add some checks to the wiki extension that deploy another bmo cluster.

> Is there specific client validation in the Web interface that the API or extension
> can/should implement?

sorry, i don't know what you're asking.

Lawrence Mandel [:lmandel] (use needinfo)

Comment 11

•

12 years ago

(In reply to Byron Jones ‹:glob› from comment #10)
> 
> > Can we create a Bugzilla mirror that is specific for REST API requests?
> 
> that's really a question for infrastructure.  it seems like it would be less
> work to add some checks to the wiki extension that deploy another bmo
> cluster.

Agreed. Just looking for options in case we need them.

> 
> > Is there specific client validation in the Web interface that the API or extension
> > can/should implement?
> 
> sorry, i don't know what you're asking.

I was referring to the last part of your comment 8, that the web ui does not allow an open-ended query. Are there other restrictions that the web ui places on queries that we can perhaps duplicate in the extension?

:glob ✱

Comment 12

•

12 years ago

(In reply to Lawrence Mandel [:lmandel] from comment #11)
> I was referring to the last part of your comment 8, that the web ui does not
> allow an open-ended query. Are there other restrictions that the web ui
> places on queries that we can perhaps duplicate in the extension?

oh, right :)  nope, that's the only one we impose, and it's now applied to bzapi initiated requests too.

Lawrence Mandel [:lmandel] (use needinfo)

Comment 13

•

12 years ago

OK. So at least this angle is covered and requires no changes in the extension.

Brandon W Maister

Comment 14

•

12 years ago

In my experience, the standard way to get around API queries that could take down a service is to limit and paginate by default.

I have no idea how bzapi works, but it seems like that would solve most of these problems. Of course, I think it would require a new backwards-incompatible version of the API.

LegNeato

Comment 15

•

12 years ago

I'm also not sure how we could limit this to known "good" queries. I could say find everything in the "Foo" component, and that could be 1 bug one day and thousands of bugs the next. I could say "everything changed in the last day", and in the morning and weekends that would be a quick query but in the evenings on weekdays it would be fairly expensive.

LegNeato

Comment 16

•

12 years ago

> (In reply to Brandon Savage [:brandon] from comment #4)
> > Also, I believe that we ought to revise the way that the table requests
> > data, perhaps making an AJAX request after page load to prevent time out
> > issues, or spawning a separate process that can be monitored. I'll
> > investigate these options.
> Is there a bug for this investigation? Can we send multiple queries in
> parallel - without taking down Bugzilla?

MediaWiki has a job queue which the extension uses and I have code lying around here somewhere to do all the queries via ajax using https://github.com/harthur/bz.js FWIW. Not sure either would help

Brandon Savage [:brandon]

Comment 17

•

12 years ago

(In reply to Christian Legnitto [:LegNeato] from comment #16)
> MediaWiki has a job queue which the extension uses and I have code lying
> around here somewhere to do all the queries via ajax using
> https://github.com/harthur/bz.js FWIW. Not sure either would help

I pulled the job queue out early on in my development. The plugin was not loading data immediately, causing the first request to appear to fail. However, if using the job queue would improve the extension's stability I can reenable it.

LegNeato

Comment 18

•

12 years ago

Ah, yeah. I'm not sure it would help quite frankly, and could even hurt (more surges / concurrent queries).

We could also do stuff like:

* Store the amount of time the query took with the cache record, multiply that by some constant refresh time to make it so longer queries are refreshed less often

* Only request the ids for every query, and then padinate on the client and via ajax pull in detailed information as needed

LegNeato

Comment 19

•

12 years ago

*paginate, sigh. Long day :-P

:glob ✱

Comment 20

•

12 years ago

(In reply to Christian Legnitto [:LegNeato] from comment #18)
> * Only request the ids for every query, and then paginate on the client and
> via ajax pull in detailed information as needed

unfortunately this won't help with the load on bmo.

bzapi calls buglist.cgi to get the search results, and always requests all columns, so the database still has to do all the heavy lifting even if you only request the bug's id (see http://hg.mozilla.org/webtools/bzapi/file/90ea2d3966bf/lib/Bugzilla/API/Model/Bug.pm#l68)

pagination on sorted results doesn't relieve the amount of work the database has to do either :(

Gervase Markham [:gerv]

Comment 21

•

12 years ago

I've only just found this bug. If changes are needed to BzAPI behaviour to improve matters, let me know. 

If it had a significant effect on Bugzilla load, we could look at limiting the columns requested in some common cases. The problem is that we have to support custom fields, which have arbitrary names, so we can't necessarily hard-code a column list for everything. Still, if someone just wanted a list of IDs, or some other common pattern, we can detect that.

Which columns lead to extra joins and extra work?

Gerv

Brandon Savage [:brandon]

Comment 22

•

12 years ago

(In reply to Byron Jones ‹:glob› from comment #20)
> (In reply to Christian Legnitto [:LegNeato] from comment #18)
> > * Only request the ids for every query, and then paginate on the client and
> > via ajax pull in detailed information as needed
> 
> unfortunately this won't help with the load on bmo.
> 
> bzapi calls buglist.cgi to get the search results, and always requests all
> columns, so the database still has to do all the heavy lifting even if you
> only request the bug's id (see
> http://hg.mozilla.org/webtools/bzapi/file/90ea2d3966bf/lib/Bugzilla/API/
> Model/Bug.pm#l68)
> 
> pagination on sorted results doesn't relieve the amount of work the database
> has to do either :(

So if the same amount of work will be done by BMO regardless of what we request or how we request it, then it doesn't sound like much can be done on the extension's side to improve, reduce or otherwise help solve this issue. What am I missing here?

Obviously we can enforce users not making some of the more challenging queries (like All Bugs), but we can't accurately predict the total number of bugs returned (e.g. All Firefox Bugs) so we need to find some way to reduce the workload on Bugzilla to manageable levels. 

@Byron, have you had success getting the access logs yet? I'm keen to know exactly what is causing long request times against Bugzilla.

:glob ✱

Comment 23

•

12 years ago

(In reply to Brandon Savage [:brandon] from comment #22)
> So if the same amount of work will be done by BMO regardless of what we
> request or how we request it, then it doesn't sound like much can be done on
> the extension's side to improve, reduce or otherwise help solve this issue.
> What am I missing here?

the search criteria has an impact, but not what columns you request.

my comment was in response to "only request the ids.." .. only requesting the ids won't make anything better.

> Byron, have you had success getting the access logs yet? I'm keen to know
> exactly what is causing long request times against Bugzilla.

no (bug 731994).

(In reply to Gervase Markham [:gerv] from comment #21)
> I've only just found this bug. If changes are needed to BzAPI behaviour to
> improve matters, let me know. 

sorry gerv, we're still in the discovery phase here; i didn't want to suggest bzapi changes until we know they would help.

> Which columns lead to extra joins and extra work?

i'm waiting on the logs in order to determine exactly what queries were made -- if you have logs from bzapi that would help a lot too.

there were massive loads on the bmo db slaves (the load was over 30 for an extended period of time).

we know for certain that during this time requests were being made without any search criteria.  on my dev system a single open-ended request is enough to cause it much pain, so a few refreshes of a page which is triggering these requests would quickly be detrimental to bmo's health.  this hole has already been closed in bug 731664.

what is unclear is if these requests alone were the cause of the issues; this is an question i'm hoping the logs will provide some answers to.

Gervase Markham [:gerv]

Comment 24

•

12 years ago

The BzAPI logs I have store both the full queries being sent to BzAPI and the corresponding URL sent to Bugzilla. The logs run to about 42MB per day, uncompressed. If you were able to give me a date and time range, plus an endpoint (/0.9, /1.0 and/or /latest - what does the extension use?) I could extract the logs for you.

I don't store originator IP information in my logs; I may be able to get data out of the webserver logs, and you may be able to correlate it if you want to find requests specifically sent by the MediaWiki extension.

Gerv

Gervase Markham [:gerv]

Comment 25

•

12 years ago

This query:

https://bugzilla.mozilla.org/buglist.cgi?keywords=sec-review-needed&keywords_type=allwords&status_whiteboard=%5Bsecr%3Acurtisk%5D&status_whiteboard_type=substring&columnlist=all&ctype=csv

got executed 852 times in 15 hours on Feb 27th. 

Was this the mediawiki extension, or does curtis have some personal script which has gone wild? :-)

Gerv

Curtis Koenig [:curtisk-use curtis.koenig+bzATgmail.com]]

Comment 26

•

12 years ago

I believe this is the extension. I used this query
<bugzilla>
{
"keywords":  "sec-review-needed",
"whiteboard": "[secr:curtisk]",
"whiteboard_type": "contains"
}
</bugzilla>

on this page https://wiki.mozilla.org/Platform/2012-02-28

And was testing this query
<bugzilla type="count" display="bar">
    {
        "keywords":"sec-review-needed",
	"whiteboard":"[secr:curtisk]",
	"whiteboard_type":"contains",
        "x_axis_field":"whiteboard"
    }
</bugzilla>
on this page https://wiki.mozilla.org/Security/Radar/test

The first query is to find all bugs that have unscheduled security reviews. The page it is on is hit hard during the meeting so that could part of it. 

The second query is a test page of mine I have not publicized to test out some features of the extension so I can learn how it works. Neither of these seems overly broad to me.

Lawrence Mandel [:lmandel] (use needinfo)

Comment 27

•

12 years ago

(In reply to Gervase Markham [:gerv] from comment #25)
> This query:
> 
> https://bugzilla.mozilla.org/buglist.cgi?keywords=sec-review-
> needed&keywords_type=allwords&status_whiteboard=%5Bsecr%3Acurtisk%5D&status_w
> hiteboard_type=substring&columnlist=all&ctype=csv
> 
> got executed 852 times in 15 hours on Feb 27th. 

That seems excessive given that the extension has caching built in. A cached entry is currently set to live for 5 minutes. This query should only be executed a maximum of ~180 times (there may be a few more requests if there are concurrent page loads) in 15 hours. Is the cache functioning correctly?

If we increase the cache time to 15 minutes and provide a refresh option (so that one person on the secreview meeting and similar other meetings can refresh) that should further cut down on the number of queries hitting Bugzilla.

Further questions are:
1. Does this query produce significant load on Bugzilla? 
2. Is the issue with the large number of queries or, as Byron has suggested, is the issue with a few very broad queries?

Laura Thomson :laura

Comment 28

•

12 years ago

I suggest we add a separate bzapi call that includes pagination?  That would avoid problems with BC and solve this issue if the extension moved to using it.

:glob ✱

Comment 29

•

12 years ago

(In reply to Laura Thomson :laura from comment #28)
> I suggest we add a separate bzapi call that includes pagination?  That would
> avoid problems with BC and solve this issue if the extension moved to using
> it.

pagination doesn't do anything to relieve load placed on the database server -- it still has to fetch all rows, sort them, and only then return a limited subset.

also bzapi is a proxy in front of bmo -- when you perform a search using bzapi, you're hitting a service on a community server, which then executes a normal buglist.cgi query.  as bugzilla doesn't support pagination, changing bzapi won't help the load placed on bmo.  adding pagination to bugzilla is non-trivial, and won't help to alleviate db load issues.

:glob ✱

Comment 30

•

12 years ago

(In reply to Gervase Markham [:gerv] from comment #24)
> The BzAPI logs I have store both the full queries being sent to BzAPI and
> the corresponding URL sent to Bugzilla. The logs run to about 42MB per day,
> uncompressed. If you were able to give me a date and time range, plus an
> endpoint (/0.9, /1.0 and/or /latest - what does the extension use?) I could
> extract the logs for you.

the ext was enabled on 2012-02-21 pst, not sure of the exact time.
disabled 2012-02-29 pst (comment 2).
i believe it uses /latest

Gervase Markham [:gerv]

Comment 31

•

12 years ago

Feb 28 07:43:48 [INFO] https://api-dev.bugzilla.mozilla.org/latest/bug?keywords=sec-review-needed&whiteboard=%5Bsecr%3Acurtisk%5D&whiteboard_type=contains GET => https://bugzilla.mozilla.org/buglist.cgi?keywords=sec-review-needed&keywords_type=allwords&status_whiteboard=%5Bsecr%3Acurtisk%5D&status_whiteboard_type=substring&columnlist=all&ctype=csv

Feb 28 07:45:15 [INFO] https://api-dev.bugzilla.mozilla.org/latest/bug?keywords=sec-review-needed&whiteboard=%5Bsecr%3Acurtisk%5D&whiteboard_type=contains GET => https://bugzilla.mozilla.org/buglist.cgi?keywords=sec-review-needed&keywords_type=allwords&status_whiteboard=%5Bsecr%3Acurtisk%5D&status_whiteboard_type=substring&columnlist=all&ctype=csv

Feb 28 07:45:26 [INFO] https://api-dev.bugzilla.mozilla.org/latest/bug?keywords=sec-review-needed&whiteboard=%5Bsecr%3Acurtisk%5D&whiteboard_type=contains GET => https://bugzilla.mozilla.org/buglist.cgi?keywords=sec-review-needed&keywords_type=allwords&status_whiteboard=%5Bsecr%3Acurtisk%5D&status_whiteboard_type=substring&columnlist=all&ctype=csv

If it's supposed to request a query a maximum of once every 5 minutes, it seems to me like the caching isn't working properly...

Gerv

Brandon Savage [:brandon]

Comment 32

•

12 years ago

(In reply to Lawrence Mandel [:lmandel] from comment #27)
> 
> That seems excessive given that the extension has caching built in. A cached
> entry is currently set to live for 5 minutes. This query should only be
> executed a maximum of ~180 times (there may be a few more requests if there
> are concurrent page loads) in 15 hours. Is the cache functioning correctly?

Taking a look at the configuration in the repo, caching is disabled by default. I'll patch it to change this to enabled by default.

Gervase Markham [:gerv]

Comment 33

•

12 years ago

Additionally, something to do with "autoland" is doing the following regexp search over the status whiteboard twice a minute, every minute, 24 hours a day:

Feb 28 07:54:11 [INFO] https://api-dev.bugzilla.mozilla.org/latest/bug/?whiteboard=\[autoland.*\]&whiteboard_type=regex&include_fields=id,whiteboard&username=release@mozilla.com&password=XXXX GET => https://bugzilla.mozilla.org/buglist.cgi?status_whiteboard=%5C%5Bautoland.*%5C%5D&status_whiteboard_type=regexp&columnlist=all&ctype=csv&Bugzilla_login=release%40mozilla.com&Bugzilla_password=XXXX

Each time it does it, it takes Bugzilla 8 seconds to return results. Perhaps we could encourage the maintainers of that system to back off a little, or switch to a keyword?

Also, this sort of query would be prime for some code to reduce the number of columns asked for, if we thought that would reduce the load on the database server.

Gerv

Lawrence Mandel [:lmandel] (use needinfo)

Comment 34

•

12 years ago

(In reply to Brandon Savage [:brandon] from comment #32)
> (In reply to Lawrence Mandel [:lmandel] from comment #27)
> > 
> > That seems excessive given that the extension has caching built in. A cached
> > entry is currently set to live for 5 minutes. This query should only be
> > executed a maximum of ~180 times (there may be a few more requests if there
> > are concurrent page loads) in 15 hours. Is the cache functioning correctly?
> 
> Taking a look at the configuration in the repo, caching is disabled by
> default. I'll patch it to change this to enabled by default.

Good find. This should alleviate some of the pain. Let's start by enabling the cache. If need be we can look to tweak the cache time (currently set to 5 minutes) after we see the affect of simply enabling it.

Gerv/Byron - Do you have any more information about the specific queries that are executed and the execution times so that we can provide more guidance on queries that should not be used or, if need be, provide some level of filtering? Are there any notable outliers that take excessive time to complete?

Brandon Savage [:brandon]

Comment 35

•

12 years ago

The caching issue has been fixed on master.

It's my typical practice to disable things like caching, especially where they might write to a user's database. Users should get to decide for themselves if the use a particular feature. Since this plugin was meant to be released to the general public as well as used internally, I made the assumption that IT would review the configuration to determine the optimal conflagration for their environment. I would recommend that we do this again before we reenable this plugin.

I also recommend we up the cache time to ten minutes for our implementation. Sure, you might miss a few bugs but this plugin isn't supposed to be a replacement for Bugzilla itself.

Josh Matthews [:jdm]

Comment 36

•

12 years ago

(In reply to Gervase Markham [:gerv] from comment #33)
> Additionally, something to do with "autoland" is doing the following regexp
> search over the status whiteboard twice a minute, every minute, 24 hours a
> day:
> 
> Feb 28 07:54:11 [INFO]
> https://api-dev.bugzilla.mozilla.org/latest/bug/?whiteboard=\[autoland.
> *\]&whiteboard_type=regex&include_fields=id,
> whiteboard&username=release@mozilla.com&password=XXXX GET =>
> https://bugzilla.mozilla.org/buglist.cgi?status_whiteboard=%5C%5Bautoland.
> *%5C%5D&status_whiteboard_type=regexp&columnlist=all&ctype=csv&Bugzilla_login
> =release%40mozilla.com&Bugzilla_password=XXXX
> 
> Each time it does it, it takes Bugzilla 8 seconds to return results. Perhaps
> we could encourage the maintainers of that system to back off a little, or
> switch to a keyword?
> 
> Also, this sort of query would be prime for some code to reduce the number
> of columns asked for, if we thought that would reduce the load on the
> database server.
> 
> Gerv

I'm contacting the autoland team about the possibility of switching to use Pulse as a source for this data, which seems like a service perfectly aligned with their use-case.

Lawrence Mandel [:lmandel] (use needinfo)

Comment 37

•

12 years ago

(In reply to Brandon Savage [:brandon] from comment #35)
> I also recommend we up the cache time to ten minutes for our implementation.
> Sure, you might miss a few bugs but this plugin isn't supposed to be a
> replacement for Bugzilla itself.

I also think that upping the cache time to ten minutes makes a lot of sense. One of the primary uses of this plug-in is in meeting status pages. This category of page will be hit in many cases by a large number of people over a ten minute period as the meeting starts. Not only will increasing the cache reduce the load but it should ensure that everyone is receiving the same information. Unless there are any detractors I would just make this change at this point.

Jake - Are you the right person to provide an IT review of the configuration?

Gervase Markham [:gerv]

Comment 38

•

12 years ago

(In reply to Brandon Savage [:brandon] from comment #35)
> It's my typical practice to disable things like caching, especially where
> they might write to a user's database. Users should get to decide for
> I made the
> assumption that IT would review the configuration to determine the optimal
> conflagration for their environment. 

It seems to me that "optimal conflagration" is a good description of what happened... :-)

I'm working on improvements to BzAPI so it's smarter about which columns it requests, which might well ease the load.

Gerv

Lawrence Mandel [:lmandel] (use needinfo)

Comment 39

•

12 years ago

(In reply to Gervase Markham [:gerv] from comment #38)
> I'm working on improvements to BzAPI so it's smarter about which columns it
> requests, which might well ease the load.

From Byron's comment 20, I don't think changing the columns that BzAPI requests will have any impact on the load.

Gervase Markham [:gerv]

Comment 40

•

12 years ago

Comment 20 was about _you_ telling BzAPI which columns you want. He is right that, without changes to BzAPI, that doesn't help. But if you do that, and I then make BzAPI smarter so that it requests less data from Bugzilla, then that will definitely help with the load. Compare the amount of table joins in these two queries:

All columns:
https://bugzilla.mozilla.org/buglist.cgi?keywords=sec-review-needed;keywords_type=allwords;status_whiteboard=[secr%3Acurtisk];status_whiteboard_type=substring;columnlist=all;list_id=2515990&debug=1

2 specified columns:
https://bugzilla.mozilla.org/buglist.cgi?keywords=sec-review-needed;keywords_type=allwords;status_whiteboard=[secr%3Acurtisk];status_whiteboard_type=substring;columnlist=bug_id;list_id=2515990&debug=1

Gerv

Lawrence Mandel [:lmandel] (use needinfo)

Comment 41

•

12 years ago

I see. Thanks for the clarification. The extension currently displays id, summary, status, and priority. We can limit to these columns by default.

Curtis Koenig [:curtisk-use curtis.koenig+bzATgmail.com]]

Comment 42

•

12 years ago

I would like the ability to pick columns explicitly (have not figured that out). Is cacheing or throttling of how often not able to be goverened when larger sets are requested?

Gervase Markham [:gerv]

Comment 43

•

12 years ago

Sorry, Curtis, you lost me :-) But the field control stuff for BzAPI is here:
https://wiki.mozilla.org/Bugzilla:REST_API#Field_Control

Gerv

Jake Maul [:jakem]

Comment 44

•

12 years ago

I can review the config to see if it makes sense. I'm looking at the bottom of Bugzilla.php, in the "Default Settings" section. Let me know if there's anywhere else to look... I didn't see a "config.inc.php" or similar.

The copy I have in staging (wiki.allizom.org) and in prod already has this set:

$wgBugzillaUseCache    = TRUE;

However there is also this:

$wgCacheObject = 'BugzillaCacheDummy';

There is no description of what this setting does, so I can't tell if that's a sane value or not.


How does the caching work? I ask because there are 6 web servers in the cluster that serves wiki.mozilla.org... when it moves to PHX1 in the next couple months, it will be about *100* servers. Unless the cache is in Bugilla or some separate thing (file cache on an NFS volume, memcache, redis, etc), it will be sub-optimal.

Brandon Savage [:brandon]

Comment 45

•

12 years ago

It writes to the database. However, I wrote the cache to be easily changed to something else. I can write a memcache option for you if you need.

LegNeato

Comment 46

•

12 years ago

(In reply to Curtis Koenig [:curtisk] from comment #42)
> I would like the ability to pick columns explicitly (have not figured that
> out). Is cacheing or throttling of how often not able to be goverened when
> larger sets are requested?

The UI doesn't currently show columns you specify in include_fields (AFAIK). On the ol' TODO.

In any case, outside the scope of this bug.

Lawrence Mandel [:lmandel] (use needinfo)

Comment 47

•

12 years ago

(In reply to Brandon Savage [:brandon] from comment #45)
> It writes to the database. However, I wrote the cache to be easily changed
> to something else. I can write a memcache option for you if you need.

Jake - Does the current mechanism (writing to the database) work in the wikimo cluster environment? (i.e. Is there a single database?) Do you need Brandon to work on a memcache or other option for caching to be effective?

Jake Maul [:jakem]

Comment 48

•

12 years ago

There is a single shared database, so caching there is fine. It's rather funny to me that we would query one database and cache the results into another.... I understand the difference and it makes perfectly good sense, but I still find it amusing. :)

If you want to rewrite it using Memcache, that should be fine... wiki.mozilla.org already has config for that and uses it elsewhere:
$wgUseMemCached = true;
$wgMemCachedServers = array( <servers> );

But I don't see that it matters a lot, unless we start having performance problems with the the wikimo database due to this caching... which seems unlikely.


As for the DB caching that currently exists... what does it take to enable it? As far as I can tell it didn't make a table. Can we put the setup info in the README somewhere?

Lawrence Mandel [:lmandel] (use needinfo)

Comment 49

•

12 years ago

(In reply to Jake Maul [:jakem] from comment #48)
>As for the DB caching that currently exists... what does it take to enable it? As >far as I can tell it didn't make a table. Can we put the setup info in the README >somewhere?
>The DB caching does require a new table be created. From the readme,

From the readme,

"Run the MediaWiki update script to create the cache database table php /var/lib/mediawiki/maintenance/update.php. *Note that you may need to add $wgDBadminuser and $wgDBadminpassword to /etc/mediawiki/LocalSettings.php depending on your MediaWiki version"

Do you need more details?

Jake Maul [:jakem]

Comment 50

•

12 years ago

Ah, I must have missed that on prod and only done it on stage. The bugzilla_cache table exists now.

How is this table maintained? Specifically:

How/when are expired rows removed?
How is the 'key' generated? Are multiple servers likely to generate the same key for the same query?

Lawrence Mandel [:lmandel] (use needinfo)

Comment 51

•

12 years ago

(In reply to Jake Maul [:jakem] from comment #50)
> How is this table maintained? Specifically:
> 
> How/when are expired rows removed?

I don't know that any maintenance is performed to clear expired entries. As cached entries have a short life I would suggest that we can simply wipe the table every day at a set time. (Some time off peak PST.)

> How is the 'key' generated? Are multiple servers likely to generate the same
> key for the same query?

Yes. The key is generated by serializing the query parameters and creating a sha1 hash of the results. See the function _generate_id() in
https://github.com/mozilla/mediawiki-bugzilla/blob/master/BugzillaQuery.class.php

Brandon Savage [:brandon]

Comment 52

•

12 years ago

The key is generated based on the request itself. In any event, the cache is designed to REPLACE the row, so it won't error out. Since the key will be generated the same across all servers, the single database should serve our purposes of providing cached results to all servers.

Expired cache results are expired on request. So, if the application identifies a cache result that it determines to be expired, it removes it from the database and acts as though it was never found. The application then regenerates the data from Bugzilla, and caches it again.

Lawrence Mandel [:lmandel] (use needinfo)

Comment 53

•

12 years ago

Brandon, my understanding is that one off queries (say, while someone is designing a page) or queries that change over time (like release management queries targeting specific Firefox versions) will never get expunged with the current mechanism. Is that correct?

If there isn't much concern about the size of the table we can look to cleanup on a less frequent schedule such as weekly or monthly to ensure the table doesn't continue to grow unchecked. I'll leave it up to IT to manage the table cleanup schedule.

Jake Maul [:jakem]

Comment 54

•

12 years ago

Excellent... sounds mostly self-maintaining. I can see how we might want to wipe out old entries occasionally (as lmandel pointed out, queries that stop getting used), but given how you've described it I can't see that being significant for many months, if not years. I expect rows would build up pretty slowly over time.


With the caching in place (set to 15 minute TTL) and the BzAPI fix disallowing open-ended queries, I'm comfortable re-enabling this again to see how it goes. I will do that now and close this bug out.


I'm definitely looking forward to future versions of BzAPI and this plugin that allow for column specification. I think that might end up helping considerably, by reducing JOIN work on the Bugzilla database servers, and on bandwidth from them to Bugzilla to BzAPI to wikimo. I don't know if there are bugs for this yet or not. It seems to me that BzAPI would need to go first before the wikimo extension could be updated to make use of column specification... unless we already know what the proper syntax would be, in which case development could be in parallel. :)

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Jake Maul [:jakem]

Comment 55

•

12 years ago

Hmm... the table exists, but isn't getting data when I visit pages that use the extension.

Is there some other thing that needs changed to make caching work, other than this one:
$wgBugzillaUseCache    = TRUE;

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Brandon Savage [:brandon]

Comment 56

•

12 years ago

Did you also switch out the dummy cache for BugzillaCacheMysql? See the config in master for more.https://github.com/mozilla/mediawiki-bugzilla/blob/master/Bugzilla.php#L147

Jake Maul [:jakem]

Comment 57

•

12 years ago

Ah, okay... that's what I was getting at with comment 44... I didn't know what that should be set to. With it changed I see things in the table now. Thanks!

Status: REOPENED → RESOLVED

Closed: 12 years ago → 12 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

11 years ago

Component: Server Operations: Web Operations → WebOps: Other

Product: mozilla.org → Infrastructure & Operations

BMO Automation

Updated

•

5 years ago

Product: Infrastructure & Operations → Infrastructure & Operations Graveyard