Closed Bug 641633 Opened 14 years ago Closed 14 years ago

One long bzapi query blocks server

Categories

(Tree Management Graveyard :: OrangeFactor, defect, P1)

x86
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mcote, Unassigned)

Details

(Whiteboard: [sg:nse][ws:dos])

If you do a silly query like "NOT bug_id:655366" (which will cause woo_server to ask bzapi for the details on > 600000 bugs--don't do this on brasstacks please :), the entire server will freeze and all OF pages will block. Theoretically somewhere something should be multithreaded/multiprocess/whatever to avoid one long query from blocking the entire site...
As this is a potential DOS, marking security sensitive. Please fix as soon as possible. Even if it is just a band-aid to ensure we don't try to query and return 60,000 bugs.
Group: core-security
Whiteboard: [sg:nse][ws:dos]
Shouldn't this be filed against BzAPI? Fixing this on the brasstacks end seems to solve one instance of the problem, and leave the real bug in BzAPI open for the world to exploit...
Yeah, I was just gonna say, I think this has blown out of proportion (or, at least, the problem is elsewhere). There are two problems here: 1. A long query blocks orangefactor. After some investigation, I think our server is too naive; we need to fix that, ideally by getting WSGI working once and for all on brasstacks (everything else is proxied or FastCGI). 2. It's possible to ask orangefactor to return an arbitrarily long list of bugs. So you can go to, say, Bug Count, and effectively ask it to return the details of 600 000+ bugs (although woo_server only asks for a few pieces of info on each bug). Problem 1 is local to orangefactor and should (and will) be fixed. I guess you could call it a DOS attack because someone doing a long query could block orangefactor for everyone. However, it can't take down the whole server, and in fact this bug means you can only do one (albeit long) query on bzAPI at a time. Fixing this, however, means problem 2 will be more important, since then more than one long query will be able to be executed at the same time. However as Ehsan notes, this isn't a problem with orangefactor itself. Orangefactor provides a window to bzAPI--but anyone can replicate this by doing bzAPI queries directly, or for that matter just going to bugzilla.mozilla.org and typing "NOT bug_id:1" in the quicksearch box. If we're worried about a lot of people doing queries that return lots of data, then, as Ehsan said, the problem is with bzAPI and/or Bugzilla, not orangefactor, since you can repro this problem without orangefactor.
Should we move this bug to BzAPI as it is, then?
Well part 1 of this bug is still valid w.r.t. orangefactor. I guess the rest can be moved, but we would need to carefully consider exactly *what* the problem is. Queries that will return a huge number of bugs should not be executed? Or should return an error if it appears that too many bugs will be returned? In fact, as I mentioned, this isn't even specific to bzAPI. You can even do a crazy search like this directly in Bugzilla: go to the main page and type in "-bug_id:641592" (strangely, entering "NOT bug_id:641592" returns Zarro Boogs, not sure why). I didn't bother waiting, but it certainly didn't return quickly, so I imagine it is trying to compile a list of all 600 000+ bugs that aren't 641592.
The BzAPI server has a limited number of processes per API endpoint (as each one consumes memory). For API endpoints pointed at the production server, there are 10 processes. AIUI, each will wait for the request it is serving to finish. So if you manage to block them all with long-running requests, everyone else will have to wait. I can't significantly up this number without trying to get more RAM on the box, and it already has 4GB (which may be a limit, because I _think_ it's 32-bit Ubuntu). I could perhaps double it by reducing the number of processes for endpoints other than /latest (which most people seem to use). Bugzilla 4.2 will have a limit of 10,000 bugs returned in a query, and results are paged. The release notes for unstable 4.1.1 say: * By default, searches now only return 500 results. (You can click a link to see more.) Searches may also now never return more than 10,000 results. http://bugzillaupdate.wordpress.com/2011/03/14/bugzilla-4-1-1-development-release/ Gerv
This is still an issue, albeit not a serious one. Unfortunately the framework we're using doesn't like FastCGI, and all attempts at getting brasstacks's webserver to play nicely with WSGI have failed. We might be able to get some sort of proxy going. When we switch to a newer, better brasstacks, this should all be a nonissue, whenever that happens.
Priority: -- → P1
Fixed by switching to a proper back end (bug 658078). nginx now serves the static files itself, and we use a proper FastCGI connection to serve the REST API. I verified that you can still load the site while a very long bugzilla query is running in another tab.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Product: Testing → Tree Management
Group: core-security
Product: Tree Management → Tree Management Graveyard
You need to log in before you can comment on or make changes to this bug.