Closed Bug 557871 Opened 14 years ago Closed 14 years ago

Compare PHP vs. Python sphinx queries

Categories

(addons.mozilla.org Graveyard :: Search, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: clouserw, Assigned: davedash)

References

Details

(Whiteboard: [need:livepages][qa-])

We have a lot of queries time out when running them from the zamboni API pages.  This might indicate more complex/slower queries being built.  Can you compare the speed of the queries you're running in remora vs zamboni?  

Also, is there some sort of locking in sphinx?  Despite 1 second timeouts (I'm still not convinced this is actually working) sphinx can't respond fast enough and apache queries build up until apache dies.  If Sphinx has something locked and all the other queries are waiting on it that would help explain this.

This is blocking any further launches since we can't have apache crashing on us all the time so marking as a P1.  If we don't find anything with the remora/zamboni comparison we can schedule a call with the sphinx guys.
After profiling the sphinx functionality it becomes pretty clear the python sphinx library is the cause of all this. The script below, with one query, takes about 4 seconds to run and fully utilizes the CPU most of the time.  You'll notice, for example, it calls ord almost 500k times and does 6k memcache lookups for one query.  


How to profile:
python26 -m cProfile -s calls test_script.py

Script used (thanks clouserw):

from manage import settings
from search.client import Client as SearchClient, SearchError

query = 'test'
limit = 500


def go():
    sc = SearchClient()

    opts = {}

    try:
        results = sc.query(query, limit, **opts)
        print "Results: %s" % len(results)
    except SearchError:
        print "Fail! (could not connect to sphinx)"


if __name__ == "__main__":
    go()
wtf... thanks for this data - is this in my library or in sphinxapi.py ?
At the end of search we do a big query to get all the add-ons.  That query gets cached like all our queries, and caching 500 objects (the limit in the script) takes a while.

bug 545460 will let us skip the cache for these queries, and bug 557941 will make our memcache library almost 2x faster.

We should then test whether it's more efficient to do a big query that never gets cached or a bunch of id lookups that will cache well.
Whiteboard: [need:livepages]
Right now in my search branch we are doing the latter - a bunch of id lookups that will cache well.  I'm fairly confident that this is the best bet - we're saving 20-100 queries by doing it all at once, but we generate 200-1000 queries, so it's not a huge savings to do this in one query.  Not enough to justify the DB hit.

I ran the cProfile - in my environment with cache set to local and it seemed alright, otherwise - most of the time was spent doing mysql-ish things.
Status: NEW → ASSIGNED
Depends on: 557941
So we landed the search branch to pull addons singly - that should use the cache more efficiently.

There's a bug for IT to do some kernel based virtualization - https://bugzilla.mozilla.org/show_bug.cgi?id=556177 and we're waiting on IT to give us an ETA.

Meanwhile, better caching and other changes should help us diagnose this problem with more clarity.
Summary:  I don't think this is blocking 5.10 from launching.
We have a bunch of new code and the latest code is orders of magnitude faster (180+r/s).  If we still have this problem once that code hits, we'll investigate more.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [need:livepages] → [need:livepages][qa-]
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.