live changes made to download day for db load



11 years ago
8 years ago


(Reporter: xb95, Assigned: xb95)


Firefox Tracking Flags

(Not tracked)



(1 attachment)



11 years ago

I had to make a change to the Download Day site live to ensure that we would live through the night and tomorrow.

Summary, this code is fundamentally flawed:

   $obj = get_from_memcache
   if ( ! $obj ) {
      $obj = load_from_database
      cache_in_memcache( $obj, 60 )
   return $obj

Notably, in the web environment, every 60 seconds you are guaranteed at least 1 and at our scale more likely dozens or hundreds of those "load_from_database" calls are going to be made.

Given that you're caching the query because it is pretty heavy (GROUP BY on 1.5 million rows = non-trivial, relatively speaking) having the query run dozens to hundreds of times every time it expires is a recipe for disaster.  Indeed, the database has buckled a few times today under the weight of these queries.  Sadly I didn't really get around to investigating it until late at night.

There are several ways to get around this problem.  I believe I suggested a few weeks ago having a cron job that populates the memcache key.  Well, tonight I enacted that change, but I went slightly different.  First up:

   CREATE TABLE pledge_counts (
      country_code CHAR(2) NOT NULL PRIMARY KEY,

Then, I created this cron:

   0,5,10,15,20,25,30,35,40,45,50,55 * * * * root
      echo "REPLACE INTO pledge_counts (pledges, country_code)
            SELECT COUNT(pledges.country_code) AS 'pledges',
                   countries.country_code FROM countries
              LEFT JOIN pledges
                     ON pledges.country_code = countries.country_code
               GROUP BY countries.country_code"
      | mysql sfx_v2

(Except imagine it without the whitespace which I added for, hopefully, readability.)  Net result: every 5 minutes this REPLACE INTO is done which does the original query Download Day was doing and then puts the results in the table.

I then made a patch to Download Day, as attached, to use this new table.

The net result of these actions is that Download Day loads its values from a stable table that is really fast to query and the table is updated every few minutes.  This reduces database load to a mere fraction, prevents runaway queries, and is scalable out.  Sure, ideally we would ensure only one process is stuffing data into memcache, but it's near irrelevant with how fast the query is now.  It should only happen a few times instead of hundreds.

At any rate, I'm sorry to butt in like this, but I felt the change couldn't wait for a member of webdev to be around for.  At this point, I have NOT committed this to SVN or anything.  I would appreciate someone reviewing it and then committing it (or I can commit it) just to keep things on the up-and-up as much as possible.


Comment 1

11 years ago
Created attachment 325373 [details] [diff] [review]
Download Day pledge_counts patch
Component: Webdev → Download Day
OS: Mac OS X → All
Product: → Webtools
QA Contact: webdev → download-day
Hardware: PC → All
Assignee: nobody → mark


11 years ago
Attachment #325373 - Flags: review?(rdoherty)
Comment on attachment 325373 [details] [diff] [review]
Download Day pledge_counts patch

Gotta r- this because at 10am we'll stop using this table. We should take the curl request from models/downloads.class.php and cron it instead.
Attachment #325373 - Flags: review?(rdoherty) → review-
Pledge counts are no longer shown, but this idea should be used in the future. Closing.
Last Resolved: 11 years ago
Resolution: --- → WONTFIX


8 years ago
Component: Download Day → Download Day
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.