Closed Bug 456687 Opened 12 years ago Closed 12 years ago

addons stats script unkind to databases

Categories

(addons.mozilla.org Graveyard :: Statistics, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: xb95, Assigned: morgamic)

References

Details

Attachments

(2 files, 3 obsolete files)

This query

     UPDATE download_counts SET count=count+1 WHERE addon_id=? AND date=?

needs to be done less often (i.e., +100 instead of +1 and do it 1/100th as often, etc).  It's part of the stats script and it causes large amounts of replication lag and general pain on the database slaves whenever it runs.

Please let me know if you need more information, graphs, supporting details, or whatever.  Thank you!  :)
23:18 <@nagios> tm-amo01-slave02:MySQL Replication Lag is CRITICAL:
                Replication lag: 1915 seconds

I am not intimately familiar with what exactly AMO uses slaves for, so I am not sure of the impact replication lag will have on users of this particular site.  But as a general rule it's bad and should be avoided for production slaves.

Additionally, this is causing the on-call sysadmin to be paged multiple times every night while the statistics script is running.
00:13 <@nagios> tm-amo01-slave02:MySQL Replication Lag is CRITICAL:
                Replication lag: 3116 seconds

It's paged the on-call five times so far this evening.  Will page a few more times before it catches up in an hour or two.
Blocks: 455655
Assignee: nobody → morgamic
Attached patch v1, mark's proposed patch (obsolete) — Splinter Review
I'm on the hook to review and test this change to try to move count updates to batches.
Attachment #340673 - Flags: review?(morgamic)
OS: Windows Vista → All
Hardware: PC → All
-> 4.0.2, since it is urgent
Target Milestone: --- → 4.0.2
Mark - which server has these logs?  Are you already running this in prod?
Hey -- tried logging into dm-stats01, no response.  dm-stats02 doesn't work either.  If I could grab a couple sample log files I could verify output between both versions of the script.
Nevermind, just can't get there from khan (that's annoying).  So, I've got some sample log files and am testing the patch on khan -- will post results when it's done on both versions.
Also, future reference:
https://wiki.mozilla.org/Update:Developers/Statistics

Explains how to use the scripts, etc. (thanks fligtar)
Pass 1, current revision (unpatched):
[morgamic@khan parse_logs]$ /usr/bin/php -f parse_logs.php logs=/home/morgamic/logs temp=/tmp type=downloads geo=SJ

---------- [ BEGIN ACCESS LOG PARSING FOR SJ] ----------

---------- [ Copying /home/morgamic/logs/access_2008-10-13-19.0.gz ] ----------

---------- [ Parsing /home/morgamic/logs/access_2008-10-13-19.0.gz in SJ] ----------

---------- [ Finished parsing /home/morgamic/logs/access_2008-10-13-19.0.gz ] ----------

---------- [ END ACCESS LOG PARSING ] ----------

downloads counted: 9281
        Skipped because of blacklist: 17773
        Skipped because of SJ: 0
        Skipped because of NL: 5012
        Skipped because of CN: 0

Execution time: 219.21580815315

mysql> select addon_id, count, date from download_counts where date='2008-10-13' order by count desc limit 20;
+----------+-------+------------+
| addon_id | count | date       |
+----------+-------+------------+
|     1865 |   363 | 2008-10-13 | 
|     3006 |   359 | 2008-10-13 | 
|     1368 |   253 | 2008-10-13 | 
|     3456 |   229 | 2008-10-13 | 
|      201 |   226 | 2008-10-13 | 
|      722 |   204 | 2008-10-13 | 
|     1419 |   192 | 2008-10-13 | 
|     3590 |   159 | 2008-10-13 | 
|       72 |   156 | 2008-10-13 | 
|     1843 |   154 | 2008-10-13 | 
|      138 |   153 | 2008-10-13 | 
|     2410 |   151 | 2008-10-13 | 
|      748 |   147 | 2008-10-13 | 
|       26 |   142 | 2008-10-13 | 
|      743 |   133 | 2008-10-13 | 
|     5579 |   130 | 2008-10-13 | 
|      636 |   111 | 2008-10-13 | 
|      220 |   102 | 2008-10-13 | 
|     1027 |    97 | 2008-10-13 | 
|     5791 |    91 | 2008-10-13 | 
+----------+-------+------------+
20 rows in set (0.00 sec)

mysql> select sum(count) from download_counts where date='2008-10-13';
+------------+
| sum(count) |
+------------+
|       9281 | 
+------------+
1 row in set (0.00 sec)
Comment on attachment 340673 [details] [diff] [review]
v1, mark's proposed patch

This had a syntax error in it.  Will repost a fixed version.
Attachment #340673 - Flags: review?(morgamic) → review-
Attached patch v2, without syntax error (obsolete) — Splinter Review
Attachment #340673 - Attachment is obsolete: true
Attachment #343015 - Flags: review?(morgamic)
Comment on attachment 343015 [details] [diff] [review]
v2, without syntax error

This won't work either, notices and warnings all over the place... will repost a third revision in a bit.
Attachment #343015 - Flags: review?(morgamic) → review-
Attachment #343015 - Attachment is obsolete: true
Pass 2, new patch:
[morgamic@khan parse_logs]$ /usr/bin/php -f parse_logs.php logs=/home/morgamic/logs temp=/tmp type=downloads geo=SJ

---------- [ BEGIN ACCESS LOG PARSING FOR SJ] ----------

---------- [ Copying /home/morgamic/logs/access_2008-10-13-19.0.gz ] ----------

---------- [ Parsing /home/morgamic/logs/access_2008-10-13-19.0.gz in SJ] ----------

---------- [ Finished parsing /home/morgamic/logs/access_2008-10-13-19.0.gz ] ----------

---------- [ END ACCESS LOG PARSING ] ----------

downloads counted: 9281
    Skipped because of blacklist: 17773
    Skipped because of SJ: 0
    Skipped because of NL: 5012
    Skipped because of CN: 0

Execution time: 36.570813894272


mysql> select addon_id, count, date from download_counts where date='2008-10-13' order by count desc limit 20;
+----------+-------+------------+
| addon_id | count | date       |
+----------+-------+------------+
|     1865 |   363 | 2008-10-13 | 
|     3006 |   359 | 2008-10-13 | 
|     1368 |   253 | 2008-10-13 | 
|     3456 |   229 | 2008-10-13 | 
|      201 |   226 | 2008-10-13 | 
|      722 |   204 | 2008-10-13 | 
|     1419 |   192 | 2008-10-13 | 
|     3590 |   159 | 2008-10-13 | 
|       72 |   156 | 2008-10-13 | 
|     1843 |   154 | 2008-10-13 | 
|      138 |   153 | 2008-10-13 | 
|     2410 |   151 | 2008-10-13 | 
|      748 |   147 | 2008-10-13 | 
|       26 |   142 | 2008-10-13 | 
|      743 |   133 | 2008-10-13 | 
|     5579 |   130 | 2008-10-13 | 
|      636 |   111 | 2008-10-13 | 
|      220 |   102 | 2008-10-13 | 
|     1027 |    97 | 2008-10-13 | 
|     5791 |    91 | 2008-10-13 | 
+----------+-------+------------+
20 rows in set (0.01 sec)

mysql> select sum(count) from download_counts where date='2008-10-13';
+------------+
| sum(count) |
+------------+
|       9281 | 
+------------+
1 row in set (0.00 sec)
Attached patch v3, working patch (obsolete) — Splinter Review
So, using __destruct here for some reason loses our database resource link in the database class.  If i use logFileParsedCallback() instead, it works fine.  I verified download counts working properly, but I need to go back and verify totals.  Will post those results shortly.
Attachment #343021 - Flags: review?(morgamic)
Plan to verify totals is to zero them out, run unpatched, zero them out, run patched.  I'll compare the top 20 as with download counts.
Unpatched:
mysql> select id, totaldownloads from addons order by totaldownloads desc limit 20;
+------+----------------+
| id   | totaldownloads |
+------+----------------+
| 1865 |            363 | 
| 3006 |            359 | 
| 1368 |            253 | 
| 3456 |            229 | 
|  201 |            226 | 
|  722 |            204 | 
| 1419 |            192 | 
| 3590 |            159 | 
|   72 |            156 | 
| 1843 |            154 | 
|  138 |            153 | 
| 2410 |            151 | 
|  748 |            147 | 
|   26 |            142 | 
|  743 |            133 | 
| 5579 |            130 | 
|  636 |            111 | 
|  220 |            102 | 
| 1027 |             97 | 
| 5791 |             91 | 
+------+----------------+
20 rows in set (0.03 sec)

Patched:
mysql> select id, totaldownloads from addons order by totaldownloads desc limit 20;
+------+----------------+
| id   | totaldownloads |
+------+----------------+
| 1865 |            363 | 
| 3006 |            359 | 
| 1368 |            253 | 
| 3456 |            229 | 
|  201 |            226 | 
|  722 |            204 | 
| 1419 |            192 | 
| 3590 |            159 | 
|   72 |            156 | 
| 1843 |            154 | 
|  138 |            153 | 
| 2410 |            151 | 
|  748 |            147 | 
|   26 |            142 | 
|  743 |            133 | 
| 5579 |            130 | 
|  636 |            111 | 
|  220 |            102 | 
| 1027 |             97 | 
| 5791 |             91 | 
+------+----------------+
20 rows in set (0.03 sec)
Comment on attachment 343021 [details] [diff] [review]
v3, working patch

Justin, do you have time to take a quick look at this?
Attachment #343021 - Flags: review?(morgamic) → review?(fligtar)
Attachment #343021 - Flags: review?(mark)
Mark - I'm doing an extra select, but only because the database class throws an error when an update fails.  It's not very Pythonic, unfortunately.  Since we're saving thousands of queries I opted to just keep the select rather than refactor database.class.php.
Yeah I can review it - we are pushing Thursday?
This remove that extra select, thanks to oremj for pointing out 'on duplicate key' syntax.  Still don't know why the ___destruct() approach doesn't work correctly, however.
Attachment #343021 - Attachment is obsolete: true
Attachment #343062 - Flags: review?(fligtar)
Attachment #343021 - Flags: review?(mark)
Attachment #343021 - Flags: review?(fligtar)
Attachment #343062 - Flags: review?(mark)
Justin - yeah, can ask oremj to review if you're busy.
Attachment #343062 - Flags: review?(fligtar) → review+
Comment on attachment 343062 [details] [diff] [review]
v4, using on duplicate key phrase in update query

Didn't test script but reviewed code and logic and everything looks good. Something keeps making me think there was a reason I didn't do this before when I wrote the update pings script, but I can't think of it and am going to stop worrying about it.
v4 looks great to me.
Attachment #343062 - Flags: review?(mark) → review+
Status: NEW → RESOLVED
Closed: 12 years ago
Keywords: push-needed
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Keywords: push-needed
Resolution: FIXED → ---
Attachment #343245 - Flags: review?(fligtar) → review+
r19145
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Keywords: push-needed
Resolution: --- → FIXED
Keywords: push-needed
The Lightning download counts now include 119 rows for 2008-10-17 !
https://addons.mozilla.org/en-US/statistics/csv/2313/downloads
Is this the intended behavior?

Looks like maybe "on duplicate key update" added by attachment 343062 [details] [diff] [review] is not working here, now all updates insert a new row.   (Maybe because there are no unique or primary keys in this table?)
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html

Should this bug be reopened?
We found some discrepancies as well, but the cron output isn't telling us anything.  Will have to re-run logs parsed after 7pm 10-16 once we find the problem.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
We're going to flatten the duped rows and fix the index.  Working on this today.
Duplicate rows fixed up with the following steps.  A backup was taken of download_counts before the work was done, so in case something is wrong, we can restore it.

    CREATE TABLE download_counts2 LIKE download_counts
    
    ALTER TABLE download_counts2 ADD UNIQUE (date, addon_id)

This left us with a second/extra index on date, so I removed it...

    ALTER TABLE download_counts2 DROP KEY `date`

Data was then migrated:

    INSERT INTO download_counts2 (id, date, addon_id, count)
        SELECT id, date, addon_id, SUM(count)
        FROM download_counts
        GROUP BY 2, 3
    
    RENAME TABLE download_counts TO download_counts_old,
                 download_counts2 TO download_counts

I flushed memcached servers, download counts now look correct.  And with the index there, morgamic's ON DUPLICATE KEY DO A JIG should work just fine and we'll not have a recurrence.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
<3 xb95
Duplicate of this bug: 459027
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.