Sometimes the line chart isn't drawn

RESOLVED FIXED in 6.2.3

Status

addons.mozilla.org Graveyard
Statistics
P3
normal
RESOLVED FIXED
7 years ago
2 years ago

People

(Reporter: clouserw, Assigned: jbalogh)

Tracking

unspecified
6.2.3

Details

(Reporter)

Description

7 years ago
This is more reproducible when I change to custom dates, but it's not 100%.

To reproduce:

1) Load something like https://addons.mozilla.org/z/en-US/firefox/addon/1865/statistics/?start=20100701&end=20101112

2) If you're (un)lucky the line chart won't appear.  Mousing over it will show the numbers as you'd expect, it's just not drawn.

A force refresh will make it show up.
(Reporter)

Updated

7 years ago
Assignee: nobody → thepotch
Target Milestone: 5.12.4 → 5.12.7
(Reporter)

Comment 1

7 years ago
Tried this today in minefield and chrome and couldn't reproduce.
Target Milestone: 5.12.7 → Q1 2011
(Reporter)

Updated

7 years ago
Target Milestone: Q1 2011 → Q2 2011
(Reporter)

Comment 2

7 years ago
(In reply to comment #1)
> Tried this today in minefield and chrome and couldn't reproduce.

I can no longer reproduce the "mousing over it will show the numbers" part, but on adblock-plus I have yet to see a graph today for any combination of dates (or just the defaults)
(Reporter)

Updated

7 years ago
Target Milestone: Q2 2011 → Q3 2011
(Reporter)

Comment 3

7 years ago
Alright, the mouse over part isn't relevant anymore, but it's true that graphs aren't getting drawn.  Looks like the HTTP requests timeout after 45s or so.  I suspect too slow of SQL queries
Assignee: thepotch → nobody
(Reporter)

Updated

7 years ago
Assignee: nobody → ashort
Target Milestone: Q3 2011 → 6.1.8
Generating the JSON with downloads stats does two SQL queries. 

For the URL given in the reproduce instructions it does:

    SELECT `download_counts`.`id` FROM `download_counts` WHERE (`download_counts`.`date` BETWEEN '2010-07-01' and '2011-11-12' AND `download_counts`.`addon_id` = 1865 )


    SELECT `download_counts`.`id`, `download_counts`.`addon_id`, `download_counts`.`count`, `download_counts`.`date`, `download_counts`.`src` FROM `download_counts` WHERE (`download_counts`.`date` BETWEEN '2010-07-01' and '2010-11-12' AND `download_counts`.`addon_id` = 1865  AND `download_counts`.`id` IN (26384390, 26277676, 26375220, 25533443, 26013270, 25602550, 25731403, 26098479, 25996331, 26081796, 26393383, 25593805, 25290450, 25936430, 25340002, 25987851, 25273561, 25867377, 25576852, 26030431, 25611642, 26286886, 25374039, 25517084, 25222999, 26411093, 25823850, 25331878, 26140334, 25482718, 25231700, 25806371, 25256834, 26165711, 25491368, 26038932, 26348159, 25197989, 26330978, 26115108, 25567928, 26451040, 25919137, 25620249, 25962150, 25265211, 25391336, 26469694, 26251263, 26428818, 26322365, 25722902, 26056165, 25458065, 25499934, 25698241, 25953486, 26191228, 25441207, 25646057, 25789641, 26174238, 25910042, 26123530, 26233643, 26073261, 26242210, 26419779, 25239941, 26199783, 25797832, 25892235, 26460145, 25706410, 25525231, 26157160, 25206298, 26356844, 25508439, 25944962, 26304373, 25654233, 25875764, 25424290, 25214551, 26208239, 25689625, 25415923, 25382847, 26224910, 25979322, 25970703, 26268912, 26021745, 25281825, 25680843, 26148704, 25773021, 26339509, 26365719, 25764537, 25466370, 25306930, 25474507, 26402261, 26131993, 26295456, 25348484, 25663016, 25629078, 25671796, 26438003, 26216581, 25298637, 25365404, 25407708, 25815001, 25541725, 25585343, 25841520, 25550143, 26313313, 25637801, 25323768, 25740096, 25850273, 25884142, 25449538, 25714623, 25315280, 25399585, 25248429, 25432879, 26260037, 25558987));

The second query can take several seconds when the time range is large. I talked to jbalogh about this and he suggested the inclusion of the IDs in the query improves its cacheability.

These queries use the "addon_date_idx" index, so it's not clear how to immediately speed them up.
Target Milestone: 6.1.8 → Q3 2011
(Reporter)

Comment 5

7 years ago
Combining the two queries is still slow for me, for what it's worth.  (30s on cm-webdev01-master01.  comparable to separate queries).
Are the file ids actually needed for rendering purposes? I tried running the following similar query in our datawarehouse:
select addon_id, sum(requests), date, src from amo_download_requests_by_day a join dates d on a.utc_date_id = d.date_id where date between '2010-07-01' and '2010-11-12' and addon_id = 1865 group by 1,3,4;

That query takes 3 seconds to fetch the first row.  Most of that time is spent doing the join to the dates table. If I plug in the date ids instead:
select addon_id, sum(requests), utc_date_id, src from amo_download_requests_by_day a where utc_date_id between 3835 and 3969 and addon_id = 1865 group by 1,3,4;

Then it takes only 500ms.  These queries aren't optimized.  If we wanted to investigate performing these statistics type queries directly on the datawarehouse then we could create some projections (Vertica's form of indicies) that would keep all of the queries at a consistently fast speed.
(Reporter)

Comment 7

7 years ago
We don't use file ids in our graphs - See the link in comment 0 for an example of the stats we offer.

I'm very interested in outsourcing this to you as our data and offerings are just continuing to grow.  What are the next steps for investigating what that would take? :)
(Reporter)

Comment 8

7 years ago
If we can't get interaction with the metrics cluster here we'll have to start looking at alternatives to mysql it sounds like.  Do you have any suggestions?
(Assignee)

Updated

7 years ago
Assignee: ashort → jbalogh
Target Milestone: Q3 2011 → 6.2.2
(Reporter)

Comment 9

7 years ago
13:43 <jbalogh> I want one of those war propaganda posters with kumar encouraging testing
13:43 <jbalogh> and deletion
13:43 <jbalogh> clouserw: I have stats serving from elastic
13:43 <jbalogh> and fast
13:44 <jbalogh> I need to make some scripts to index all the old data and handle backfilling and all that


-> next week for some additional scripts
Target Milestone: 6.2.2 → 6.2.3
(Assignee)

Comment 10

7 years ago
This moves all the slow stuff to ES: https://github.com/jbalogh/zamboni/commit/e21f588 

Note that I have to kick off all the indexing before stats show up.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED

Comment 11

7 years ago
(In reply to Jeff Balogh (:jbalogh) from comment #10)
> This moves all the slow stuff to ES:
> https://github.com/jbalogh/zamboni/commit/e21f588 
> 
> Note that I have to kick off all the indexing before stats show up.

Let me know when this is done.
(Reporter)

Comment 12

7 years ago
I think it's done
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.