Closed Bug 384031 Opened 17 years ago Closed 17 years ago

Check /extensions download statistics on FTP, possibly remove it

Categories

(mozilla.org Graveyard :: Server Operations, task, P2)

All
Other

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: morgamic, Assigned: nthomas)

References

()

Details

Attachments

(1 file)

We'd like to pull the /extensions directory from the FTP rsync image sometime in the future -- but not without understanding its effect on traffic.

So -- would it be possible for someone to get statistics on how many extensions are downloaded from /extensions?

We don't have to remove these immediately but we should look at phasing these out in the next 2-3 months since they are no longer being updated and have been migrated to the new /addons dir.  It makes extra work for the rsync and takes up unnecessary space in the long run.
Assignee: server-ops → justdave
I made an aborted attempt to get this query yesterday (hitcount for extensions in the last week), but I didn't think to run screen first, and had to leave before it finished.  It had been running for 2 hours at that point (the aggregated stats for the releases.m.o mirrors are all in a MySQL database).

I'm going to see if I can come up with some temporary table magic to make it quicker and try again.
Assignee: justdave → reed
Whiteboard: Need logs from mirrors
So, chizu got OSL to upload a ton of logs, so I've modified the process logs script to understand the directory layout, and I have let the script run loose. It's still processing 2005, so it'll be a while before it's done. I'll check on it tomorrow to see how far along it is. Once the logs have been processed, I should be able to give you some numbers as seen by OSL, at least.
Status: NEW → ASSIGNED
Whiteboard: Need logs from mirrors → Waiting on logs to be processed
Whiteboard: Waiting on logs to be processed → Need to figure out why stats script is broken
Assignee: reed → oremj
Status: ASSIGNED → NEW
What else needs to be done?
(In reply to comment #3)
> What else needs to be done?

Run queries against the mysql database to fulfill the original request?
What queries need to be run?  I know nothing about this database schema.
(In reply to comment #5)
> What queries need to be run?  I know nothing about this database schema.

You need to answer morgamic's original request of:
> So -- would it be possible for someone to get statistics on how many extensions
> are downloaded from /extensions?

I was working on getting this using the download stats scripts we already have (surf:/opt/webtools/download-stats/, I think), but there seems to be a bug in the script, as the result was http://download-stats.mozilla.org/extensions/extensions-january_to_current.html (note the huge number of blank cells or cells with just a 0). I recommend you look through how generate-stats.pl grabs the stats information and either 1) see if you can fix the script; or 2) run manual SQL queries based on the type of queries you see in generate-stats.pl or build your own queries based on the SQL schema found in schema.sql.

generate-stats.pl takes a while, so if you run it, make sure you do it in screen, as it'll take a good while. Last time I tried it, it was between 10-20 min., I believe (iirc).

I'm not at all familiar with the schema myself, only knowing what I have gathered from reading the scripts and playing with them.
Here's some numbers from the OSL mirrors.


2007/Apr/01: 620
2007/Apr/02: 1878
2007/Apr/03: 1310
2007/Apr/04: 1515
2007/Apr/05: 896
2007/Apr/06: 407
2007/Apr/07: 1379
2007/Apr/08: 329
2007/Apr/09: 349
2007/Apr/10: 319
2007/Apr/11: 348
2007/Apr/12: 553
2007/Apr/13: 403
2007/Apr/14: 261
2007/Apr/15: 887
2007/Apr/16: 936
2007/Apr/17: 243
2007/Apr/18: 509
2007/Apr/19: 359
2007/Apr/20: 139
2007/Apr/21: 685
2007/Apr/22: 3234
2007/Apr/23: 1385
2007/Apr/24: 404
2007/Apr/25: 1967
2007/Apr/26: 2067
2007/Apr/27: 552
2007/Apr/28: 1091
2007/Apr/29: 378
2007/Apr/30: 2092
2007/Aug/01: 2677
2007/Aug/02: 3460
2007/Aug/03: 3536
2007/Aug/04: 2190
2007/Aug/05: 2864
2007/Aug/06: 3646
2007/Aug/07: 2153
2007/Aug/08: 2329
2007/Aug/09: 2015
2007/Aug/10: 2120
2007/Aug/11: 1631
2007/Aug/12: 1245
2007/Aug/13: 1580
2007/Aug/14: 1637
2007/Aug/15: 1987
2007/Aug/16: 1519
2007/Aug/17: 1700
2007/Aug/18: 733
2007/Aug/19: 1049
2007/Aug/20: 930
2007/Aug/21: 1009
2007/Feb/01: 663
2007/Feb/02: 863
2007/Feb/03: 822
2007/Feb/04: 196
2007/Feb/05: 784
2007/Feb/06: 688
2007/Feb/07: 232
2007/Feb/08: 386
2007/Feb/09: 150
2007/Feb/10: 376
2007/Feb/11: 292
2007/Feb/12: 526
2007/Feb/13: 625
2007/Feb/14: 640
2007/Feb/15: 231
2007/Feb/16: 452
2007/Feb/17: 507
2007/Feb/18: 980
2007/Feb/19: 162
2007/Feb/20: 2557
2007/Feb/21: 2087
2007/Feb/22: 802
2007/Feb/23: 1870
2007/Feb/24: 519
2007/Feb/25: 855
2007/Feb/26: 2169
2007/Feb/27: 216
2007/Feb/28: 2282
2007/Jan/01: 574
2007/Jan/02: 369
2007/Jan/03: 879
2007/Jan/04: 464
2007/Jan/05: 575
2007/Jan/06: 1568
2007/Jan/07: 988
2007/Jan/08: 1010
2007/Jan/09: 1285
2007/Jan/10: 502
2007/Jan/11: 1449
2007/Jan/12: 988
2007/Jan/13: 638
2007/Jan/14: 766
2007/Jan/15: 397
2007/Jan/16: 767
2007/Jan/17: 533
2007/Jan/18: 948
2007/Jan/19: 1060
2007/Jan/20: 751
2007/Jan/21: 1453
2007/Jan/22: 554
2007/Jan/23: 587
2007/Jan/24: 1622
2007/Jan/25: 1298
2007/Jan/26: 998
2007/Jan/27: 906
2007/Jan/28: 838
2007/Jan/29: 659
2007/Jan/30: 1495
2007/Jan/31: 1483
2007/Jul/01: 970
2007/Jul/02: 1023
2007/Jul/03: 1178
2007/Jul/04: 1183
2007/Jul/05: 943
2007/Jul/06: 1382
2007/Jul/07: 1677
2007/Jul/08: 1344
2007/Jul/09: 2021
2007/Jul/10: 1096
2007/Jul/11: 1291
2007/Jul/12: 2213
2007/Jul/13: 1655
2007/Jul/14: 4422
2007/Jul/15: 2281
2007/Jul/16: 2741
2007/Jul/17: 1372
2007/Jul/18: 1228
2007/Jul/19: 1490
2007/Jul/20: 1478
2007/Jul/21: 1687
2007/Jul/22: 1394
2007/Jul/23: 1337
2007/Jul/24: 1838
2007/Jul/25: 1920
2007/Jul/26: 1539
2007/Jul/27: 3508
2007/Jul/28: 1671
2007/Jul/29: 4593
2007/Jul/30: 3530
2007/Jul/31: 3755
2007/Jun/01: 2475
2007/Jun/02: 3446
2007/Jun/03: 2553
2007/Jun/04: 1155
2007/Jun/05: 3762
2007/Jun/06: 2403
2007/Jun/07: 1485
2007/Jun/08: 1339
2007/Jun/09: 1718
2007/Jun/10: 870
2007/Jun/11: 448
2007/Jun/12: 832
2007/Jun/13: 695
2007/Jun/14: 1016
2007/Jun/15: 1293
2007/Jun/16: 1066
2007/Jun/17: 1179
2007/Jun/18: 1028
2007/Jun/19: 1360
2007/Jun/20: 1115
2007/Jun/21: 129
2007/Jun/27: 42
2007/Jun/28: 2386
2007/Jun/29: 1672
2007/Jun/30: 1263
2007/Mar/01: 961
2007/Mar/02: 2048
2007/Mar/03: 3130
2007/Mar/04: 1041
2007/Mar/05: 786
2007/Mar/06: 2304
2007/Mar/07: 994
2007/Mar/08: 1650
2007/Mar/09: 2449
2007/Mar/10: 1810
2007/Mar/11: 2356
2007/Mar/12: 2702
2007/Mar/13: 1100
2007/Mar/14: 1053
2007/Mar/15: 1275
2007/Mar/16: 791
2007/Mar/17: 529
2007/Mar/18: 680
2007/Mar/19: 537
2007/Mar/20: 986
2007/Mar/21: 1430
2007/Mar/22: 2327
2007/Mar/23: 4179
2007/Mar/24: 2272
2007/Mar/25: 1987
2007/Mar/26: 2278
2007/Mar/27: 1709
2007/Mar/28: 869
2007/Mar/29: 3339
2007/Mar/30: 1139
2007/Mar/31: 969
2007/May/01: 765
2007/May/02: 330
2007/May/03: 605
2007/May/04: 785
2007/May/05: 442
2007/May/06: 7
2007/May/22: 2
2007/May/23: 1282
2007/May/24: 1252
2007/May/25: 1425
2007/May/26: 1446
2007/May/27: 4830
2007/May/28: 1245
2007/May/29: 2385
2007/May/30: 1892
2007/May/31: 2861
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Jeremy, would it be possible to generate the same list correlating paths with counts?  Might be helpful to know if a lot of file are getting hits or if there are just a few that are linked to from their developer's site.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This could help us out a bit for bug 390479 - it's 2.5 GB of little files in the 148 GB module size.
Blocks: 390479
Attached file Extensions stats
Status: REOPENED → RESOLVED
Closed: 17 years ago17 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: Need to figure out why stats script is broken
Over to morgamic for a decision ...
Assignee: oremj → morgamic
Status: REOPENED → NEW
These extensions are outdated so I think we should discontinue these dirs for that reason.  Users can go to /addons if they need them.  The stats showed an oddly even distribution which means it was likely spiders/bots (the majority of 4k addons were downloaded in the range of 30-70).  One that stood out was html tidy, and I can contact the author about it.

So I think we should remove these directories from the rsync.
I went ahead and added extensions/ and themes/ to the exclude list for this rsync module. We're planning to remove the files in a couple of weeks if there are no issues.
Assignee: morgamic → nrthomas
Sounds good -- thanks Nick.
Closing - if there are issues, reopen.
Status: NEW → RESOLVED
Closed: 17 years ago17 years ago
Resolution: --- → FIXED
morgamic, should I go ahead with removing extensions/ and themes/ ?
Go for it, Nick.  Haven't heard anything from the community about it, and all the files have since been migrated anyway.  New files are added in /addons (since March) so we won't be missing anything newer than 6 months.
Done, and cleaned up the rsync exclusion list. Another 2.6GB off the mirror module :-)
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: