See data on https://addons.mozilla.org/de/thunderbird/addon/lightning/statistics/ https://addons.mozilla.org/en-US/statistics/addon/2313 https://addons.mozilla.org/en-US/statistics/addon/1865 The number of ADUs is approximately 10% lower for the Monday numbers and approx. 50% to 60% lower for the Tuesday numbers than normal. Probably some cluster breakdown, data center move or something else. As with earlier bugs like bug 683439 this will probably not be fixed, but I still wanted to report it. Perhaps you'll surprise me and fix it.
Lots of stuff was broken on Tuesday while the Phoenix datacenter was having issues.
This information comes from logs and the site was actually disabled at the firewall (meaning, no logs) so this is a CANTFIX as far as I know. Daniel: if I'm missing something please reopen.
Wil, I know that the datacenter on Tuesday had issues, but the ADU data has been incorrect ever since. The hill structure of the ADU curve on https://addons.mozilla.org/en-US/statistics/addon/2313 indicates to me that even now something is still terribly wrong. Is that a good enough reason to reopen this bug and investigate the issue?
There's a new stats dashboard! https://addons.mozilla.org/en-US/thunderbird/addon/lightning/statistics/ I think it looks like it's following mostly the same trend line. Going down today and up next week I imagine.
Wil, I know about the new stats dashboard, but since it shows exactly the same data as the old one, I don't know what benefit it gives us here? And I really don't follow why the data is following the same trend line. If you at https://addons.mozilla.org/en-US/thunderbird/addon/lightning/statistics/?last=90 you'll notice that our daily ADUs during the week are in the high 1.3M or low 1.4M area with a drop of 100K ADUs on Friday. This weeks data Monday: 1,254,787 Tuesday: 729,540 Wednesday: 1,323,269 Thursday: 1,095,863 Friday: 1,067,630 is totally off base and a clear indication of a bug. Trust me, I've been monitoring the Lightning usage data for more than two years. I know when something is severely wrong. And BTW this is not something that happens just for Lightning. Just look at the Adblock Plus numbers, which are also down by 10% - 20% since the beginning of this week.
We have had Zeus and datacenter related problems on several days last week that resulted in lost data that would affect this metric. I definitely understand how annoying it can be, because these disruptions affect all of the Firefox and Thunderbird metrics as well. Let's wait and see what next week looks like. I suspect it will stabilize.
Okay, I just found out the rest of the problem here. IT brought a new vamo server online starting on the 7th and Metrics wasn't notified so we weren't processing the logs out of that server. Data on the 5th and 6th were messed up because of the outages, data from the 7th to the 9th were messed up because of missed processing for the new server. Data for today should be correct (although, of course, lower due to the weekend).
(In reply to Daniel Einspanjer :dre [:deinspanjer] from comment #7) > Data on the 5th and 6th were messed up because of the outages, data from the > 7th to the 9th were messed up because of missed processing for the new > server. Does that mean Dec. 7-9 has no logs or they just need to be back-processed?
(In reply to Daniel Einspanjer :dre [:deinspanjer] from comment #7) > Okay, I just found out the rest of the problem here. > > IT brought a new vamo server online starting on the 7th and Metrics wasn't > notified so we weren't processing the logs out of that server. > > Data on the 5th and 6th were messed up because of the outages, data from the > 7th to the 9th were messed up because of missed processing for the new > server. > > Data for today should be correct (although, of course, lower due to the > weekend). Thanks for the info, Daniel. The Saturday data looks alright to me. If (as Justin has already asked) the Wednesday to Friday data could be backprocessed, that would reduce the impact of this bug significantly. Would that be possible?
Daniel, any update on the possibility of back-processing the data from the 7th to the 9th of December?
It seems that something is again going wrong. :-( Today's (Thursday) data is again approx. 20% lower than usual for Lightning as well as for Adblock Plus.
IT was having lots of issues with datacenter connectivity yesterday (2011-12-15), and a lot of the log data did not arrive until well after the cutoff for processing it. It is not feasible for us to backprocess the data from the 7th to the 9th. We are going to try to backprocess the data missing from today, but even that is going to take hours to correct.
Okay, found the delay in processing. We reprocessed the missing data from 2011-12-15 and I am inserting it now. Please verify or reopen tomorrow if the data is not correct for either 2011-12-15 or 2011-12-16
Hey Daniel, unfortunately things haven't improved. Thursday's data went up a little bit (but not enough), but Friday's and especially Saturday's data is way lower than usual. I measure this again by looking at Adblock Plus (most popular add-on for Firefox) and Lightning (most popular add-on for Thunderbird). Therefore I'm reopening this bug :-(
Added more servers and backprocessed. Let me know how it looks tomorrow morning?
The data from 17th - 20th seems to be still far below normal. A general question: is there any monitoring on your side to make sure that statistics data processing is working without problems? If not, this would be very appreciated.
I've double checked the days in question and I can find no sign of missing data. I think this might be a combination of weekend + holiday trends. We have lots of checks and safeguards against errors during processing. However, discovering that there is new data that we aren't tracking is much more difficult and basically requires notification from the IT team. When that is compounded with the severe load problems the versioncheck.addons.mozilla.org site has been having, it causes a real mess that we have to sort through.
Thanks for your answer. Of course there will be fewer users than normal because of christmas holidays, but the difference is in my opinion too large. Is it possible that the lower numbers are connected to this problem? http://blog.mozilla.com/addons/2011/12/20/statistics-correction-in-developer-dashboards/. In the posting is mentioned, that the total number of users shouldn't be affected, but it looks like that the total number is just the sum of the users of all versions. Any idea if the statistics data is already re-indexed and if this problem is fixed?
I don't know, I am not familiar with the system in question on that bug. I agree that the numbers feel too low to be the result of just a holiday dip. That said, I haven't been able to find any further causes. I will pick this up and continue investigation after the holidays when the numbers even out again. Of course we will continue the regular monitoring for any missing or late arriving data.
(In reply to Daniel Einspanjer :dre [:deinspanjer] from comment #20) > I agree that the numbers feel too low to be the result of just a holiday > dip. That said, I haven't been able to find any further causes. I will pick > this up and continue investigation after the holidays when the numbers even > out again. Of course we will continue the regular monitoring for any > missing or late arriving data. Daniel, I just compared last year's numbers of Lightning (last 14 days) with this year's number. On average last year's numbers were 5% higher than this year's numbers. This does not track at all with the data from the rest of 2011, where the Lightning add-on saw a significant growth of 30% to 40% in terms of active users compared with 2010. So something *is* really wrong here, it doesn't just feel wrong. The number of active users should be considerably higher, by at least 30%. Looking at last year and at 2009 the ADUs on the christmas weekend should at least be in the range of our weekend ADU numbers, but they are far below.
You can see the exact same pattern in the public stats for Add-block Plus: https://addons.mozilla.org/en-US/firefox/addon/adblock-plus/statistics/ If you look at the same time last year, they saw a huge drop on Christmas day but bounced right back up. This drop started around 12/16 or 12/17 and has been sustained for two weeks straight. Also, the same pattern is showing for my add-on. In both cases it is about a 30% drop. I agree with Simon. Something definitely is really wrong here.
I agree too that there is some unknown problem. I've pinged a few more people, but didn't hear back over the holidays. I will post here once I hear back.
IT has found the problem. They started shifting a third of the versioncheck.addons.mozilla.org traffic (on which we base our add-on usage counts) to a datacenter that was not properly configured for request logging. That means that Metrics never received the data to process. IT corrected the problem at 4 PM Pacific today. Since 3 PM Pacific is the end of the day UTC, we should see a complete recovery starting tomorrow evening when tomorrow's EndOfDay is run. I'll update tomorrow evening.
Once these stats are re-indexed, We can re-run the stats indexing job for the affected timerange to update the UI.
We are getting the additional traffic in our logs now and the numbers tonight should reflect that. (In reply to Potch [:potch] from comment #26) > Once these stats are re-indexed, We can re-run the stats indexing job for > the affected timerange to update the UI. Unfortunately, there isn't any data to re-index. The logs from the missing cluster were never generated, so the time period from Dec 16 through yesterday is permanently down by 33%.
Last days numbers show a significant uptick in the range of 50% (which correlates perfectly to the 33% loss). Marking FIXED. Too bad that the data from the middle of December till now cannot be reindexed :(