Closed Bug 668459 Opened 14 years ago Closed 14 years ago

Statistics Dashboard: Daily Users are 0 / No data yet

Categories

(Mozilla Metrics :: Data/Backend Reports, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Unreviewed

People

(Reporter: u232883, Unassigned)

References

()

Details

Report from the author of CE Downloader: On the Statistics Dashboard, the Active Daily Users are 0 (No data yet) since its release in February. I have access to the dashboard and can confirm the issue. https://addons.mozilla.org/statistics/addon/278546
Hi, can anyone take a look at this?
Component: Statistics → Data/Backend Reports
Product: addons.mozilla.org → Mozilla Metrics
QA Contact: statistics → data-reports
I'll be working on this today.
At some point in time, there was another addon_id created with the same guid, 278543. ADU is based off of only the guid, so we select the lowest addon_id number to associate with the matching guid. That means that all the ADU is reported for the wrong/old addon_id. Downloads however are based off of the file_id which has a strong tie through version_id to the specific addon_id, so they don't suffer from this guid mismatch problem. I don't know the best way to resolve this problem for you right now. I need some guidance from the amo guys. The first question I have is how this can happen, the next question I have is whether the old addon_id still exists in their system. It would probably be much easier for me if something could be done on the amo stats dashboard side to associate these two ids, but if that isn't feasible, then we'll work out a plan to get the data corrected for you.
guid is a unique key in the database. It's possible that the add-on id changes (an author can delete their add-on and then reupload it as a new add-on and that id will change). At no time can their be multiple ids associated with a guid. Are you caching the id->guid relationship on your end?
Yes. Because we are storing historical data, we can't delete records that disappear from the source systems because that would cause broken FK references for the historic data. Wil, is there any chance that the dashboards could be extended to easily associate alias addon_ids to retrieve them from the stats we export to you?
That sounds like putting the association in the wrong place. The guid is the canonical determination of what should be where. I'm not saying you need to backport historical data, but right now you're injecting data into the table for add-on X and it should be add-on Y. Honestly, I'm a little surprised there isn't an FK on the table to prevent this kind of thing. Should we add one and when your script fails you pull a new ID for the GUID you're trying to insert?
I can't have a unique key on the guid since it is legitimate that there can be multiple addon_ids with the same guid over time. Originally, we always retrieved the highest number addon_id when looking up the guid, but a while back, we had a couple of addons that were in the opposite situation as this one, there was a higher number addon_id that was created and then deleted and it caused us to mis-report the ADU for them, so we changed it to select the lowest number addon_id instead. It would be very easy if we exported the data using guid instead of addon_id. That way there wouldn't be any confusion on either side. Would that possibly work for you? In the longer term, what we should do for our historic table is to add a date column that tracks the deletion date. Then, when we are looking up a guid, we can use that to ensure that we don't select a addon_id that has been deleted. That change will take a bit of time and testing though so hopefully we can try to do the guid export or something to get us back to a functional state quickly.
(In reply to comment #7) > I can't have a unique key on the guid since it is legitimate that there can > be multiple addon_ids with the same guid over time. Originally, we always > retrieved the highest number addon_id when looking up the guid, but a while > back, we had a couple of addons that were in the opposite situation as this > one, there was a higher number addon_id that was created and then deleted > and it caused us to mis-report the ADU for them, so we changed it to select > the lowest number addon_id instead. I don't understand what you're doing here. It's only possible for a guid to match a single add-on id at any time. It shouldn't be a matter of finding the lowest or highest add-on id - there is only one. It sounds like your system is using add-on id as the master value and AMO is using GUID? > It would be very easy if we exported the data using guid instead of > addon_id. That way there wouldn't be any confusion on either side. Would > that possibly work for you? No, all the FKs are based on add-on ID.
My table contains a record of all addon_ids that have ever existed, regardless of whether they are currently deleted in your system. This is needed because we store historic data. If addon_id 5 was active for a month and then it was deleted and replaced with addon_id 6, I can't delete addon_id 5 because then I'd have broken foreign keys for all the downloads and usage that occurred before 5 was deleted.
why not just start storing new data for add-on id 6? I mentioned the historical data doesn't matter on our end in this case - I already imported it for the new add-on id.
I have hacked around the problem for now, future vamo pings will be associated with 278546. I will start working on longer term solution described in comment 6 of keeping track of deleted addon_ids so they are not used.
Wil, can you rerun your fix on yesterday's data since it was partially incorrect? The data for today and onward should be correct.
I didn't run any fixes on yesterday's data. I imported moved all the old data to the new ID and then I filed this bug. Anything since filing this bug I haven't touched. What do you need done?
Basically, there is further data exported using the old id since the last time you changed the data to the new id. If you can move any other data for id 278543 to 278546 then there shouldn't be any further "bad" data.
Add-on ID 278543 has 158 rows in the update_counts table going back to februrary. r? `UPDATE update_counts SET addon_id=278546 WHERE addon_id=278543 limit 158`
r+ from me, that sounds perfect. I'm gonna ping on the other thread that has the same issue so we can close these out.
Query OK, 158 rows affected (0.18 sec) Rows matched: 158 Changed: 158 Warnings: 0
Looks good to me -- I'll report back to the author. Thanks a lot!
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.