Closed Bug 681825 Opened 13 years ago Closed 13 years ago

Add "most active" facet to search

Categories

(addons.mozilla.org Graveyard :: Add-on Builder, defect, P1)

Tracking

(Not tracked)

RESOLVED FIXED
Builder 0.9.12

People

(Reporter: smcarthur, Assigned: smcarthur)

References

Details

facet: most active (recency of last update indexed against number of total commits, accounting for add-on/library age)
What would the filter be, exactly, in the UI? A slider of... commits per day, or something?
Assignee: nobody → smcarthur
Blocks: 625947
Severity: normal → major
Priority: -- → P1
Target Milestone: --- → Builder 0.9.11
Most active would be the total # of updates in ratio to the running average of update recency.
btw, updates == revisions
Sure, but what would the units look like on the slider?

The other sliders all are easy: Number of copies, Number times depended...
Since this is a property that relies on time, we either need to call a function on the index (which is less performant since the search engine can't optimize and cache things), or run a cron perhaps every night to re-index all packages, updating their Activity info.
Target Milestone: Builder 0.9.11 → Builder 0.9.12
It would be useful to define how this filter should work. dbuc pointed out that Google Code uses an "Activity" label, and that wordage could fit for this filter.

Next up, is how this all should work. A fairly simple one is to just collect the number of revisions in the past X (month?) and store that as a properties in the index. We would need a way to decide how "active" that many revisions is. Do we hard code a scale, or base it off all the current activity?
So for this you could take the total number of revisions a packages has ever had and divide that by the moving average of time-between-updates.

To get this moving average you take the time between the very latest revision and the last one, and count it at 100% weight. So if that time was 86400 seconds (1 day) you would retain that whole amount. Next you walk down the revision tree to the next most recent step, the time between the second to latest revision and the third. say that is also 86400 seconds. This time you weight it down by a given factor, which we can decide upon (just depends on how fast we want things to "go stale"). Each revision interval away from the latest one then matters less and less, as the down weighting becomes more and more prominent each time.

Does that make sense? I can do you want me to post a simple computed revision history example?

Also , this doesn't have to be a cron. As long as we store the time between one revision and the last on revision save, the code just needs to apply a simple computation to each revision when it does the indexing on revision save.
Target Milestone: Builder 0.9.12 → Builder 0.9.11
We probably want to use an exponentially-weighted moving average: http://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average
We just pushed 0.9.11, so aiming this for the next push (0.9.12)
Target Milestone: Builder 0.9.11 → Builder 0.9.12
Regarding being a cron, we could of course updating the data on each save. But after an addon has been saved the last time, as time goes on, the activity would start to get stale. However, without a cron, that data wouldn't get updated until it was saved again.

We could consider a moving average. However, how is this data useful past a month or so?
landed in master https://github.com/mozilla/FlightDeck/commit/c943907515e4795650ce5ec62bc44316947692f8
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.