Closed Bug 1161858 Opened 9 years ago Closed 9 years ago

Adjust "New" and "Popular" algorithms to return a mix of apps and sites

Categories

(Marketplace Graveyard :: API, enhancement, P1)

enhancement

Tracking

(Not tracked)

VERIFIED FIXED
2015-06-30

People

(Reporter: ddurst, Assigned: mat)

References

Details

Because sites would overwhelm "New" and apps would overwhelm "Popular," we need to tweak the algorithms that drive those pages to deliver a (sensible) balance of sites and apps.
Severity: normal → enhancement
Priority: -- → P1
Blocks: 1165024
No longer blocks: 1165024
See Also: → 1166490
See Also: → 1167245
Assignee: nobody → mpillard
A proposal:

Assuming we are importing 1000 websites (sorted by rank from the CSV):
- Take the 1000 most popular apps
- For each website, steal the global popularity of the app in the same position, minus one. So if the most popular app has a "popularity" value of 9999 in region 0, the first website will get 9998 "popularity". *Also add that same value for restofworld*. If the second most popular app has 6666 popularity, the second website will get 6665, and so on.

- Problem:
  - The site won't be marked as popular in mature regions. Should we cheat and use the global score for popularity for websites at the moment, and only switch to the regional ones when we have enough data?

    Note that since we're adding our fake popularity in restofworld, when implementing bug 1166490, we probably don't want to overwrite data for restofworld.

We can then apply a similar hack to the last_updated field to get a nice distribution of websites in the "New" page.
The only issue I see with that is that it would result in a forced order of alternating apps & sites.

(The only thing I don't follow is the "Also add that same value for restofworld")

Agree on the problem: we should stick with 'global' data until we have "enough" data on a region (and until we've defined what that threshold is).

Adding Rob due to impact on 1166490. Rob, thoughts?
Flags: needinfo?(robhudson.mozbugs)
I struggle with this because we're trying to display apples with oranges together, with a sort that ends up being a bit arbitrary b/c popularity for apps and popularity for sites are derived from different things.

Even when we start getting data flowing in the scales will be different and we have to continuously monitor and adjust the true values to get the sorting how we expect.

Since they are different, why not show them separately but on the same page. Something like 1 column with the top apps and another column with the top sites? (Or other idea that doesn't require munging the true underlying popularity data?)
Flags: needinfo?(robhudson.mozbugs)
Yeah, it's apples and oranges, but ... there's not a great solution for it.

This was documented to some degree by UX: https://docs.google.com/document/d/1mZTXgMQdyIRHRUgG2DcZqdw7N-OffyDDMZsOPEtJ0ME/edit

So we could segment the page, which gets weird with pagination (and tablet). We could forcibly intermix them, like mat suggests.

I agree that actually CHANGING the data that affects popularity and newness seems like a bad idea -- but even once we start getting data, we're loading way more sites than we're going to have enough data for anytime soon. So whatever we do now will have to suffice until real data can take its place -- or get replaced by real data, when it exists, per app.

I have just scared myself.
(In reply to David Durst [:ddurst] from comment #4)
> Yeah, it's apples and oranges, but ... there's not a great solution for it.
> 
> This was documented to some degree by UX:
> https://docs.google.com/document/d/1mZTXgMQdyIRHRUgG2DcZqdw7N-
> OffyDDMZsOPEtJ0ME/edit

I hadn't seen that doc before. Could we link to these docs if they exist when a bug is created?

> I agree that actually CHANGING the data that affects popularity and newness
> seems like a bad idea -- but even once we start getting data, we're loading
> way more sites than we're going to have enough data for anytime soon. So
> whatever we do now will have to suffice until real data can take its place
> -- or get replaced by real data, when it exists, per app.

Coding a fake popularity until we have real data sounds like some ugly code, and not true to the definition of what should be on this page. Having no websites under popular b/c we have no data seems correct to me. Those websites will be listed under "New" and also when searching by a search term. When data comes in and the app truly is popular it shows up.

Once we have some data we could probably normalize both scales to the max of one or the other. E.g. if the most popular app is installed 100 times and the most popular website is opened 500 times, we normalize all app popularity scores by 5x so they are on the same scale. Since we process these as a batch once per day that seems possible.

Another option is to disable the popularity nightly task (just for websites) until we have data and include a popularity value in the website import. Until we enable the task to update this value from the data coming in from GA we have the provided popularity? It remains static until we flip the switch for the nightly task for websites. But this doesn't provide the normalizing we could get once we have real data but maybe that's ok in the interim?

Other ideas?
> Another option is to disable the popularity nightly task (just for websites) until we have data and include a popularity value in the website import. Until we enable the task to update this value from the data coming in from GA we have the provided popularity? It remains static until we flip the switch for the nightly task for websites. But this doesn't provide the normalizing we could get once we have real data but maybe that's ok in the interim?

That's what I've been suggesting in comment 1, sorry if I wasn't clear ? Store a fake global popularity for each website (that gives us a fake, but good enough mix of websites and apps in /popular), and only switch the real popularity (normalizing as you propose) when we have enough data.

A similar hack needs to be applied to /new too, otherwise /new would only contain websites at launch.
Script merged in https://github.com/mozilla/zamboni/commit/72609babca8dc98bea32b0db884d3656d92f72a6

Since we have new data to import, I think we'll do both at the same time on altdev. This will need to be monitored a bit so it's probably going to be during the Whistler Work Week.
Status: NEW → ASSIGNED
QA: use altdev to verify this bug. The idea is that popular and new pages, when viewed on mobile with the filter set to "all content", should have a mix of websites and apps shown.

https://marketplace-altdev.allizom.org/popular
https://marketplace-altdev.allizom.org/new
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2015-06-30
Verified as fixed in marketplace-altdev.allizom.org.
In both popular and new pages websites and apps are displayed: http://screencast.com/t/EHcwPTwMnx   http://screencast.com/t/YWRvhmRCPsY4
Closing bug.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.