Closed Bug 648752 Opened 13 years ago Closed 12 years ago

Add-ons causing higher real-world slowdown than measured on Talos need to be recognized

Categories

(addons.mozilla.org Graveyard :: Administration, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX
4.x (triaged)

People

(Reporter: jwkbugzilla, Unassigned)

References

Details

(Whiteboard: [ddn])

There is a number of add-ons that will slow down Firefox startup significantly more in real-worlds scenarios than when tested with a clean state on Talos. The current approach is apparently to pick out some addons and make sure they are tested with a non-clean state (bug 639898, bug 647561). However, if this is being done it needs to be done for all add-ons - otherwise it will give some add-ons an unfair advantage. I don't like the idea that tomorrow users will prefer another ad blocking extension to Adblock Plus because the slowdown numbers on AMO are significantly lower - and that's just because Adblock Plus is tested with filters and that other extension isn't. Note also that it is not always obvious that an extension performs worse with a non-clean state.

I don't think that the current manual approach is viable. Asking users to tell you which add-ons require special activation is great but won't give you all of them. Add-on developers on the other hand have little incentive to turn themselves in. The only realistic approach that I can think of is using the ping data to check whether the real-world data is significantly different from the measured data for an addon. One approach I can think of is taking all the pings where the user has a given add-on installed and checking how much they deviate from the overall average for users with the same number of add-ons. With a large enough sample the average deviation should give you some idea about the real-world impact of an add-on.

There might be more precise ways to do this of course, I am not a statistician. But there needs to be a fallback plan because right now gaming the system is way too easy - and you are working on giving add-on developers incentives to do it.
I'm against making the tests less accurate for the sake of fairness.

Our goal is to make Talos tests as close to real world usage as viable, and we will try to set up the right preferences/data that are necessary to test add-ons in a realistic manner. It certainly won't scale to all add-ons, but we should be able to cover at least the ones that have the majority of usage.

We will definitely monitor the new usage data we receive to identify add-ons misrepresented on Talos tests. When possible, we will adjust the tests.
Target Milestone: --- → 4.x (triaged)
I don't object to a realistic testing of Adblock Plus & Co., in fact I asked about that myself. I ask for a definitive and reliable solution to recognize the add-ons with incorrect performance measurement results.

> We will definitely monitor the new usage data we receive to identify add-ons
> misrepresented on Talos tests.

How exactly? Is it going to be a manual process, on base of suspicion? Or an automated plausibility test for all top add-ons?
(In reply to comment #2)
> I don't object to a realistic testing of Adblock Plus & Co., in fact I asked
> about that myself. I ask for a definitive and reliable solution to recognize
> the add-ons with incorrect performance measurement results.

I don't think it's possible to do "definitive", considering new add-ons are submitted and modified all the time. If we can't have a perfect system, I'd rather have a decent one, even if it is unfair.

This is not about comparing add-ons with each other, it's about making users aware of the possible cause of significant slowdowns in Firefox, and make developers aware of their add-ons' performance, hopefully guiding them in the right direction about fixing any problems.   

> > We will definitely monitor the new usage data we receive to identify add-ons
> > misrepresented on Talos tests.
> 
> How exactly? Is it going to be a manual process, on base of suspicion? Or an
> automated plausibility test for all top add-ons?

The former, for now. If you're willing to help on automating this process, we'll be happy to collaborate with you.
(In reply to comment #3)
> This is not about comparing add-ons with each other

It is - for the users. Intended or not, the users are already using these numbers to compare add-ons.
(In reply to comment #3)
> This is not about comparing add-ons with each other, it's about making users
> aware of the possible cause of significant slowdowns in Firefox

I know Wladimir already replied, but this seems a silly assertion to make. If this is not about comparing add-ons with each other, then why is the main feature of the campaign a page which shows the add-ons compared to each other in a graph, with a ranking given to each of them?

The page says it lists the slowest add-ons, implies that those add-ons can make it difficult to use Firefox, and lists them in ranked order. Those are pretty definitive statements. If it's not possible to "do definitive", that's fine, but then the page shouldn't imply that it is.
(In reply to comment #5)
> I know Wladimir already replied, but this seems a silly assertion to make. If
> this is not about comparing add-ons with each other, then why is the main
> feature of the campaign a page which shows the add-ons compared to each other
> in a graph, with a ranking given to each of them?

That's not the main feature of the campaign. The main feature of the campaign is making users and developers more aware of start-up performance, as described in the announcement post (http://blog.mozilla.com/addons/2011/04/01/improving-add-on-performance/). The performance page is intended for users to be able to diagnose why they might be having problems.

As Jorge said, we have the real-world performance data and are working on improving our reports and investigating the worst offenders. The way it will work is we'll look at the add-ons that have the biggest impact on start-up time and run our own tests on that add-on to confirm that we get similar results or find out what settings the users have that trigger those results. We'll then start the outreach and performance warnings just as we do for other add-ons.
There are some addons for which this type of performance testing are just not appropriate:

- those that use outside servers (like that little widget that sits in the corner and reports the weather), talos runs proxied to localhost so we can't get a reasonable performance metric here
- those that require some sort of user interaction (clicking a menu bar, or a button to generate some result), we don't currently have a means to send clicks to the browser
- addons that require user configuration of some sort to do anything
- etc (i'm sure that there are more examples)

Putting together the initial round of ts tests was an exercise to see if any sort of automated performance testing of addons was possible.  I don't believe that it covers all the addons being currently tested in a 'fair' manner.  We are trying to get the appropriate perfs/config set for as many addons as we can so as to make the testing that we do have more accurate.

I see this as generating a very raw idea of performance that we can then work with.  Really, I very much want to work with addons community members to refine and improve the system along with identifying the pitfalls of assigning a 'performance number' out of an automated result.
Whiteboard: [ddn]
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.