Closed Bug 735486 Opened 12 years ago Closed 12 years ago

Establish a date for soft-blocking McAfee Site Advisor 3.4.1.195

Categories

(Toolkit :: Blocklist Policy Requests, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: justin.lebar+bug, Unassigned)

References

Details

I think we should establish a date after which McAfee Site Advisor 3.4.1.195 will be soft-blocked.  This add-on leaks compartments; see bug 729608.

To be clear, I am not advocating a hard-block of this add-on.  Users will still be able to use it, by clicking through a dialog.

No progress has been made on the bug in the past two weeks (bug 729608 comment 22).

At the moment, bug 729608 does not appear to me to be a bug in Firefox.  Even if it is, it appears that we'd need the assistance of the McAfee folks to track down the issue, so we're gated on McAfee either way.

Those who work with partners have suggested that we can get bugs such as this fixed in a timely manner.  If we're confident in this, then we should be able to easily agree on a date here, because if we chose a reasonable date, the add-on will be fixed by then and we won't have to soft-block it.

I propose that we soft-block the add-on on March 23, which is one month after bug 729608 was filed.

If a fix is pushed before March 23, I propose we soft-block previous, known-leaky versions on March 31.

Do we agree that these dates are reasonable?
Blocks: 729608
Whiteboard: [MemShrink]
Whiteboard: [MemShrink]
I disagree with blocking the add-on unless the leak is severe.
I agree with Justin Lebar. We should have a timeline for these things. It is completely reasonable to expect an add-on author to fix serious issues in weeks not months. I think two from notification to the vendor when we get no progress from the vendor is reasonable. If the vendor says they need additional time, we could extend it a week or two.
In the MemShrink project, we generally consider any content compartment leak severe.

A content compartment can take up an arbitrary amount of memory -- for example, tbpl is taking up 170mb, and Gmail is taking up 60mb.   I've seen Facebook and Google Reader compartments in the hundreds of megabytes.

Moreover, these compartment leaks often lead to long GC and CC times.  The GC and CC are currently in a state of flux, but I've observed 2s GC pauses in nightly builds (e.g. bug 734408) apparently due only to leaked compartments.  Previous iterations of the CC (such as, I suspect, what currently lives in the release channel) can, likewise, run very slowly in the presence of a large zombie compartment.

The whole browser freezes when we have a long GC/CC pause; it's quite bad.  I can't prove that Site Advisor is causing those kinds of freezes (they're in general very difficult to reproduce), but I don't think that's the right bar to set.   The fact that we know there are zombie compartments, which is a severe memory leak, and that zombies are leaked to severe, persistent jank is enough for me to consider this bug to be severe.
(In reply to Asa Dotzler [:asa] from comment #2)
> I agree with Justin Lebar. We should have a timeline for these things. It is
> completely reasonable to expect an add-on author to fix serious issues in
> weeks not months.

But we haven't established this is a serious issue yet.

Calling all memory leaks equally severe is certainly wrong, given that most of them go completely unnoticed and don't affect performance in any visible way. The first leak that they fixed was certainly severe, in that all visited pages were leaking. If the remaining leak is similarly bad, I'm OK considering this approach. If not, I don't think it warrants annoying millions of users.
(In reply to Jorge Villalobos [:jorgev] from comment #4)
> (In reply to Asa Dotzler [:asa] from comment #2)
> > I agree with Justin Lebar. We should have a timeline for these things. It is
> > completely reasonable to expect an add-on author to fix serious issues in
> > weeks not months.
> 
> But we haven't established this is a serious issue yet.

Compartment leaks are severe. 
 
> Calling all memory leaks equally severe is certainly wrong, given that most

All memory leaks are not equally severe. Compartment leaks, by definition, IMO, are severe though. 

> of them go completely unnoticed and don't affect performance in any visible
> way. The first leak that they fixed was certainly severe, in that all
> visited pages were leaking. If the remaining leak is similarly bad, I'm OK
> considering this approach. If not, I don't think it warrants annoying
> millions of users.

I don't think you have to leak all pages to be considered severe. That's not severe, that's outrageous. 

As for annoying millions of users, millions of users are annoyed when their browser uses more memory than they want and when their UI gets all janky because of GC and CC pauses that result from compartment leaks.
(In reply to Asa Dotzler [:asa] from comment #5)
> Compartment leaks are severe. 

So every dependency of bug 700547 is a severe problem? Should we consider blocking all of those add-ons if they aren't fixed promptly?

I strongly disagree. I'd like to see anyone trying to reproduce a poor experience out of any of them, except a couple of the most serious ones.

Or maybe we have different ideas of what a zombie compartment leak is, because the way I see them, in the majority of cases this is not a problem the user will ever know about. njn, can you give some input on this?

> I don't think you have to leak all pages to be considered severe. That's not
> severe, that's outrageous.

That I agree with. Though it doesn't help demonstrate the severity of this particular compartment leak.

> As for annoying millions of users, millions of users are annoyed when their
> browser uses more memory than they want and when their UI gets all janky
> because of GC and CC pauses that result from compartment leaks.

Do you think automatically disabling their security add-ons is any better? And that's even assuming they *are* having these performance problems, which as far as I know is unproven in this case.
(In reply to Jorge Villalobos [:jorgev] from comment #6)
> (In reply to Asa Dotzler [:asa] from comment #5)
> > Compartment leaks are severe. 
> 
> So every dependency of bug 700547 is a severe problem? Should we consider
> blocking all of those add-ons if they aren't fixed promptly?

If we've contacted them and they've not done anything to address the problem in a week or three? Yes. 

> > As for annoying millions of users, millions of users are annoyed when their
> > browser uses more memory than they want and when their UI gets all janky
> > because of GC and CC pauses that result from compartment leaks.
> 
> Do you think automatically disabling their security add-ons is any better?
> And that's even assuming they *are* having these performance problems, which
> as far as I know is unproven in this case.


Yes. I think disabling their security add-on is better. We must eventually draw that line in the sand and I think now is better than later. Shall we let them accrue more millions of users and only then take action when an even larger group is affected? I say no. I say it's time to put our foot down and say "you don't get to screw over millions of Firefox users with bad experience". Add-on authors need to get with the program, and fix the serious shortcomings we find (we're already doing more than we should have to in diagnosing their problems for them) of find a different platform to abuse. 

I know it's your job to keep add-on authors happy, but Mozilla's mission isn't about add-on authors. It's about users and web developers. Add-on authors are a nice to have but I have no problem saying "you must be this tall to ride" and discarding the ones that don't make the cut. I think it's long past time that we start doing that and stop coddling these giant business who are harming our users.  

I think it's especially easy in cases because the add-on is mostly scareware and does almost nothing to add to the security of the Firefox experience. We already have built in capabilities that mirror most of what this add-on is doing and we don't accept serious leaks in our own code that performs those functions.
(In reply to Asa Dotzler [:asa] from comment #7)
> I know it's your job to keep add-on authors happy

That is most certainly not my job.
(In reply to Jorge Villalobos [:jorgev] from comment #6)
>
> Or maybe we have different ideas of what a zombie compartment leak is,
> because the way I see them, in the majority of cases this is not a problem
> the user will ever know about. njn, can you give some input on this?

Most leaky add-ons create a single zombie compartment, usually when you do an action on a page.  If you do that action on another page, the zombie compartment is replaced with a new one.  SiteAdvisor 3.4.1.195 doesn't follow this pattern, it creates multiple zombie compartments.  So although it's much better than the old versions, it's still the second-leakiest add-on I've seen.

As for whether the user will notice:  that's a difficult question.  Memory consumption effects are deep into "YMMV" territory.  Zombie compartments will increase GC and CC times for everyone by some non-negligible amount;  how much depends on the mix of zombie and non-zombie compartments present.  For a smaller fraction of users, the presence of a zombie compartment could cause Firefox's working set to exceed physical memory size, whereupon paging will start and performance will fall off a cliff.

In summary, there's a definite non-zero effect, but quantifying it is nigh on impossible.

I totally feel Justin's frustation, because I feel it too.  Firefox is engineered to a particular standard, but add-ons are not.  And we only have one stick for punishing non-AMO add-ons -- blocklisting.  But it's such a severe stick that we're *really* reluctant to use it.  As a result, authors of non-AMO add-ons have no incentive to fix problems unless they are catastrophic.
> And we only have one
> stick for punishing non-AMO add-ons -- blocklisting.  But it's such a severe
> stick that we're *really* reluctant to use it.

I and others have proposed a number of new sticks we could add to arsenal, and those cc'ed on this bug know how little fruit those proposals have borne.  So I'm trying to work with what we have.

I don't feel like the soft-block is particularly severe, all things considered.  But inasmuch as the tools we currently have are insufficient to address this problem, it's not for a lack of proposed new tools.

> So every dependency of bug 700547 is a severe problem? Should we consider
> blocking all of those add-ons if they aren't fixed promptly?

Well, not every dependency of bug 700547 is a content compartment leak, so no.  But you sniffed out my bigger goal, yes.  I think any add-on with a "severe" leak (*) should be soft-blocked if it is not fixed promptly.  Either that, or we should implement some other way of notifying the user that the add-on is bad.  I was hoping to use this bug to explore what's the right policy.

(*) The MemShrink team has triaged hundreds of leaks for upwards of nine months now, so I think we're in a good position to classify what "severe" means and to judge which kinds of memory leaks are likely to lead to performance problems.
(In reply to Asa Dotzler [:asa] from comment #7)
> I say it's time to put our foot
> down and say "you don't get to screw over millions of Firefox users with bad
> experience". Add-on authors need to get with the program, and fix the
> serious shortcomings we find (we're already doing more than we should have
> to in diagnosing their problems for them) of find a different platform to
> abuse. 
> 
> I know it's your job to keep add-on authors happy, but Mozilla's mission
> isn't about add-on authors. It's about users and web developers. Add-on
> authors are a nice to have but I have no problem saying "you must be this
> tall to ride" and discarding the ones that don't make the cut.

I disagree with almost every line of that, but sticking to this particular issue, I would be willing to block this add-on if this is a serious enough problem and the vendor is not showing progress towards addressing it in a reasonable timeframe.

The part that seems to be in dispute is whether this is a serious enough issue to warrant blocklisting. The MemShrink folks think it is, but their focus is on eliminating memory leaks and from their perspective this is one of the worst they've seen. I don't think we can argue whether the leak is a really bad memory leak -- if the MemShrink team says it is, it is. What I am more interested in is how bad is this really bad memory leak relative to the blocklist bar.

Jorge says impact is not very serious, and he has a great understanding of our blocklisting bar and what technical issues rise to the level of forcibly turning off someone else's software.

Nick helped me understand the user impact a bit more, and the takeaway I got from it is that it's possible a small fraction of users may notice some issues but the actual impact can't really be quantified.

When we aren't sure or an important aspect like that can't be quantified except to say it's a small fraction of users, I can't justify turning off someone else's software and would never err on the side of blocklisting.

Blocklisting is a great solution to suggest when you aren't responsible for the consequences, and I'm sure I'd do the same were my focus to eliminate memory leaks. But I don't see enough evidence of the severity of the impact to be okay with blocking.
(In reply to Justin Scott [:fligtar] from comment #11)
> When we aren't sure or an important aspect like that can't be quantified
> except to say it's a small fraction of users, I can't justify turning off
> someone else's software and would never err on the side of blocklisting.

This is our software. We have every right to defend it. If we won't block for serious leaks, then we need a policy or tech that prevents software with serious leaks from getting installed in the first place. Not doing one of those two things is simply unacceptable. We're spending hundreds of man hours chasing down and eliminating leaks and making other memory improvements in our software only to have that work undone by irresponsible and unresponsive third parties. This cannot continue. 

> Blocklisting is a great solution to suggest when you aren't responsible for
> the consequences, and I'm sure I'd do the same were my focus to eliminate
> memory leaks. 

*It's all of our focus to eliminating memory leaks*. This is a top-line priority for Firefox in 2012 (and was in 2011). This is as important as eliminating crashes and security holes. It's well above most user-facing Firefox features in priority on our roadmap for this year. We cannot continue to let our efforts in this critical area be undermined by third parties. That's no longer acceptable.

If you don't think Firefox should be focused on eliminating memory leaks, we can have that discussion and you can try to change my mind and the current plan of record for Firefox priorities for 2012 (though I think a better time for that would have been months ago when we were committing to the now published roadmap which very much includes improving memory management,) but I don't think you can just dismiss this as if it was just one random engineer's focus.
(In reply to Justin Scott [:fligtar] from comment #11)
> 
> Nick helped me understand the user impact a bit more, and the takeaway I got
> from it is that it's possible a small fraction of users may notice some
> issues but the actual impact can't really be quantified.

Thinking about it some more, a good summary is this: the performance impact will range from negligible to catastrophic.  Where any individual user lies on that spectrum depends unpredictably on their machine configuration and browsing patterns.

Another factor is that people judge Firefox on its memory consumption as measured by utilities like the Task Manager.  So even if a 20MB leak has a negligible performance effect for a particular user, the leak may cause a negative reaction.
> Blocklisting is a great solution to suggest when you aren't responsible for the consequences, and 
> I'm sure I'd do the same were my focus to eliminate memory leaks. But I don't see enough evidence of 
> the severity of the impact to be okay with blocking.

Speaking only for myself: If you file a bug on doing something less drastic than a soft block but which still notifies users that the add-on is problematic and gives them an option to disable it, I would be happy to resolve this bug as wontfix.
(In reply to Justin Lebar [:jlebar] from comment #14)
> Speaking only for myself: If you file a bug on doing something less drastic
> than a soft block but which still notifies users that the add-on is
> problematic and gives them an option to disable it, I would be happy to
> resolve this bug as wontfix.

That speaks to one of the roots of this problem, which is that we don't have a reasonable in-between solution. Right now we're pretty much limited by either blocklisting or not blocklisting, and we set a high bar because we the blocklist was designed more as an emergency system rather than a notification system. This is one of the things I want to discuss soon with the AOM team as one of many solutions to address the add-on problems that have recently popped up.

Going back on topic, there's no middle ground at the moment, and according to McAfee bug 735486 has been resolved on their side. Once the fix is verified and deployed, we will follow a similar blocklisting process as in bug 729614.
> and according to McAfee bug 735486 has been resolved on their side.

I would still like to establish a date beyond which we think it is unacceptable to wait for a fix, if you don't mind.  We have no guarantee that the bug is actually resolved or that it will actually go out in early April.

In the meantime, we can work on an alternative means of notifying users that the add-on is buggy.
(In reply to Justin Lebar [:jlebar] from comment #14)
> Speaking only for myself: If you file a bug on doing something less drastic
> than a soft block but which still notifies users that the add-on is
> problematic and gives them an option to disable it, I would be happy to
> resolve this bug as wontfix.

I'm currently in data gathering mode, where I'm listening to everyone's comments about what is broken about the current add-on installation/blocklisting/ecosystem models. Many people are throwing out solutions to their specific problems which range from not allowing unreviewed add-ons to be installed in Firefox to publishing some guidelines we expect all add-ons, even non-hosted ones, to follow.

When I propose something it will try to address the larger root issues I see and hear about.
> When I propose something it will try to address the larger root issues I see and hear about.

Roughly when should we expect this proposal?
> I'm currently in data gathering mode, where I'm listening to everyone's
> comments about what is broken about the current add-on
> installation/blocklisting/ecosystem models. Many people are throwing out
> solutions to their specific problems which range from not allowing
> unreviewed add-ons to be installed in Firefox to publishing some guidelines
> we expect all add-ons, even non-hosted ones, to follow.

We're on the same wavelength -- I've been working on a summary of all the ideas that have been floating around :)

I just finished it and posted a link to dev-platform:  https://groups.google.com/forum/?fromgroups#!topic/mozilla.dev.platform/_wwnIkRjUPc.  Hopefully that'll help you.
(In reply to Justin Scott [:fligtar] from comment #17)
> 
> I'm currently in data gathering mode, where I'm listening to everyone's
> comments about what is broken about the current add-on
> installation/blocklisting/ecosystem models. Many people are throwing out
> solutions to their specific problems which range from not allowing
> unreviewed add-ons to be installed in Firefox to publishing some guidelines
> we expect all add-ons, even non-hosted ones, to follow.
> 
> When I propose something it will try to address the larger root issues I see
> and hear about.

Justin, I've heard that you don't have time to work on this.  Is that right?
Bug 729608 has been resolved. However, this was not a new full version release, but a 'content update', which is just a partial patch. Given that their uptake is fairly quick and now we have some 3.4.1.195 that leak and some 3.4.1.195 that don't, I don't think it's a good idea to block this version anymore.
Can you remind me how fast the uptake was the last time McAfee updated things?
Bug 729614 comment #2 says it was 90% uptake in one week.
Calling this resolved per comment #21.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
Product: addons.mozilla.org → Toolkit
You need to log in before you can comment on or make changes to this bug.