Closed Bug 1067664 Opened 6 years ago Closed 6 years ago

Allow different leak thresholds for different types of processes

Categories

(Testing :: Mochitest, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla35

People

(Reporter: mccr8, Assigned: mccr8)

References

Details

(Whiteboard: [MemShrink:P2])

Attachments

(3 files, 2 obsolete files)

Right now, there's a single leak threshold that applies to all kinds of processes.  We're probably going to want to set a different threshold for a tab process (probably nonzero...) versus the parent process (still 0, or whatever it is on B2G).

Ideally, leakThreshold would be changed from a single integer to a dictionary mapping process names to integers.

I guess command line options should be set up for this, though I'm not sure who even uses those.
Just put in sane defaults and things should be fine.
Whiteboard: [MemShrink]
Whiteboard: [MemShrink] → [MemShrink:P2]
Instead of a single leak threshold, we pass around a dict from processtype to threshold. This keeps the existing stuff for default processes. It also sets a reasonable threshold for where we are with tab processes, though note that right now we never actually create a log for the tab processes.
Attachment #8495380 - Flags: review?(jmaher)
This probably doesn't really need to be a separate patch, but now that we have a reasonable threshold in place for tab processes, remove the existing code that always ignores leaks for tab processes.
Attachment #8495382 - Flags: review?(jmaher)
Comment on attachment 8495380 [details] [diff] [review]
part 1 - Allow different leak thresholds for different types of processes.

Review of attachment 8495380 [details] [diff] [review]:
-----------------------------------------------------------------

the concept of default as a process type is confusing.  Also leaving the cli flag name the same could imply that we are setting the threshold for all processes.  can we change the cli name?

::: build/automationutils.py
@@ +336,5 @@
>      log.info("WARNING | leakcheck | refcount logging is off, so leaks can't be detected!")
>      return
>  
> +  for processType, leakThreshold in leakThresholds.iteritems():
> +    log.info("TEST-INFO | leakcheck | %s process: leak threshold set at %d bytes" % (processType, leakThreshold))

if this is not set, it won't print 0.  Could we put this elsewhere?

::: testing/mochitest/mochitest_options.py
@@ +222,5 @@
>          }],
>          [["--leak-threshold"],
>          { "action": "store",
>            "type": "int",
> +          "dest": "defaultLeakThreshold",

in code, default implies the fallback option.  are you implying the 'default process'?  if so, this couls be confusing.

@@ +599,5 @@
>                      self.error('Missing binary %s required for --use-test-media-devices')
>  
> +        options.leakThresholds = {}
> +        if options.defaultLeakThreshold != 0:
> +            options.leakThresholds["default"] = options.defaultLeakThreshold

why not set it to zero here?
Attachment #8495380 - Flags: review?(jmaher) → review-
Attachment #8495382 - Flags: review?(jmaher) → review+
(In reply to Joel Maher (:jmaher) from comment #6)
> the concept of default as a process type is confusing.
Yeah, it is, but I don't really want to make up a new name for this existing Gecko concept.  I'll try to be more explicit in naming things that it is the default process.

> Also leaving the cli flag name the same could imply that we are setting the threshold for all
> processes.  can we change the cli name?
Sure.  I can call it --default-process-leak-threshold.  Though that is still kind of confusing.

> if this is not set, it won't print 0.  Could we put this elsewhere?
I was just preserving the existing behavior, where it doesn't print out anything if the value is zero, and it only prints out the threshold once no matter how many logs there are.  Do you think it is better to print out the threshold when it is zero?

> in code, default implies the fallback option.  are you implying the 'default
> process'?  if so, this couls be confusing.
Yeah.  I could change it to defaultProcessLeakThreshold, though that is still kind of confusing. :-/

> @@ +599,5 @@
> >                      self.error('Missing binary %s required for --use-test-media-devices')
> >  
> > +        options.leakThresholds = {}
> > +        if options.defaultLeakThreshold != 0:
> > +            options.leakThresholds["default"] = options.defaultLeakThreshold
> 
> why not set it to zero here?

The existing behavior is to never print out the value when it is zero. If we maintain the invariant that we never store zero, then we don't need to check.  But maybe I should just check in automationutils and then store a zero or something...
Maybe some kind of phrasing like leakThresholdForDefaultProcess is less ambiguous, though getting long.
Attachment #8493386 - Attachment is obsolete: true
(In reply to Andrew McCreight [:mccr8] from comment #7)

> > if this is not set, it won't print 0.  Could we put this elsewhere?
> I was just preserving the existing behavior, where it doesn't print out
> anything if the value is zero, and it only prints out the threshold once no
> matter how many logs there are.  Do you think it is better to print out the
> threshold when it is zero?

I think it is more explicit

> 
> > in code, default implies the fallback option.  are you implying the 'default
> > process'?  if so, this couls be confusing.
> Yeah.  I could change it to defaultProcessLeakThreshold, though that is
> still kind of confusing. :-/

agreed, the help does specify this, where do we stop on making things non confusing. Lets leaving this alone for now.

> 
> > @@ +599,5 @@
> > >                      self.error('Missing binary %s required for --use-test-media-devices')
> > >  
> > > +        options.leakThresholds = {}
> > > +        if options.defaultLeakThreshold != 0:
> > > +            options.leakThresholds["default"] = options.defaultLeakThreshold
> > 
> > why not set it to zero here?
> 
> The existing behavior is to never print out the value when it is zero. If we
> maintain the invariant that we never store zero, then we don't need to
> check.  But maybe I should just check in automationutils and then store a
> zero or something...

accepting zero here makes the code cleaner
> I think it is more explicit
Sounds reasonable.  I was thinking I'd have to print it out once per log type, but I guess I can just hard code "default", "plugin", "tab" and "geckomediaplugin" and then assert if we see one that doesn't match that.  There's also ipdlunittest, but I don't think we use that in any place where we are doing leak checking.

http://mxr.mozilla.org/mozilla-central/source/xpcom/build/nsXULAppAPI.h#366
Comment on attachment 8495513 [details] [diff] [review]
part 1 - Allow different leak thresholds for different types of processes.

Review of attachment 8495513 [details] [diff] [review]:
-----------------------------------------------------------------

looks a lot cleaner
Attachment #8495513 - Flags: review?(jmaher) → review+
Thanks for the reviews, it improved the patch quite a bit.  Here's another try run just in case I messed something up: https://tbpl.mozilla.org/?tree=Try&rev=0e15550dfb01
Backed out while mccr8 investigates some Android issues that showed up in the logs.
https://hg.mozilla.org/integration/mozilla-inbound/rev/8bc6dfe614be
Retriggers appear to be pointing at this push as the cause of the below intermittent mochitest-e10s failures:
https://tbpl.mozilla.org/php/getParsedLog.php?id=48995549&tree=Mozilla-Inbound
I'll land this folded into part 1.  Android doesn't do leak checking, but we end up grabbing values off of options before we check that there's no leak log.

A try run confirmed that the Android test harness exceptions are fixed.  I also retriggered Linux64 opt e10s M3 a bunch of times and it didn't show the WebRTC errors, so with my Ozymandias hat on I'm going to blame gremlins.

try run: https://tbpl.mozilla.org/?tree=Try&rev=409ec0a94407
Attachment #8497615 - Flags: review?(jmaher)
Comment on attachment 8497615 [details] [diff] [review]
part 1b - Add a dummy option for android reftests.

Review of attachment 8497615 [details] [diff] [review]:
-----------------------------------------------------------------

nice
Attachment #8497615 - Flags: review?(jmaher) → review+
You need to log in before you can comment on or make changes to this bug.