Closed Bug 791404 Opened 7 years ago Closed 7 years ago

Raise the sample interval to 10ms on android/gonk

Categories

(Core :: Gecko Profiler, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla18

People

(Reporter: cjones, Assigned: cjones)

Details

Attachments

(1 file, 1 obsolete file)

dbaron found that the profiler results seemed skewed at 1ms sampling.  Bumping it up to 10ms.
Last of the gecko support for b2g.
Assignee: nobody → jones.chris.g
Attachment #661401 - Flags: review?(bgirard)
(In reply to Chris Jones [:cjones] [:warhammer] from comment #0)
> dbaron found that the profiler results seemed skewed at 1ms sampling. 
> Bumping it up to 10ms.

I did?  When?

(I remember being worried about the costs of label enter/exit, but I think that's a distinct problem.)
Huh, I'm not sure anymore.

Anyway, I still think we should do this because
 - 1000Hz sampling drops perf by 10fps on the b2g homescreen.  So big overhead.
 - b2g will be sampling up to 10 OS processes or so simultaneously
 - the profiling results (what was hottest) changed when going 1000Hz->100Hz.  I would tend to trust the 100Hz results more.
Comment on attachment 661401 [details] [diff] [review]
Sample at 10ms on gonk/android for more realistic results

This patch will change the sampling interval for eideticker tests. From my results the information gathered from these tests have been accurate (it has rediscovered bugs diagnosed through other means) therefore I'm not looking to change it on anecdotal evidence.

About the performance regression I rather spend time and understand what is causing the regression then drop a blind patch like this.

We need to understand if the regression is coming from the signal delivery or from the JS instrumentation which is an optional feature. Dropping the frequency will not reduce the JS instrumentation overhead.

These default values will only be important for startup profiling+eideticker once past+glandium's changes land which are nearly ready. Having these will make it easy to play with the interval/buffer size to find a configuration that works well with b2g.
Attachment #661401 - Flags: review?(bgirard) → review-
(In reply to Benoit Girard (:BenWa) from comment #4)
> Comment on attachment 661401 [details] [diff] [review]
> Sample at 10ms on gonk/android for more realistic results
> 
> This patch will change the sampling interval for eideticker tests. From my

Ah, yes; this needs to happen only when MEMORY_MAY_BE_TIGHT.  Thanks for the catch.

> results the information gathered from these tests have been accurate (it has
> rediscovered bugs diagnosed through other means) therefore I'm not looking
> to change it on anecdotal evidence.
> 

Those are strong and rather insulting terms you're bandying about, young man!  I gave you concrete evidence of why 1ms is causing issues.  Who has done that work for fennec-android?

I'm sorry, but trying to criticize this patch by generalizing from maybe-or-maybe-not concrete testing on high-end phones/dev boards with a bazillion cores and fat memory buses running *one* gecko process, to b2g where we have one low-end core and *many* gecko processes, is totally bogus.

> About the performance regression I rather spend time and understand what is
> causing the regression then drop a blind patch like this.
> 

Be my guest.  I'll see you next year when you get oprofile running ;).  In the meantime, 1ms sampling is a big perf hit in b2g, and has been shown to give different results than 10ms sampling.

I'm going to repost a new patch that makes this only kick into effect when we think we're on crappy HW, like the sample-buffer limiter.
Doesn't touch android default.
Attachment #661401 - Attachment is obsolete: true
Attachment #661995 - Flags: review?(bgirard)
(In reply to Chris Jones [:cjones] [:warhammer] from comment #5)
> > results the information gathered from these tests have been accurate (it has
> > rediscovered bugs diagnosed through other means) therefore I'm not looking
> > to change it on anecdotal evidence.
> > 
> 
> Those are strong and rather insulting terms you're bandying about, young
> man! 

I'm not trying to offend, I am simply seeking more evidence to support that dropping the sampling interval is the right fix. (see below)

> I gave you concrete evidence of why 1ms is causing issues.
> 

The evidence you discuss in Comment 3 shows that 1ms profiling isn't working properly but it doesn't explain why it is so slow.

> Who has
> done that work for fennec-android?

Myself, Ehsan, Julian, glandium have done some of the work and discussed profiling performance on low spec devices. Furthermore I've done analysis to compare the sampling accuracy and precision on all platforms. 

> I'm sorry, but trying to criticize this patch by generalizing from
> maybe-or-maybe-not concrete testing on high-end phones/dev boards with a
> bazillion cores and fat memory buses running *one* gecko process, to b2g
> where we have one low-end core and *many* gecko processes, is totally bogus.
> 

The reason I want further investigation is because the performance impact you discuss in Comment 3 is much higher then we have anticipated. To me the first indication is that sampling code is more expensive then it needs to be. I want to make sure that we understand where the performance is lower then we predicted rather then dialing back the frequency at the first sight of trouble. Even on a low end 600 Mhz processor, if we budget a 5% performance impact at 1ms that leaves us a budget of 30,000 cycles. Recording a basic profile with no features active means we only need to record about 200 bytes so that leaves us say 29,000 cycles for context switching and signal between the main thread, the timer thread, signal handler. I can't find any data on how much this could cost. My calculations here are likely wrong but that's why we need to investigate the practical overhead of sampling, recording the sample, the javascript instrumentation and the other pieces.

We're only trying to sample one thread on a phone running multiple process (b2g or fennec both) and the memory bandwidth required to save a 200 bytes of profile data with JS profiling disabled doesn't require a 'fat bus' or a 'bazillion cores'.

> > About the performance regression I rather spend time and understand what is
> > causing the regression then drop a blind patch like this.
> > 
> 
> Be my guest.  I'll see you next year when you get oprofile running ;).  In
> the meantime, 1ms sampling is a big perf hit in b2g, and has been shown to
> give different results than 10ms sampling.
> 

I'm not working on oprofile so I don't understand. Dropping the interval is a last ditch solution so let's make sure the profiler is performing correctly first. From the data you provide that isn't enough data to convince me that this is the case.
This most recent patch only changes the interval for b2g.  Is that an acceptable compromise?

I am trying very hard to get "a" profiler working for b2g because partners are requesting this.  oprofile is not an option at this point.
Attachment #661995 - Flags: review?(bgirard) → review+
https://hg.mozilla.org/mozilla-central/rev/4cf7fbb68148
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
You need to log in before you can comment on or make changes to this bug.