Closed Bug 1097235 Opened 10 years ago Closed 6 years ago

[Loop ]Firefoxhello! has poor performance when browsing other apps while in a Loop call.

Categories

(Firefox OS Graveyard :: Gaia::Loop, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:-, b2g-v2.0 unaffected, b2g-v2.1 affected)

RESOLVED WONTFIX
blocking-b2g -
Tracking Status
b2g-v2.0 --- unaffected
b2g-v2.1 --- affected

People

(Reporter: jthomas, Unassigned)

References

(Depends on 1 open bug, )

Details

(Keywords: regression, Whiteboard: [2.1-exploratory-3])

Attachments

(5 files)

Description:
If the user attempts to browse other apps (such as Gallery and Videos) while in a loop call the device will have a very low performance rate.

Repro Steps:
1) Update a Flame to 20141111001201
2) Connect to another device with a Loop call.
3) While in the Loop call, browse other high performance apps such as Gallery and Videos.
4) Observe framerate.

Actual:
Framerate and performance drops substantially.

Expected:
It is expected that the user will be able to enter these other apps with ease, or not at all until closing the Loop app.

Device: Flame 2.1 (319mb)(Kitkat Base)(Shallow Flash)
BuildID: 20141111001201
Gaia: 7154f9aa0a69af56c8bd89be0c98d9791449527b
Gecko: 452db1a0c9a0
Version: 34.0 (2.1)
Firmware Version: v188-1
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0


Repro frequency:100%
See attached: Logcat & Video
Video: https://www.youtube.com/watch?edit=vd&v=IxZoMkjouJI
Currently unable to connect while in a Loop message for the Flame 2.2 KK (319mb)

Flame 2.2

Device: Flame 2.2 Master (319mb)(Kitkat Base)(Shallow Flash)
BuildID: 20141110040206
Gaia: 5f8206bab97cdd7b547cc2c8953cadb2a80a7e11
Gecko: d380166816dd
Version: 36.0a1 (2.2 Master)
Firmware: V188-1
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0

------------------------------------------------------------------
This issue is NOT occuring on the Flame 2.0 KK (319mb).

Result: Performance ran a bit slower than normal while the Loop app was running in the background, but not nearly as slow as the Flame 2.1 KK (319mb)

Flame 2.0

Device: Flame 2.0 (319mb)(Kitkat Base)(Shallow Flash)
BuildID: 20141110000204
Gaia: d3e4da377ee448f9c25f908159480e867dfb13f3
Gecko: 7198906837e7
Version: 32.0 (2.0)
Firmware: V188-1
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(dharris)
[Blocking Requested - why for this release]:

The performance drop when Loop is open on 2.1 is very severe. This is also a regression. Nominating to block on 2.1
blocking-b2g: --- → 2.1?
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(dharris)
Not sure if this is related to https://bugzilla.mozilla.org/show_bug.cgi?id=1095106, :jesup?

Also NI oteo, to see if they have been any other related changes causing this.
blocking-b2g: 2.1? → 2.1+
Flags: needinfo?(rjesup)
Flags: needinfo?(oteo)
John/Derek,

I spoke offline with Maire and she does not think any recent landings on the platform side may have caused this. She also mentioned it will be of great help if the QA can run "adb shell top -m 15 -d 2 -t" on a system the problem and not showing the problem, that would give us a big clue. So NI'ing you folks for help.
Flags: needinfo?(jthomas)
Flags: needinfo?(dharris)
To expand on what Bhavana says is Comment 4: this is not due to bug 1095106 and unlikely to be caused by any other recent WebRTC landings.  I can't speak yet to other platform side changes that could impact Hello.

Jay -- Can you reproduce this problem? I'm looking to verify this is a real regression on the platform, and if so, narrow down the regression window.   Also, if you are able to repro, can you get us a profile of v2.0 vs v2.1? (results from "adb shell top -m 15 -d 2 -t" would be fine).   Do you think you could do this before our sync up meeting tomorrow at 9:30am Pacific?  Thanks!
Flags: needinfo?(rjesup)
Attached file Script Log
Flags: needinfo?(jthomas)
QA Whiteboard: [QAnalyst-Triage?]
bhavana: John has attached a text document in comment 6 that contains info from running the script you mentioned in comment 4. Hope that helps!
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(dharris)
Derek: we need output of the script from a system with the problem, and the output from a system not showing the problem (i.e. in this case as it's flagged as a regression a 2.1 system and a 2.0 system).  

Also: this must be done while the problem is being seen (or in the same situation as where the problem is seen on 2.0).  I the log, I see no indication that a webrtc call is active (or the loop/Hello app).

In addition: while this is being flagged as a regression and we'll look at it as such, I would expect a drop in performance in other apps while a loop call is active - it has to constantly send/receive/decode/encode video and audio data.  Under normal use, this can use 60-90% of the CPU (perhaps a big less if it's not actually being rendered to the screen).  This is part of why I need to see the 2.0 top results.

Thanks
Flags: needinfo?(dharris)
QA Contact: ckreinbring
Unfortunately it does not look like we will be able to get regression window for this bug due to several factors
1) The Loop app does not appear stable enough at this time - the tester was not able to get two devices to reliably and consistently connect to each other
2) This is a perf bug with no definitive 'first broken' and 'last working' state but instead displays an array of different inconsistent levels of lag / jankyness
QA Contact: ckreinbring
Attached file Flame 2.1 Log
Attached file loop log
here is the coinciding 2.0 log cat running the same case with the 2.1
Attached log with the requested script for the Flame 2.1 KK (319mb) and the Flame 2.0 KK (319mb)
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Randell, new logs have been attached in comment 10 and comment 11, using the command given to us in comment 4.
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(dharris) → needinfo?
Thanks; those help a lot more.

A few things:
* Compositor uses a lot more CPU on 2.1 than 2.0 (10-14% vs 3%)
* System uses a *lot* more CPU, ~30-44% vs ~18-25%
* b2g uses similar (18-22%) though that's still high...
* MediaStreamGraph is gone, and most of the work done by it is now on AudioTrack.  CPU for that and other WebRTC/GUM things are similar
* Overall, CPU use on 2.1 gets into the mid-upper 90's often, while on 2.0 is ~77% and much more stable.
  Most of the difference would fall to Compositor (8-11%) plus system (10-20%).

Normal profiling will not tell us anything about the system CPU use, so a tool like Perf would be needed to see that, or Qualcomm's profiler.  It will help with compositor.  Sotaro, Steven, and ideas?  Steven: Can we get a perf run for 2.0 and 2.1 to compare?  (or even just 2.1)
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(slee)
Flags: needinfo?
(In reply to Randell Jesup [:jesup] from comment #14)
> Thanks; those help a lot more.
> 
> A few things:
> * Compositor uses a lot more CPU on 2.1 than 2.0 (10-14% vs 3%)
> * System uses a *lot* more CPU, ~30-44% vs ~18-25%
> * b2g uses similar (18-22%) though that's still high...
> * MediaStreamGraph is gone, and most of the work done by it is now on
> AudioTrack.  CPU for that and other WebRTC/GUM things are similar
> * Overall, CPU use on 2.1 gets into the mid-upper 90's often, while on 2.0
> is ~77% and much more stable.
>   Most of the difference would fall to Compositor (8-11%) plus system
> (10-20%).
> 
> Normal profiling will not tell us anything about the system CPU use, so a
> tool like Perf would be needed to see that, or Qualcomm's profiler.  It will
> help with compositor.  Sotaro, Steven, and ideas?  Steven: Can we get a perf
> run for 2.0 and 2.1 to compare?  (or even just 2.1)
Perf build is broken when we're migrating to KK. :(

Peter, can you have some to look the high CPU usage of compositor?
Flags: needinfo?(slee)
Flags: needinfo?(pchang)
(In reply to Randell Jesup [:jesup] from comment #14)
> Thanks; those help a lot more.
> 
> A few things:
> * Compositor uses a lot more CPU on 2.1 than 2.0 (10-14% vs 3%)
> * System uses a *lot* more CPU, ~30-44% vs ~18-25%
> * b2g uses similar (18-22%) though that's still high...
> * MediaStreamGraph is gone, and most of the work done by it is now on
> AudioTrack.  CPU for that and other WebRTC/GUM things are similar
> * Overall, CPU use on 2.1 gets into the mid-upper 90's often, while on 2.0
> is ~77% and much more stable.
>   Most of the difference would fall to Compositor (8-11%) plus system
> (10-20%).
> 
> Normal profiling will not tell us anything about the system CPU use, so a
> tool like Perf would be needed to see that, or Qualcomm's profiler.  It will
> help with compositor.  Sotaro, Steven, and ideas?  Steven: Can we get a perf
> run for 2.0 and 2.1 to compare?  (or even just 2.1)
I think gecko profiler could help to tell the story about CPU usage in gecko side.

Derek, are you able to check are we using HWC for flame 2.0? You can disable HWC from developer option in the flame 2.0 device to see the CPU usage of compositor is higher or not.
Flags: needinfo?(pchang) → needinfo?(dharris)
(In reply to StevenLee[:slee] from comment #15)
> (In reply to Randell Jesup [:jesup] from comment #14)
> > Thanks; those help a lot more.
> > 
> > A few things:
> > * Compositor uses a lot more CPU on 2.1 than 2.0 (10-14% vs 3%)
> > * System uses a *lot* more CPU, ~30-44% vs ~18-25%
> > * b2g uses similar (18-22%) though that's still high...
> > * MediaStreamGraph is gone, and most of the work done by it is now on
> > AudioTrack.  CPU for that and other WebRTC/GUM things are similar
> > * Overall, CPU use on 2.1 gets into the mid-upper 90's often, while on 2.0
> > is ~77% and much more stable.
> >   Most of the difference would fall to Compositor (8-11%) plus system
> > (10-20%).
> > 
> > Normal profiling will not tell us anything about the system CPU use, so a
> > tool like Perf would be needed to see that, or Qualcomm's profiler.  It will
> > help with compositor.  Sotaro, Steven, and ideas?  Steven: Can we get a perf
> > run for 2.0 and 2.1 to compare?  (or even just 2.1)

> Perf build is broken when we're migrating to KK. :(

That's unfortunate....

Perhaps Qualcomm's profiler works?  Jay/Diego?
 
> Peter, can you have some to look the high CPU usage of compositor?
Flags: needinfo?(jaywang)
Flags: needinfo?(dwilson)
Is there no profile log of b2g main thread and compositor thread? It is very important to discuss about the performance.
Flags: needinfo?(sotaro.ikeda.g)
There is a lot of things mentioned in this bug, I'm not sure what it's about at this point.  Do we have a goal of running Loop and other "demanding" apps at the same time?  What are the targets?  Or is this about the performance change from 2.0 to 2.1, described in comment 14?
If compositor thread's cpu usage is high, it typically means compositor is doing more frequent composition. Typical example is Bug 1091775. In it, animated png request more frequent composition, it increased b2g main thread and compositor thread's cpu usage.

If compositor thread's cpu usage is high even when the Loop app is in the background, there seems 2 possibilities.
- [1] WebRTC video is still compositing even when the loop app is in background.
- [2] system app(status bar) request frequent composite like Bug 1091775.

[2] seems more possible than [1]. When application goes to backgorund, all layers related to the app should be destroyed. Therefore if [1] happens, it is a defect. We can confirm if [1] happens from layer dumps.
Leaving ni? for Jay as he has done all the profiling from the CAF side.
Flags: needinfo?(dwilson)
Attached file perf_loop_bg_v2.1.txt
Attach,please, find the perf profiling of Compositor.
Flags: needinfo?(jaywang)
Comment on attachment 8526426 [details]
perf_loop_bg_v2.1.txt

Sotaro - we have a profile of Compositor now
2.27% (I presume of total CPU, not of Compositor) for pthread_mutex_unlock  (and 1.48% for lock, 0.48% for mutex_lock_slowpath, and 1.08% for raw_spin_unlock_irqrestore) is really, really high.  And I stopped counting, but there are a lot more lock/unlock/etc down lower

The alloc/free/memcpy/memsets also are high, and then there are more arena/etc ones down lower.

I'm hoping these are %-of-Compositor, not %-of-total-CPU.  To find out more, we need more info from jay about what is measured here.  (If I was more familiar with perf it might be obvious.)
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(jaywang)
Flags: needinfo?(dharris)
(In reply to Randell Jesup [:jesup] from comment #23)
> Comment on attachment 8526426 [details]
> perf_loop_bg_v2.1.txt

Hmm, I expected a profile of using the following. From attachment 8526426 [details], it seem very difficult to say something...
 - http://people.mozilla.org/~bgirard/cleopatra/
Flags: needinfo?(sotaro.ikeda.g)
I tried to reproduce the problem by installing the firefox hello app from the following. But the app did not work at all. After showing the app start up screen, it does nothing. Is there a problem to the problem? or is there an additional step to install the Firefox hello app?

https://github.com/mozilla-b2g/firefoxos-loop-client
jesup, do you know something about comment 25?
Flags: needinfo?(rjesup)
(In reply to Sotaro Ikeda [:sotaro] from comment #25)
> I tried to reproduce the problem by installing the firefox hello app from
> the following. But the app did not work at all. After showing the app start
> up screen, it does nothing. Is there a problem to the problem? or is there
> an additional step to install the Firefox hello app?
> 
> https://github.com/mozilla-b2g/firefoxos-loop-client

Basically you need to associate it with an FxA account or a phone number (I don't have a SIM, I used my android cell's number; it sent a confirmation text to my phone).

Also, to make a call you need to add at least one main Contact using the Contacts app before you can add more in the loop client.

Once registered, you should be able to call from one phone to another (which I would do for this test).  Make sure both phones have builds with media.peerconnection.video.h264_enabled = true -- which is NOT the default since this is only supported for 8x10 devices which have a "good enough" DSP code bundle.  If need be, set it as a user_pref (or do so just to make sure).
CC-ing some tef people who can help more.
Flags: needinfo?(rjesup) → needinfo?(dcoloma)
(In reply to Randell Jesup [:jesup] from comment #27)
> (In reply to Sotaro Ikeda [:sotaro] from comment #25)
> > I tried to reproduce the problem by installing the firefox hello app from
> > the following. But the app did not work at all. After showing the app start
> > up screen, it does nothing. Is there a problem to the problem? or is there
> > an additional step to install the Firefox hello app?
> > 
> > https://github.com/mozilla-b2g/firefoxos-loop-client
> 

My case seems a different problem. After applications splash animation, the app do nothing. Even account setting menu does not appear.
(In reply to Sotaro Ikeda [:sotaro] from comment #28)
> (In reply to Randell Jesup [:jesup] from comment #27)
> > (In reply to Sotaro Ikeda [:sotaro] from comment #25)
> > > I tried to reproduce the problem by installing the firefox hello app from
> > > the following. But the app did not work at all. After showing the app start
> > > up screen, it does nothing. Is there a problem to the problem? or is there
> > > an additional step to install the Firefox hello app?
> > > 
> > > https://github.com/mozilla-b2g/firefoxos-loop-client
> > 
> 
> My case seems a different problem. After applications splash animation, the
> app do nothing. Even account setting menu does not appear.

You need to install the app via grunt as explained in https://github.com/mozilla-b2g/firefoxos-loop-client/blob/master/README.md

Let me know if it does not work either.
Flags: needinfo?(oteo)
Flags: needinfo?(dcoloma)
Thanks for the info! Therefore, does the following explanation not work now?
https://wiki.mozilla.org/Loop/Try_Loop#Firefox_OS_Loop_Client
(In reply to Daniel Coloma:dcoloma from comment #29)
> 
> You need to install the app via grunt as explained in
> https://github.com/mozilla-b2g/firefoxos-loop-client/blob/master/README.md
> 
> Let me know if it does not work either.

Thanks, the app works!
(In reply to Sotaro Ikeda [:sotaro] from comment #30)
> Thanks for the info! Therefore, does the following explanation not work now?
> https://wiki.mozilla.org/Loop/Try_Loop#Firefox_OS_Loop_Client

You are right, it's not up to date, we need to do it. Thanks for pointing it out. Adding a ni on me so I don't forget it.
Flags: needinfo?(dcoloma)
(In reply to Randell Jesup [:jesup] from comment #23)
> Comment on attachment 8526426 [details]
> perf_loop_bg_v2.1.txt
> 
> Sotaro - we have a profile of Compositor now
> 2.27% (I presume of total CPU, not of Compositor) for pthread_mutex_unlock 
> (and 1.48% for lock, 0.48% for mutex_lock_slowpath, and 1.08% for
> raw_spin_unlock_irqrestore) is really, really high.  And I stopped counting,
> but there are a lot more lock/unlock/etc down lower
> 
> The alloc/free/memcpy/memsets also are high, and then there are more
> arena/etc ones down lower.
> 
> I'm hoping these are %-of-Compositor, not %-of-total-CPU.  To find out more,
> we need more info from jay about what is measured here.  (If I was more
> familiar with perf it might be obvious.)

The profiling measures the Compositor thread and % here is representing the % of Compositor. I only include the big items here. There are more than 1000+ items in the complete profiling

I think __memcpy_base 2.92% and memset 2.14% imply there are a lot of memory access here. However, these are low-level functions so it is not straightforward to derive who triggered these.
Maybe, these operation?

 0.86%     0.00%     0.86%      b2g  libxul.so              [.] mozilla::gfx::Matrix4x4::operator*(mozilla::gfx::Matrix4x4 const&) const                                                                                                                                                                                                                                                                                                                                                                                   
     0.83%     0.00%     0.83%      b2g  libxul.so              [.] mozilla::layers::CompositorOGL::DrawQuad(mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&)                                                                                                                                                                                                               
       0.56%     0.00%     0.56%      b2g  libxul.so              [.] mozilla::layers::TiledContentHost::RenderTile(mozilla::layers::TileHost const&, gfxRGBA const*, mozilla::layers::EffectChain&, float, mozilla::gfx::Matrix4x4 const&, mozilla::gfx::Filter const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, nsIntRegion const&, nsIntPoint const&, nsIntSize const&)
Flags: needinfo?(jaywang)
I understand that one very big difference between b2gv2.0 and b2gv2.1 during showing Gallery app and doing WebRTC is AttentionIndicator. It seems to add extra cpu usage on b2g main thread and compositor.
AttentionIndicator is not shown on b2g v2.0 during showing Gallery app and doing WebRTC.
When css animation is used, the animation try to composite by maximum frame rate(60fps). I do not think it is a good idea to use css animation to Attention Indicator.
This Attention Indicator is animated by using AsyncCompositionManager on compositor side. But b2g main thread is still doing reflow. 

The following is a profile of b2g main thread during showing Galley app during doing WebRTC by using Firefox Hello app on b2g v2.1. In this use case, Gallery did nothing. Just Attention Indicator was shown on top of the display.

http://people.mozilla.org/~bgirard/cleopatra/#report=a93ff06eabc2c43134425283e0b4a866ac00c601
BenWa, can you give a comment to comment 36, comment 37 and comment 38?
Flags: needinfo?(bgirard)
(In reply to Sotaro Ikeda [:sotaro] from comment #37)
> When css animation is used, the animation try to composite by maximum frame
> rate(60fps). I do not think it is a good idea to use css animation to
> Attention Indicator.

I received an advice from :Benwa that steps() with CSS Animation could reduce the fps.
Just to summarize what I found looking at the app with the inspector and FPS counter:
- We should use steps to limit the FPS of the blue indicator to not run at 60 FPS but run at a lower rate. (filed bug 1103207)
- The handle3 animation we saw was not hitting the off-main-thread-animation path and is causing us to build a display list a 60 FPS thus hurting the CPU usage. (shown in profile from Comment 38). We can check this with the animation logging feature IsAnimationLoggingEnabled)
- The image bridge is causing composites even if the layer is not visible.
Flags: needinfo?(bgirard)
(In reply to Benoit Girard (:BenWa) from comment #41)
> Just to summarize what I found looking at the app with the inspector and FPS
> counter:
> - We should use steps to limit the FPS of the blue indicator to not run at
> 60 FPS but run at a lower rate. (filed bug 1103207)

This part is actually, composed by 2 things.
- CSS Animation trigger 60fps composition.
    + By using steps(), need to slow down animation.
- steps() does not slow down actual compositor's composition
    + Even when step is used to slow down animation, AsyncCompositionManager try to composite by 60fps.
Blocks: 1103213
(In reply to Benoit Girard (:BenWa) from comment #41)
> - The image bridge is causing composites even if the layer is not visible.

Bug 1103213  is created for the above.
No longer blocks: 1103213
Depends on: 1103213
Depends on: 1103207
(In reply to Sotaro Ikeda [:sotaro] from comment #38)
> This Attention Indicator is animated by using AsyncCompositionManager on
> compositor side. But b2g main thread is still doing reflow. 

One question (not sure if you are the right person to ask but just in case), what is the purpose of that indicator? i.e. when should it be shown?

Thanks!
Flags: needinfo?(sotaro.ikeda.g)
Depends on: 1104151
(In reply to Daniel Coloma:dcoloma from comment #44)
> (In reply to Sotaro Ikeda [:sotaro] from comment #38)
> > This Attention Indicator is animated by using AsyncCompositionManager on
> > compositor side. But b2g main thread is still doing reflow. 
> 
> One question (not sure if you are the right person to ask but just in case),
> what is the purpose of that indicator? i.e. when should it be shown?
> 
> Thanks!

Sorry, I do not know about it. I assume that it is a indicator of saying important thing is going on in the background. WebRTC could send video and audio to other places, therefore if user does not recognize it, it could becomes a risk to the privacy.

:alive, do you know who is the correct person to answer this question?
Flags: needinfo?(sotaro.ikeda.g) → needinfo?(alive)
See user story in https://bugzilla.mozilla.org/show_bug.cgi?id=927862

Briefly it's saying there is an attention window running at background; it should be shown when you dismiss the attention window by home button and hide when you re-trigger the attention from the utility tray.
Flags: needinfo?(alive)
Joint triage: decided to minus it.
blocking-b2g: 2.1+ → -
Flags: needinfo?(dcoloma)
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: