Closed Bug 1110625 Opened 10 years ago Closed 6 years ago

Minimize the overhead from system app when launch an app

Categories

(Firefox OS Graveyard :: Performance, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(feature-b2g:3.0?, tracking-b2g:+)

RESOLVED WONTFIX
feature-b2g 3.0?
tracking-b2g +

People

(Reporter: ting, Unassigned)

References

Details

Attachments

(5 files, 3 obsolete files)

Based on bug 1094010 comment 6 and 10, the app launch animation occupies B2G process main thread more than 80ms. This could block sync IPC from content process, for example, SendGetVolumes() of Music app.

There're two major animations, one for closing homescreen and one for the splash, and some others especially the background of status bar.
Could you have someone to study this bug?
Flags: needinfo?(pchang)
chiajung, please have a look at the animation part to see we could improve this part or not.

I just copied some info from bug 1094010 comment 10.

Gallery: http://people.mozilla.org/~bgirard/cleopatra/#report=728451ecd3cd5f2cc93e4f5d494028c95947caf6
  - B2G process main thread has a very long repaint [1177,1364] and [1381,1491] after system app launch Gallery.
  - runFontFit() [1740,1912].
Flags: needinfo?(pchang) → needinfo?(chung)
Quoted from bug 1074783 comment 61:

(In reply to Ting-Yu Chou [:ting] from comment #51)
> I thought the only painting is splash
> icon enlarge animation which can be done from hardware and shouldn't take
> too much time on b2g main thread.

I found there're other updates: .gesture-panel (left/right), #statusbar, .statusbar-shadow, .notifications-shadow, .sb-icon-*.
The profile has 2 large part:
(1) 121ms Restyle #0[1177,1365]: Which is not triggered by RefreshDriver. I tried to debug and it seems element.clientWidth()/element.focus() calls make it resolve style.

(2) 109ms RefreshDriver::Tick[1381,1491]: It is combined of 40ms restyle and 20ms rasterize and others. Style/Raster time seems OK but where the extra cost comes from?
Flags: needinfo?(chung)
(In reply to Chiajung Hung [:chiajung] from comment #4)
> The profile has 2 large part:
> (1) 121ms Restyle #0[1177,1365]: Which is not triggered by RefreshDriver. I
> tried to debug and it seems element.clientWidth()/element.focus() calls make
> it resolve style.
> 
> (2) 109ms RefreshDriver::Tick[1381,1491]: It is combined of 40ms restyle and
> 20ms rasterize and others. Style/Raster time seems OK but where the extra
> cost comes from?

From the old profile, it should be PLayerTransaction::SendUpdate IPC call which usually implies Compositor busy, so I create another profile to show what's wrong there. 

http://people.mozilla.org/~bgirard/cleopatra/#report=40d7912b9943c27faa37de2eaf8ab8ce2134c1c7

However, my profile does not contain similar pattern, and from the profile, I think [2896,2933] is waiting for PLayerTransaction's actor creation while compositor busy. BTW, project silk may help such condition slightly since CompositorThread usually waiting for the vsync to swapBuffer currently.
Or we may need async PLayerTransaction construction.

In my profile, first paint for app seems missing label so I can not tell when app launch done. But there is a long Refresh[4083,4115], too. The last part(6ms) seems dominated by another sync IPC: PLayerTransaction::SendUpdate in [4109,4111](3ms).
(In reply to Chiajung Hung [:chiajung] from comment #5)
> (In reply to Chiajung Hung [:chiajung] from comment #4)
> > The profile has 2 large part:
> > (1) 121ms Restyle #0[1177,1365]: Which is not triggered by RefreshDriver. I
> > tried to debug and it seems element.clientWidth()/element.focus() calls make
> > it resolve style.
> > 
> > (2) 109ms RefreshDriver::Tick[1381,1491]: It is combined of 40ms restyle and
> > 20ms rasterize and others. Style/Raster time seems OK but where the extra
> > cost comes from?
> 
> From the old profile, it should be PLayerTransaction::SendUpdate IPC call
> which usually implies Compositor busy, so I create another profile to show
> what's wrong there. 
> 
> http://people.mozilla.org/~bgirard/cleopatra/
> #report=40d7912b9943c27faa37de2eaf8ab8ce2134c1c7
> 
> However, my profile does not contain similar pattern, and from the profile,
> I think [2896,2933] is waiting for PLayerTransaction's actor creation while

Could you confirm it?  And, do you have any idea to fix it?  For example project silk, or async PLayerTransaction construction.

Could you take this bug?
There's already a bug 1110625 dealing with the case of Gallery, in this bug what we want is to minimize the general overhead from launch animation:

  - homescreen closing,
  - app splash,
  - .gesture-panel (left/right),
  - #statusbar,
  - .statusbar-shadow,
  - .notifications-shadow,
  - .sb-icon-*.
(In reply to Thinker Li [:sinker] from comment #6)
> (In reply to Chiajung Hung [:chiajung] from comment #5)
> > (In reply to Chiajung Hung [:chiajung] from comment #4)
> > > The profile has 2 large part:
> > > (1) 121ms Restyle #0[1177,1365]: Which is not triggered by RefreshDriver. I
> > > tried to debug and it seems element.clientWidth()/element.focus() calls make
> > > it resolve style.
> > > 
> > > (2) 109ms RefreshDriver::Tick[1381,1491]: It is combined of 40ms restyle and
> > > 20ms rasterize and others. Style/Raster time seems OK but where the extra
> > > cost comes from?
> > 
> > From the old profile, it should be PLayerTransaction::SendUpdate IPC call
> > which usually implies Compositor busy, so I create another profile to show
> > what's wrong there. 
> > 
> > http://people.mozilla.org/~bgirard/cleopatra/
> > #report=40d7912b9943c27faa37de2eaf8ab8ce2134c1c7
> > 
> > However, my profile does not contain similar pattern, and from the profile,
> > I think [2896,2933] is waiting for PLayerTransaction's actor creation while
> 
> Could you confirm it?  And, do you have any idea to fix it?  For example
> project silk, or async PLayerTransaction construction.
> 
> Could you take this bug?

I will try to make PLayerTransaction construction async first.
If success, I will take this bug.
(In reply to Ting-Yu Chou [:ting] from comment #7)
> There's already a bug 1110625 dealing with the case of Gallery, in this bug

Correction: bug 1102715.
The profile Kanru captured at bug 1102715 comment 1 also shows two nsRefreshDriver::Tick() [2480,2570], and [2600,2670], which should be from the animation.
feature-b2g: --- → 2.2?
I just cpatured two profiles from launching SMS:

1. https://people.mozilla.org/~bgirard/cleopatra/#report=67d3672fac6183fb607441f85df9c436da7770a4

   cmd = |$ ./profile.sh start -p b2g && ./profile.sh start -p Homescreen && ./profile.sh start -p [preallocated pid]; sleep 5; ./profile.sh capture|

   nsRefreshDriver::Tick [1462,1469], [1524,1556]
   PresShell::Flush (FlushInterruptibleLayout) [1497,1524]

2. https://people.mozilla.org/~bgirard/cleopatra/#report=5bd53b6908fd6d697d620197cee2da9e00de09c9

   cmd = |$ ./profile.sh start -p b2g -t GeckoMain,Compositor && ./profile.sh start -p Homescreen && ./profile.sh start -p [preallocated pid]; sleep 5; ./profile.sh capture|

   nsRefreshDriver::Tick [1938,1946], [1994,2020]
   PresShell::Flush (FlushInterruptibleLayout) [1965,1994]
Summary: Minimize the overhead of app launch animation (>80ms) → Minimize the overhead of app launch animation (~60ms)
(In reply to Chiajung Hung [:chiajung] from comment #8)
> I will try to make PLayerTransaction construction async first.
> If success, I will take this bug.

Make PLayerTransaction cstor async is a different issue, we can file another bug. Note Chiajung told me he can't reproduce it consistently.
Chiajung, you can build FxOS with B2G_DEBUG=1 and let ShouldLogRestyle() [1] return true to enable LOG_RESTYLE*.

[1] http://dxr.mozilla.org/mozilla-central/source/layout/base/RestyleManager.h#421
(In reply to Ting-Yu Chou [:ting] from comment #13)
> Chiajung, you can build FxOS with B2G_DEBUG=1 and let ShouldLogRestyle() [1]
> return true to enable LOG_RESTYLE*.
> 
> [1]
> http://dxr.mozilla.org/mozilla-central/source/layout/base/RestyleManager.
> h#421

Debug build has to much noise for me, I created a small patch for restyle logging.
feature-b2g: 2.2? → 2.2+
I tried to break on nsPresShell::RecordStyleSheetChange and found its a JS side behaviour, the stack is:
$4 = 0xadb3b000 "0 anonymous(baseUrl = \"/shared/elements/gaia_progress/\") [\"app://system.gaiamobile.org/shared/js/component_utils.js\":22]\n
    this = [object HTMLElement]\n1 anonymous() [\"app://system.gaiamobile.org/shared/elements/gaia_progress/script.js\":19]\n
    this = [object HTMLElement]\n2 anonymous() [\"app://system.gaiamobile.org/js/app_chrome.js\":539]\n
    this = [object Object]\n3 AppChrome(app = [object Object]) [\"app://system.gaiamobile.org/js/app_chrome.js\":39]\n
    this = [object Object]\n4 onOpened([object CustomEvent]) [\"app://system.gaiamobile.org/js/app_window.js\":777]\n
    this = [object HTMLDivElement]\n5 aw_broadcast(event = \"opened\", detail = undefined) [\"app://system.gaiamobile.org/js/app_window.js\":1287]\n
    this = [object Object]\n6 anonymous(event = \"opened\") [\"app://system.gaiamobile.org/js/app_window.js\":1259]\n
    this = [object Object]\n7 atc_changeTransitionState(evt = \"complete\") [\"app://system.gaiamobile.org/js/app_transition_controller.js\":110]\n
    this = [object Object]\n8 atc_handleEvent(evt = [object CustomEvent]) [\"app://system.gaiamobile.org/js/app_transition_controller.js\":373]\n
    this = [object Object]\n9 aw_broadcast(event = \"loaded\", detail = undefined) [\"app://system.gaiamobile.org/js/app_window.js\":1287]\n
    this = [object Object]\n10 anonymous(event = \"loaded\") [\"app://system.gaiamobile.org/js/app_window.js\":1259]\n
    this = [object Object]\n11 aw__handle_mozbrowserloadend(evt = [object CustomEvent]) [\"app://system.gaiamobile.org/js/app_window.js\":937]\n
    this = [object Object]\n12 aw_handleEvent(evt = [object CustomEvent]) [\"app://system.gaiamobile.org/js/app_window.js\":1065]\n
    this = [object Object]\n13 anonymous(data = [object Object]) [\"jar:file:///system/b2g/omni.ja!/components/BrowserElementParent.js\":377]\n
    this = [object Object]\n14 anonymous(data = [object Object]) [\"jar:file:///system/b2g/omni.ja!/components/BrowserElementParent.js\":357]\n
    this = [object Object]\n15 anonymous(aMsg = [object Object]) [\"jar:file:///system/b2g/omni.ja!/components/BrowserElementParent.js\":227]\n
    this = [object ChromeMessageSender]\n"

I tried comment out the style() call at gaia_progress/script.js and it make the first huge RefreshDriver Tick vanish like:
https://people.mozilla.org/~bgirard/cleopatra/#report=0614f23d352da960c2ceb421c6b87a2f7080d4cc
(In reply to Chiajung Hung [:chiajung] from comment #15)
> I tried comment out the style() call at gaia_progress/script.js and it make
> the first huge RefreshDriver Tick vanish like:
> https://people.mozilla.org/~bgirard/cleopatra/
> #report=0614f23d352da960c2ceb421c6b87a2f7080d4cc

I am not sure which RefreshDriver::Tick is vanished by comparing the profile with the first one at comment 11?

Yours:
  [2045, 2054] nsRefreshDriver::Tick
  [2064, 2106] awm_switchApp
  [2122, 2263] PresShell::Flush (InterruptibleLayout)
  [2270, 2311] nsRefreshDriver::Tick

1st profile at comment 11:
  [1462, 1469] nsRefreshDRiver::Tick
  [1473, 1491] awm_switchApp
  [1497, 1524] PresShell::Flush (InterruptibleLayout)
  [1524, 1556] nsRefreshDRiver::Tick
Flags: needinfo?(chung)
OK, I logged RefreshDriver tick duration and enabled Restyle log, and found each big tick is observable in my log.

Those long tick log vanished after remove those code... Maybe I was wrong.
Flags: needinfo?(chung)
I had tried various change in gaia based on profile and log, and here is some result:

Gecko version: 0cf53b165b6b664939ba8c2834c9b869c83a82e7
Gaia version: 0e2876d476f6707669b6489095f3840c558a3e50

The profile for SMS cold launch: 
http://people.mozilla.org/~bgirard/cleopatra/#report=82fb91b40c9a0e1e4809538012aa1bf890ba5041

The first long refresh from profiler after "title changed" takes *150 ms*.

The profile of this patch (status bar broken):
https://people.mozilla.org/~bgirard/cleopatra/#report=95d0cacdc7f1367e44d26ba3c7792837dce1add9

The first long refresh from profiler 
  after "title changed" takes 8ms.

Profile of apply the patch but revert status bar change:
http://people.mozilla.org/~bgirard/cleopatra/#report=a7d8e38b12dae5e320eb81a7ba5cd28a22f18b0c

The first long refresh from profiler after "title changed" takes 3ms.

It is somewhat strange for me, or maybe this is a bad way to judge performance difference.

While investigating this bug, I found some strange things: I tried to call PrintJSStack on Restyle's registration, and mark out those JS and hoping those restyle time gone, but it just make the stack point to other position not remove the time. Maybe we need some tool to point out such problem precisely.

For example, if we can tell which JS trigger or what style change triggers Restyle/Reflow, and how long each Restyle/Reflow takes, maybe we can find the problem faster.
@ting
As talked yesterday, this bug want to deal with a Flush(Layout_Interruptible) after awm_switchApp in fact.
I tried the patch in comment 20, that patch produce a profile like:
https://people.mozilla.org/~bgirard/cleopatra/#report=cf6e2310d8880ea74994fdf4387b7fa403643df3

It takes only 5ms(vs 40ms) on the Flush, so I think this bug is a dup of bug 1102715.
Do you find any other problem in this profile?
Flags: needinfo?(tchou)
(In reply to Chiajung Hung [:chiajung] from comment #21)
> @ting
> As talked yesterday, this bug want to deal with a
> Flush(Layout_Interruptible) after awm_switchApp in fact.

Not only the layout flush but also the refresh driver tick after it.

> I tried the patch in comment 20, that patch produce a profile like:
> https://people.mozilla.org/~bgirard/cleopatra/
> #report=cf6e2310d8880ea74994fdf4387b7fa403643df3
> 
> It takes only 5ms(vs 40ms) on the Flush, so I think this bug is a dup of bug
> 1102715.
> Do you find any other problem in this profile?

The refresh driver tick now takes 73ms [4091,4165], I guess the flush just moves into it.
Flags: needinfo?(tchou)
For Record,
the total cost goes from 88ms to 73ms, and from the profile it takes 7ms to decode a image (sms_284.png), and takes 28ms on Styling(decoding included), 37ms on Painting.

This patch should further shorten the first tick. (This is not the best possible, but the simplest change. Maybe mul-alpha 1-row at once/NEON is better. And since RasterImage buffer should be read only after decode, we may able to do such alpha handling in compositor/DrawTarget directly.)

New profile(w/patch from comment 20): 
http://people.mozilla.org/~bgirard/cleopatra/#report=1562d77c72e68777c2a52b22972ddf648f1b4848

BTW, I think this profile is somewhat not accurate, since its first tick after awm_switchApp costs only 12ms, and after loadstart costs only 37ms.
blocking-b2g: --- → 2.2?
feature-b2g: 2.2+ → ---
feature-b2g -> blocker-b2g
blocking-b2g: 2.2? → 2.2+
New profile with some local modifications:
http://people.mozilla.org/~bgirard/cleopatra/#report=26818112ec24a9b6f9a7b299696ccdfc9103d324&filter=[{%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A5222,%22end%22%3A5729},{%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A5472,%22end%22%3A5491}]&selection=0,1,193,194,461,80,539

Gecko version: 0cf53b165b6b664939ba8c2834c9b869c83a82e7
Gaia version: 4dbd7fe81d0672c8bfcfdf3e252754ee7218fc2f

In this profile, I make the restyle tag based on the restyle root's id(e.g.[5473,5490] is for AppWindow) and current class, so you can find the restyle while app launch is mainly involved in "AppWindow" "homescreen" in this profile.

The Flush(InterruptibleLayout) we noticed in old profile is mainly comes from element.focus() call in [1]. Since the focus call will Flush Layout synchrounously and add the element who get focus into RestyleRoot than trigger RefreshDriver. In this profile, I removed that line, and makes those restyle merged into refresh driver tick.

The long Restyle there was originally 3 equal length restyles: homescreen, status bar, AppWindow. In this profile I removed the line in [2], so you can not find statusbar's restyle.

To find why AppWindow/homescreen takes so many time restyle, I added some log and found some of the resyle comes from [3]. From those code, I suspect it is used to remove all Animation related classes while the app launch done.

And In this profile 1 last thing takes long is the splash icon decoding. The icon is synchrous decoded while compute its style, it's bug 1121897.

@alive
Can we remove [2] and postpone [3]?

[1] http://mxr.mozilla.org/gaia/source/apps/system/js/app_transition_controller.js#286
[2] http://mxr.mozilla.org/gaia/source/apps/system/js/statusbar.js#555
[3] http://mxr.mozilla.org/gaia/source/apps/system/js/app_transition_controller.js#349
Flags: needinfo?(alive)
(In reply to Chiajung Hung [:chiajung] from comment #25)
> New profile with some local modifications:
> http://people.mozilla.org/~bgirard/cleopatra/
> #report=26818112ec24a9b6f9a7b299696ccdfc9103d324&filter=[{%22type%22%3A%22Ran
> geSampleFilter%22,%22start%22%3A5222,%22end%22%3A5729},
> {%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A5472,
> %22end%22%3A5491}]&selection=0,1,193,194,461,80,539
> 
> Gecko version: 0cf53b165b6b664939ba8c2834c9b869c83a82e7
> Gaia version: 4dbd7fe81d0672c8bfcfdf3e252754ee7218fc2f
> 
> In this profile, I make the restyle tag based on the restyle root's
> id(e.g.[5473,5490] is for AppWindow) and current class, so you can find the
> restyle while app launch is mainly involved in "AppWindow" "homescreen" in
> this profile.
> 
> The Flush(InterruptibleLayout) we noticed in old profile is mainly comes
> from element.focus() call in [1]. Since the focus call will Flush Layout
> synchrounously and add the element who get focus into RestyleRoot than
> trigger RefreshDriver. In this profile, I removed that line, and makes those
> restyle merged into refresh driver tick.
> 
> The long Restyle there was originally 3 equal length restyles: homescreen,
> status bar, AppWindow. In this profile I removed the line in [2], so you can
> not find statusbar's restyle.
> 
> To find why AppWindow/homescreen takes so many time restyle, I added some
> log and found some of the resyle comes from [3]. From those code, I suspect
> it is used to remove all Animation related classes while the app launch done.
> 
> And In this profile 1 last thing takes long is the splash icon decoding. The
> icon is synchrous decoded while compute its style, it's bug 1121897.
> 
> @alive
> Can we remove [2] and postpone [3]?
> 
> [1]
> http://mxr.mozilla.org/gaia/source/apps/system/js/app_transition_controller.
> js#286
> [2] http://mxr.mozilla.org/gaia/source/apps/system/js/statusbar.js#555
> [3]
> http://mxr.mozilla.org/gaia/source/apps/system/js/app_transition_controller.
> js#349

[2] is a question for systemsfe...Michael?
I will try out [3].
Flags: needinfo?(alive) → needinfo?(mhenretty)
Comment on attachment 8552162 [details] [review]
[PullReq] alivedise:bugzilla/1110625 to mozilla-b2g:master

The patch is to fix [3]
Attachment #8552162 - Flags: feedback?(chung)
Comment on attachment 8552162 [details] [review]
[PullReq] alivedise:bugzilla/1110625 to mozilla-b2g:master

I can still find the class iteration during app launch.
I log it by following step:
(1) adb logcat -c
(2) adb logcat -vthreadtime|grep {b2g pid} | tee log.txt,
(3) launch the app
(4) wait for the animation finish and app full loaded
(5) ctrl+C

Maybe we have to postpone it after some event like mozbrowser-loadend.

@ting
How do you think?
Flags: needinfo?(tchou)
Attachment #8552162 - Flags: feedback?(chung) → feedback-
Alive's patch is to skip the reset of the very first transition which I expect is from "closed" to "opening". But the reset will still happen for the others, such as "opening" to "opened", seeing class iteration from log doesn't clarify it is from which state transition. Better add some more logs to confirm whether it works.
Flags: needinfo?(tchou)
I could give it a try but I am not sure if there is side effect if we don't remove the class when opening->opened.
(In reply to Ting-Yu Chou [:ting] from comment #30)
> Alive's patch is to skip the reset of the very first transition which I
> expect is from "closed" to "opening". But the reset will still happen for
> the others, such as "opening" to "opened", seeing class iteration from log
> doesn't clarify it is from which state transition. Better add some more logs
> to confirm whether it works.

I went to log since the profiler shows similar cost.
It turns out that we need a way to associate the JSStack and profiler label first.
(In reply to Ting-Yu Chou [:ting] from comment #33)
> Are we doing transition animation like this?
> http://stackoverflow.com/questions/16646233/css3-transition-only-when-class-
> is-added-not-when-removed

I don't know what you want - we have different open/close transition. It's impossible to do the same thing as the article unless rules like this:

.appWindow.opening {
}

.appWindow.opening.closing {
}

.appWindow.opening.closing.opening2 {
}

.appWindow.opening.closing.opening2.closing2 {
}
(In reply to Ting-Yu Chou [:ting] from comment #30)
> Alive's patch is to skip the reset of the very first transition which I
> expect is from "closed" to "opening". But the reset will still happen for
> the others, such as "opening" to "opened", seeing class iteration from log
> doesn't clarify it is from which state transition. Better add some more logs
> to confirm whether it works.

I checked the codebase and what we have now is:
If system app is busy loading (top most window is not load-end), we will skip the animationend event but wait for mozbrowserloadend event to go to the opened state. That is to say, the resetClass will NOT happen before the page is loaded.

I don't exactly know what you want - if what you want is "never" iterating class removal please state cleaner and why it's overhead even after the app is launched.
(In reply to Ting-Yu Chou [:ting] from comment #30)
> Alive's patch is to skip the reset of the very first transition which I
> expect is from "closed" to "opening". But the reset will still happen for
> the others, such as "opening" to "opened", seeing class iteration from log
> doesn't clarify it is from which state transition. Better add some more logs
> to confirm whether it works.

I may wrong before, since it is hard to associate log and profile. To make it clear, check this profile (w/ alive's patch):
http://people.mozilla.org/~bgirard/cleopatra/#report=c56d167e384cd87684d221756f4b6696cd94e277&selection=0,1,216,325,354

In this profile, the Restyle is splitted into more labels based on the callstack so we can check the resyle source without log. {null} stack means the restyle root are not add into resyle table via RestyleTracker::AddPendingRestyleToTable or the function is not triggered by JS.

From this profile, 
* AppWindow restyle comes from setFrameBackground.(Decoding included)
* status bar restyle comes from an anonymous stack.
* homescreen restyle comes from atc_do_closing(): it seems the leaving animation restyle

By simply removing the FrameBackground, and I get 
http://people.mozilla.org/~bgirard/cleopatra/#report=a65b87cc3619d160d569ccf8f2205940a9cc93c5&filter=%5B{%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A2974,%22end%22%3A3468}%5D&selection=0,1,8,206,207,298,300

It shows the AppWindow restyle from aw_changeState now. I suspect there are several stacks for each component, so I append each stack in the profiler label but they are truncated. BTW, I think these operations may needed, and it's still not clear for me about how these operations affect app launch in what way. Does it really block any IPC or other task for a real case? If yes, can we see them in the profile? Do we have a better way to measure the difference?

@ting
Do you know how to make profiler save longer label?
Flags: needinfo?(tchou)
Enlarge this buffer [1] if you're using PROFILER_LABEL_PRINTF.

[1] https://dxr.mozilla.org/mozilla-central/source/tools/profiler/GeckoProfilerImpl.h#382
Flags: needinfo?(tchou)
(In reply to Ting-Yu Chou [:ting] from comment #37)
> Enlarge this buffer [1] if you're using PROFILER_LABEL_PRINTF.
> 
> [1]
> https://dxr.mozilla.org/mozilla-central/source/tools/profiler/
> GeckoProfilerImpl.h#382
It works, thanks.

For the sync decode image problem, we can change
https://dxr.mozilla.org/mozilla-central/source/image/src/RasterImage.cpp#1475

from FLAG_SYNC_DECODE to FLAG_NONE and get the profile:
http://people.mozilla.org/~bgirard/cleopatra/#report=874affb40aa3b4061341adbf69cf817e94c755aa&filter=%5B{%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A17639,%22end%22%3A19796},{%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A18859,%22end%22%3A18961}%5D&selection=0,1,6,77,402

The total Reflow becomes 50ms with it. And the statusbar spend most time on SelectorMatching, which should be improved by scoping CSS for statusbar.
(In reply to Chiajung Hung [:chiajung] from comment #25)
> Can we remove [2] [...]?
> 
> [2] http://mxr.mozilla.org/gaia/source/apps/system/js/statusbar.js#555

This was added as part of bug 1045017 for accessibility purposes. Not sure what the accessibility impact of removing this is.  ni? Yura to see if we can remove this or explore other approaches.
Flags: needinfo?(mhenretty) → needinfo?(yzenevich)
Sorry for the delay, we have a regression in Gecko atm that prevents me from trying things out with the screen reader, it should be resolved soon so I will check the above stuff asap.
I played around with the hidden class and the statusbar events touched in bug 1045017, and it looks like it does not affect a11y in a bad way. AFAIK, it was actually introduced to deal with some performance issues when transitioning between apps with edge gestures. In case the pats introduced in bug 1045017 will be removed/modified, please mark me for a11y-review so I could thoroughly test it with the screen reader. Thanks!
Flags: needinfo?(yzenevich)
(In reply to Chiajung Hung [:chiajung] from comment #25)
> New profile with some local modifications:
> http://people.mozilla.org/~bgirard/cleopatra/
> #report=26818112ec24a9b6f9a7b299696ccdfc9103d324&filter=[{%22type%22%3A%22Ran
> geSampleFilter%22,%22start%22%3A5222,%22end%22%3A5729},
> {%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A5472,
> %22end%22%3A5491}]&selection=0,1,193,194,461,80,539
> 

This change looks very interesting! I've been wanting to add something like this. Can you file a separate bug to track this modification? I'd like to see if we can get it landed :D.
Flags: needinfo?(chung)
I see it's already filed in bug 1129249.
Flags: needinfo?(chung)
Depends on: 1131446
Is it possible to get some profiles of this situation with call stacks enabled. ('-f stackwalk')
Flags: needinfo?(chung)
profile provided in comment 45.
Flags: needinfo?(chung)
So it looks like one approximation for restyle complexity is the number of times ContentEnumFunc is called.

Here are some counts for some of the restyles happening here:
'homescreen' 77
'statusbar' 478
Running this again, I get these counts:

'homescreen' 2182
'statusbar' 12749
and 'AppWindow_3' 2805

Here are some other statistics the results of CalcStyleDifference:
  id, # of no change results, # of change results
 'homescreen',  24, 3
 'statusbar',   47, 3
 'appWindow_2', 34, 3
 'appWindow_2', 36, 1
 'homescreen',  26, 1
Isn't CalcStyleDifference takes much of time even if status bar changes only visibility? 
(It somehow recursive into each child for diff, even if only statusbar container div changes)

BTW, you can see the diff result by enable RESTYLE_LOGGING. (It is currently depend on DEBUG, but it is easy to enable standalone.)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #48)
> Running this again, I get these counts:
> 
> 'homescreen' 2182
> 'statusbar' 12749

Or 'homescreen' 2160 w/ 36 FileRules, 12 ResolveStyleFor
   'statusbar' 12615 w/ 120 FileRules, 68 ResolveStyleFor
   'appWindow_2' 2777 w/ 46 FileRules, 14 ResolveStyleFor

You might ask why do we have 68 elements under statusbar to restyle. It looks like we basically have two copies of the statusbar. One as statusbar-maximized-wrapper and one as statusbar-minimized-wrapper. There are also a number of elements without ids under both statusbar-connections
(In reply to Chiajung Hung [:chiajung] from comment #50)
> Isn't CalcStyleDifference takes much of time even if status bar changes only
> visibility? 
> (It somehow recursive into each child for diff, even if only statusbar
> container div changes)

We call ResolveStyleFor on all of the children. This produces a new styleContext which we often need to compare to the old one.

> BTW, you can see the diff result by enable RESTYLE_LOGGING. (It is currently
> depend on DEBUG, but it is easy to enable standalone.)

If you have a log, do you mind attaching it to the bug?
What if we manipulate the style by JS directly instead of adding a "hidden" class? For instance:

  case 'homescreenopening':
  case 'appopening':
    this.element.style.opacity = 0;
    this.element.style.visibility = hidden;
//    this.element.classList.add('hidden');
    break;

Will this remove the rules matching?
(In reply to Ting-Yu Chou [:ting] from comment #53)
> What if we manipulate the style by JS directly instead of adding a "hidden"
> class? For instance:
> 
>   case 'homescreenopening':
>   case 'appopening':
>     this.element.style.opacity = 0;
>     this.element.style.visibility = hidden;
> //    this.element.classList.add('hidden');
>     break;
> 
> Will this remove the rules matching?

It should yes.
Attached file log.txt (obsolete) —
Here is a sample output by apply 0001-Enable-Restyle-log.patch. 
I filtered out all other processes log, so only b2g process log inside. And the origin log for RESTYLE_LOGGING flag is a little hard to figure out which element it is working on.
Attachment #8545717 - Attachment is obsolete: true
Attachment #8549314 - Attachment is obsolete: true
(In reply to Jeff Muizelaar [:jrmuizel] from comment #54)
> (In reply to Ting-Yu Chou [:ting] from comment #53)
> > What if we manipulate the style by JS directly instead of adding a "hidden"
> > class? For instance:
> > 
> >   case 'homescreenopening':
> >   case 'appopening':
> >     this.element.style.opacity = 0;
> >     this.element.style.visibility = hidden;
> > //    this.element.classList.add('hidden');
> >     break;
> > 
> > Will this remove the rules matching?
> 
> It should yes.

I don't think so. Such change just make another kind of style rule in nsStyleSet. And we in fact need go through all style rules in style set.

I think scoped style should be helpful if matching is the problem.
(However, AppWindow, statusbar style is rather hard to be scoped correctly, and in my test last time, it have no or little benefit here. Maybe I did something wrong in my test)
(In reply to Chiajung Hung [:chiajung] from comment #56)
> I don't think so. Such change just make another kind of style rule in
> nsStyleSet. And we in fact need go through all style rules in style set.
> 
> I think scoped style should be helpful if matching is the problem.
> (However, AppWindow, statusbar style is rather hard to be scoped correctly,
> and in my test last time, it have no or little benefit here. Maybe I did
> something wrong in my test)

Shouldn't setting the style directly avoid having to do style resolution on all of the child elements?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #57)
> (In reply to Chiajung Hung [:chiajung] from comment #56)
> > I don't think so. Such change just make another kind of style rule in
> > nsStyleSet. And we in fact need go through all style rules in style set.
> > 
> > I think scoped style should be helpful if matching is the problem.
> > (However, AppWindow, statusbar style is rather hard to be scoped correctly,
> > and in my test last time, it have no or little benefit here. Maybe I did
> > something wrong in my test)
> 
> Shouldn't setting the style directly avoid having to do style resolution on
> all of the child elements?

Setting the style attribute rather than a class should:

 * definitely avoid rerunning selector matching on descendants (which is a maybe with setting a class, thanks to RestyleHintForOp)
   * if this is avoided, we also potentially benefit from the optimization in bug 931668 once that relands


 * definitely avoid rerunning selector matching on the element itself (never avoided with class, presuming there's a style change on the element)
Attached file log.txt
The old log is wrong. 

(This log is done after change gaia side based on comment 53)
Attachment #8563264 - Attachment is obsolete: true
Attached file log.txt
Here is a log with element id, which I logged in ComputeStyleChangeFor.

In this log you can search statusbar and still find a lot of RestyleContentChildren follows. 

And the profile for it shows no improvement:
http://people.mozilla.org/~bgirard/cleopatra/#report=1b0f05cc88414ede3fa196387d832fb1e9184e71&filter=%5B{%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A1913,%22end%22%3A2524},{%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A2190,%22end%22%3A2322}%5D&selection=0,1,333,430,444
The same profile with stackwalk:

http://people.mozilla.org/~bgirard/cleopatra/#report=4c6d5ab47ad5ad75ffc05297f06f5a7b217a041c&filter=%5B{%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A3194,%22end%22%3A3555}%5D&selection=0,1,45,181,182,94,1079

(BTW, my gaia change is 
       case 'appopening':
-        this.element.classList.add('hidden');
+        this.element.style.opacity = 0;

       case 'appioened':
         this.setAppearance(evt.detail);
-        this.element.classList.remove('hidden');
+        this.element.style.opacity = 1;
Attached patch statusbar.patchSplinter Review
Gaia change based on previous comment
(In reply to Chiajung Hung [:chiajung] from comment #60)
> In this log you can search statusbar and still find a lot of
> RestyleContentChildren follows. 

bug 1125391, which reenables bug 931668, is likely to help that, although not if there's something that makes it miss the optimized codepath.  If RestyleContentChildren is the bulk of the time, then the class selector change is probably a eRestyle_Self thanks to RestyleHintForOp.
Assignee: nobody → chung
Status: NEW → ASSIGNED
This now grows even higher, the nsRefreshDriver::Tick() [2950,3180] after awm_switchApp() [2850,2915] takes >200ms.

http://people.mozilla.org/~bgirard/cleopatra/#report=eade3b584f5ca40155a64079551f959177c84f65

This is captured on a Flame 319MB v18D while launching SMS (no load) on v2.2.
(In reply to Ting-Yu Chou [:ting] from comment #64)
> This now grows even higher, the nsRefreshDriver::Tick() [2950,3180] after
> awm_switchApp() [2850,2915] takes >200ms.
> 
> http://people.mozilla.org/~bgirard/cleopatra/
> #report=eade3b584f5ca40155a64079551f959177c84f65
> 
> This is captured on a Flame 319MB v18D while launching SMS (no load) on v2.2.

It should be possible to find a regression window. I expect that would be very helpful.
Are we actively finding the regression window? 

This bug doesn't have any update for 10 days. Chiajung, can you share the latest status with all of us? Thank you.
Flags: needinfo?(chung)
Last time, Jeff found the huge restyle is in fact unexpected huge amount of style matching. As :dbaron stated, the statusbar hidden/show should not involving children restyles but in fact the style in system app make it ask style matching for all decendents of status bar. Since status bar has huge amount of decendents, the style matching becomes expensive even if each match takes under 1ms.

The related style is a rule of "[something] not(.maximized) [some other thing]": every status bar's style change hit "not" part, and since these rules have [some other thing] part, which require style matches of its children, and make the status bar style change super expensive in terms of style matching.

To reduce such kind of cost, a huge change in system app is needed. Which involves naming all statusbar element and use id selector. And/Or style children via javascript logic rather than a super long tricky css selector.

In our test, avoiding the style matching of status bar's decendents reduces half style time of status bar.

This part is what I can tell. Let's ask Jeff for some more detail.

@Jeff
Can you help summary your findings for this bug?
Flags: needinfo?(chung) → needinfo?(jmuizelaar)
The summary is basically what Chiajung wrote. The CSS design for the status bar is too expensive. Some of the problems are:
- Too many elements in the status bar. When I was looking we had a duplicate copy of the status bar elements
- Complicated descendent selectors that cause us to do style resolution for all of these elements
- A large number of style rules that cause us to do a lot of work for each style resolution.

I've put up a small document at https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Investigating_CSS_Performance that includes a patch that I used for analyzing the restyle preformance.
Flags: needinfo?(jmuizelaar)
As discussion with Chiajung, a persistent status bar could possibly mitigate this situation. Attached prototype of persistent status bar (by Chiajung): https://www.youtube.com/watch?v=Xgob3_lAaPw. However, it requires UX consensus to make further decision.
Rob, I think comment 69 is worth to be applied. could you have inputs per UX perspective?
Flags: needinfo?(rmacdonald)
(In reply to Bobby Chien [:bchien] from comment #69)
> As discussion with Chiajung, a persistent status bar could possibly mitigate
> this situation. Attached prototype of persistent status bar (by Chiajung):
> https://www.youtube.com/watch?v=Xgob3_lAaPw. However, it requires UX
> consensus to make further decision.

I had a chance to speak with my ux colleagues as well as the engineers who were involved in the original status bar development. There were some tweaks we'd like to see from the visual side and Etienne suggested that we contact Eitan regarding potential accessibility issues. But otherwise everyone was on board with the persistent status bar proposal.

Please flag Eric Pang and I when the changes are available for review or NI us if you have any additional questions. (NI'ing Eric as an FYI for now.)

Thanks!
Rob
Flags: needinfo?(rmacdonald) → needinfo?(epang)
Why is this blocking 2.2, have we done a regression check here? The per-app statusbar was introduced in 2.1, so we should've caught this earlier if it's a blocker.

(In reply to Bobby Chien [:bchien] from comment #69)
> As discussion with Chiajung, a persistent status bar could possibly mitigate
> this situation. Attached prototype of persistent status bar (by Chiajung):
> https://www.youtube.com/watch?v=Xgob3_lAaPw. However, it requires UX
> consensus to make further decision.

It sounds like we want to move away from the per-app status/rocketbars and move to a global statusbar again. This has huge UX implications, and is a non-trivial fix. There is no way that we can make that change in 2.2 at this point. 3.0 is a possibility, but we would need to see a UX spec first. We need to rewrite a large portion of rocketbar if this is what we want to do.
As UX perspective in comment 71 and engineering effort in comment 72, the persistent start bar proposal (comment 69) is worth to consider implementing. The proposal could reduce animation works during app startup, and could have new experience benefit in status bar. However, consider to last minute effort and potential risk in v2.2. I'd like suggest the implementation to next version (3.0?).
blocking-b2g: 2.2+ → ---
feature-b2g: --- → 3.0?
tracking-b2g: --- → +
Blocks: 1163471
Blocks: 1191339
Chiajung is no longer working on this.
Assignee: ffantasy1999 → nobody
Status: ASSIGNED → NEW
(In reply to Kevin Grandon :kgrandon from comment #72)
> It sounds like we want to move away from the per-app status/rocketbars and
> move to a global statusbar again.

No, it is not about per-app or global status/rocketbars. What we'd like to do is stop hiding and showing the statusbar while launching application, which the statusbar will be just there (video clip at comment 69).
Blocks: 1194121
sb_handleEvent() receives 3 stackchanged events before appwillopen, which will remove class maximized from #statusbar when it is still visible, and for some reasons I haven't figured out, will trigger #windows subtree restyle which takes time.
There're sibling selectors for #statusbar [1], which match when #statusbar:

  1. has class "maximized" removed
  2. has class "hidden" added
  3. has class "hidden" removed

RestyleTracker::DoProcessRestyles() [2] will add all #statusbar's later siblings (more than 30) to pending restyle with hint eRestyle_Subtree, and #windows is one of them.

I'll try MutationObserver instead.

[1] http://mxr.mozilla.org/gaia/source/apps/system/style/app_titlebar.css#65
[2] https://dxr.mozilla.org/mozilla-central/source/layout/base/RestyleTracker.cpp#301
(In reply to Ting-Yu Chou [:ting] from comment #77)
> I'll try MutationObserver instead.

Tried, but this doesn't seem a good idea.

There's no clear timing to disconnect the observer except when destroy AppWindow. That means if there're many apps running in background, each will receive AttributeWillChange() in platform when there's any attribute in the dom tree changed.

Any other ideas to monitor element A's class attribute to change element B/C's style?
I don't see any differences by simply removing the sibling selectors [1], the transition bug 1047988 would like to make still works.

Kevin, will there anything break if I remove them?

Note on Aries, removing those selectors can make the two big restyles after awf_launch() from ~60ms, 80ms to ~10ms, 20ms.

[1] http://mxr.mozilla.org/gaia/source/apps/system/style/app_titlebar.css#65
Flags: needinfo?(kevingrandon)
Get rid of statusbar hide/show when sibling selectors are removed doesn't make any differences.
I am not exactly clear on what you're trying to do here, can you provide a patch? Perhaps in another bug that blocks this one?
Flags: needinfo?(kevingrandon) → needinfo?(janus926)
Depends on: 1196604
(In reply to Kevin Grandon :kgrandon from comment #81)
> I am not exactly clear on what you're trying to do here, can you provide a
> patch? Perhaps in another bug that blocks this one?

Bug 1196604.
Flags: needinfo?(janus926)
Depends on: 1198657
Depends on: 1179723
Depends on: 1202528
Priority: -- → P1
Depends on: 1218844
No longer depends on: 1218844
Summary: Minimize the overhead of app launch animation (~60ms) → Minimize the overhead from system app when launch an app
Depends on: 1220550
(In reply to Rob MacDonald [:robmac] from comment #71)
> (In reply to Bobby Chien [:bchien] from comment #69)
> > As discussion with Chiajung, a persistent status bar could possibly mitigate
> > this situation. Attached prototype of persistent status bar (by Chiajung):
> > https://www.youtube.com/watch?v=Xgob3_lAaPw. However, it requires UX
> > consensus to make further decision.
> 
> I had a chance to speak with my ux colleagues as well as the engineers who
> were involved in the original status bar development. There were some tweaks
> we'd like to see from the visual side and Etienne suggested that we contact
> Eitan regarding potential accessibility issues. But otherwise everyone was
> on board with the persistent status bar proposal.
> 
> Please flag Eric Pang and I when the changes are available for review or NI
> us if you have any additional questions. (NI'ing Eric as an FYI for now.)
> 
> Thanks!
> Rob
Flags: needinfo?(epang)
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: