Closed Bug 1198266 Opened 5 years ago Closed 3 years ago

[Messages] Use ConversationService in the application

Categories

(Firefox OS Graveyard :: Gaia::SMS, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: julienw, Unassigned)

References

Details

(Whiteboard: [sms-sprint FxOS-S8 p=1][sms-sprint FxOS-S7 p=3][sms-sprint FxOS-S6 p=2])

Attachments

(3 files)

In this bug we want to start using ConversationService in the application, especially at startup.
p=2 in this sprint, but we don't intend to fix all this in this sprint. However we intend to have solutions ready.
Whiteboard: [p=2]
Assignee: nobody → azasypkin
Status: NEW → ASSIGNED
Here are some results from profiling with WebIDE (that most significantly affect us on startup):

1. l10n.js initialization takes about ~50ms (sometimes ~70):
  1.1 ~10-15 is spent on Pseudo locale initialization that even isn't used;
  1.2 ~20 - init resources, we can't likely avoid that;
  1.3 the rest is distributed among other things.

2. date_time_helper spends about 15ms to initialize, time mainly spent on Settings.createLock;

3. gaia_header spends ~85ms for it's "setupShadowRoots" function and next ~16ms for "injectLightCss" 

4. then gaia_header makes two nextTick calls that cost us:
  4.1 ~20 (boils down to getTextWidth);
  4.2 ~10ms (setTitleStyle).

5. then we see huge restyle/reflow ~165ms caused by gaia-header's sumButtonWidth;

6. then ~16ms for l10n.js io_onload followed by ~10ms reflow;

7. then I see some weird ~10ms layout + ~10ms paint that seems related to our hidden iframe - need to check, not sure what we paint here;

8. then a big pause (looks like it's one when SharedWorker is created), I see a lot of micro (<0.5ms) "Recalculate Style" entries during this pause, not sure what it means;

9. once connection is established, we see that mozMobileMessage.getThreads.onsuccess doesn't give chance to pass retrieved thread to the bridge (from 4 to 7 "onsuccess" callbacks on the stack before we pass message through the bridge).

10. interesting thing once first conversation is retrieved by the view, I see two big Non-Incremental GC events ~33ms and ~12ms with the reason "Too many 
allocations".

11. restyle/reflow for the single conversation node is ~10ms, paint ~15ms, mutation callback to localize (l10n) is about ~5ms

12. when we render first page we synchronously call "visually-loaded" that initializes lazy loading - we spend here ~80ms (mostly appending "script" nodes);

13. Once all lazy scripts are loaded, we initialize lazy objects during ~80ms, half of that time is spent on Settings.init.

14. After that I don't see anything suspicious, except for the fact that somewhere around the time when we start lazy loading I see exceptionally long Paints - 40-50ms, I suspect it may be related to the fact that Inbox starts to render conversations with gradient photo placeholder.

So action reminder for myself:
1 - Talk to l10n people, to see if we can get rid of this Pseud locale init - not a big win, but still looks not reasonable to always initialize Pseudo locale;

3 - 5, especially 5 - Talk to Julien to know if it's expected and if we can do smth with it;

7 - Try to understand if it's related to our iframe;

8 - Will see what happens if I make SharedWorker initialization as the first js statement when app is loading.

9 - Will play with different "setImmediate" shims mentioned in [1].

10 - Want to understand what causes these GC's, currently WebIDE crashes when I try to turn on "Record Allocations" option [2]. Will try other tools we have, and ask Gecko people if I can't figure this out by myself.

12 - Should we make build step that concatenates all this files into one?

14 - Will check if it's because of our gradient photo placeholders.

So the majority of the issues are the same as we have on master, the new thing that makes things worse is SharedWorker (MessageChannel) + iframe initialization.
Hey Zibi,

Is it expected that l10n.js lib spends 10-15ms (sometimes I even see that it takes ~20ms) to initialize Pseudo locales right on the startup [1] even if don't use it? Can it be initialized later or even when user switches Pseudo locale on (see comment 2 point 1 and attachment 8653486 [details] for the profile)?

[1] https://github.com/mozilla-b2g/gaia/blob/master/shared/js/l10n.js#L1131

Thanks!
Flags: needinfo?(gandalf)
Hey Julien,

As we talked offline could you please take a look at the points 3-5 in comment 2? At least confirm that it is expected or I misunderstood profile :)

Thanks!
Flags: needinfo?(felash)
Oleg: I'm currently investigating a PR to move Pseudo to lazy in l20n.js - https://github.com/l20n/l20n.js/pull/70

I'll probably land it this week, but you'll benefit from it only when you move from l10n.js to l20n.js (you're mostly ready, so it should be a painless switch - :stas is in charge of that). I don't think we'll backport those changes to l10n.js, which is basically in maintenance mode and all our effort is right now in improving l20n.js and converting apps to it.

I've been doing a lot of perf/mem investigations around l20n.js, and I would love to work with you on any data you can get me!
One bit that you posted is:

> 6. then ~16ms for l10n.js io_onload

That should not take 16ms, since it's asynchronous JSON load. I'm also investigating switching it to Fetch API, but waiting for bug 1196592 to be solved.
Flags: needinfo?(gandalf)
(In reply to Oleg Zasypkin [:azasypkin][⏰UTC+1] from comment #2)
> Created attachment 8653486 [details]
> profile-2015-08-27.json
> 

Just a note that I shared it via cleopatra: http://people.mozilla.org/~bgirard/cleopatra/#report=8be08798a63cbff71e168260ef3c3dbeff22f32d
Flags: needinfo?(felash)
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #5)
> Oleg: I'm currently investigating a PR to move Pseudo to lazy in l20n.js -
> https://github.com/l20n/l20n.js/pull/70
> 
> I'll probably land it this week, but you'll benefit from it only when you
> move from l10n.js to l20n.js (you're mostly ready, so it should be a
> painless switch - :stas is in charge of that). I don't think we'll backport
> those changes to l10n.js, which is basically in maintenance mode and all our
> effort is right now in improving l20n.js and converting apps to it.

Good to know, thanks!

> 
> I've been doing a lot of perf/mem investigations around l20n.js, and I would
> love to work with you on any data you can get me!

Sure, are you guys going to integrate l20n.js in SMS/any-other-app in v2.5?

> One bit that you posted is:
> 
> > 6. then ~16ms for l10n.js io_onload
> 
> That should not take 16ms, since it's asynchronous JSON load. I'm also
> investigating switching it to Fetch API, but waiting for bug 1196592 to be
> solved.

I'll double check what exactly takes time here.
Returning ni?, looks like you've accidentally removed it in comment 6 :)
Flags: needinfo?(felash)
Yes sorry, I'm gonna defer this to Wilson as I'm in PTO for 1 week.
Hey Wilson, can you have a look at comment 2 and see if the time spent in gaia-header looks normal ? This is on a page that uses title-start/title-end.
Flags: needinfo?(felash) → needinfo?(wilsonpage)
(In reply to Julien Wajsberg [:julienw] (PTO Sept 2nd -> Sept 9th) from comment #9)
> Yes sorry, I'm gonna defer this to Wilson as I'm in PTO for 1 week.
> Hey Wilson, can you have a look at comment 2 and see if the time spent in
> gaia-header looks normal ? This is on a page that uses title-start/title-end.

When using title-start/title-end gaia-header's `sumButtonWidths` function should not even be called [1], so something strange is going on here. I'd need some STRs so I can dig in further here.

I'm also surprised how long `.setupShadowRoot()` is taking, but would need platform help to find out where the time is being spent here. A rough guess is that it's causing some recalc style when we append the `<style>`.

[1] https://github.com/gaia-components/gaia-header/blob/master/gaia-header.js#L605
Flags: needinfo?(wilsonpage)
(In reply to Wilson Page [:wilsonpage] from comment #10)
> (In reply to Julien Wajsberg [:julienw] (PTO Sept 2nd -> Sept 9th) from
> comment #9)
> > Yes sorry, I'm gonna defer this to Wilson as I'm in PTO for 1 week.
> > Hey Wilson, can you have a look at comment 2 and see if the time spent in
> > gaia-header looks normal ? This is on a page that uses title-start/title-end.
> 
> When using title-start/title-end gaia-header's `sumButtonWidths` function
> should not even be called [1], so something strange is going on here. I'd
> need some STRs so I can dig in further here.

Seems I've found out what is causing `sumButtonWidths` - we remove "no-font-fit" from _edit mode_ header right inside `InboxView.beforeEnter`, we likely can postpone this until "visuallyLoaded".

> I'm also surprised how long `.setupShadowRoot()` is taking, but would need
> platform help to find out where the time is being spent here. A rough guess
> is that it's causing some recalc style when we append the `<style>`.

As Wilson figured out - we have 6 setupShadowRoots calls for every gaia-header we have in markup. Every call costs about ~20ms. We can try to add gaia-headers for edit modes, report and group views a bit later.

Still I'm profiling to understand where exactly we spend these 20ms for single setupShadowRoots, will follow up later.
Whiteboard: [p=2] → [sms-sprint FxOS-S7 p=3][sms-sprint FxOS-S6 p=2]
Hey Ting-Yu,

Right now I'm trying to figure out why "setupShadowRoot" [1] of gaia-header web component takes ~20ms, I've got profile with WebIDE [2], but it didn't really clarify things for me :)

So I hoped to get more details with Gecko profiler, but result profile [3] doesn't contain any gaia-header related calls at all, so there is no "setupShadowRoot".

I use my own B2G_PROFILING=1 build for flame-kk. The test case is the same for [2] and [3]: I use custom empty page with link as a startup Messages page, then I run profiling (via "./profile.sh start -p b2g -t Compositor,GeckoMain -f js,leaf,threads,stackwalk && ./profile.sh start -p Messages") and click the link that leads to a page with gaia-header web component defined, once page is fully loaded I capture profile with "./profile.sh capture".

Could you please help me to understand what I'm doing wrong with Gecko profiler or even maybe you have clue why "setupShadowRoot" takes that time?

Thanks a lot!

[1] https://github.com/mozilla-b2g/gaia/blob/47459eead04385e22f967012b824f5abdddcfb7c/shared/elements/gaia-header/dist/gaia-header.js#L252

[2] http://people.mozilla.org/~bgirard/cleopatra/#report=f5a70884e77522db5f5df3eca55f848cd222d51d

[3] http://people.mozilla.org/~bgirard/cleopatra/#report=460650cf2a8873361a8891ea9a021bb2e5911f38
Flags: needinfo?(janus926)
From [3], it seems the samples are dropped. Try "-e 2000000" or some larger number with |./profile.sh start| can avoid it.

For setupShadowRoot(), from the profiles in bug 1143580, it's parsing the inline style rules, I guess it's the rules within template.
Flags: needinfo?(janus926)
Well, setupShadowRoot adds styles, etc, so it's likely provoking a reflow...

I think the best would be to create a small testcase using gaia-header and profile this...
Hey guys,

Here is very WIP patch with "upgradable" client. PR consists of 3 commits:
* Removing draft handling from ConversationService for now, to have fair comparison with master - maybe it's better to add in the separate patch;

* Moving ConversationService to window context;

* Draft idea on how we can upgrade service to SharedWorker once we have time/resources available.

How do you feel about the idea? Any feedback will be more than helpful :)

Thanks!
Attachment #8659761 - Flags: feedback?(schung)
Attachment #8659761 - Flags: feedback?(felash)
One more thing: if this or similar idea sticks I want to try the same thing for mozMobileMessageShim, load it in window for startup and then upgrade it to alwaysLowered window.
Well, I completely redone previous PR yesterday :) In this WIP I've changed the way we handle Client/Service upgrade (aka switch to SharedWorkers) + added alwaysLowered window support.

It still has a bunch of issues and it can't be tested on device due to bug 1204299, but the idea should be clear.

What do you think about it?

Thanks!
Attachment #8660400 - Flags: feedback?(schung)
Attachment #8660400 - Flags: feedback?(felash)
Comment on attachment 8659761 [details] [review]
GitHub pull request URL (wip, upgradable client)

Deferring feedback on this one :)
Attachment #8659761 - Attachment description: GitHub brach URL (wip, upgradable client) → GitHub pull request URL (wip, upgradable client)
Attachment #8659761 - Flags: feedback?(schung)
Attachment #8659761 - Flags: feedback?(felash)
Comment on attachment 8660400 [details] [review]
GitHub pull request URL (WIP, ServiceManager)

Add some questions on github, I have some concern about the timing of the service/shim init at the cold launch path, but maybe I misunderstood it.
(In reply to Steve Chung [:steveck] from comment #19)
> Comment on attachment 8660400 [details] [review]
> GitHub pull request URL (WIP, ServiceManager)
> 
> Add some questions on github, I have some concern about the timing of the
> service/shim init at the cold launch path, but maybe I misunderstood it.

Replied at GitHub.
p=1 only to give feedback.
Whiteboard: [sms-sprint FxOS-S7 p=3][sms-sprint FxOS-S6 p=2] → [sms-sprint FxOS-S8 p=1][sms-sprint FxOS-S7 p=3][sms-sprint FxOS-S6 p=2]
Comment on attachment 8660400 [details] [review]
GitHub pull request URL (WIP, ServiceManager)

For me the solution should work. I think the original idea of integrate all the service related initialization and upgrade mechanism for avoiding possible performance regression is good, but I think it might need more memory for the update. I have some concern about the memory usage in this patch because it already regressed in master and we have a 2.5 blocker for it. Could we roughly measure the memory impact for the changes? Thanks!
Attachment #8660400 - Flags: feedback?(schung) → feedback+
(In reply to Steve Chung [:steveck] from comment #22)
> Comment on attachment 8660400 [details] [review]
> GitHub pull request URL (WIP, ServiceManager)
> 
> For me the solution should work. I think the original idea of integrate all
> the service related initialization and upgrade mechanism for avoiding
> possible performance regression is good, but I think it might need more
> memory for the update.

Thanks for the feedback!

> I have some concern about the memory usage in this
> patch because it already regressed in master and we have a 2.5 blocker for
> it. Could we roughly measure the memory impact for the changes? Thanks!

Sure, I'm going to take a look at the memory numbers closer.

Thanks
Comment on attachment 8660400 [details] [review]
GitHub pull request URL (WIP, ServiceManager)

at last gave a feedback !

With the "little-browser" logic, I wonder if this will be so useful.. or maybe we'll need to adjust to make some of it live in the content wrapper ? Things to think about in the future..
Attachment #8660400 - Flags: feedback?(felash) → feedback+
(In reply to Julien Wajsberg [:julienw] from comment #24)
> Comment on attachment 8660400 [details] [review]
> GitHub pull request URL (WIP, ServiceManager)
> 
> at last gave a feedback !

Yay!

> 
> With the "little-browser" logic, I wonder if this will be so useful..

It least in my head, I still see the value in "upgradable" part, so that we can use fast-and-dirty configuration on startup and switch to more memory-manageable/shareable configuration later.

> or maybe we'll need to adjust to make some of it live in the content wrapper ?
> Things to think about in the future..

Yeah, this will definitely have to be adjusted to live with content wrapper instead of always-lowered window.
Byt
Waiting for feedback/decision on the solution proposed in bug 1224514. If we move this PR forward it should be based on the outcome of bug 1224514 anyway.
Status: ASSIGNED → NEW
Depends on: 1224514
Unassigning myself here for now, hopefully will get back to it later on.
Assignee: azasypkin → nobody
Mass closing of Gaia::SMS bugs. End of an era :(
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WONTFIX
Mass closing of Gaia::SMS bugs. End of an era :(
You need to log in before you can comment on or make changes to this bug.