Closed Bug 809600 Opened 12 years ago Closed 6 years ago

l10n.js takes a lot of time during app startup

Categories

(Firefox OS Graveyard :: Gaia::L10n, defect, P3)

ARM
Gonk (Firefox OS)
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: Margaret, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: perf, Whiteboard: [c=progress p= s= u=])

Let's use this bug to track our work to improve l10n.js. I think we can file dependent bugs as we actually make patches to land.

Here are the things we've been working on so far:
-Remove .ini files and instead directly load .properties files to avoid extra file load i/o
-Get rid of synchronous xhr to load .properties files
-Translate index.html files at build time to avoid needing to translate them during startup
Blocks: 797195
Depends on: 810448
Depends on: 810450
I filed some dependent bugs for the work jhford and I are doing.

I also have a branch here that adds some timers to l10n.js that I've been using to measure differences between approaches:
https://github.com/leibovic/gaia/tree/timers
So, jhford and I made some patches to include pre-processed l10n JSON data in the HTML of the page (first in comments in hidden <span> element, then in <script type="text/l10n"> tags), and this hugely improved the time of loadLocale() in l10n.js, but we found that app startup time was actually slower :( We hypothesize that extra stuff in the HTML is slowing things down, but we're not exactly sure what to do about this. 

Here's a profile of the settings app startup with only the en-US data in a <script type="text/l10n"> tag:
http://people.mozilla.com/~bgirard/cleopatra/#report=590da29076655e20c42656bc37b35f6144ca7565&filter=%255B%257B%2522type%2522%253A%2522RangeSampleFilter%2522%252C%2522start%2522%253A5909%252C%2522end%2522%253A6372%257D%255D&selection=%255B%2522Startup%253A%253AXRE_InitChildProcess%2522%252C%2522Timer%253A%253AFire%2522%255D
Margaret, why did you rule out the idea of fully pre-process the files (without including l10n JSON data)?
To do that correctly, we'd need to use an HTML 5 parser and serializer.  The HTML parser that I used to try that (html5lib, python) was reading the HTML5 fine, but changed attribute ordering and turned bare attribute names into null string assignments.  Bascially, <a href="site" special> became <a special="" href="site>.  We'd also want to use the gecko HTML5 parser to do this for real.  If we have a way to parse the markup then change the strings and serialize back to markup that's unchanged aside from the strings, this is definitely an option.

I did some timing runs on the phones using the same markup, but with and without Margaret's js changes to read the extra data, and it looks like Margaret's code path is much faster with identically sized markup.  That suggests that the file i/o is causing a lot of overhead.

The JSON data is very inefficiently encoded right now.  Possible changes:

1) Remove the nested dictionaries.  Almost all of them are {"textContent": realValue}.  Removing that level of nesting at the expense of branching in L10N.js might be useful if we are I/O bound and not CPU bound.  For the cases of x.y = z in the property file, we could fallback to the dictionary subkey.  We could also remove all unneeded whitespace from the json.

2) We currently insert all the JSON for all of the locales into a single piece of markup.  If we instead generated html files that included only a single locale inserted and use the index.html.ab-CD trick, we would be doing fewer file opens and possible reading less data than we currently do (do to collating and overriding en-US at build time instead of run-time).

3) We currently insert all of the JSON data after the </body> tag.  We could try different places in the markup to see if that has any effect, also different tag types.  We used display: none for hiding, but we could try other things as well.

4) If we really are I/O bound and not CPU bound, we could compress the L10N data and decompress it on the client side.
A few comments:

Processing to a single processed locale makes it impossible to switch locales, I think we want to avoid that.

One thing we intend to do on the l20n project is to compile to js code, that returns strings for input. That might be helpful compare to json. json often just means you're not compiling to something that's efficient, but parse json, and then compile it to something efficient.

Do we have comparisons of profiles before the current attempt and after?
(In reply to John Ford [:jhford] from comment #4)
> To do that correctly, we'd need to use an HTML 5 parser and serializer.  The
> HTML parser that I used to try that (html5lib, python) was reading the HTML5
> fine, but changed attribute ordering and turned bare attribute names into
> null string assignments.  

When using an HTML parser and serializer, are we able to handle HTML code as seen in the settings app, where due to performance reasons most of the code is commented out and the DOM is built dynamically?  See this commit:

https://github.com/mozilla-b2g/gaia/commit/978eda3b13d4326b3813a31f1928cf39cafbb8fc#L10R43

AIUI, we can't just get nodes by ID when they're hidden in a comment node.

(In reply to Axel Hecht [:Pike] from comment #5)
> Processing to a single processed locale makes it impossible to switch
> locales, I think we want to avoid that.

We could use the runtime localization approach for this scenario and go back to preprocessed HTML when the device (or the app?) is restarted.

preprocessed html 
  -> [locale change] -> runtime l10n (slower) 
  -> [restart] -> preprecessed html
Sounds like you're talking about a fast-load cache for system apps html, and that sounds like a big task to me.

Other hacky idea: Set body to display:none, and only show it after we went through with the DOM mods.
Other hacky idea: Use mutation observers to work on content as we get it, instead of trying to process it all post-content-ready.

Both attack slightly different reasons for percieved perf. The first tries to avoid redoing layout for each localized node. The second tries to tie in to incremental parsing, but I didn't check how much data you would get when.
(In reply to Axel Hecht [:Pike] from comment #7)
> Sounds like you're talking about a fast-load cache for system apps html, and
> that sounds like a big task to me.

Yeah, I think translating the HTML itself is a big task, and one that will probably have lots of bugs, something I don't think we can afford to do at this point for v1.

> Other hacky idea: Set body to display:none, and only show it after we went
> through with the DOM mods.

Currently, lots of apps wait for the "localized" event before showing any content. Some Gaia devs told me this was a best practice, but not everyone follows it.

> Other hacky idea: Use mutation observers to work on content as we get it,
> instead of trying to process it all post-content-ready.

Actually doing the DOM manipulation doesn't take very much time (so long as the app developer is smart and is only trying to load the HTML that is going to be shown). What's taking a long time is loading the JS dictionary structure that we use for translating the entities.

As jhford talked about in comment 4, we really seem to just be slowed down reading data, whether it's .properties files, or pre-processed JSON.

I'm going to do some more organized measurements today to gather data about our different options.

Here is a profile before any changes:
http://people.mozilla.com/~bgirard/cleopatra/#report=c6ec9dcd55c70c23ffd77a0c3a51ff7a2e7af94a&selection=%255B%2522Startup%253A%253AXRE_InitChildProcess%2522%252C%2522Timer%253A%253AFire%2522%255D

We definitely cut down on a lot of the l10n.js work between this profile and the "after" profile I posted before, but we either transferred the work somewhere else, or this work is irrelevant to first paint time (if apps are painting before getting the "localized" event).
Depends on: 811136
I've been experimenting with only including one locale in the HTML, and that speeds things up. In fact, even with all the locales, I'm seeing improvement from the numbers we were seeing last week. Maybe updating my gecko/gaia trees fixed something? 

In any case, users switch locales so infrequently that if we're at all worried about the extra overhead of including data for all the locales in every html file, we should just include the default locale.

I have a patch from fabrice to magically load locale-specific html files, so we can use that to only include the necessary data for each locale. jhford, can you try doing that? What we want is a <filename>.html.<locale> (e.g. index.html.en-US) for every processed html file, and we can just continue to stick the hidden strings for the appropriate locale at the bottom of that file.

Then we have the existing code path as a fallback if the user switches locales. This will still be slightly slow, but it is a case that the user rarely encounters, and it will only affect apps that are already running.
Great to see this bug!

Conveniently, I've been just working on a performance metrics for l20n project (bug 810463) and my first aim was to test Settings app converted to use l20n lib.

An obvious reference point is the current gaia l10n lib so I branched gaia, instrumented settings.js and l10n.js and measured the whole localization process for Settings app on Unagi (gaia and B2G from Friday).

Here are my numbers:

Lib:
                                No  Min    Avg    Max    Cum
  ready:                                                 516.51
  pages:
    settings:                                            17.85

Context "main":
                                No  Min    Avg    Max    Cum
  resloading:                   6   98.08  129.61 160.83 777.68
    locales.ini                                          98.08
    date.ini                                             127.72
    permissions.ini                                      133.42
    settings-en-US.prop                                  115.17
    date-en-US.prop                                      142.46
    permissions-en-US.prop                               160.83

  parsing:                      6   1.92   107.29 235.14 643.77
    settings-en-US.prop                                  45.43
    date-en-US.prop                                      2.93
    permissions-en-US.prop                               1.92
    permissions.ini                                      171.81
    date.ini                                             186.49
    locales.ini                                          235.14
  execution:                    34  0.09   0.56   6.93   19.04


The perf. test suite is still in works, so those will probably be useless as comparison point later on, since I'm still changing the lib, but may be helpful to you guys.

Explanation:
 - Lib are the numbers for the library usage itself
 - Context "main" is the only context you have - navigator.mozL10n

 - "ready" is the time between initializing locale loading for L10n and when "localized" event is fired. There must be a bug in my code since this number should not be lower than the total combined time of all resources loading. working on that.
 - "pages" is the cost of singe page localization (in this case, document.body of settings index.html)

 - resloading is the cost of loading resources.
 - parsing is the cost of parsing resources
 - execution is the cost of each singular node translation (getL10nData in l10n.js)

The current l10n.js library parsing of .ini file initializes synchronous .properties loading for that file.
I did not have time to deduct this cost from the parsing cost for the .ini file, so the parsing numbers for .ini files are a sum of .ini file parsing cost plus its .prop file synchronous loading time plus its .prop file parsing time. Deduct the latter two and you'll get the parsing cost of .ini file :)

The overall conclusion confirms what you said before - resource loading costs the most. Then comes cheap parsing and execution.

I cannot measure the reflow cost with this, only raw time of JS (+DOM injection) cost.

My only suggestion for reflow would be to consider doing domFragment localization before the fragment is injected to the panel. (in settings_loadPanel).

You can find the source code of the test lib (crappy, your eyes will hurt) and the placement of tickets in settings.js and l10n.js in my gaia branch here: https://github.com/zbraniecki/gaia/tree/perftest

The way to trigger the stats is to go to settings and click on the top header bar.
Depends on: 811540
So, my "ready" time is correct in the above metrics. The reason why the total resloading time is greater is because asynchronous loading overlaps.

Running this multiple times (after cold reboot) gave me a huge variance - from 512ms to 1300ms and slower resloading time correlated with slower parsing/execution times which indicate to me that high system load slows down everything.

I'm wondering if slow file read time deserves a separate bug? Is it possible that it's on a kernel level? Shouldn't SSD be faster than 100ms per file?
(In reply to Zbigniew Braniecki [:gandalf] from comment #11)

> I'm wondering if slow file read time deserves a separate bug? Is it possible
> that it's on a kernel level? Shouldn't SSD be faster than 100ms per file?

I think it would be smart to file a new bug. I don't know how much of that is a fundamental limitation of the hardware vs. something that could be improved, but speeding up file read time would help a lot of things.
I'd like to get a superreview on the approach of doing pre-processed html. Personally, it rubs me backwards. I'd feel better if we had an exit strategy at least post-v1.

What we do here is close enough to an "API" on the web, at least if it's read as "best practices", so I think we should treat it like API-like architecture.
Is the potential super-reviewer CCed on this bug?
(In reply to Axel Hecht [:Pike] from comment #13)
> I'd like to get a superreview on the approach of doing pre-processed html.
> Personally, it rubs me backwards. I'd feel better if we had an exit strategy
> at least post-v1.

In parallel we can work to improve the existing dynamic loading code path, and if that gets super fast we can always remove this pre-processing step. This approach does not limit any options for the future.

> What we do here is close enough to an "API" on the web, at least if it's
> read as "best practices", so I think we should treat it like API-like
> architecture.

It's an "API", but it's a polyfill in a library that the app author chooses to include, so I don't think we need to worry too much about it. Also, the API still works if you don't have pre-processed files. This just adds an optimized fast-path for pre-built gaia apps (and if app developers want the same optimization, they can include the same pre-built JS in their HTML files, and beyond that they can include the same .locale HTML files in their .zip).

(In reply to Fabrice Desré [:fabrice] from comment #14)
> Is the potential super-reviewer CCed on this bug?

I discussed this approach with Andreas last week. Is he qualified to superreview?
(In reply to Margaret Leibovic [:margaret] from comment #15)
> (In reply to Axel Hecht [:Pike] from comment #13)
> > I'd like to get a superreview on the approach of doing pre-processed html.
> > Personally, it rubs me backwards. I'd feel better if we had an exit strategy
> > at least post-v1.
> 
> In parallel we can work to improve the existing dynamic loading code path,
> and if that gets super fast we can always remove this pre-processing step.
> This approach does not limit any options for the future.
> 
> > What we do here is close enough to an "API" on the web, at least if it's
> > read as "best practices", so I think we should treat it like API-like
> > architecture.
> 
> It's an "API", but it's a polyfill in a library that the app author chooses
> to include, so I don't think we need to worry too much about it. Also, the
> API still works if you don't have pre-processed files. This just adds an
> optimized fast-path for pre-built gaia apps (and if app developers want the
> same optimization, they can include the same pre-built JS in their HTML
> files, and beyond that they can include the same .locale HTML files in their
> .zip).

This is the "API" level that I'm talking about. I'm concerned that tons of app developers starting creating .de files just because we hack around a perf issue. Probably more along the lines of "Significant architectual refactoring", if I read the page referenced below.

> (In reply to Fabrice Desré [:fabrice] from comment #14)
> > Is the potential super-reviewer CCed on this bug?
> 
> I discussed this approach with Andreas last week. Is he qualified to
> superreview?

http://www.mozilla.org/hacking/reviewers.html doesn't list him. I'd try to get cycles from folks in the content/layout area.

I've chatted with gandalf just a few ago, and he's off trying to get more numbers on how much time we spend on loading referenced files in general. Seems the l10n files are only a small fragment, and we're not sure they're special, or understand in which way they are.
(In reply to Axel Hecht [:Pike] from comment #13)
> I'd like to get a superreview on the approach of doing pre-processed html.
> Personally, it rubs me backwards. I'd feel better if we had an exit strategy
> at least post-v1.

We spoke about an l10n API in the platform in dev-webapi a while back. This would likely require some changes to the parser but would be much more efficient since it will be native. Does it sounds good?
(In reply to Axel Hecht [:Pike] from comment #16)

> > It's an "API", but it's a polyfill in a library that the app author chooses
> > to include, so I don't think we need to worry too much about it. Also, the
> > API still works if you don't have pre-processed files. This just adds an
> > optimized fast-path for pre-built gaia apps (and if app developers want the
> > same optimization, they can include the same pre-built JS in their HTML
> > files, and beyond that they can include the same .locale HTML files in their
> > .zip).
> 
> This is the "API" level that I'm talking about. I'm concerned that tons of
> app developers starting creating .de files just because we hack around a
> perf issue. Probably more along the lines of "Significant architectual
> refactoring", if I read the page referenced below.

Okay, so let's not tell app developers they should do this. If we never communicate this as a supported way to do things, I don't think we need to worry about developers starting to rely on it. App developers won't know to make these .locale files unless they start poking around in the Gaia source (or read AppProtocolHandler.js). The one concern I would have is if developers do include files with these extensions in their zip for some reason, then get unexpected behavior, but I think that's an edge case.

> > (In reply to Fabrice Desré [:fabrice] from comment #14)
> > > Is the potential super-reviewer CCed on this bug?
> > 
> > I discussed this approach with Andreas last week. Is he qualified to
> > superreview?
> 
> http://www.mozilla.org/hacking/reviewers.html doesn't list him. I'd try to
> get cycles from folks in the content/layout area.

So, there's 3 parts to this proposed change:
1) Change l10n.js to look for a pre-loaded JS object
2) Change the Gaia build system to generate these pre-processed HTML files
3) Change the app:// protocol to load different HTML files based on the locale

Parts 1 and 2 are in Gaia land, so technically that doesn't fall under these mozilla-central policies. We could do parts 1 and 2 without part 3, but that would involve potentially loading a lot more JS than necessary into each HTML file.

(In reply to Vivien Nicolas (:vingtetun) from comment #17)
> (In reply to Axel Hecht [:Pike] from comment #13)
> > I'd like to get a superreview on the approach of doing pre-processed html.
> > Personally, it rubs me backwards. I'd feel better if we had an exit strategy
> > at least post-v1.
> 
> We spoke about an l10n API in the platform in dev-webapi a while back. This
> would likely require some changes to the parser but would be much more
> efficient since it will be native. Does it sounds good?

I don't think we have the time for a change like that for v1, but that could be interesting to pursue in the future. I bet gandalf would have ideas about that.
(In reply to Margaret Leibovic [:margaret] from comment #18)
> (In reply to Vivien Nicolas (:vingtetun) from comment #17)
> > (In reply to Axel Hecht [:Pike] from comment #13)
> > > I'd like to get a superreview on the approach of doing pre-processed html.
> > > Personally, it rubs me backwards. I'd feel better if we had an exit strategy
> > > at least post-v1.
> > 
> > We spoke about an l10n API in the platform in dev-webapi a while back. This
> > would likely require some changes to the parser but would be much more
> > efficient since it will be native. Does it sounds good?
> 
> I don't think we have the time for a change like that for v1, but that could
> be interesting to pursue in the future. I bet gandalf would have ideas about
> that.

Obviously I'm replying to the post-v1 plan :)
Keywords: perf
Whiteboard: [c= p= s= u=]
Priority: -- → P3
Whiteboard: [c= p= s= u=] → [c=progress p= s= u=]
FWIW, the l10n drivers team has been working on a complete refactor of l10n.js in bug 914414.  The goal is to make l10n.js more modular, extensible, scalable, secure and faster.
Depends on: 914414
Component: Gaia → Gaia::L10n
Priority: P3 → --
Hi Zibi,

Is there a reason why you cleared this bug's priority field?

We (FxOS Perf) set Priorities (P1-P5) on FxOS perf issues so we can more easily plan what we address in upcoming sprints. We track these via our backlog[1].

Mike

[1] https://scrumbu.gs/p/fxos-perf/
Flags: needinfo?(gandalf)
OS: Mac OS X → Gonk (Firefox OS)
Hardware: x86 → ARM
(In reply to Mike Lee [:mlee] from comment #21)
> Hi Zibi,
> 
> Is there a reason why you cleared this bug's priority field?
> 
> We (FxOS Perf) set Priorities (P1-P5) on FxOS perf issues so we can more
> easily plan what we address in upcoming sprints. We track these via our
> backlog[1].

We were doing a triage of our component (Gaia:L10n) and did not see at the moment too much value in this bug. It has two unsolved dependencies: one with zero impact on the bootstrap (XHR one) and one with a very experimental approach that we don't see to much value in (localizing app:// files) because the performance cost that we'd save is marginally low (below 30ms on Tarako).

Historically we kept triaging bugs in our component according to our priorities. If you think that this bug should have a high priority set and still remain in Gaia:L10n component, I'm totally ok with that.
Flags: needinfo?(gandalf)
Based on comment 22, going ahead and putting P3 back on since this showed up in our prioritization triage again. If this is causing an issue with your tracking, let us know.
Priority: -- → P3
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.