Last Comment Bug 903553 - JSON'ing objects for event handling is not efficient
: JSON'ing objects for event handling is not efficient
Status: NEW
:
Product: Firefox for Android
Classification: Client Software
Component: General (show other bugs)
: unspecified
: All Android
: -- normal (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
Mentors:
: 823586 (view as bug list)
Depends on:
Blocks: 807322
  Show dependency treegraph
 
Reported: 2013-08-09 12:05 PDT by Nathan Froyd [:froydnj]
Modified: 2014-03-14 12:04 PDT (History)
17 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Chris's startup profile without restoring tabs (5.15 MB, text/plain)
2013-08-09 12:07 PDT, Nathan Froyd [:froydnj]
no flags Details
Chris's startup profile with restoring tabs (6.25 MB, text/plain)
2013-08-09 12:40 PDT, Nathan Froyd [:froydnj]
no flags Details
Try using a different JSON library (16.67 KB, patch)
2013-08-11 10:00 PDT, Kartikaya Gupta (email:kats@mozilla.com)
no flags Details | Diff | Splinter Review
start/stop method tracing in fennec (3.21 KB, patch)
2013-08-12 12:21 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
Chris' profile proliferation patch. (6.35 KB, patch)
2013-08-12 13:50 PDT, Chris Kitching [:ckitching]
no flags Details | Diff | Splinter Review
Decode JSON in C++. [Slightly borked] (28.90 KB, patch)
2013-08-23 01:52 PDT, Chris Kitching [:ckitching]
no flags Details | Diff | Splinter Review
Proof of concept, WIP: ByteArray communication (24.23 KB, patch)
2013-08-26 17:07 PDT, Shilpan Bhagat
no flags Details | Diff | Splinter Review

Description Nathan Froyd [:froydnj] 2013-08-09 12:05:37 PDT
Chris has done some nice profiling work showing that JSON'ifying things to pass them back and forth between Java / C++ / JS can be a significant portion of startup time.  The problematic areas fall into a couple distinct places:

* prefs
* session restore
* tabs
* telemetry histograms (!)

At least the first and the last should be addressable without too much trouble.
Comment 1 Nathan Froyd [:froydnj] 2013-08-09 12:07:58 PDT
Created attachment 788298 [details]
Chris's startup profile without restoring tabs

Load this up in $SDK/tools/traceview.
Comment 2 Wesley Johnston (:wesj) 2013-08-09 12:16:09 PDT
I want to be careful here. Using JSON gives us tons of flexibility. For things that aren't performance critical, I'd happily take the tradeoff (or a faster JSON parser).
Comment 3 Nathan Froyd [:froydnj] 2013-08-09 12:33:18 PDT
(In reply to Wesley Johnston (:wesj) from comment #2)
> I want to be careful here. Using JSON gives us tons of flexibility. For
> things that aren't performance critical, I'd happily take the tradeoff (or a
> faster JSON parser).

Can you elaborate on the flexibility we gain here?  I mean, I understand that creating broadcast events with arbitrary JSON'ified data is useful and easy to do.  But e.g. for the preferences bits:

http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/PrefsHelper.java#43

it seems like it'd be easy enough to wire up a separate GET_PREFERENCES event or similar that would be more efficient at transferring data.  Similarly for the Telemetry one; there's no need to bounce a telemetry histogram request from Java to C++ to JS.  Just have the C++ code take care of things.
Comment 4 Nathan Froyd [:froydnj] 2013-08-09 12:40:41 PDT
Created attachment 788314 [details]
Chris's startup profile with restoring tabs

Load this up in $SDK/tools/traceview.
Comment 5 Nathan Froyd [:froydnj] 2013-08-09 12:47:38 PDT
ni?'ing for a response to comment 3.
Comment 6 Wesley Johnston (:wesj) 2013-08-09 12:50:37 PDT
(In reply to Nathan Froyd (:froydnj) from comment #3)
> (In reply to Wesley Johnston (:wesj) from comment #2)
> > I want to be careful here. Using JSON gives us tons of flexibility. For
> > things that aren't performance critical, I'd happily take the tradeoff (or a
> > faster JSON parser).
> 
> Can you elaborate on the flexibility we gain here?  I mean, I understand
> that creating broadcast events with arbitrary JSON'ified data is useful and
> easy to do.  But e.g. for the preferences bits:
> 
> http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/
> PrefsHelper.java#43

Prefs is a fine example I guess. We don't always transfer straight up preferences with it. In places we've special cased certain "prefs" to query different data sources, yet make it Java unaware what's going on:

http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/browser.js#1025

We can do the same thing in C (heck we can do everything in C++), but its trivial to do here. We add and change "prefs" frequently and easily, and for the most part, prefs don't really change and aren't read that often so the flexibility is worth it. We could pretty easily move them out of the startup path by caching their values in the Native frontend if we need to.
Comment 7 Chris Kitching [:ckitching] 2013-08-09 13:08:46 PDT
While it'd be nice to start using a non-JSON serialisation format for transferring information between JS and Java, JS is essentially incapable of serialising to any other format with sane performance. This seems to be because JS' JSON serialisation is implemented in native code (Presumably some part of the runtime), and any attempt to write a serialiser in JS is many times slower than this.
But the problem is JSON is a really lousy serialisation format for objects. It's expensive to both create and parse - it is human readable, but it's not being read by humans...

It seems that switching to an alternative format is only really possible if we have a way of invoking native code with a pointer to a JS object in memory that we can then serialise however we like. This seems like it'd involve patching the JS runtime. This seems unlikely to be sane.

The majority of the cost of our current implementation is that we're doing JSON  encoding/decoding in Java. Java is really bad at this (Substantially slower than the native code in the JS runtime which does this, for example) - I strongly suspect we will get much better performance if we decode the JSON in some C code and then use JNI calls to build a deserialised object in Java. This approach would likely get us a large speedup with minimal work. It'd certainly be a worthwhile first step.

Another thing worth looking into would be trying to stop using JSON in the cases where it's truly pathologically awful to do so. For example, in several places we send image data between JS and Java as a base64 encoded string inside a JSON object. This is terrible. In such cases it'd be far better to use the JNI to call Java with the actual bytes.

As well as, of course, what's being discussed above along the lines of "Send fewer messages". Sending messages is never going to be cheap, but I suspect we can make it less extortionate.


Unless there's someone out there who knows more about this than I who can suggest nice ways of getting JS objects encoded/decoded in more Java-friendly ways?
Comment 8 Nathan Froyd [:froydnj] 2013-08-09 13:14:33 PDT
(In reply to Wesley Johnston (:wesj) from comment #6)
> Prefs is a fine example I guess. We don't always transfer straight up
> preferences with it. In places we've special cased certain "prefs" to query
> different data sources, yet make it Java unaware what's going on:
> 
> http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/
> browser.js#1025
>
> We can do the same thing in C (heck we can do everything in C++), but its
> trivial to do here.

This sounds like you agree with me now.  I am confused. :)

I was mostly just thinking we could do a better job with the data transfer, e.g. actually sending a Java array of strings to C++, converting those to nsString, and then passing an array to Java.  Though I'm not sure that would necessarily be faster...
Comment 9 Wesley Johnston (:wesj) 2013-08-09 13:48:06 PDT
(In reply to Chris Kitching [:ckitching] from comment #7)
I really didn't want to get into this argument. Just to say "You could do this faster, so you should do it faster" is a false argument that I strongly disagree with. There are plenty of good reasons to use something that's not the fastest in places where performance isn't the end all be all of everything. I'm fine with optimizing, but I don't want to see us making development hard just for sake of speeding up things that were already more than fast enough to begin with.

> it is human readable, but it's not being read by humans...

I print out and read the objects we're transferring frequently when I'm debugging.

> The majority of the cost of our current implementation is that we're doing
> JSON  encoding/decoding in Java. Java is really bad at this (Substantially

Why is bringing in a better JSON library not anywhere in your list of options here?

> Another thing worth looking into would be trying to stop using JSON in the
> cases where it's truly pathologically awful to do so. For example, in
> several places we send image data between JS and Java as a base64 encoded
> string inside a JSON object. This is terrible. In such cases it'd be far
> better to use the JNI to call Java with the actual bytes.

I'd be happy if we had better ways to transfer images between the two.

> Unless there's someone out there who knows more about this than I who can
> suggest nice ways of getting JS objects encoded/decoded in more
> Java-friendly ways?

You might try pinging azakai on this stuff. He's got a lot of experience dealing with storing js objects in things like TypedArrays or blobs:

https://developer.mozilla.org/en-US/docs/Web/API/Blob?redirectlocale=en-US&redirectslug=DOM%2FBlob
Comment 10 Mark Finkle (:mfinkle) (use needinfo?) 2013-08-09 19:32:06 PDT
I agree with the example of the Telemetry messages: If we can move to a Java -> C++ route for that specific situation, it would probably work out for the best. Let's pick and choose those wisely.

I also agree with the situation where Java <-> JS communication is currently simple and easy with JSON. If we have an example of something in a tight loop that could be ported to some other transport, we should file a bug on it. But for the basic glue between the two systems, JSON is our best bet currently.

I have talked in the past about using JNI/jsctypes for JS -> Java communication. I am still open to that, but we need to verify that JNI/jsctypes is actually faster.
Comment 11 Chris Kitching [:ckitching] 2013-08-11 02:14:23 PDT
(In reply to Mark Finkle (:mfinkle) from comment #10)
> I also agree with the situation where Java <-> JS communication is currently
> simple and easy with JSON. If we have an example of something in a tight
> loop that could be ported to some other transport, we should file a bug on
> it. But for the basic glue between the two systems, JSON is our best bet
> currently.

Yes, this is more or less the conclusion I came to, as well. While JSON is a truly lousy intermediate representation, switching to something better would be extremely difficult. What we have isn't slow enough to justify all the effort and inconvenience that would ensue.

> I have talked in the past about using JNI/jsctypes for JS -> Java
> communication. I am still open to that, but we need to verify that
> JNI/jsctypes is actually faster.

The problem with using jsctypes with JNI is that we lose the ability we currently have to use arbitrarily-structured JSON objects - we'd have to use an IDL to specify the message formats we'd be using (Due to the way jsctypes works - it maps JSON structures to C structs. You need to specify the structs.) We could do it, it would *probably* be faster (But maybe not, because you're suddenly doing extra processing in JS, which is very slow), but it's annoying to both implement and use.

So, how to make JSON faster?. 

Using a faster Java JSON library has been investigated in the past - Jackson's the only one that's much faster, and it's too large.

JSON is not bad enough to justify things being as slow as they are. Why is JS so good at encoding/decoding these JSON strings, and why is Java so awful at it?

Java, for a variety of reasons, is awful at the sort of processing needed to do JSON encoding/decoding. I suggest we just stop asking Java to do this entirely.
JS is fast at this, largely because it cheats and does the encoding/decoding in native code, somewhere inside the runtime. For this reason, implementing any other object serialisation format for JS that performs comparably well requires a similar implementation. So far as I can tell, this requires patching the JS runtime. Let's not do that.

But we can similarly cheat for Java - We could decode the JSON objects in C++, in the Android Bridge, prior to sending them to the VCM. We could then use the JNI to move the already-decoded object into Java (In the form of a hashmap-like object, semantically similar to the existing JSONObject).
Going the other way would be similar - dump the object into C++ using the JNI and let C do the actual encoding.

That way, we can get Java to have comparable JSON performance to JS. This approach would be a drop-in replacement for JSONObject on the Java side, and would require no changes at all to how we use message passing.
There already exist JSON libraries for C++, so the work involved in this approach is also not that difficult - take such a libary, write a little JNI code to glue it all together, and you're done. I think this is certainly worth trying - we can always try one of the more exotic approaches if this isn't fast enough.

> I agree with the example of the Telemetry messages: If we can move to a Java
> -> C++ route for that specific situation, it would probably work out for the
> best. Let's pick and choose those wisely.

This is a good idea. In general, we should reduce the amount of inter-language communication we have to do. However we solve this communication problem it will be quite expensive, so let's do as little of it as we can get away with (Sanely).




(In reply to Wesley Johnston (:wesj) from comment #9)
> (In reply to Chris Kitching [:ckitching] from comment #7)
> I really didn't want to get into this argument. Just to say "You could do
> this faster, so you should do it faster" is a false argument that I strongly
> disagree with.

This isn't at all what I was trying to say with my earlier posts. 
I agree with you, this is not a valid argument. We need to focus our efforts where they make a difference.

"Premature optimisation is the root of all evil" - Knuth. (A favourite quote. :P)

> There are plenty of good reasons to use something that's not
> the fastest in places where performance isn't the end all be all of
> everything. I'm fine with optimizing, but I don't want to see us making
> development hard just for sake of speeding up things that were already more
> than fast enough to begin with.

Also very true - but the IPC system that is used to make the two halves of our app talk to each other is somewhat important for performance (But not so much that it justifies compromising programmer convenience a large amount).
It is far from "more than fast enough", but it's also far from slow enough to warrant switching to a different format than JSON (Simply because getting JS to serialise to anything other than JSON in a way that performs nonterrible turns out to be essentially impossible). I outline above an approach that ought to give us most of the performance gain of the more exotic solutions, while allowing us to entirely keep our JSON and all the convenience it provies.


> > it is human readable, but it's not being read by humans...
> 
> I print out and read the objects we're transferring frequently when I'm
> debugging.

This wasn't a well thought out statement on my part. Let me try again...
The format used for the objects while they are in transit does not need to be human readable. That is not ever being read by humans. We read the objects we're sending for debugging purposes, but we do this either before or after it has been sent. We don't care about what happens in between provided it gets from A to B as fast as possible.
It just so happens to be that, for JSON objects, the serialisation format you use to print a representation of the object for debugging purposes is the same one we use for transmission. If we were instead using some opaque intermediate representation, your debugging can proceed equally well by you printing the JSON (Or whatever you like) representation of the object immediately before passing it to the magical encoder which turns it into the opaque format and sends it. Assuming there's no encoder/decoder bug, this change wouldn't affect your ability to debug.
In any case, we're probably not going to do that. Just felt I should clarify the apparently really ignorant statement I made earlier. :P


> Why is bringing in a better JSON library not anywhere in your list of
> options here?

Using a faster Java JSON library has been investigated in the past - Jackson's the only one that's much faster, and it's too large (Adds about 1MB to apk size). It does have many features we don't use. Proguard would actually probably end up deleting a large chunk of it, and we could manually extricate the parts we want. I'm reluctant to do this at first because I estimate that my proposed decode/encode-in-C++ approach ought to be around 5-6 times faster (Based on the usual speedup gained from switching to native code, then applying a very pessimistic view of how much the extra JNI calls will cost us) than our current approach. Jackson is measured as being twice as fast as our current approach.

> I'd be happy if we had better ways to transfer images between the two.

Likewise. How attached are you to base64 in JSON objects? This may be a naive question - why can't you just dump the actual image bytes into the JSON string? It wouldn't parse sanely using any character set.. But we don't care, do we? It would allow us, at the receiving end, to just go "Those bytes? They're a PNG. Go decode it." instead of having to run through it and decode the base64.

> You might try pinging azakai on this stuff. He's got a lot of experience
> dealing with storing js objects in things like TypedArrays or blobs:

This is perhaps worth a look. I did actually make a start implementing an encoder in JS which represents our messages as Java-friendly byte sequences in a TypedArray. I gave up on it when it turned out to be about 5x slower than JSON.stringify (Presumably because JSON.stringify is in native code and my thing wasn't.)
This is largely why we're stuck with either keeping JSON or patching the JS runtime to add some comparably fast encoder for a different format (Something which would be a huge job and probably not worth the effort).

(In reply to Nathan Froyd (:froydnj) from comment #8)
> I was mostly just thinking we could do a better job with the data transfer,
> e.g. actually sending a Java array of strings to C++, converting those to
> nsString, and then passing an array to Java.  Though I'm not sure that would
> necessarily be faster...
Something of this flavour would very likely be considerably faster than what we currently have, because of eliminating the need to decode a JSON string in the JVM entirely. I imagine the improvement of this would be slightly greater than that of my proposed solution in this post, but it would require the extra work of changing how we send messages.
There are also most likely overheads involved in calling the JNI from JS (XPCOM is the system used for this, right? Doesn't this serialise/deserialise the arguments when you make a call?) which may make it end up being slower.
So more unknown overheads, more fiddly, and maybe faster. I'm not sure how the cost of getting calls into native from from JS compares to the JNI. (Random note: Ostensibly a JNI function call from Java costs around 10x that of an ordinary Java method call.)
Comment 12 Kartikaya Gupta (email:kats@mozilla.com) 2013-08-11 10:00:54 PDT
Created attachment 788702 [details] [diff] [review]
Try using a different JSON library

It's not that hard to write a faster JSON parsing library in Java. I have one that I wrote a while ago and I shoehorned it into fennec to see how it fares against the built-in JSON parser. Mine was about ~20% faster, with only negligible time spent optimizing it for speed. Mine was originally written for parsing streams of JSON rather than fixed strings, so I'm pretty confident that with some more adjustments I can bring it down to at least 30%-40% faster, and also modify the interface to be more of a drop-in replacement to the existing code. I'm attaching the code as I have it now - feel free to steal the patch and hack on it if you feel it's worthwhile, or let me know and I can do it.

When I ran the attached patch on my N4 for startup + one page load, the times reported were

D/staktrace( 3392): Current times at 7156, 5762 : YAY!

which is 19.4% faster.
Comment 13 (dormant account) 2013-08-11 13:12:08 PDT
re java being slow at json: jackson is one of the fastest json libs around. It's only slower than the fastest c++ libs. 

We can also pass java objects to js using lazy serialization(eg they would look like JS objects, this is what I did in Treehydra to pass GIMPLE into JS). It would not be too much work to code that up: http://hg.mozilla.org/rewriting-and-analysis/dehydra/file/9a7704207d98/treehydra.c#l107 


You can do something similar on the JS->Java side, it shouldn't be uglier than using JSON in Java.
Comment 14 (dormant account) 2013-08-11 13:14:28 PDT
To be clear, the speedup in my proposal would mainly come from having to partial serialization/deserializiation to only transfer bits that matter across. There likely wont be a win for something that needs every field member.
Comment 15 Chris Kitching [:ckitching] 2013-08-11 18:30:36 PDT
(In reply to Taras Glek (:taras) from comment #14)
> To be clear, the speedup in my proposal would mainly come from having to
> partial serialization/deserializiation to only transfer bits that matter
> across. There likely wont be a win for something that needs every field
> member.

That's a really cool idea, but I think virtually every message we send between JS and Java uses every field - otherwise we could simply omit the field. I may be incorrect - if we are sending fields we don't need then we need to stop doing this.

(In reply to Taras Glek (:taras) from comment #13)
> re java being slow at json: jackson is one of the fastest json libs around.
> It's only slower than the fastest c++ libs. 

This is both correct and incorrect ;).
If you sit down and write a program that has Jackson do a bunch of JSON operations, and has some C++ code do a bunch of JSON operations, measures the times, and prints the results, then your results will support what you said.

The problem is that you've cheated - in your benchmarking program the Jackson code paths are being hammered, so even the stupidest JIT will realise that they need attention. After a few slow iteratoins, the VM catches on, Jackson gets JIT'd, and we're suddenly benchmarking native code against slightly different native code. It's no surprise that the results of benchmarking JIT'd Jackson against proper C++ JSON code tells us that Jackson is just a teeny bit slower.
But if you were to run such a benchmark with the JIT disabled, you'd very likely find that the C++ wins by a sizeable amount since native code has a pretty substantial performance advantage over interpreted bytecode (After all, that's why JITs exist).

Worse still, if you then made the not unreasonable decision to use Jackson after seeing such a benchmark you'd find it would perform much worse in Fennec than it did in your test, since in Fennec it's not being hammered nearly enough to justify being JIT'd as readily as it was in your benchmark.
And then you consider that Android's JIT is nowhere near as good as the one on the Hotspot VM, and is generally quite reluctant to compile things that aren't being hit extremely hard.
And that some versions of Android don't even HAVE a JIT...

And now we see why benchmarking Java stuff is such a pain. If you try and benchmark something in isolation in Java then the JIT jumps in and makes it perform really fast, even though when you profiled it with the rest of your program it was highly suckful.

I would be extremely surprised if a benchmark of Jackson with JIT disabled is anywhere near as good as a half-decent C++ JSON library. This is why I'm not in favour of just using Jackson and letting Proguard strip out all the stuff we don't need - Java benchmarks are almost always instances of implicit benchmarksmanship, and people rarely notice.

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #12)
> It's not that hard to write a faster JSON parsing library in Java. I have
> one that I wrote a while ago and I shoehorned it into fennec to see how it
> fares against the built-in JSON parser. Mine was about ~20% faster, with
> only negligible time spent optimizing it for speed. Mine was originally
> written for parsing streams of JSON rather than fixed strings, so I'm pretty
> confident that with some more adjustments I can bring it down to at least
> 30%-40% faster, and also modify the interface to be more of a drop-in
> replacement to the existing code. I'm attaching the code as I have it now -
> feel free to steal the patch and hack on it if you feel it's worthwhile, or
> let me know and I can do it.
> 
> When I ran the attached patch on my N4 for startup + one page load, the
> times reported were
> 
> D/staktrace( 3392): Current times at 7156, 5762 : YAY!
> 
> which is 19.4% faster.

Excellent! This is most encouraging. I'm not convinced such a system can ever be mashed into outperforming native encode/decode, but this may not be impossible if the JNI overheads dominate.
It seems that both approaches are worth experimentation and comparism.

For reference - Jackson is approximately 50% faster than the builtin one. Arguably if we're going to to with the faster in-VM encode/decode approach then our best bet is to use Jackson and Proguard away the extra weight. (We'd need to mark Bug 709230 as a dep of this one)

Now if I can just get the build system to stop shouting at me for having the audacity to try and add two new C++ files I can get some meaningful numbers about this native encode/decode approach...

I also need to stop writing >1k word Bugzilla comments...
Comment 16 Nathan Froyd [:froydnj] 2013-08-12 10:21:11 PDT
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #12)
> Created attachment 788702 [details] [diff] [review]
> Try using a different JSON library
> 
> It's not that hard to write a faster JSON parsing library in Java. I have
> one that I wrote a while ago and I shoehorned it into fennec to see how it
> fares against the built-in JSON parser. Mine was about ~20% faster, with
> only negligible time spent optimizing it for speed. Mine was originally
> written for parsing streams of JSON rather than fixed strings, so I'm pretty
> confident that with some more adjustments I can bring it down to at least
> 30%-40% faster, and also modify the interface to be more of a drop-in
> replacement to the existing code. I'm attaching the code as I have it now -
> feel free to steal the patch and hack on it if you feel it's worthwhile, or
> let me know and I can do it.
> 
> When I ran the attached patch on my N4 for startup + one page load, the
> times reported were
> 
> D/staktrace( 3392): Current times at 7156, 5762 : YAY!
> 
> which is 19.4% faster.

That's a ridiculous speedup.  Are you sure that comes from just dropping in your code?  'Cause AFAICS from Chris's profiles, the time is mostly spent in JSON writing, not JSON reading.  And if I grok your patch correctly, your patch only affects reading, right?
Comment 17 Kartikaya Gupta (email:kats@mozilla.com) 2013-08-12 10:30:29 PDT
Yeah, my patch only affects reading. But I'm also only timing the JSON parsing code. This is a micro-benchmark, I don't know how it impacts end-to-end page load time.
Comment 18 Nathan Froyd [:froydnj] 2013-08-12 12:21:28 PDT
Created attachment 789079 [details] [diff] [review]
start/stop method tracing in fennec

I've been using this patch, both with Eideticker and simple startup/shutdown
with my own Fennec profile to try and see what impact JSON parsing has.  Even
with a decent number (10+) of tabs, I'm not seeing Java JSON serialization take
more than about 1.5%, if that, of startup.
Comment 19 Nathan Froyd [:froydnj] 2013-08-12 12:22:39 PDT
(In reply to Nathan Froyd (:froydnj) from comment #18)
> Created attachment 789079 [details] [diff] [review]
> start/stop method tracing in fennec
> 
> I've been using this patch, both with Eideticker and simple startup/shutdown
> with my own Fennec profile to try and see what impact JSON parsing has.  Even
> with a decent number (10+) of tabs, I'm not seeing Java JSON serialization
> take
> more than about 1.5%, if that, of startup.

Worth noting that this is without Kats's patch.  Chris, does my patch match up with where you were profiling?
Comment 20 Chris Kitching [:ckitching] 2013-08-12 13:43:02 PDT
(In reply to Nathan Froyd (:froydnj) from comment #19)
> Worth noting that this is without Kats's patch.  Chris, does my patch match
> up with where you were profiling?

No - I started method tracing in a static initialiser and ended method tracing as the final line in org.mozilla.gecko.GeckoApp.initialize.
Also, if restore tabs is disabled a large amount of the problem goes away on startup.

The JSON reading stuff in the profile shows up under JSONObject<init>(Ljava/lang/String)V - the constructor of JSONObject that takes a JSON string and spits out a decoded object.
Still, we're spending much less time there than we are stringifying JSONs in Java.

I'm similarly suspicious of the reported speedup from kats' patch. Those numbers you're quoting - are those from a Talos test? My past attempts to use Talos to measure speedup of things have been entirely unsuccessful, since the jitter on those tests is too large to make them useful.

I also wrote a patch which does decoding of JSON in C. It does seem to greatly accelerate that part of the process. Working on adding native encode today (As well as fixing it:)
https://tbpl.mozilla.org/?tree=Try&rev=a954678b8fac

Quite a long way to go on that one, it seems. :P
Comment 21 Chris Kitching [:ckitching] 2013-08-12 13:50:46 PDT
Created attachment 789158 [details] [diff] [review]
Chris' profile proliferation patch.

In fact, here's the dirty, dirty hack I use to get my profiles in general. This patch adds the startup profiling lines, as well as the extra robocop job I use to collect page load profiles.
Comment 22 Nathan Froyd [:froydnj] 2013-08-12 13:57:58 PDT
Ah, I must have been missing all the action that happens in initialize().  Just captured a profile where we took 15% of the time in JSONObject.toString(). :(
Comment 23 Kartikaya Gupta (email:kats@mozilla.com) 2013-08-12 14:24:15 PDT
(In reply to Chris Kitching [:ckitching] from comment #20)
> I'm similarly suspicious of the reported speedup from kats' patch. Those
> numbers you're quoting - are those from a Talos test? My past attempts to
> use Talos to measure speedup of things have been entirely unsuccessful,
> since the jitter on those tests is too large to make them useful.

If that question is directed at me, then no, my numbers are not from a Talos test. They are from the timing code that I included in my patch.
Comment 24 Nathan Froyd [:froydnj] 2013-08-14 09:41:23 PDT
I took most of yesterday and tried moving Tab:Load to a native Gecko event.  This process was tedious and (at least according to traces; I probably should try measuring with telemetry) didn't produce much of a win, even if there's less JSON dancing involved.

Profiling said that the real problem is doing Session:Restore, where we're reading in the sessionstore data, parsing and re-serializing *that*, and then sending it through Gecko.  Converting that would be even more tedious than converting Tab:Load, so I didn't try.

Since we probably can't avoid fiddling with JSON in session restore, I tried moving the parsing of it earlier and onto a background thread.  You can follow along in bug 905223 if you like.  The speedup is satisfying.

I still think it might be worth moving the telemetry bits away from JSON and coming up with a better scheme for prefs...but dealing with session restore is the problem WRT JSON and I don't think there's any way to make it much better...
Comment 25 David Teller [:Yoric] (please use "needinfo") 2013-08-14 09:57:11 PDT
(In reply to Nathan Froyd (:froydnj) from comment #24)
> Profiling said that the real problem is doing Session:Restore, where we're
> reading in the sessionstore data, parsing and re-serializing *that*, and
> then sending it through Gecko.  Converting that would be even more tedious
> than converting Tab:Load, so I didn't try.

Why do we even need to serialize the sessionstore data?
Comment 26 Nathan Froyd [:froydnj] 2013-08-14 10:01:11 PDT
(In reply to David Rajchenbach Teller [:Yoric] from comment #25)
> (In reply to Nathan Froyd (:froydnj) from comment #24)
> > Profiling said that the real problem is doing Session:Restore, where we're
> > reading in the sessionstore data, parsing and re-serializing *that*, and
> > then sending it through Gecko.  Converting that would be even more tedious
> > than converting Tab:Load, so I didn't try.
> 
> Why do we even need to serialize the sessionstore data?

Fennec parses the sessionstore data to stub the restored tabs ASAP, so we have UI.  It modifies the tab id for all the newly stubbed tabs, so we need to reserialize the data so that the JS side is aware of it...at least as far as I understand it.
Comment 27 Mark Finkle (:mfinkle) (use needinfo?) 2013-08-14 11:00:21 PDT
(In reply to Nathan Froyd (:froydnj) from comment #26)
> (In reply to David Rajchenbach Teller [:Yoric] from comment #25)
> > (In reply to Nathan Froyd (:froydnj) from comment #24)
> > > Profiling said that the real problem is doing Session:Restore, where we're
> > > reading in the sessionstore data, parsing and re-serializing *that*, and
> > > then sending it through Gecko.  Converting that would be even more tedious
> > > than converting Tab:Load, so I didn't try.
> > 
> > Why do we even need to serialize the sessionstore data?
> 
> Fennec parses the sessionstore data to stub the restored tabs ASAP, so we
> have UI.  It modifies the tab id for all the newly stubbed tabs, so we need
> to reserialize the data so that the JS side is aware of it...at least as far
> as I understand it.

That is a good summary
Comment 28 Chris Kitching [:ckitching] 2013-08-23 01:52:07 PDT
Created attachment 794557 [details] [diff] [review]
Decode JSON in C++. [Slightly borked]

For the sake of keeping the bug in sync with what I'm doing, here's my WIP patch for decoding JSON in C++. This does seem to be somewhat faster in this direction than the old way.
My implementation is far from optimal. It also segfaults on try. Lots:
https://tbpl.mozilla.org/?tree=Try&rev=92fa77c7d203

But locally seems to work pretty well :P

Pretty sure the problem relates to funkyness with JNI local variable frames (The distinctive 0xdeadd00d segfault lets us know this is the JNI telling us we're being stupid rather than a "proper" segfault).
If someone else wants to fix this, go nuts. I'll be back to this in due course.

It is also true that we actually have somewhat more to gain by doing a similar thing the other way (Essentially the opposite of what this code does)
It seems that using the JSON C++ library in my patch one could do so without vast amounts of effort. Have a method in C which takes a JSONObject and does essentially the opposite of what my patch does.
Preferably minus the segfaults.

Some additional logic could be added to deal with binary data and make the sending of favicons through the bridge suck less.
Comment 29 Shilpan Bhagat 2013-08-26 17:07:54 PDT
Created attachment 795749 [details] [diff] [review]
Proof of concept, WIP: ByteArray communication

A very rough patch just to show if we want to move away from JSON to something more faster. Using Uint8Array in js we can decode/encode in ByteArrays with headers much faster then JSON stringify/parse. Here is a small jsperf I made: http://jsperf.com/uint8array-encoding-vs-json-encoding

It's pretty obvious sending base 64 encoded images over this method would be pretty expensive but for our current use, it should be much faster then our current implementation.

This proof of concept is not meant to be a replacement for the structured communication we currently have thanks to JSON but it serves as another way of communication where structure is not a big issue, which is most of the communication we do currently. 

The idea is simple, your data is an array of strings which is recieved as an array of string on the other side to java. The in between will be a native method taking care of the conversions using the necessary headers to make parsing quick.

Again, the patch is meant to be a proof of concept, so you will see a lot of profiling code which I have kept intentionally incase you wish to measure it's efficiency yourself. 

Although, the profiling code doesn't do it justice since the JIT helps the JSON encoding when it's done 10,000 times while native code doesn't have that benefit. So expect the difference to be a lot more then what you see in the profile if used normally.
Comment 30 Mark Finkle (:mfinkle) (use needinfo?) 2013-08-26 19:58:07 PDT
(In reply to Shilpan Bhagat from comment #29)
> Created attachment 795749 [details] [diff] [review]
> Proof of concept, WIP: ByteArray communication

A few quick thoughts:
* We could probably use JSON.stringify(arrayOfStrings) and be a little faster than stringifying an object, if we wanted to drop the object's name/value pairs.
* In places where we *are* sending images as data: URI strings, we could immediately switch to ByteArrays and save space, as well as encoding/decoding time on the JS and Java sides. Please file a bug on this one.
Comment 31 Richard Newman [:rnewman] 2013-09-25 20:16:58 PDT
> > Fennec parses the sessionstore data to stub the restored tabs ASAP, so we
> > have UI.  It modifies the tab id for all the newly stubbed tabs, so we need
> > to reserialize the data so that the JS side is aware of it...at least as far
> > as I understand it.

To step back from this bug a bit: all of our expensive JSON handling would be much less significant if (a) we didn't send favicons as base64 in JSON, and (b) our startup path didn't involve parsing, manipulating, and serializing a pretty hefty chunk of JSON.

That is: if parsing and serializing JSON is a big part of our profiles, we're sending too much JSON.

Can we write the stubs that Fennec needs into a separate file, so it doesn't need to do the read-lots-transform-send-piss-around mess? And find an alternative to modifying tab IDs? Ideally one that allows Gecko and Java to just read independently, rather than having to chat verbosely in the common case?

More work in session store, yes, but it's not work that we need to do on startup.

This is not to say that making JSON parsing and serializing quicker isn't a fine goal, but if we can do better by eliminating work altogether, I'm all in favor of exploring that path.
Comment 32 Nathan Froyd [:froydnj] 2013-09-26 10:43:18 PDT
(In reply to Richard Newman [:rnewman] from comment #31)
> To step back from this bug a bit: all of our expensive JSON handling would
> be much less significant if (a) we didn't send favicons as base64 in JSON,

Where is the favicons sending?  I've seen this tossed around, but haven't stumbled upon it in my brief sojournings in the Android code.
Comment 33 Richard Newman [:rnewman] 2013-09-26 10:58:34 PDT
Chris might have evidence to hand.
Comment 34 Chris Kitching [:ckitching] 2013-09-26 13:51:31 PDT
(In reply to Nathan Froyd (:froydnj) from comment #32)
> Where is the favicons sending?  I've seen this tossed around, but haven't
> stumbled upon it in my brief sojournings in the Android code.

Here:
http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/browser.js#6512

That "iconURI" is a data URI - the base64-encoded favicon for the search engine, which is then decoded in Java.
Comment 35 Mark Finkle (:mfinkle) (use needinfo?) 2013-09-26 19:55:39 PDT
(In reply to Chris Kitching [:ckitching] from comment #34)
> (In reply to Nathan Froyd (:froydnj) from comment #32)
> > Where is the favicons sending?  I've seen this tossed around, but haven't
> > stumbled upon it in my brief sojournings in the Android code.
> 
> Here:
> http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/
> browser.js#6512
> 
> That "iconURI" is a data URI - the base64-encoded favicon for the search
> engine, which is then decoded in Java.

Yes, but sadly, this data only exists as a data: URI, saved in the search engine XML files. We could convert it to a ByteArray in JS and then send the array to Java via JSON, but I'm not sure that helps much.

It would be nice if the icon was a PNG to start.
Comment 36 Richard Newman [:rnewman] 2013-09-26 20:41:05 PDT
(In reply to Mark Finkle (:mfinkle) from comment #35)

To present a different perspective: it would be nice if we didn't depend on Gecko providing this data to the JS UI.

I'm a big fan of pre-processing where it makes sense. Can Gecko maintain a disk cache for search engine icons that Java can read from? Can we read and process the XML files directly?
Comment 37 Jim Chen [:jchen] [:darchons] 2014-03-14 10:25:32 PDT
*** Bug 823586 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.