Closed Bug 589905 Opened 10 years ago Closed 10 years ago

Investigate remote-prefs performance

Categories

(Core :: Preferences: Backend, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Tracking Status
fennec 2.0b1+ ---

People

(Reporter: cjones, Assigned: dougt)

References

Details

Attachments

(3 files, 1 obsolete file)

When looking at IPC logs a while back, I noticed that we fetch quite a few prefs during startup+creating <browser>+loading page, something like 900 IIRC.  I added some dumb instrumentation to the prefs code (attached patch), and ran a little test that loaded the browser, navigated to google.com, then shut down the browser.  The entire test took 6-7secs, and the instrumentation said we spent about 1sec in prefs code.  This setup is flawed for a lot of reasons, but 1sec seems high enough to investigate further.

I also instrumented to see how many requests were unique to see if caching would be worthwhile.  I think the instrumentation wasn't quite right, but out of ~1200 requests during start-load-shutdown, ~600 were unique, so caching won't much help startup perf.

Since we seem to have a total of ~1000 prefs, it seems like a good impl would be for the content to load or receive all the pref values at once, then add a listener for a change to any pref (on the assumption that pref changes are relatively rare).  As a stop-gap fix, we could have the content process just read in the pref file directly.  A better impl would be to serialize the DB and send it from chrome->content.  Best would probably be to map the pref file into shmem somehow, and send a ref to shmem to the content process, but bsmedberg tells me that prefs are in a JAR now, so the simple mmap() impl isn't going to cut the mustard.
This is a nice-to-have for fennec 2.0b1, but we need better profiling to know if blocking? and if so, what status, is appropriate.
Assignee: nobody → dwitte
Blocks: 592666
Knowing how much this affects us in the field is definitely a blocker.  If it doesn't affect us, optimizing isn't a blocker.  If it is, then optimizing is a blocker.
tracking-fennec: --- → ?
-> dougt per request.
Assignee: dwitte → doug.turner
Attached patch patch v.1Splinter Review
first pass at remoting all preference in one go.

1) removes existing e10s preferences api

2) adds new sync "readprefs" api that returns all of the preferences as a string.  This string is basically what we would write out to disk if we were flushing preferences.

3) adds a new api to the internal interface of the preferences service called "serializePreferences".  This enumerates the preferences hash table and returns a string (basically what we would write to disk...) of both user and default perfs.

4) added a new save option for pref_savePref to allow us get access to the default preferences (things that wouldn't normally be written to the user preferences file.

It is a pretty small change.. However, it is untested and we need to get some measurements on how it performs.  I am uploading it now to get some early feedback.
Attachment #477415 - Flags: feedback?
Attachment #477415 - Flags: feedback? → feedback?(dwitte)
Attached patch tests patches (obsolete) — Splinter Review
Attachment #477737 - Flags: review?
Attachment #477737 - Flags: review? → review?(dwitte)
Looks like it works as expected.
I guess you need to propagate dynamic pref changes in some manner?
Wait ... if we're going to load the entire pref DB in the content process, why not just re-use all the existing pref listener machinery and drop the code in ContentChild?  We could set up a single listener for the entire content process, that listens to all pref changes and updates the content-process DB.
If we don't do that, then the content-process pref and chrome-process DBs will get out of sync.
we did.  The only code that i left was: the stuff to prevent write access in the child, and the code that allow the child process to get notifications.

on synchronization:

when the child process starts, we hand off a serialization of the db.  from that point forward, any preference changes in the parent are reflected to the child via the remote observer.  The shouldn't be any sync problems.

Maybe I shouldnt have put those mxr links into this bug as it may have caused confusion.  Please check out the first wip patch, and let me know if there is an issue.
OK, cool.  The links threw me off, yeah.
Blocks: 588050
I really don't think serializing all this as a string makes a lot of sense. Can we use IPDL array/union types to do it instead? That should be less expensive in terms of memory and have less chance of serialization/parsing errors.
yeah, not very proud of that.

I so not think there is going to be any serialization/parsing errors -- this is the same format as what is written to the disk.  Do we have serialization/parsing errors now?

I'd hate to write more code to serialize since libpref already does this just fine and has been used for a very long time.

however, we probably can optimize by preallocating a buffer, instead.  I can put up another patch that changes the serialization to something less allocatey... however, lets get dwitte (or someone else) review this patch as-is.
Attachment #477415 - Flags: feedback?(dwitte) → review?(dwitte)
Comment on attachment 477415 [details] [diff] [review]
patch v.1

>diff --git a/modules/libpref/public/nsIPrefService.idl b/modules/libpref/public/nsIPrefService.idl

>+  string serializePreferences();

Make this return an ACString? No need to throw around extra allocations with char*.

>diff --git a/modules/libpref/src/nsPrefBranch.cpp b/modules/libpref/src/nsPrefBranch.cpp

> NS_IMETHODIMP nsPrefBranch::PrefIsLocked(const char *aPrefName, PRBool *_retval)
> {

>+  if (GetContentChild()) {
>+    NS_ERROR("cannot check lock pref from content process");
>+    return NS_ERROR_NOT_AVAILABLE;

Hmm, howcome? Is this going to be a problem?

>+    ContentChild* cpc = ContentChild::GetSingleton();
>+    nsCAutoString prefs;
>+    cpc->SendReadPrefs(&prefs);
>+
>+    PrefParseState ps;
>+    PREF_InitParseState(&ps, PREF_ReaderCallback, NULL);
>+    nsresult rv = PREF_ParseBuf(&ps, prefs.get(), prefs.Length());
>+    PREF_FinalizeParseState(&ps);

OK for now, but file a followup to serialize this directly as a hash? Doesn't need to block beta, but maybe final.

>+NS_IMETHODIMP nsPrefService::SerializePreferences(char** aPrefsOut)
>+{
>+  nsCAutoString prefs;

Yeah, just use an nsACString& arg, and ditch the intermediary.

>+  char** valueArray = (char **)PR_Calloc(sizeof(char *), gHashTable.entryCount);

new char*[gHashTable.entryCount]?

>+  if (!valueArray)
>+    return NS_ERROR_OUT_OF_MEMORY;

Infallible new.

>+  char** walker = valueArray;
>+  for (PRUint32 valueIdx = 0; valueIdx < gHashTable.entryCount; valueIdx++, walker++) {
>+    if (*walker) {

Should probably assert that it's nonnull, and not bother release-build checking.

>+  PR_Free(valueArray);

delete[] valueArray?

> PLDHashOperator
> pref_savePref(PLDHashTable *table, PLDHashEntryHdr *heh, PRUint32 i, void *arg)
> {

>+    if (!pref->key) {
>+        return PL_DHASH_NEXT;
>+    }

No braces, per file style.

I don't understand why this would happen with your patch but not without. Are you just being safe or is there a reason? If no reason, probably shouldn't change it. This is stable code. ;)

r=dwitte with followup bug.
Attachment #477415 - Flags: review?(dwitte) → review+
Comment on attachment 477737 [details] [diff] [review]
tests patches

>diff --git a/gfx/layers/ThebesLayerBuffer.cpp b/gfx/layers/ThebesLayerBuffer.cpp

>+  aTarget->ResetClip();

Probably not, right? :)

>diff --git a/modules/libpref/test/unit_ipc/test_simpleIPC.js b/modules/libpref/test/unit_ipc/test_simpleIPC.js

Need lots more tests here -- observers especially. Can you copy/paste the code from the existing xpcshell test and adapt the relevant bits?

Canceling request pending new patch.
Attachment #477737 - Flags: review?(dwitte)
Any chance this could be done on top of the patch in blocker bug 580790 so I don't have to unbitrot it from ipc changes again?
Feel free
tracking-fennec: ? → 2.0b2+
Whiteboard: [fennec-checkin-postb1]
I'm not sure what [fennec-checkin-postb1] means, but if it means what I think it means, this isn't ready. It needs tests which aren't finished yet.
[fennec-checkin-postb1] doesn't apply to this bug.
Whiteboard: [fennec-checkin-postb1]
(In reply to comment #20)
> [fennec-checkin-postb1] doesn't apply to this bug.

why?
(1) the patch needs to be cleaned up before pushing -- see the comments.
(2) the tests need to be more comprehensive.

I suppose it will be checked in post b1 (just like everything not on the b1 blocker list), but I don't want anyone seeing that flag and pulling the trigger on this before it is complete.
Dan, thanks for the review:

Regarding lock pref - since we aren't setting from the child, i didn't think locking was important.  I took a quick look and didn't see any actual users of it.

The walking code was cribbed from else where in this file, so it might be better to convert both, or leave this one as is.  The point being consistency.

I will file a follow up to address the serialization.  We can do better, but will be shadowed by the gain by simply caching.
(In reply to comment #23)
> Regarding lock pref - since we aren't setting from the child, i didn't think
> locking was important.  I took a quick look and didn't see any actual users of
> it.

Fair enough; as long as it throws I'm fine with that.

> The walking code was cribbed from else where in this file, so it might be
> better to convert both, or leave this one as is.  The point being consistency.

OK, fine as it is.

> I will file a follow up to address the serialization.  We can do better, but
> will be shadowed by the gain by simply caching.

Yeah. Maybe azakai can do a quick test of how long the serialization is taking (just wrap the parent walking code in PR_Now()'s, and the child reading code) as an upper bound on how much better we could make it? If it's significant we can final+ it.
(In reply to comment #24)
> Yeah. Maybe azakai can do a quick test of how long the serialization is taking
> (just wrap the parent walking code in PR_Now()'s, and the child reading code)
> as an upper bound on how much better we could make it? If it's significant we
> can final+ it.

For the entire serialization sync message + deserialization after, looks like 90ms in an optimized build, which is 2.8% of the entire time it takes to load cnn.com.
So that's including the sync message time? What if we take that out? (We're going to be paying that no matter what.)
in fennec, we startup the content process asynchronously when we startup.
The overhead of the sync message itself? That is close to negligible here, a few ms or less.
So the time just for serialization + deserialization is 85 - 90ms? That's significant. If 5% is a blocker, we should probably block final on 2.8% as well. :)
Attached patch test patch v.2Splinter Review
this tests preferences set after the child is created.

dan - many of the tests in test_libPrefs are not designed to work in the child.  For example, all of the set/lock preferences do not work.  This ipc test tests the bare minimum that is required of preferences in fennec. It will test the 3 basic types of preferences set before the child is created.  It will also test the same times set after the child is created as you suggested (good idea).

Also, it is fair to note, that the pattern i used to test in the child probably could be factored out.  It allows you to just run a "verification" function in the child process without having to have a separate file.
Attachment #477737 - Attachment is obsolete: true
Attachment #478450 - Flags: review?(jmaher)
Comment on attachment 478450 [details] [diff] [review]
test patch v.2

Nice, the observer tests look good.
create 599545 to follow up on serialization perf.
Comment on attachment 478450 [details] [diff] [review]
test patch v.2

>+function verify_existing_prefs() {
>+  const Ci = Components.interfaces;
>+  const Cc = Components.classes;
>+  dump("verify_existing_prefs\n");

Extraneous?

>+function verify_observed_prefs() {
>+  const Ci = Components.interfaces;
>+  const Cc = Components.classes;
>+  dump("verify_observed_prefs\n");

Here too.

r=dwitte
Attachment #478450 - Flags: review?(jmaher) → review+
the context that those two methods are being executing in doesn't have those defined.
I mean the dump. :)
then you would be correct!  I'll remove.
tracking-fennec: 2.0b2+ → 2.0b1+
http://hg.mozilla.org/mozilla-central/rev/f584b2db2a7e
http://hg.mozilla.org/mozilla-central/rev/a7300f9d11ae
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
The old pref observer code in ContentChild.cpp that I thought this patch would obsolete is still around.  Any reason for that?
yes, it needs to stay.  We serialize the entire preferences hashtable, then send it to the child process.  After this point, for all preference changes in the parent, we will notify the child using the existing "old pref observer code" in the ContentChild.

Do you see a way for us to remove this?
Blocks: 599900
Blocks: 600110
i'll remove in bug 600110!
Blocks: 599545
You need to log in before you can comment on or make changes to this bug.