Closed Bug 900551 Opened 11 years ago Closed 10 years ago

Provide a mechanism to get/set shared background image for privileged apps using mozSettings

Categories

(Core :: DOM: Device Interfaces, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
2.1 S4 (12sep)
feature-b2g 2.1
tracking-b2g backlog

People

(Reporter: vingtetun, Assigned: qdot)

References

Details

(Keywords: dev-doc-needed, Whiteboard: [systemsfe] [p=8])

Attachments

(6 files, 28 obsolete files)

1.10 KB, patch
bent.mozilla
: review+
Details | Diff | Splinter Review
17.61 KB, patch
Details | Diff | Splinter Review
946 bytes, patch
jaliu
: review+
Details | Diff | Splinter Review
4.06 KB, patch
dlee
: review+
hchang
: review+
Details | Diff | Splinter Review
67.50 KB, patch
Details | Diff | Splinter Review
6.46 KB, patch
bent.mozilla
: review+
Details | Diff | Splinter Review
Right now the homescreen depends on mozSettings to retrieve the user selected background. This makes it hard to write a third party homescreen since mozSettings is for certified apps only.

It would be nice to get rid of this dependency and/or find a way for third party homescreen to access the default background in a pretty way.
It looks like the wallpaper is managed by the System app.

Wallpapers can be chosen from the Settings app itself and the system app listens on the wallpaper.image setting. The image is attached to a #screen DOM element belonging to the System app and the Homescreen shows it through transparency.

Is that okay?
(In reply to Olav Nymoen [:olav] from comment #1)
> It looks like the wallpaper is managed by the System app.

This is true but it won't viable to rely on the system app for displaying the background if people wants parallax homescreen. For this to work the homescreen will have to display the background image by itself.
Maybe it can be move to a system datastore?
> Maybe it can be move to a system datastore?

Sounds good! Then this one is also blocked by 871445. Can't wait!
blocking-b2g: --- → 1.5+
What should be the permission model here? Should every privileged app be allowed to change the wallpaper or just homescreen apps? I guess we are switching to a 'homescreen-role' property in the manifest. Should we just allow homescreen and certified apps to change the wallpaper?
Whiteboard: [systemsfe]
Ideally I'd like to keep using settings here, but enable privileged apps to get this setting without also getting other settings.

The problem is the way that we've implemented the settings API right now we can't securely enforce that an app only has access to one setting. Fixing this requires rewriting the settings API implementation, though the API exposed to apps wouldn't need to change.
No longer blocks: vertical-homescreen
(In reply to Jonas Sicking (:sicking) from comment #7)
> Ideally I'd like to keep using settings here, but enable privileged apps to
> get this setting without also getting other settings.
> 
> The problem is the way that we've implemented the settings API right now we
> can't securely enforce that an app only has access to one setting. Fixing
> this requires rewriting the settings API implementation, though the API
> exposed to apps wouldn't need to change.

Evelyn, is this something the settings team wants to do in the near future?
Flags: needinfo?(ehung)
(In reply to Gregor Wagner [:gwagner] from comment #8)
> (In reply to Jonas Sicking (:sicking) from comment #7)
> > Ideally I'd like to keep using settings here, but enable privileged apps to
> > get this setting without also getting other settings.
> > 
> > The problem is the way that we've implemented the settings API right now we
> > can't securely enforce that an app only has access to one setting. Fixing
> > this requires rewriting the settings API implementation, though the API
> > exposed to apps wouldn't need to change.
> 
> Evelyn, is this something the settings team wants to do in the near future?

Yes, we should let privileged apps be able to get (or even set) some settings instead of all settings, wallpaper is just a case. I'm not sure how difficult to make it happen on mozSettings API. We brought up this topic on webapi meeting a couple of weeks ago, but it seems people didn't have a clear picture on this topic.

I also want to echo Vivien's comment 2. Background should be part of a homescreen app, so if we want the homescreen be replicable then homescreen needs to display the background image by itself.
Flags: needinfo?(ehung)
(In reply to Evelyn Hung [:evelyn] from comment #9)
> (In reply to Gregor Wagner [:gwagner] from comment #8)
> > (In reply to Jonas Sicking (:sicking) from comment #7)
> > > Ideally I'd like to keep using settings here, but enable privileged apps to
> > > get this setting without also getting other settings.
> > > 
> > > The problem is the way that we've implemented the settings API right now we
> > > can't securely enforce that an app only has access to one setting. Fixing
> > > this requires rewriting the settings API implementation, though the API
> > > exposed to apps wouldn't need to change.
> > 
> > Evelyn, is this something the settings team wants to do in the near future?
> 
> Yes, we should let privileged apps be able to get (or even set) some
> settings instead of all settings, wallpaper is just a case. I'm not sure how
> difficult to make it happen on mozSettings API. We brought up this topic on
> webapi meeting a couple of weeks ago, but it seems people didn't have a
> clear picture on this topic.
> 
> I also want to echo Vivien's comment 2. Background should be part of a
> homescreen app, so if we want the homescreen be replicable then homescreen
> needs to display the background image by itself.

Right now we implement the settings API in the child and let indexedDB take care of all the child-parent communication. If we want a system where settings can be accessed based on app privileges we probably have to to move all the logic to the parent. This might or might not be a complex task but it is not a small change.
I think our team should take this. Based on comment 7, there would be lots of Gecko work at least. Assign this to me first and we'll coordinate the resources.
Assignee: nobody → gene.lian
Hi Marco, I think it's an independant and interesting topic for newbiew(s). I can try to be the mentor to help them work it out.
OS: Linux → Gonk (Firefox OS)
Hardware: x86_64 → ARM
(In reply to Gene Lian [:gene] (needinfo? encouraged) from comment #11)
> I think our team should take this. Based on comment 7, there would be lots
> of Gecko work at least. Assign this to me first and we'll coordinate the
> resources.

So it seems we are going to extend mozSettings API for privileged apps. Should we rename the title to reflect it or file another bug and make this dependency?
Flags: needinfo?(gene.lian)
OS: Gonk (Firefox OS) → Linux
Hardware: ARM → x86_64
OS: Linux → Gonk (Firefox OS)
Hardware: x86_64 → ARM
Depends on: 846200
Flags: needinfo?(gene.lian)
(In reply to Evelyn Hung [:evelyn] from comment #13)
> (In reply to Gene Lian [:gene] (needinfo? encouraged) from comment #11)
> > I think our team should take this. Based on comment 7, there would be lots
> > of Gecko work at least. Assign this to me first and we'll coordinate the
> > resources.
> 
> So it seems we are going to extend mozSettings API for privileged apps.
> Should we rename the title to reflect it or file another bug and make this
> dependency?

Either works for me. We'll solve this at bug 846200 from the Gecko side.
(In reply to Gene Lian [:gene] (needinfo? encouraged) from comment #12)
> Hi Marco, I think it's an independant and interesting topic for newbiew(s).
> I can try to be the mentor to help them work it out.

Yes, thanks for this good suggestion. And we can discuss this offline since it didn't be relevant to the bug itself.
Hi Gregor,

Since the original idea from Gene is to take this (and bug 846200) for a newbie's practice, may I know the exactly schedule about it before we make the decision?

As I know the due day of feature landing on 2.0 is 06/09/14, this time line is not suitable for a junior engineer. so
 
  1. if this is really a blocker of 2.0, I don't prefer to let junior engineer to take the risk and pressure. (ex: Is replaceable homescreen on 2.0 a privilege or certified apps).

  2. If this can be released from 2.0 then it is good for a practice.

Thanks.
Flags: needinfo?(anygregor)
(In reply to Marco Chen [:mchen] from comment #16)
> Hi Gregor,
> 
> Since the original idea from Gene is to take this (and bug 846200) for a
> newbie's practice, may I know the exactly schedule about it before we make
> the decision?
> 
> As I know the due day of feature landing on 2.0 is 06/09/14, this time line
> is not suitable for a junior engineer. so
>  
>   1. if this is really a blocker of 2.0, I don't prefer to let junior
> engineer to take the risk and pressure. (ex: Is replaceable homescreen on
> 2.0 a privilege or certified apps).
> 
>   2. If this can be released from 2.0 then it is good for a practice.
> 
> Thanks.

Hi Marco!
We need privileged apps being able to access the settings API because our new homescreen2 will be a privileged app. The new homescreen needs access to the settings API because the wallpaper is currently stored via the settings API.

Since homescreen2 is blocking our 2.0 release and this feature is blocking the new homescreen we need it rather sooner than later. So I would say the deadline for landing is one sprint before 2.0 FC.

The other option is to create a workaround where homescreen apps can set wallpapers but I would rather solve the original problem. We were talking about this for a while now and it's time to fix it.

I don't think its a very hard problem but I wouldn't call it a beginner project. It has security and performance implications so we need to get it right the first time.
Flags: needinfo?(anygregor)
After confirmation, we think this issue is not necessary for our team (stream-3) to take because our partner's homescreen apps don't have to be privileged for now. Sorry we don't really have resources to take such a v2.0 feature. Actually, we have some newbies be able to work on this but it might be too urgent for them to take. Hope either stream-1 or stream-2 team can take this.
Assignee: gene.lian → nobody
To correct one thing - "TV partner planed to have privileged homescreen app".
But as what Gene said the resource we can release here is newcomer, therefore he can't take it according to this time frame.
(In reply to comment #18)
> After confirmation, we think this issue is not necessary for our team
> (stream-3) to take because our partner's homescreen apps don't have to be
> privileged for now. Sorry we don't really have resources to take such a v2.0
> feature. Actually, we have some newbies be able to work on this but it might be
> too urgent for them to take. Hope either stream-1 or stream-2 team can take
> this.

Gene: Does this mean that all of the other requests based on the use case of privileged homescreen apps are not an issue any longer?
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #20)
> Gene: Does this mean that all of the other requests based on the use case of
> privileged homescreen apps are not an issue any longer?

Our TV partner still hopes to enable privileged homescreen apps but we won't make commitment to them, because as you know, we don't have good solutions to the widget issue until the nested OOP iframe comes up.

However, please refer to comment #17, this bug is still required for V2.0 (requested by non-TV product's time line) which wants to develop new homescreen2 app (has to be privileged) before May/June. Unfortunately, the TV team only has newbies be able to work on this at this moment and we think the V2.0 time line is too tight for them to take, so we have to reassign ourselves off this one.

In summary, the use case of privileged homescreen apps is still an important issue but the TV team doesn't have enough resources to make it happen for V2.0.
Whiteboard: [systemsfe]
Or maybe dom:device_interfaces?
Evelyn, can you take this one?
Component: Gaia::Homescreen → Gaia::Settings
Flags: needinfo?(ehung)
This shouldn't block - this is a feature, which is something we won't block on unless we're past FL & we're planning to still land the feature.
blocking-b2g: 2.0+ → backlog
(In reply to Gregor Wagner [:gwagner] from comment #22)
> Or maybe dom:device_interfaces?
> Evelyn, can you take this one?

As I know it's more like a API change as we've concluded from comments above, but it seems we don't have a Gecko dev who is able to working on it and meet 2.0 time frame. Let's file another bug for tracking the general case - exposing mozSetting API to privileged app.

So thing's coming to a Gaia workaround for this specific case (as the bug title said) - I think we need more discussion here and focusing on what Gaia system/homescreen/settings apps can do, then assign this bug to a right person.
Flags: needinfo?(ehung)
Blocks: 1003242
(In reply to Evelyn Hung [:evelyn] from comment #24)
> (In reply to Gregor Wagner [:gwagner] from comment #22)
> > Or maybe dom:device_interfaces?
> > Evelyn, can you take this one?
> 
> As I know it's more like a API change as we've concluded from comments
> above, but it seems we don't have a Gecko dev who is able to working on it
> and meet 2.0 time frame. Let's file another bug for tracking the general
> case - exposing mozSetting API to privileged app.
> 

bug 1003242 filed.
No longer blocks: 1003242
Depends on: 1003242
(In reply to Evelyn Hung [:evelyn] from comment #25)
> (In reply to Evelyn Hung [:evelyn] from comment #24)
> 
> bug 1003242 filed.

I found there is already a bug 846200 and it's better described what we want on API change! Sorry for the spam.
Assignee: nobody → kyle
Whiteboard: [systemsfe] [p=8]
Target Milestone: --- → 2.0 S2 (23may)
Going ahead and changing the title on this as said in comment #13, since I'm working on bug 846200 and this basically follows.
Summary: Provide a mechanism to have a shared background image that does not involved mozSettings → Provide a mechanism to get/set shared background image for privileged apps using mozSettings
feature-b2g: --- → 2.0
This is the oop settings API, /minus/ the ability to deal with transactions across processes. Right now I just open a new transaction for every call to the DB, which at least proves it's communicating correctly. Not sure if it passes tests yet.

fb? here is mostly for sanity checking while I get transactions in order.
Attachment #8423541 - Flags: feedback?(anygregor)
Other issues worth noting in WIP:

- Should probably move message handling for locks to the SettingsManager, otherwise we spam all the open locks in the process any time we return a transaction.

- We don't currently fail well on not having permissions when get/set, we just throw NS_ERROR_NOT_IMPLEMENTED. We should either throw something clearly or kill the whole app. Permissions issues usually mean kill whole app, but this seems like a place we could do something a bit gentler. Assuming we do just return an error, it should also invalidate the current transaction if one is open.
Comment on attachment 8423541 [details] [diff] [review]
Patch 1 (v1) - WIP: OOP Settings API

Patch violently broken. Removing until fixed.
Attachment #8423541 - Attachment is obsolete: true
Attachment #8423541 - Flags: feedback?(anygregor)
Due to the fact that we're going to be executing transactions out of process, I'm now going to have to chain multiple IDB transactions together in order to simulate the way we used to add to transactions in callbacks. This adds a new place for transactions to fail, since I'm going to have to batch all "set" operations until the settings lock is completely finished receiving commands, and run all of the sets against the database at once, while returning expected values for gets. This is basically implementing a transaction layer with delayed commit on top of IDB.

In order to relay this to the user, I think we need to turn SettingsLock into an EventHandler with two events: ontransactionsuccess and ontransactionfailure. One of these will fire once the lock has commited all of its changes to the database. That way, should any failure happen in calling set, we'll at least be able to let the user know.

There could be conflict issues in the DB since other transactions can happen to change values before we run our final set calls, but there's not much we can do about it as far as I can see.

Jonas, you got any objections to this?
Flags: needinfo?(jonas)
No objections really. Though I'm not sure that it's required to do in this bug. We actually already have that same problem. Even though we use a single transaction, we currently never signal if committing that transaction fails or succeeds.
Flags: needinfo?(jonas)
Target Milestone: 2.0 S2 (23may) → 2.0 S3 (6june)
fb? here is actually r? except that a few things still need to be finished up. I'm still working on failing transactions on permissions denials, and I just realized we also have to overhaul SettingsChangeNotifier, since it's currently built on an all or nothing type system, and needs to be able to know whether a window can be notified of a permissions change via the permissions it has. Even so, this patch passes the current mochis. I'll also be adding more mochis to test the new permissions failures, callbacks, etc.
Attachment #8430486 - Flags: feedback?(anygregor)
Integrated SettingsChangeNotifier into SettingsRequestManager. /Should/ now only fire notifications based on permissions of the child window, but still need tests in to confirm this. All current mochis pass on this though.
Attachment #8430486 - Attachment is obsolete: true
Attachment #8430486 - Flags: feedback?(anygregor)
Attachment #8431320 - Flags: feedback?(anygregor)
Added mochi for testing the new transaction callback
Attachment #8431371 - Flags: review?(anygregor)
Comment on attachment 8431320 [details] [diff] [review]
Patch 1 (v3) - WIP: OOP Settings API

Changing this to a full review. I was going to try to implement Bug 1015452 (failing gracefully on permissions check failures) here, but working around the "crash when permission check fails" strategy feels like it needs more discussion, and this needs to land soon, so for the moment we'll just continue to crash on failed perm checks.
Attachment #8431320 - Flags: feedback?(anygregor) → review?(anygregor)
Attached patch Patch 1 (v4) - OOP Settings API (obsolete) — Splinter Review
Lots of fixes. Also accidentally merged in mochitests to this patch but eh.
Attachment #8431320 - Attachment is obsolete: true
Attachment #8431371 - Attachment is obsolete: true
Attachment #8431320 - Flags: review?(anygregor)
Attachment #8431371 - Flags: review?(anygregor)
Attachment #8432909 - Flags: review?(anygregor)
Permissions failures, which at the moment still cause the requesting process to crash, will now wedge the whole system since SettingsRequestManager doesn't do anything with the resulting child-process-shutdown message. This has to be fixed before landing. :(
Also, the permissions checks are FAR too aggressive. We can't use the assertPermission function for everything, as it will crash out, meaning that currently any readonly settings app will crash when it tries to open a transaction lock. All of this passes on mochis because the non-oop tests always get all the permissions.
assertPermission is broken in many ways, so we should stop using it as soon as replacements are available.

The way we *should* be doing things is:

When the child sends the initial message to the parent using sendAsyncMessage, include the nsIPrincipal of the calling page as the fourth argument.

On the parent side the message manager will check that the principal received from the child isn't a lie. I.e. it will check that the appid in the principal matches the appid that we're running in the child. If the principal is a lie we kill the child process and don't fire the message.

Once you receive the message, check with the nsIPermissionManager that the child process has permission to do what it's trying to do.

If it doesn't have the permission, simply return an error or kill the child process as appropriate. Basically, kill the child process if the code in the child process should have already done the permission check, otherwise simply return an error.

All of this *should* already work. The only thing I'm not sure of is how to kill a child process from the parent.

And possibly we should have some helper functions that can do the "check if a principal has a given permission, kill the child and return false if it does not, return true if it does".
feature-b2g: 2.0 → 2.1
Attached patch Patch 1 (v5) - OOP Settings API (obsolete) — Splinter Review
Attachment #8432909 - Attachment is obsolete: true
Attachment #8432909 - Flags: review?(anygregor)
Attachment #8435451 - Flags: review?(anygregor)
Attachment #8431375 - Attachment is obsolete: true
Attachment #8431375 - Flags: review?(anygregor)
Attachment #8435452 - Flags: review?(anygregor)
Holding off on review for mochitests because there needs to be more of them.
Attached patch Patch 1 (v6) - OOP Settings API (obsolete) — Splinter Review
Fixed permissions caching so it'll work with mochis, clean up transactions on message manager death, fixed rebasing issue between this and bug 1015518
Attachment #8435451 - Attachment is obsolete: true
Attachment #8435451 - Flags: review?(anygregor)
Attachment #8437142 - Flags: review?(anygregor)
Added new mochitests for testing request failure cascades, fixed permissions issues.
Attachment #8435453 - Attachment is obsolete: true
Attachment #8437144 - Flags: review?(anygregor)
Target Milestone: 2.0 S3 (6june) → 2.0 S4 (20june)
Attached patch Patch 1 (v7) - OOP Settings API (obsolete) — Splinter Review
Found issue on phone where unclean process death caused locks to stall in the queue.
Attachment #8437142 - Attachment is obsolete: true
Attachment #8437142 - Flags: review?(anygregor)
Attachment #8437356 - Flags: review?(anygregor)
Attachment #8435452 - Flags: review?(anygregor) → review?(bent.mozilla)
Attachment #8437144 - Flags: review?(anygregor) → review?(bent.mozilla)
Attachment #8437356 - Flags: review?(anygregor) → review?(bent.mozilla)
Attachment #8435452 - Flags: review?(bent.mozilla) → review?(amarchesini)
Attachment #8437144 - Flags: review?(bent.mozilla) → review?(amarchesini)
Comment on attachment 8437356 [details] [diff] [review]
Patch 1 (v7) - OOP Settings API

I take these 3 patches for a first review.
Attachment #8437356 - Flags: review?(bent.mozilla) → review?(amarchesini)
Comment on attachment 8437356 [details] [diff] [review]
Patch 1 (v7) - OOP Settings API

Review of attachment 8437356 [details] [diff] [review]:
-----------------------------------------------------------------

I have to finish the review, but if you can re-upload it with the renaming of the file, it's easier for me :)
Otherwise, write a comment and I'll continue the review as it is.

::: dom/base/DOMRequestHelper.jsm
@@ +145,4 @@
>     *    will be added as a strong referred one by default.
>     */
>    initDOMRequestHelper: function(aWindow, aMessages) {
> +    debug("INIT");

Just to have the same kind of comments in SettingChangeNotifier.jsm and other, can you write them down lowercase?

::: dom/settings/SettingsChangeNotifier.jsm
@@ -1,5 @@
> -/* This Source Code Form is subject to the terms of the Mozilla Public
> - * License, v. 2.0. If a copy of the MPL was not distributed with this file,
> - * You can obtain one at http://mozilla.org/MPL/2.0/. */
> -
> -"use strict"

Can you upload a new patch with a rename instead a remove+add ?
(In reply to Andrea Marchesini (:baku) from comment #49)

> Just to have the same kind of comments in SettingChangeNotifier.jsm and
> other, can you write them down lowercase?

Oops, yeah, usually anything I add in all caps is meant to be taken out, but that one should just get fixed. Will do as followup

> Can you upload a new patch with a rename instead a remove+add ?

I'm not exactly sure what you want me to rename? SettingsChangeNotifier.jsm got sliced up into parts in SettingsRequestManager.jsm, but SettingsRequestManager also contains large parts of SettingsManager and a lot of its own code, so I'm not sure which file you want me to rename SettingsChangeNotifier to? I can try going SettingsChangeNotifier->SettingsRequestManager but I'm not sure how well that's going to line up.
Target Milestone: 2.0 S4 (20june) → 2.0 S5 (4july)
Depends on: 1030997
Attachment #8435452 - Flags: review?(amarchesini) → review?(bent.mozilla)
Attachment #8437144 - Flags: review?(amarchesini) → review?(bent.mozilla)
Comment on attachment 8437356 [details] [diff] [review]
Patch 1 (v7) - OOP Settings API

I'm in PTO, I cannot work on these patches in the next weeks.
Attachment #8437356 - Flags: review?(amarchesini) → review?(bent.mozilla)
Target Milestone: 2.0 S5 (4july) → 2.0 S6 (18july)
Comment on attachment 8437356 [details] [diff] [review]
Patch 1 (v7) - OOP Settings API

Review of attachment 8437356 [details] [diff] [review]:
-----------------------------------------------------------------

Overall this looks great! Though I'd still want to see another version with this stuff fixed (or just explained, there were a few questions I had here).

::: dom/base/DOMRequestHelper.jsm
@@ +17,5 @@
>   * when needed.
>   */
> +
> +const DEBUG = false;
> +function debug(s) { if (DEBUG) dump("-*- DOMRequestHelper: " + s + "\n"); }

As a general matter this really shouldn't check DEBUG inside the function. We've had perf problems in the past where people were calling things like:

  debug(doSomethingExpensive());

Since the arguments are always evaluated this causes a perf hit regardless of the value of DEBUG.

I know this isn't a big problem here because you're just passing simple string args, but still, we don't want anyone copying this pattern. So please change all your debug calls from |debug(...);| to |if (DEBUG) debug(...);|, in this and all the other files.

::: dom/settings/SettingsManager.js
@@ +44,5 @@
> + * to the parent when to release the transaction once the child is done with the
> + * lock. We keep a list of all open requests for a lock, and once the lock
> + * reaches the end of its receiveMessage function with no more queued requests,
> + * we consider it dead. At that point, we send a message to the parent to notify
> + * it to close the transaction.

Nit: This comment isn't very accurate... Each get() has its own idb transaction (roughly) and each set() doesn't actually run until the very end...

@@ +59,2 @@
>      this._open = false;
> +    this.runOrFinalizeQueries();

Hm, is this needed? I don't quite understand why, shouldn't the lock already be running?

@@ +77,5 @@
> +
> +  get ontransactionsuccess() {
> +    return this.__DOM_IMPL__.getEventHandler("ontransactionsuccess");
> +  },
> +  set ontransactionfailure(aHandler) {

Nit: want an extra newline here to match the others?

@@ +97,5 @@
> +  runOrFinalizeQueries: function() {
> +    if (!this._requests || Object.keys(this._requests).length == 0) {
> +      cpmm.sendAsyncMessage("Settings:Finalize", {lockID: this._id}, undefined, this._settingsManager._window.document.nodePrincipal);
> +    } else {
> +      cpmm.sendAsyncMessage("Settings:Run", {lockID: this._id}, undefined, this._settingsManager._window.document.nodePrincipal);

A helper function for this would be nice.

@@ +106,5 @@
> +    let msg = aMessage.data;
> +    // SettingsRequestManager broadcasts changes to all locks in the child. If
> +    // our lock isn't being addressed, just return.
> +    // TODO: Is this safe? Should we set up IPDL channel per lock?
> +    if(msg.lockID != this._id) return;

Nit: space between |if(| and always only one statement per line (otherwise it's easy to miss the return).

@@ +115,5 @@
> +      switch (aMessage.name) {
> +        case "Settings:Finalize:OK":
> +          debug("Lock finalize ok!");
> +          if (this.ontransactionsuccess) {
> +            this.ontransactionsuccess();

Hrm, maybe the DOMRequestHelper thing does some magic here, but shouldn't you be creating and passing in a success event?

@@ +121,5 @@
> +          break;
> +        case "Settings:Finalize:KO":
> +          debug("Lock finalize failed!");
> +          if (this.ontransactionfailure) {
> +            this.ontransactionfailure();

Same for the error case, you're not sending any info about what kind of error happened...

@@ +214,5 @@
>    classID: Components.ID("{60c9357c-3ae0-4222-8f55-da01428470d5}"),
>    contractID: "@mozilla.org/settingsLock;1",
> +  QueryInterface: XPCOMUtils.generateQI([Ci.nsISupports,
> +                                         Ci.nsIObserver,
> +                                         Ci.nsISupportsWeakReference])

Er, who needs the weak reference? It doesn't look like you're using it.

@@ +230,3 @@
>    },
>  
>    set onsettingchange(aHandler) {

This needs to handle the case where aHandler is null (should unegister for messages probably?)

::: dom/settings/SettingsRequestManager.jsm
@@ +14,5 @@
> +const Cu = Components.utils;
> +
> +this.EXPORTED_SYMBOLS = ["SettingsRequestManager"];
> +
> +Cu.import("resource://gre/modules/ObjectWrapper.jsm");

ObjectWrapper got nuked in bug 1022003 FYI.

@@ +27,5 @@
> +const kSettingsWriteSuffix             = "-write";
> +const kAllSettingsReadPermission       = "settings" + kSettingsReadSuffix;
> +const kAllSettingsWritePermission      = "settings" + kSettingsWriteSuffix;
> +const kSomeSettingsReadPermission      = "settings-api" + kSettingsReadSuffix;
> +const kSomeSettingsWritePermission     = "settings-api" + kSettingsWriteSuffix;

the '-api' thing doesn't make a whole lot of sense... This is all using the settings api... What about '-partial' or '-specific' instead?

@@ +50,5 @@
> +    }
> +    let perms = [];
> +    let principal;
> +    if (aMessage.principal.origin == "[System Principal]") {
> +      principal = Services.scriptSecurityManager.getSystemPrincipal();

Rather than get the system principal and then proceed to call into the permission manager a bunch of times I think you can just assume that the system principal can do everything.

@@ +52,5 @@
> +    let principal;
> +    if (aMessage.principal.origin == "[System Principal]") {
> +      principal = Services.scriptSecurityManager.getSystemPrincipal();
> +    } else {
> +      let appId = aMessage.principal.appId != 0 ? aMessage.principal.appId : null;

I think you should probably just always use whatever is on aMessage.principal.appId always.

@@ +53,5 @@
> +    if (aMessage.principal.origin == "[System Principal]") {
> +      principal = Services.scriptSecurityManager.getSystemPrincipal();
> +    } else {
> +      let appId = aMessage.principal.appId != 0 ? aMessage.principal.appId : null;
> +      let url = Services.io.newURI(aMessage.principal.origin, null, null);

Nit: s/url/uri/

@@ +61,5 @@
> +    }
> +    for (let i in AllPossiblePermissions) {
> +      let permName = AllPossiblePermissions[i];
> +      // We only care about permissions starting with the word "settings"
> +      if (permName.substring("settings") == -1) {

Shouldn't this be != 0? Otherwise you match things that contain "settings" anywhere.

@@ +83,5 @@
> +    if (!this._mmPermissions[aMsgMgr]) {
> +      debug("Manager not added!");
> +      return false;
> +    }
> +    return (this._mmPermissions[aMsgMgr].indexOf(kAllSettingsReadPermission) != -1);

All of these should probably be != 0.

@@ +160,5 @@
> +      this._transaction.oncomplete = function () {
> +        debug("Transaction for lock " + this.lockID + " closed");
> +        this._transaction = undefined;
> +      }.bind(this);
> +      this._transaction.onabort = function () {

Hm, I wonder if you shouldn't also mark this._failed = true?

@@ +179,5 @@
> +              "Settings:CreateLock", "Settings:RegisterForMessages"],
> +  // Cached permissions lookups based on the request originating window in the
> +  // child process. Allows us to only do permissions lookups once per window
> +  // creation.
> +  windowPermissionsInfo: {},

Looks like this is unused? In any case caching permissions won't buy you much perf because the permission database lives in memory already, and you also risk holding on to stale data.

@@ +190,5 @@
> +  children: [],
> +  init: function() {
> +    debug("init");
> +    this.settingsDB.init();
> +    this._messages.forEach((function(msgName) {

Hm, why does this one have an _ while the rest of your members don't?

@@ +224,5 @@
> +
> +  queueTask: function(aOperation, aData) {
> +    debug("Queueing task: " + aOperation);
> +
> +    var defer = {};

Nit: let everywhere, right?

@@ +262,5 @@
> +      return Promise.reject({task: aTask, error: "No permission to get " + data.name});
> +    }
> +
> +    // If the value was set during this transaction, use the cached value
> +    if(Object.keys(lock.queuedSets).indexOf(data.name) != -1) {

Can't you just use |if (data.name in lock.queuedSets) {| ?

@@ +275,5 @@
> +    let getReq;
> +    debug("Making get transaction request for " + data.name);
> +    try {
> +      store = lock.transaction.objectStore(SETTINGSSTORE_NAME);
> +    } catch (e if e instanceof InvalidStateError) {

This pattern is repeated a bunch, it might makes sense to clean this up to make the SettingsLockInfo have an 'objectStore' getter that handles this for you. It could just return null if an exception is thrown and you could still reject promises in that case.

@@ +305,5 @@
> +        debug(name + ": " + result.userValue +", " + result.defaultValue);
> +        var value = result.userValue !== undefined ? result.userValue : result.defaultValue;
> +        results[name] = value;
> +      }
> +      debug(results);

This will just spit out "[object]" right?

@@ +337,5 @@
> +        return Promise.reject({task: aTask, error: "No permission to set " + keys[i]});
> +      }
> +    }
> +
> +    let checkPromises = [];

unused

@@ +343,5 @@
> +      let key = keys[i];
> +      debug("key: " + key + ", val: " + JSON.stringify(data.settings[key]) + ", type: " + typeof(data.settings[key]));
> +      lock.queuedSets[key] = data.settings[key];
> +    }
> +    return Promise.resolve({task: aTask});

This looks problematic... You can't resolve without hitting the database thread or else your callbacks are going to get out of order. Consider:

  lock.get("foo").then(function() { alert("foo!"); });
  lock.set("bar").then(function() { alert("bar!"); });

As a user of this API I would expect "foo!" before "bar!", but that's not guaranteed here. If the db thread is fast and the threads switch at just the right moment you might get the expected order, but most likely you're going to get the callbacks in reverse order.

As we discussed, the simplest thing is to probably chain this to a idb.get() with a nonexistent key (number, null, etc). It's a little wasteful but not much and it's way way waaaaay simpler than trying to do the necessary queuing yourself.

@@ +348,5 @@
> +  },
> +
> +  // Removes the current lock from the queue, and starts transactions for the
> +  // next lock, assuming there is one.
> +  removeCurrentLock: function(aLockID) {

Passing in a lock id here seems risky... Can't you just get the current lock id from the settingsLockQueue array before you remove the first item?

@@ +367,5 @@
> +      return Promise.reject({task: aTask, error: "Lock failed a permissions check, all requests now failing."});
> +    }
> +
> +    let defer = {};
> +    let promiseWrapper = new Promise(function(resolve, reject) {

You can probably move this below all the places you early-resolve/reject.

@@ +415,5 @@
> +        } else {
> +          defaultValue = event.target.result.defaultValue;
> +        }
> +        lock.queuedSets[key] = { value: settings[key],
> +                                 defaultValue: defaultValue };

Hm, why update this here? Seems unnecessary, and also could lead to trouble if lock.queuedSets is ever accessed again since it now contains objects...

@@ +436,5 @@
> +
> +    // Once all transactions are done, or any have failed, remove the lock and
> +    // start processing the tasks from the next lock in the queue.
> +    Promise.all(checkPromises).then(function() {
> +        // If all commits were successful, notify observers

Nit: Indent looks wrong in this block

@@ +449,5 @@
> +    }.bind(this));
> +    return promiseWrapper;
> +  },
> +
> +  taskClear: function(aTask) {

Somewhere in here you should clear out queuedSets, right? Otherwise taskGet will hand them out still. Sounds like an easy thing to add a test for too.

@@ +479,5 @@
> +    catch (e if e instanceof InvalidStateError) {
> +      debug("Rejecting Clear Task on lock " + data.lockID);
> +      return Promise.reject({task: aTask, error: e.name});
> +    }
> +    let clearReq = store.clear();

So I'm a little confused... You defer all the sets until the end of the lock lifetime, but you run clear() immediately. Why is there a difference? Are you assuming that since you have 'all write' permission the rest of the requests in the lock will succeed? What about disk errors and such?

@@ +500,5 @@
> +    this.settingsDB.ensureDB(
> +      function() { defer.resolve(); },
> +      function(error) {
> +        debug("Cannot open Settings DB. Trying to open an old version?\n");
> +        defer.reject();

I think we might want to pass the error out here, at least so someone can log it?

@@ +529,5 @@
> +          p = this.finalizeSets(currentTask);
> +          break;
> +        default:
> +          debug("Invalid operation: " + currentTask.operation);
> +          return Promise.reject({task: currentTask});

This function doesn't normally return promises...

@@ +540,5 @@
> +      promises.push(p);
> +      currentTask = lock.tasks.shift();
> +    }
> +    //After we exhaust all queued tasks, make sure our transaction is freed.
> +    lock.transaction = undefined;

Is this needed? I can't really figure out when this would make much difference.

And if it isn't then you can remove the transaction setter from SettingsLockInfo.

@@ +544,5 @@
> +    lock.transaction = undefined;
> +  },
> +
> +  consumeTasks: function() {
> +    let lockID = this.settingsLockQueue[0];

If a child process died isn't it possible that settingsLockQueue is empty?

@@ +553,5 @@
> +    // child-process-shutdown event. But just in case we don't, we want to make
> +    // sure we never block on consuming.
> +    if (!lock) {
> +      debug("Lock not found");
> +      this.removeCurrentLock();

Hm, this doesn't look like it will work, you're passing no lock id here?

@@ +567,5 @@
> +    this.ensureConnection().then(
> +      function(task) {
> +        this.runTasks(lockID);
> +      }.bind(this), function(ret) {
> +        debug("Cannot make DB connection!");

This feels like something that should be logged regardless of the DEBUG flag.

@@ +623,5 @@
> +      this.children.splice(index, 1);
> +    }
> +  },
> +
> +  removeMessageManager: function(aMsgMgr){

This also needs to somehow start the next settings lock if the process that just crashed was blocking other locks.

@@ +627,5 @@
> +  removeMessageManager: function(aMsgMgr){
> +    debug("Removing message manager " + aMsgMgr);
> +    this.removeObserver(aMsgMgr);
> +    SettingsPermissions.removeManager(aMsgMgr);
> +    let closedLocks = [];

Nit: call this closedLockIDs?

@@ +665,5 @@
> +        this.removeMessageManager(mm);
> +        break;
> +      case "Settings:RegisterForMessages":
> +        SettingsPermissions.addManager(aMessage);
> +        if (!SettingsPermissions.hasSomeReadPermission(mm)) {

So we used to kill apps pretty hard if they didn't have the right permissions... Shouldn't we do that here?

@@ +671,5 @@
> +                         " from a content process with no 'settings-api-read' privileges.");
> +          return;
> +        }
> +        if (this.children.indexOf(mm) == -1) {
> +          this.addObserver(mm);

Nit: addObserver does this indexOf check already.

@@ +675,5 @@
> +          this.addObserver(mm);
> +        }
> +        break;
> +      case "Settings:CreateLock":
> +        debug("Received CreateLock for " + msg.lockID);

I don't think it's smart to rely on the child to provide you with a unique lockID. It could always send the same one, and then everything would be hosed. Maybe you need two ids, one that you create that you can guarantee is unique, and then the one that the child sent. Then if the child screws itself, well, too bad.

The ids that you generate here needn't be a uuid, just an int that always increases is probably sufficient.

@@ +680,5 @@
> +        this.settingsLockQueue.push(msg.lockID);
> +        SettingsPermissions.addManager(aMessage);
> +        this.lockInfo[msg.lockID] = SettingsLockInfo(this.settingsDB, mm, msg.lockID, msg.isServiceLock);
> +        break;
> +      case "Settings:Get":

Get/Set/Clear need to somehow guard against being called after Finalize (and kill hard if the order gets messed up).

@@ +727,5 @@
> +            errorMsg: error
> +          });
> +        });
> +        break;
> +      case "Settings:Finalize":

You probably want to guard against this being called multiple times, or at times you don't expect (before Settings::Run, for instance?).
Attachment #8437356 - Flags: review?(bent.mozilla) → review-
Attachment #8435452 - Flags: review?(bent.mozilla) → review+
Attachment #8437144 - Flags: review?(bent.mozilla) → review+
Target Milestone: 2.0 S6 (18july) → 2.1 S1 (1aug)
Attached patch Patch 1 (v8) - OOP Settings API (obsolete) — Splinter Review
> ::: dom/base/DOMRequestHelper.jsm
> @@ +17,5 @@
> >   * when needed.
> >   */
> > +
> > +const DEBUG = false;
> > +function debug(s) { if (DEBUG) dump("-*- DOMRequestHelper: " + s + "\n"); }

This actually shouldn't have even been left in DOMRequestHelper. That
instrumentation was for some run order issues I was having. I've
removed them from that file, and updated the other files to follow the
requested calls.

::: dom/settings/SettingsManager.js
> Nit: This comment isn't very accurate... Each get() has its own idb
> transaction (roughly) and each set() doesn't actually run until the
> very end...

Changed it from saying "close the transaction" to "finalize the
transaction", which I think makes more sense?

> @@ +59,2 @@
> >      this._open = false;
> > +    this.runOrFinalizeQueries();

> Hm, is this needed? I don't quite understand why, shouldn't the lock already be running?

Yeah, it's needed. Queries are only to be run once the lock has left
the function scope it was created in. So we set this function to be
called later in the event loop using a dispatch, so it'll execute all
of the queries requested that were queued in the function.

> @@ +214,5 @@
> >    classID: Components.ID("{60c9357c-3ae0-4222-8f55-da01428470d5}"),
> >    contractID: "@mozilla.org/settingsLock;1",
> > +  QueryInterface: XPCOMUtils.generateQI([Ci.nsISupports,
> > +                                         Ci.nsIObserver,
> > +                                         Ci.nsISupportsWeakReference])
>
> Er, who needs the weak reference? It doesn't look like you're using it.

DOMRequestHelper uses it, and we use the DRH prototype, therefore we
need to expose it in our QI, otherwise things break.

> @@ +230,3 @@
> >    },
> >  
> >    set onsettingchange(aHandler) {
>
> This needs to handle the case where aHandler is null (should unegister for messages probably?)

We aren't just an EventHandler through, we've also got
add/removeObserver functions that require being registered for
messages even though we may not have an event handler. I could try to
make sure we have neither an event handler /nor/ callbacks and
register/unregister accordingly, but is that worth the possibly
convoluted logic?

> :::dom/settings/SettingsRequestManager.jsm
> @@ +27,5 @@
> > +const kSettingsWriteSuffix             = "-write";
> > +const kAllSettingsReadPermission       = "settings" + kSettingsReadSuffix;
> > +const kAllSettingsWritePermission      = "settings" + kSettingsWriteSuffix;
> > +const kSomeSettingsReadPermission      = "settings-api" + kSettingsReadSuffix;
> > +const kSomeSettingsWritePermission     = "settings-api" + kSettingsWriteSuffix;
> 
> the '-api' thing doesn't make a whole lot of sense... This is all
> using the settings api... What about '-partial' or '-specific'
> instead?

This is something Jonas and I discussed and came up with. We need a
settings-api permission to give to apps so that we know they even need
to hit the mozSettings API, and so that navigator knows to expose it
to them. This is why it's called "settings-api", it means whatever
principal is requesting the settings-api-* permission can use
mozSettings. Both things that need specific and global permissions
need this. 

That said, it's an additional permission. If you request either
"settings" or "settings:[specific-setting-name]" permissions and it's
granted, you get settings-api with it.

Should I include this info in comments somewhere? Perhaps in the
permissions table?

> @@ +61,5 @@
> > +      if (permName.substring("settings") == -1) {
>
> Shouldn't this be != 0? Otherwise you match things that contain "settings" anywhere.

Far worse than that. This always returns true because substring takes
1-2 integers, so this always returns just the string, which will never
equal -1. Oops. Luckily this is just an optimization step and not an
actual permissions step. Fixed.

> @@ +83,5 @@
> > +    if (!this._mmPermissions[aMsgMgr]) {
> > +      debug("Manager not added!");
> > +      return false;
> > +    }
> > +    return (this._mmPermissions[aMsgMgr].indexOf(kAllSettingsReadPermission) != -1);
> 
> All of these should probably be != 0.

This is an array of arrays of strings. So we're searching for the
location of the permission string in an array of strings, not the
position of a substring in a string. So this should be != -1, like it
is now.

> @@ +449,5 @@
> > +    }.bind(this));
> > +    return promiseWrapper;
> > +  },
> > +
> > +  taskClear: function(aTask) {
> 
> Somewhere in here you should clear out queuedSets, right? Otherwise
> taskGet will hand them out still. Sounds like an easy thing to add a
> test for too.

This is in finalize. If we're here, we're guaranteed that all get/set
requests have been made (via SettingsManager in the child) and there's
no way to make more get or set requests. The lock is closed and we're
done. The problem is, I need the value in lockedSets once the initial
'get' transaction returns. I did at least clean up where it gets
changed, and I suppose I could bind it in, but since we won't get any
more 'get'/'set' requests at this point, I think we can leave it as
is?

> @@ +479,5 @@
> > +    catch (e if e instanceof InvalidStateError) {
> > +      debug("Rejecting Clear Task on lock " + data.lockID);
> > +      return Promise.reject({task: aTask, error: e.name});
> > +    }
> > +    let clearReq = store.clear();
> 
> So I'm a little confused... You defer all the sets until the end of
> the lock lifetime, but you run clear() immediately. Why is there a
> difference? Are you assuming that since you have 'all write'
> permission the rest of the requests in the lock will succeed? What
> about disk errors and such?

Clear is /only/ available to certified APIs, and is only used in tests
for resetting the DB. So the restrictions on timing/etc are not as
stringent.

> @@ +540,5 @@
> > +      promises.push(p);
> > +      currentTask = lock.tasks.shift();
> > +    }
> > +    //After we exhaust all queued tasks, make sure our transaction is freed.
> > +    lock.transaction = undefined;
> 
> Is this needed? I can't really figure out when this would make much difference.

WARNING: This did not get fixed because I cannot figure out what is
going on. If I comment out this line, mochitests stall, but I have no
idea why. If I leave it in, everything works fine. We should talk
about this when you get back.

> @@ +665,5 @@
> > +        this.removeMessageManager(mm);
> > +        break;
> > +      case "Settings:RegisterForMessages":
> > +        SettingsPermissions.addManager(aMessage);
> > +        if (!SettingsPermissions.hasSomeReadPermission(mm)) {
> 
> So we used to kill apps pretty hard if they didn't have the right permissions... Shouldn't we do that here?

I put a hard kill in here via making an assertPermissions check, but
this would be a very weird situation to hit, since an app would have
to have write only access to the settings API. We also fail
permissions softly elsewhere due to the fact that the whole reason
this exists is for privileged apps. But I think we can get away with
doing a hard kill here at least.

> > @@ +675,5 @@
> > +          this.addObserver(mm);
> > +        }
> > +        break;
> > +      case "Settings:CreateLock":
> > +        debug("Received CreateLock for " + msg.lockID);
> 
> I don't think  it's smart to rely  on the child to provide  you with a
> unique lockID. 

Instead of setting up yet another ID system, which I didn't quite get
the gist of, I just changed CreateLock to print an error and break if
it encounters a create request for an ID it already has, instead of
adding it to the queue. Then we'll just queue more tasks for the same
ID afterward. That seems like enough to me, but we can discuss it
more.
Attachment #8437356 - Attachment is obsolete: true
Attachment #8462290 - Flags: feedback?
bent: Will need to talk to you about this before it goes full back into review, just posting what I've got for the moment. The fact that we're required to manually null out the transaction for anything to work is really worrisome.
Flags: needinfo?(bent.mozilla)
Updating mochitests to use mozSettingsTransaction event.
Attachment #8437144 - Attachment is obsolete: true
Attached patch Patch 1 (v9) - OOP Settings API (obsolete) — Splinter Review
Ok, fixed the transaction issue. Spending a few days away gave the clarity that we didn't even need to expose the transaction like I was doing. All we need is the object store. The whole thing of reusing the transaction was an unneeded (I think) optimization, and trying to keep it around was what was causing the lock issue.

I also changed the way we create transactions. We only create an RW transaction during the finalize or clear step, all other steps use RO.
Attachment #8462290 - Attachment is obsolete: true
Attachment #8462290 - Flags: feedback?
Attachment #8465140 - Flags: review?(bent.mozilla)
Flags: needinfo?(bent.mozilla)
Target Milestone: 2.1 S1 (1aug) → 2.1 S2 (15aug)
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #53)
> I could try to
> make sure we have neither an event handler /nor/ callbacks and
> register/unregister accordingly, but is that worth the possibly
> convoluted logic?

Yeah, I think so. It cuts down on the work that the parent process has to do a little.

> Should I include this info in comments somewhere? Perhaps in the
> permissions table?

That sounds fine.

> This is an array of arrays of strings.

Ah right. Ok.

> 
> > @@ +449,5 @@
> > > +    }.bind(this));
> > > +    return promiseWrapper;
> > > +  },
> > > +
> > > +  taskClear: function(aTask) {
> > 
> > Somewhere in here you should clear out queuedSets, right? Otherwise
> > taskGet will hand them out still. Sounds like an easy thing to add a
> > test for too.
> 
> This is in finalize. If we're here, we're guaranteed that all get/set
> requests have been made (via SettingsManager in the child) and there's
> no way to make more get or set requests. The lock is closed and we're
> done.

Hm, I don't think we're on the same page here. The issue is if a client does this:

  lock.set("foo", 10);
  lock.clear();
  lock.get("foo");

I don't see where |taskClear| clears |queuedSets|, so it looks to me like the get() would return 10 here since you have an early-return in |taskGet| that pulls from |queuedSets|.

> Clear is /only/ available to certified APIs, and is only used in tests
> for resetting the DB. So the restrictions on timing/etc are not as
> stringent.

Sure, but it does mean that it's not possible to roll back any lock that had a clear in it. If your code did this:

  lock.clear();
  lock.set("foo", 10);

If the set fails for some reason then the db is still cleared...

> Instead of setting up yet another ID system, which I didn't quite get
> the gist of, I just changed CreateLock to print an error and break if
> it encounters a create request for an ID it already has, instead of
> adding it to the queue. Then we'll just queue more tasks for the same
> ID afterward. That seems like enough to me, but we can discuss it
> more.

Well, I'm just worried about a child spoofing the lock id. As long as it's only possible for a child to spoof its own locks (and not the locks of another child process) then I think your fix is fine.
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #57)
> (In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #53)
> > I could try to
> > make sure we have neither an event handler /nor/ callbacks and
> > register/unregister accordingly, but is that worth the possibly
> > convoluted logic?
> 
> Yeah, I think so. It cuts down on the work that the parent process has to do
> a little.

Ok. Will try to get this in, hopefully the mochis will keep the corner cases covered.

>   lock.set("foo", 10);
>   lock.clear();
>   lock.get("foo");
 
>   lock.clear();
>   lock.set("foo", 10);
> 
> If the set fails for some reason then the db is still cleared...

In reference to the two above examples: I'm going to make it so that if clear is called on a lock, that is ALL the lock can do. If it's not the first thing called on the lock, the lock fails. If anything is called after it, the lock fails. I'm not going to waste time with trying to get every version of this right for a command specifically implemented for testing alone.

I'm also tempted to add a settings-clear permission to make sure only tests call this, for safety sake.

> Well, I'm just worried about a child spoofing the lock id. As long as it's
> only possible for a child to spoof its own locks (and not the locks of
> another child process) then I think your fix is fine.

Actually, that's a good point. While the message manager is stored with the lock, we don't actually check that the message manager requesting a task on a certain lock ID actually /owns/ that. Will implement that check as part of receive message, and should probably completely kill an app on that case.
Flags: needinfo?(bent.mozilla)
Attachment #8465140 - Flags: review?(bent.mozilla)
Attached patch Patch 1 (v10) - OOP Settings API (obsolete) — Splinter Review
Fixed all the requested issues:

- If a lock runs a clear command, it is now the ONLY thing it can do, and the app has to have special permission to do it. This is to make sure it's only used in tests, and that we don't has atomicity issues with it

- SettingsManager now registers/unregisters for messages as needed, depending on whether or not it has observers or event handlers

- If a message manager requests a task on a lock ID that is valid but that it does not own, the app it is bound to killed.
Attachment #8465140 - Attachment is obsolete: true
Attachment #8472780 - Flags: review?(bent.mozilla)
Flags: needinfo?(bent.mozilla)
Updated tests to use new settings-clear permission, and quarantined clear into its own step.
Attachment #8462292 - Attachment is obsolete: true
Try run of Bug 900551, bug 846200, Bug 1015518.

Looks like there's a race in one of the tests, and I need to add the new MozSettingsTransactionEvent to a few places.

https://tbpl.mozilla.org/?tree=Try&rev=e53cee22e5bf
Fixed all test issues that were showing up with this on tbpl. However, this required editing a file that said I am REQUIRED to get a DOM peer to review this. I realize bent already r+'d this once, but am asking for re-review.
Attachment #8472781 - Attachment is obsolete: true
Attachment #8474072 - Flags: review?(bent.mozilla)
Here's the new try with my mochitest fixes. Note that while mochis greened up, oth still burns, but that's due to stuff in bug 1015518, so I've probably changed something that's making the callbacks I did in SettingsService unhappy, and can fix those over there.

https://tbpl.mozilla.org/?tree=Try&rev=06d5ed56dc1a
Comment on attachment 8472780 [details] [diff] [review]
Patch 1 (v10) - OOP Settings API

Review of attachment 8472780 [details] [diff] [review]:
-----------------------------------------------------------------

I might have a little more to read through here, but there's some stuff that still needs work around readonly transactions and ensuring correct lockIDs.

::: dom/apps/src/PermissionsTable.jsm
@@ +149,5 @@
>                             },
> +                           "settings-clear": {
> +                             app: DENY_ACTION,
> +                             privileged: DENY_ACTION,
> +                             certified: ALLOW_ACTION,

I wonder if you shouldn't leave this at DENY even for certified apps, just add a comment that says it's for testing only?

::: dom/base/DOMRequestHelper.jsm
@@ +16,4 @@
>   * to the child side of frame and process message manager and removing them
>   * when needed.
>   */
> +

Nit: looks like you only have whitespace changes to this file, I'd revert to avoid blame.

::: dom/settings/SettingsManager.js
@@ +109,5 @@
> +  receiveMessage: function(aMessage) {
> +    let msg = aMessage.data;
> +    // SettingsRequestManager broadcasts changes to all locks in the child. If
> +    // our lock isn't being addressed, just return.
> +    // TODO: Is this safe? Should we set up IPDL channel per lock?

Nit: You can remove this TODO, I don't think there's any security concern here since permissions are app-wide.

@@ +180,5 @@
>        throw Components.results.NS_ERROR_ABORT;
>      }
> +    let req = this.createRequest();
> +    let reqID = this.getRequestId({request: req});
> +    cpmm.sendAsyncMessage("Settings:Get", {requestID: reqID,

This could be made nicer by calling your helper function, |this.sendMessage()|. Same for set() and clear() below.

::: dom/settings/SettingsRequestManager.jsm
@@ +45,5 @@
> +    if (DEBUG) debug("Adding message manager permissions");
> +    let mm = aMessage.target;
> +    // In order for mochitests to work, we have to update permissions on every
> +    // lock creation or observer addition. This still means we can cache
> +    // permissions.

We should probably file a followup to do this only when running mochitests (probably by using a pref, "dom.settings.testing" or something).

@@ +103,5 @@
> +  hasClearPermission: function(aMsgMgr) {
> +    return this.checkPermission(aMsgMgr, kSettingsClearPermission);
> +  },
> +  assertSomeReadPermission: function(aMsgMgr) {
> +    return aMsgMgr.assertPermission(kSomeSettingsReadPermission);

Nit: No need for return here.

@@ +145,5 @@
> +    getObjectStore: function(isReadOnly) {
> +      if (DEBUG) debug("Getting transaction for " + this.lockID);
> +      let t;
> +      if (isReadOnly) {
> +        t = aDB._db.transaction(SETTINGSSTORE_NAME, "readonly");

As you noticed since these are all happening in different transactions that run in parallel you'll need to somehow make sure that the callbacks are delivered in the order they were received without relying on idb here.

@@ +154,5 @@
> +        }
> +        t = aDB._db.transaction(SETTINGSSTORE_NAME, "readwrite");
> +      }
> +      if (!t) {
> +        return null;

This shouldn't be necessary, the idb transaction() function will never return null unless it also throws.

@@ +162,5 @@
> +                     }.bind(this);
> +      t.onabort = function () {
> +                    if (DEBUG) debug("Transaction for lock " + this.lockID + " aborted");
> +                    this._failed = true;
> +                  }.bind(this);

Nit: crazy indent here (and a little above too).

@@ +221,5 @@
> +    let binaries = Object.create(null);
> +    let stringified = JSON.stringify(aObject, function(key, value) {
> +      value = this.settingsDB.prepareValue(value);
> +      if (needsUUID(value)) {
> +        let uuid = Cc["@mozilla.org/uuid-generator;1"].getService(Ci.nsIUUIDGenerator)

You already have a lazyServiceGetter for this.

@@ +329,5 @@
> +      }
> +      return defer.resolve({task: aTask, results: results});
> +    };
> +    getReq.onerror = function() {
> +      return defer.reject({task: aTask, error: getReq.error.name});

Do you want to call event.preventDefault() here to keep the IDB error from showing up in the error console? It seems to me like you're "handling" the error here.

@@ +383,5 @@
> +    // potentially receive promise responses out of expected order if
> +    // a get is called before a set. Therefore, we wrap our resolve in
> +    // a null get, which means it will resolves afer the rest of the
> +    // calls queued to the DB.
> +    let getReq = store.mozGetAll(null);

This will be pretty expensive (no key range means *everything*). I'd just do a count().

@@ +584,5 @@
> +  runTasks: function(aLockID) {
> +    if (DEBUG) debug("Running tasks for " + aLockID);
> +    let lock = this.lockInfo[aLockID];
> +    if (lock.finalizing) {
> +      debug("TASKED TRYING TO QUEUE AFTER FINALIZE CALLED. THIS IS BAD. Lock: " + aLockID);

Nit: s/TASKED/TASK/

@@ +589,5 @@
> +      return;
> +    }
> +    let currentTask = lock.tasks.shift();
> +    let promises = [];
> +    while(currentTask) {

Nit: Do a quick search/replace for "if(", "while(", "for(", and "switch(" to make sure you clean these up.

@@ +745,5 @@
> +        if (DEBUG) debug("Return message failed, " + name);
> +      }
> +    }
> +
> +    if (msg && msg.lockID && this.lockInfo[msg.lockID]) {

Hrm, so you should probably also kill the process if it sends a lockID that isn't in your map, and then you probably want to enforce that any of the message types that you expect to have a lockID included actually do. So rather than base the decision to kill or not on whether the message contains a lockID, maybe do something like this:

  switch (aMessage.name) {
    case "Settings:CreateLock":
    case "Settings:Get":
    case "Settings:Set":
    case "Settings:Clear":
    case "Settings:Run":
      if (!msg.lockID ||
          !this.lockInfo[msg.lockID] ||
          mm != this.lockInfo[msg.lockID]._mm) {
        // Kill
      }
    default:
      break;
  }

@@ +750,5 @@
> +      let lock = this.lockInfo[msg.lockID];
> +      if (mm != lock._mm) {
> +        // Kill the app by checking for a non-existent permission
> +        aMessage.target.assertPermission("message-manager-mismatch-kill");
> +        Cu.reportError("Process trying to access settings lock from another process. Killing.");

Below you kill the app after reporting the error, but here you kill first and report second. Does it matter?

@@ +777,5 @@
> +        break;
> +      case "Settings:CreateLock":
> +        if (DEBUG) debug("Received CreateLock for " + msg.lockID);
> +        if (msg.lockID in this.settingsLockQueue) {
> +          debug("ERROR: Trying to queue a lock with the same ID as an already queued lock. Ignoring.");

Seems like this should be fatal...
Comment on attachment 8474072 [details] [diff] [review]
Patch 3 (v4) - Mochitests for New Settings Permissions (r=bent)

Review of attachment 8474072 [details] [diff] [review]:
-----------------------------------------------------------------

r=me on the tests, but:

::: dom/tests/mochitest/general/test_interfaces.html
@@ +706,4 @@
>  // IMPORTANT: Do not change this list without review from a DOM peer!
>      "MozSettingsEvent",
>  // IMPORTANT: Do not change this list without review from a DOM peer!
> +    "MozSettingsTransactionEvent",

I'm a little behind on how we're supposed to do this kind of thing nowadays. Ehsan would probably know better.

We already expose MozSettingsEvent (from long ago) but I wonder if we need to hid this new one behind a pref or something?
Attachment #8474072 - Flags: review?(ehsan)
Attachment #8474072 - Flags: review?(bent.mozilla)
Attachment #8474072 - Flags: review+
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #64)

> ::: dom/settings/SettingsRequestManager.jsm
> @@ +45,5 @@
> > +    if (DEBUG) debug("Adding message manager permissions");
> > +    let mm = aMessage.target;
> > +    // In order for mochitests to work, we have to update permissions on every
> > +    // lock creation or observer addition. This still means we can cache
> > +    // permissions.
> 
> We should probably file a followup to do this only when running mochitests
> (probably by using a pref, "dom.settings.testing" or something).

Actually, that comment is from when I still thought caching was required because permissions access was slow. It just needs to be removed.


> @@ +145,5 @@
> > +    getObjectStore: function(isReadOnly) {
> > +      if (DEBUG) debug("Getting transaction for " + this.lockID);
> > +      let t;
> > +      if (isReadOnly) {
> > +        t = aDB._db.transaction(SETTINGSSTORE_NAME, "readonly");
> 
> As you noticed since these are all happening in different transactions that
> run in parallel you'll need to somehow make sure that the callbacks are
> delivered in the order they were received without relying on idb here.

Ok. New strategy: Use the old strategy. I've got a version working now where we use the same transaction over and over until we can no longer create datastores on it. This is exactly what the version of settings before this did, works fine (all tests green now). All of this pain came from me trying to build transactions that would be set RO/RW depending on whether or not they were only getting "get" requests, but that's just causing too much bookkeeping right now.

> @@ +329,5 @@
> > +      }
> > +      return defer.resolve({task: aTask, results: results});
> > +    };
> > +    getReq.onerror = function() {
> > +      return defer.reject({task: aTask, error: getReq.error.name});
> 
> Do you want to call event.preventDefault() here to keep the IDB error from
> showing up in the error console? It seems to me like you're "handling" the
> error here.

Honestly, I'd doubt this would ever happen, and I'm ok with it showing up on the console because it's be an extraordinary case.

> @@ +383,5 @@
> > +    // potentially receive promise responses out of expected order if
> > +    // a get is called before a set. Therefore, we wrap our resolve in
> > +    // a null get, which means it will resolves afer the rest of the
> > +    // calls queued to the DB.
> > +    let getReq = store.mozGetAll(null);
> 
> This will be pretty expensive (no key range means *everything*). I'd just do
> a count().

Copy/paste accident, mean to turn it into a .get("dummyvalue"). However, that taking forever is what lead me to the out of order return bug, so yay, bugs finding bugs. :)

It's now a count. Works fine that way. Note that count() is not in MDN yet.

> @@ +750,5 @@
> > +      let lock = this.lockInfo[msg.lockID];
> > +      if (mm != lock._mm) {
> > +        // Kill the app by checking for a non-existent permission
> > +        aMessage.target.assertPermission("message-manager-mismatch-kill");
> > +        Cu.reportError("Process trying to access settings lock from another process. Killing.");
> 
> Below you kill the app after reporting the error, but here you kill first
> and report second. Does it matter?

Not actually sure, but I made them match just in case.

> @@ +777,5 @@
> > +        break;
> > +      case "Settings:CreateLock":
> > +        if (DEBUG) debug("Received CreateLock for " + msg.lockID);
> > +        if (msg.lockID in this.settingsLockQueue) {
> > +          debug("ERROR: Trying to queue a lock with the same ID as an already queued lock. Ignoring.");
> 
> Seems like this should be fatal...

And now it is!
Attached patch Patch 1 (v11) - OOP Settings API (obsolete) — Splinter Review
Attachment #8472780 - Attachment is obsolete: true
Attachment #8472780 - Flags: review?(bent.mozilla)
Attachment #8474723 - Flags: review?(bent.mozilla)
Comment on attachment 8474723 [details] [diff] [review]
Patch 1 (v11) - OOP Settings API

Review of attachment 8474723 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/webidl/MozSettingsTransactionEvent.webidl
@@ +4,5 @@
> + * You can obtain one at http://mozilla.org/MPL/2.0/.
> + */
> +
> +[Constructor(DOMString type, optional MozSettingsTransactionEventInit eventInitDict)]
> +interface MozSettingsTransactionEvent : Event

Please use [CheckPermission="settings-read settings-write"] on the interface to avoid exposing it to all web pages.
Comment on attachment 8474072 [details] [diff] [review]
Patch 3 (v4) - Mochitests for New Settings Permissions (r=bent)

Review of attachment 8474072 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/tests/mochitest/general/test_interfaces.html
@@ +706,4 @@
>  // IMPORTANT: Do not change this list without review from a DOM peer!
>      "MozSettingsEvent",
>  // IMPORTANT: Do not change this list without review from a DOM peer!
> +    "MozSettingsTransactionEvent",

Once you've done what I suggested before, you should add a permission property to this line.  See the other examples in the same file.

Note that MozSettingsEvent is a mistake we should not repeat here.
Attachment #8474072 - Flags: review?(ehsan) → review+
Try is green (oranges are the datastore intermittents):

https://tbpl.mozilla.org/?tree=Try&rev=40280e18f79b
Comment on attachment 8474723 [details] [diff] [review]
Patch 1 (v11) - OOP Settings API

Review of attachment 8474723 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/settings/SettingsRequestManager.jsm
@@ +162,5 @@
> +      // Create one transaction with a global permission. This may be
> +      // slightly slower on apps with full settings permissions, but
> +      // it means we don't have to do our own transaction order
> +      // bookkeeping.
> +      if (!SettingsPermissions.hasSomeWritePermission(this._mm)) {

This permission is guaranteed to be set if any individual write permission is granted?

@@ +163,5 @@
> +      // slightly slower on apps with full settings permissions, but
> +      // it means we don't have to do our own transaction order
> +      // bookkeeping.
> +      if (!SettingsPermissions.hasSomeWritePermission(this._mm)) {
> +        this._transaction = aDB._db.transaction(SETTINGSSTORE_NAME, "readonly");

So I don't quite understand how this gets around the parallel readonly transaction problem. The objectStore() call above will throw if you call it outside a success/error callback (just like all idb things) and then you'll create a new transaction. Those two transactions will run in parallel, and the order in which they complete is not guaranteed.

I wonder if the reason that your tests pass now is that we don't have tests of exclusive readonly settings apps? If we don't, and we don't have any apps that have exclusive readonly apps in b2g yet (and they're guaranteed to be certified only) then I think we can just fix this in a followup. Otherwise I guess we could just always make readwrite transactions in the short term...

@@ +291,5 @@
> +    // potentially receive promise responses out of expected order if
> +    // a get is called before a set. Therefore, we wrap our resolve in
> +    // a null get, which means it will resolves afer the rest of the
> +    // calls queued to the DB.
> +    let countReq = store.count();

Ugh, I'm really sorry, but it looks like I told you wrong and this is actually more expensive than a get() for a known-nonexistent key. Let's replace with get(false) or get(0) or something (null won't work).

@@ +304,5 @@
> +      return defer.resolve(aReturnValue);
> +    };
> +    countReq.onerror = function() {
> +      // I have no idea how this would fail, but just in case, we
> +      // reject.

Nit Pretty much anything can fail due to I/O errors, and mobile drivers aren't always awesome! So the comment can go.

@@ +746,5 @@
> +        closedLockIDs.push(lockIDs[i]);
> +      }
> +    }
> +    for (let i in closedLockIDs) {
> +      delete this.lockInfo[closedLockIDs[i]];

I just realized, you should probably abort any outstanding transactions here:

  let transaction = this.lockInfo[closedLockIDs[i]]._transaction;
  if (transaction ) {
    transaction.abort();
  }
  delete this.lockInfo[closedLockIDs[i]];
  // ... all the rest

@@ +775,5 @@
> +      }
> +    }
> +
> +    // For all message types that expect a lockID, we check to make
> +    // sure that we're access a lock that's part of our process. If

Nit: s/access/accessing/

@@ +790,5 @@
> +            !this.lockInfo[msg.lockID] ||
> +            mm != this.lockInfo[msg.lockID]._mm) {
> +          Cu.reportError("Process trying to access settings lock from another process. Killing.");
> +          // Kill the app by checking for a non-existent permission
> +          aMessage.target.assertPermission("message-manager-mismatch-kill");

This is right, but silly. We should have a followup to make a kill() function on nsIProcessChecker.
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #71)
> Comment on attachment 8474723 [details] [diff] [review]
> Patch 1 (v11) - OOP Settings API
> 
> Review of attachment 8474723 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/settings/SettingsRequestManager.jsm
> @@ +162,5 @@
> > +      // Create one transaction with a global permission. This may be
> > +      // slightly slower on apps with full settings permissions, but
> > +      // it means we don't have to do our own transaction order
> > +      // bookkeeping.
> > +      if (!SettingsPermissions.hasSomeWritePermission(this._mm)) {
> 
> This permission is guaranteed to be set if any individual write permission
> is granted?

Yes. For any settings permission, you also get the equivilent settings-api permission. So if you set settings:wallpaper.image-write for an app, it'll also get settings-api-write.

> I wonder if the reason that your tests pass now is that we don't have tests
> of exclusive readonly settings apps? If we don't, and we don't have any apps
> that have exclusive readonly apps in b2g yet (and they're guaranteed to be
> certified only) then I think we can just fix this in a followup. Otherwise I
> guess we could just always make readwrite transactions in the short term...

We actually already have quite a few settings apps with readonly access. FM, costcontrol, etc. But yeah, I don't think we've had any purely readonly tests though. That'd be a good thing to put together.

I'm not actually sure how I can get parallel transactions here though? In SettingsRequestManager, whenever I create a transaction, it'll be used for multiple queries in runTasks. Once everything has been loaded onto that transaction, runTasks will exit, and queueConsume will get called to queue the next lock to go through runTasks. queueConsume uses a thread dispatch to make sure we return to the event loop before running the next set of tasks, meaning we should exhaust our transaction before the next set of queries. That means we should never have two transactions alive at the same time, right?

> Ugh, I'm really sorry, but it looks like I told you wrong and this is
> actually more expensive than a get() for a known-nonexistent key. Let's
> replace with get(false) or get(0) or something (null won't work).

Easy enough fix.

> This is right, but silly. We should have a followup to make a kill()
> function on nsIProcessChecker.

Bug 1055427
Attached patch Patch 1 (v12) - OOP Settings API (obsolete) — Splinter Review
Fixed all nits and issues with dummy count versus get.
Attachment #8474723 - Attachment is obsolete: true
Attachment #8474723 - Flags: review?(bent.mozilla)
Attachment #8475004 - Flags: review?(bent.mozilla)
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #72)
> Yes. For any settings permission, you also get the equivilent settings-api
> permission. So if you set settings:wallpaper.image-write for an app, it'll
> also get settings-api-write.

Wait, this sounds like it defeats the purpose. The goal here is to enable an app to only have permission to read (for example) wallpaper.image, but not permission to use other aspects of the settings API.

Or does settings-api-write not grant permission to access other parts of the settings API?
(In reply to Jonas Sicking (:sicking) from comment #74)
> (In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #72)
> > Yes. For any settings permission, you also get the equivilent settings-api
> > permission. So if you set settings:wallpaper.image-write for an app, it'll
> > also get settings-api-write.
> 
> Wait, this sounds like it defeats the purpose. The goal here is to enable an
> app to only have permission to read (for example) wallpaper.image, but not
> permission to use other aspects of the settings API.
> 
> Or does settings-api-write not grant permission to access other parts of the
> settings API?

I'm also curious about this. Settings DB stores some personal and sensitive data like passcode, a privilege app should not be able to access this kind of data if it does not ask for the permission explicitly.
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #72)
> I'm not actually sure how I can get parallel transactions here though?

Well, you can definitely have multiple transactions alive at once for the same lock, but I think I have convinced myself that you can't actually have racing results any more.
Comment on attachment 8475004 [details] [diff] [review]
Patch 1 (v12) - OOP Settings API

Review of attachment 8475004 [details] [diff] [review]:
-----------------------------------------------------------------

I think Jonas is right, the permissions need to be tweaked a bit.

::: dom/settings/SettingsRequestManager.jsm
@@ +421,5 @@
> +      lock.queuedSets[key] = data.settings[key];
> +    }
> +
> +    let store = lock.objectStore;
> +    if (!store) {

This block looks redundant, you're going to do this at the beginning of queueTaskReturn().
(In reply to Jonas Sicking (:sicking) from comment #74)
> (In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #72)
> > Yes. For any settings permission, you also get the equivilent settings-api
> > permission. So if you set settings:wallpaper.image-write for an app, it'll
> > also get settings-api-write.
> 
> Wait, this sounds like it defeats the purpose. The goal here is to enable an
> app to only have permission to read (for example) wallpaper.image, but not
> permission to use other aspects of the settings API.
> 
> Or does settings-api-write not grant permission to access other parts of the
> settings API?

No. This is exactly what we talked about 3 months ago when I implemented it. To be able to access the wallpaper.image setting, you would need the settings:wallpaper.image-write permission, but you ALSO need to be able to someway access the settings API to be able to call set. So you get the settings-api-write permission, which opens up the API for calls and means, sure, you can call settings.get() for anything you please, but you're just going to get back errors (not crash, because, like we discussed 3 months ago, we said errors are ok now). The only call that will WORK is create lock then a set().

Are you wanting to block at the API level, so calls to get() don't even resolve as a function? I didn't think we could do checkpermissions at that level in webidl?
BTW, the permissions are tested in Patch 3 of this bug, the new test_settings_permissions.html file. Check that out, as it shows in code what things should happen.
Attached patch Patch 1 (v13) - OOP Settings API (obsolete) — Splinter Review
Removed extra store retrieval.
Attachment #8475004 - Attachment is obsolete: true
Attachment #8475004 - Flags: review?(bent.mozilla)
Attachment #8475245 - Flags: review?(bent.mozilla)
Comment on attachment 8475245 [details] [diff] [review]
Patch 1 (v13) - OOP Settings API

Review of attachment 8475245 [details] [diff] [review]:
-----------------------------------------------------------------

Ok, this looks good to me. Thanks for ironing everything out!
Attachment #8475245 - Flags: review?(bent.mozilla) → review+
Kyle got me straightened out on irc. The permissions part sounds great.
Depends on: 1055898
Blocks: 1053733
Target Milestone: 2.1 S2 (15aug) → 2.1 S3 (29aug)
Right now, we're still held up on Marionette WebAPI tests. As far as I can tell, there's some lock in the system app that's going redundant and opening itself forever. I'm not sure why this is, as the emulator builds boot just fine, this only seems to happen during test runs. Once we get that ironed out I think we should be ready to go, but it's VERY slow going due to emulator build times and running in debug taking 10+ minutes
Finally identified the issue with Marionette WebAPI tests. In bug 968709, a change was checked in to "navigate" to the system app in order for some tests to work. However, "navigating" to the system app restarts it on a live process. Somehow, this still left us with a system that could pass marionettes before, but seems to asking for a lot of undefined behavior. With the settings api oop patch in, something about doing this now causes us to orphan a lock which we can't clean up. I suspect this is due to the system app living in the parent, and therefore talking to SettingsService, not SettingsManager, so we never receive an "child-process-shutdown" message when it dies, and we leave locks on the queue.

I have a try in that backs out what was landed in bug 968709. I can also /try/ to figure out a way to detect the system app shutdown and clear out its locks, but this seems like it should get fixed on both sides.

https://tbpl.mozilla.org/?tree=Try&rev=a6afc96afa60
Via :fabrice and :lightsofapollo, we /might/ be able to fix this on our side by using the system apps onunload event to clear its locks out of the SettingsRequestManager. However, currently locks have no idea of the domain/app they come from, since they index themselves on their message manager. Since system uses SettingsManager, it's just going to be the parent process cpmm, which never dies. :|
More talking to :lightsofapollo. SettingsManagers should register for the window.onunload event of their owning app. Most times for oop apps, I don't think they'll live long enough to do anything, but that'll be handled by the child-process-shutdown message in SettingsRequestManager. However, for apps that live in the parent process (system, keyboard, etc), this means we can catch restarts and clean up.
Jamin, just curious if you've figured out a cleaner way to switch to the system app for the bluetooth qemu tests? Using navigate() seems to be doing really weird things to the system since I /think/ it restarts the system app. It seems like we need something similar to switchToApp from marionette-js. I'm going to talk to the b2g automation team tomorrow about this too, just curious if you had any ideas.
Flags: needinfo?(jaliu)
Kyle,
Thank you for your time on this matter.
Currently, I'm not sure if I can provide a better way to switch to system app.
But I do think it's weird to use 'navigate()' for Bluetooth test set.
As you mentioned, it would be nice to have something similar to 'switchToApp'.

Currently, Bluetooth marionette tests can be performed in any order. 
However, BT tests can't be performed in staggered order with other marionette unit-tests which would change the URL of Marionette client.

The reason is that marionette client has to stay in system app to receive the system message "bluetooth-pairing-request".

The limitation should be fixed in future and it could be achieved by these ways. 
Method 1. Make sure every marionette test restore the URL before the test ends.
- It could be achieved by calling 'Marionette.go_back()' in the end of test.

Method 2. Reset the URL of marionette client at tearDown() or setUp() function in marionette_test.py
- Cache original URL at 'setUp()' by calling 'Marionette.get_url()' and restore URL at 'tearDown()' by 'Marionette.go_back()', 'Marionette.navigate' or something else.

Method 3. Provide an utility function which is similar 'switchToApp' and let whoever needs it to call it. 

Method 4. Remove system message from Bluetooth implementation.

(I'm not sure about this. But method 1 and 2 might be suffered by the same issue described at #Commet 89)

Bluetooth team are working on Bluetooth API refinement which would remove system message (i.e. method 4) and the new APIs are targeted on B2G 2.2.
Until then, I'm glad to try different approach if anyone has any suggestions. :)

Thanks.
Flags: needinfo?(jaliu)
May we know the ETA for this bug?
I'm asking because it's blocking bug 843452, which is also 2.1 committed feature.
(In reply to Jamin Liu [:jaliu] from comment #93)

> Method 3. Provide an utility function which is similar 'switchToApp' and let
> whoever needs it to call it. 

I talked to mdas this morning. In the python script, calling self.marionette.switch_to_frame() should work, as a null parameter will return us to the top frame, which in the case of b2g is the system app. I'm trying this now but unfortunately not having much luck.
ni?'ing mdas because she asked me to.
Flags: needinfo?(mdas)
Depends on: 1058158
While we work on getting bug 1058158 stable, I'm also trying to diagnose a permafailure in dom/system/gonk/test/test_data_connection.js. This is definitely something this patch is causing, but I can't quite figure out why. Those seem to be our two big issues now.
Hey Edgar, I seem to have broken your test (or, well, yours by way of git blame). :)

I'm seeing a permafailure in dom/system/gonk/tests/marionette/test_data_connection.js on the ICS emulator with the patches in bugs 846200, 900551, and 1015518 applied to m-c. I've put the gecko logs with RIL debug on in this bug. For some reason, with my settings API patches applied, the tests die in the non-default call test. It makes it through the mms call, then dies on supl. However, I can't figure out why. The logs make it look like we're expecting a message to come back from the RIL daemon that never shows up once my patches are applied. Could you take a look at the logs real quick and see if I'm missing anything obvious? I'm guessing something is going wrong with apn settings, or else one of my "internal message" changes in bug 1015518 is causing problems between settings and ril.
Flags: needinfo?(echen)
Hi Kyle:

I saw the sequence of onsuccess callback and observer event is changed. Please see below comments.
Not sure if it is an expected behavior.

log of working (Attachment 8478831 [details]):

// Marionette test changes the apn setting.
08-26 02:04:52.796 I/Gecko   (   44): -*- SettingsManager: creating lock
08-26 02:04:52.796 I/Gecko   (   44): -*- SettingsManager: settings lock init
08-26 02:04:52.816 I/Gecko   (   44): -*- SettingsManager: send: {"ril.data.apnSettings":[[{"carrier":"T-Mobile US","apn":"epc.tmobile.com","mmsc":"http://mms.msg.eng.t-mobile.com/mms/wapenc","types":["default","supl","mms","ims","dun"]}]]}

08-26 02:14:51.146 I/Gecko   (   44): MARIONETTE TEST RESULT:TEST-PASS | test_data_connection.js | aRequest is instanceof function DOMRequest() {

// RIL receive the observer event, mozsettings-changedm and apply the new apn.
08-26 02:14:51.216 I/Gecko   (   44): -*- DataConnectionManager: 'ril.data.apnSettings' is now [[{"carrier":"T-Mobile US","apn":"epc.tmobile.com","mmsc":"http://mms.msg.eng.t-mobile.com/mms/wapenc","types":["default","supl","mms","ims","dun"]}]]

// onsuccess callback is triggered and marionette start the test.
08-26 02:14:51.356 I/Gecko   (   44): MARIONETTE LOG: INFO: setSettings({"ril.data.apnSettings":[[{"carrier":"T-Mobile US","apn":"epc.tmobile.com","mmsc":"http://mms.msg.eng.t-mobile.com/mms/wapenc","types":["default","supl","mms","ims","dun"]}]]}) - success

======================

Log of broken (Attachment 8478833 [details]):

// Marionette test changes the apn setting.
08-26 02:04:52.796 I/Gecko   (   44): -*- SettingsManager: creating lock
08-26 02:04:52.796 I/Gecko   (   44): -*- SettingsManager: settings lock init
08-26 02:04:52.816 I/Gecko   (   44): -*- SettingsManager: send: {"ril.data.apnSettings":[[{"carrier":"T-Mobile US","apn":"epc.tmobile.com","mmsc":"http://mms.msg.eng.t-mobile.com/mms/wapenc","types":["default","supl","mms","ims","dun"]}]]}
08-26 02:04:52.825 I/Gecko   (   44): MARIONETTE TEST RESULT:TEST-PASS | test_data_connection.js | aRequest is instanceof function DOMRequest() {

// onsuccess callback is triggered and marionette start the test.
// But RIL doesn't apply the new apn yet.
08-26 02:04:53.156 I/Gecko   (   44): MARIONETTE LOG: INFO: setSettings({"ril.data.apnSettings":[[{"carrier":"T-Mobile US","apn":"epc.tmobile.com","mmsc":"http://mms.msg.eng.t-mobile.com/mms/wapenc","types":["default","supl","mms","ims","dun"]}]]}) - success

// RIL receive the observer event and apply the new apn.
08-26 02:04:53.526 I/Gecko   (   44): -*- SettingsRequestManager: Finished Broadcasting
08-26 02:04:53.536 I/Gecko   (   44): -*- DataConnectionManager: 'ril.data.apnSettings' is now [[{"carrier":"T-Mobile US","apn":"epc.tmobile.com","mmsc":"http://mms.msg.eng.t-mobile.com/mms/wapenc","types":["default","supl","mms","ims","dun"]}]]
Flags: needinfo?(echen)
Thanks Edgar, that was exactly the problem! Due to the way we're having to deal with Settings in an OOP context now, we instantly return on a .set() call, and cache the value until all calls on the lock have been made. We don't actually fire the observers until the end of the lock transaction. We now have new events to signal whether a lock transaction has succeeded or failed. I moved the setSettings function to use these events instead of just the promise on the .set(), and now things work fine! Going to do a bit more testing them will put this change in for review.
Attachment #8478831 - Attachment is obsolete: true
Attachment #8478833 - Attachment is obsolete: true
Attachment #8479224 - Flags: review?(echen)
Updating oop settings patch with transaction abort catch
Attachment #8475245 - Attachment is obsolete: true
No longer blocks: 1053733
Comment on attachment 8479224 [details] [diff] [review]
Patch 4 (v1) - Make dom/system/gonk marionette tests use transaction events

Review of attachment 8479224 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with below comments addressed.
Thank you. :)

::: dom/system/gonk/tests/marionette/head.js
@@ +101,5 @@
> +    deferred.resolve();
> +  };
> +  lock.onsettingstransactionfailure = function () {
> +    ok(aAllowError, "setSettings(" + JSON.stringify(settings) + ") - error");
> +    deferred.reject();

deferred.resolve() instead.
This helper function never rejects, it only reports error if aAllowError is false.
Attachment #8479224 - Flags: review?(echen) → review+
Depends on: 1059079
Found more settings timing issues in wifi and tethering, so Dimi and Henry, if you could check the files you each wrote and just make sure things look ok, that'd be great. (See Comment 101 and Comment 102 for more info on why this needs to happen this way)
Attachment #8479224 - Attachment is obsolete: true
Attachment #8479600 - Flags: review?(hchang)
Attachment #8479600 - Flags: review?(dlee)
Somehow the geolocation fixes got kicked out of this patch at some point, fixing in bug just to make sure they're replicated.
Attachment #8474072 - Attachment is obsolete: true
Disabling tests is about my least favorite thing to do, and the only reason I'm asking we do this is so I can land settings before FL on friday. We've got Bug 1058158 filed and in action right now, and we're getting close to running the B2G tests in a container which will fix this problem in the correct way. However, it's failing in weird ways on the emulator so our iteration time is very, very slow. I'm hoping to have Bug 1058158 finished by the end of the week though, at which point we'll reenable the tests and everything should hopefully run more smoothly.

I've talked to multiple people around gaia and automation about this and they're fine with it, just want to get Jamin's r+ on it too so the original author knows it's off and can poke me if it doesn't get reenabled quickly. :)
Attachment #8479605 - Flags: review?(jaliu)
Comment on attachment 8479605 [details] [diff] [review]
Patch 5 (v1) - Disable bluetooth push tests until Bug 1058158 is fixed

Review of attachment 8479605 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/bluetooth/tests/marionette/manifest.ini
@@ +3,5 @@
>  browser = false
>  qemu = true
>  
> +# Will uncomment these tests once they are fixed in Bug 1058158
> +# [test_navigate_to_default_url.py]

Nit: Use "disable" to skip the test might be better. E.g.
http://dxr.mozilla.org/mozilla-central/source/dom/mobileconnection/tests/marionette/manifest.ini#30

Doc:
http://mozbase.readthedocs.org/en/latest/manifestparser.html
Fixed tests to use disabled instead of just commenting them out. Thanks :TYLin!
Attachment #8479605 - Attachment is obsolete: true
Attachment #8479605 - Flags: review?(jaliu)
Attachment #8479611 - Flags: review?(jaliu)
Comment on attachment 8479600 [details] [diff] [review]
Patch 4 (v2) - Make marionette webapi tests use transaction events; r=edgar

Review of attachment 8479600 [details] [diff] [review]:
-----------------------------------------------------------------

I definitely grant this changes. However, as far as I know, some telephony 
test cases also make use of settings API. Do we need to modify them as well?
Attachment #8479600 - Flags: review?(hchang) → review+
Forgot to update settings-api-* permissions in tethering tests.
Attachment #8479600 - Attachment is obsolete: true
Attachment #8479600 - Flags: review?(dlee)
Attachment #8479622 - Flags: review?(hchang)
Attachment #8479622 - Flags: review?(dlee)
Comment on attachment 8479611 [details] [diff] [review]
Patch 5 (v2) - Disable bluetooth push tests until Bug 1058158 is fixed

Review of attachment 8479611 [details] [diff] [review]:
-----------------------------------------------------------------

Kyle,
Thank you for your effort. :)
I think it's fine to disable [test_dom_BluetoothAdapter_pair.js] for now since the progress of Bug 1058158 seems to be good.
Attachment #8479611 - Flags: review?(jaliu) → review+
Comment on attachment 8479622 [details] [diff] [review]
Patch 4 (v3) - Make marionette webapi tests use transaction events; r=edgar

Review of attachment 8479622 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me for tethering part, thanks Kyle!
Attachment #8479622 - Flags: review?(dlee) → review+
ah, I asked for ni? before we realized what the problem related to testing was. Been working on Bug 1058158 to fix the issue.
Flags: needinfo?(mdas)
As of bug 1059079, we no longer red Mnw on timeouts. I'm a little worried about how much startup time the settings api might be adding if we're tripping homescreen timeouts, but the emulator is a horrible way to test that.

It turns out that the tethering tests also suffer from not being run in a container with the new settings api (They open in an odd context that then gets screwed up when the system app goes away). Adding test_container = true (which comes in as part of bug 1058158) to the manifest made things run fine even through multiple runs, though. So, I'm going to ride the r+ I already got on the bluetooth to also disable the tethering test until 1058158 comes in. This SHOULD clear us to land settings as soon as the trees open again, though I'm running one last Mnw try to make sure.
Attachment #8479622 - Flags: review?(hchang) → review+
It's rare that I'll throw a patch back for review 2 weeks later, but there were some BIG holes in this.

- SettingsManager registered _callbacks as a prototype member instead of a object member. This meant that in in-process situations, ALL SettingsManagers shared the same notifier callback list. Don't know where we'd run this API in process, but it'd be a bad idea if we did. It was also causing issues with marionettes.

- Similarly, we never handled inner-window-destroyed in SettingsManager for in-process cases. This could leave locks orphaned if their SettingsManager just disappeared without finalizing them. SettingsManager now finalizes on inner-window-destroyed.

- We never handled the case of exceptions being called or program flow being altered (marionetteScriptFinish) in DOMRequest FireSuccess/Error in SettingsManager. Since runOrFinalize was only called after callbacks, this meant we could once again orphan locks. We now throw runOrFinalize on the event loop to run after callbacks, so even if something fails in them, we should complete.

When run locally, these last two things seem to alleviate many of the issues I was seeing with marionette tests. We'll see if that holds true on try.
Attachment #8479226 - Attachment is obsolete: true
Attachment #8481098 - Flags: review?(bent.mozilla)
Attached patch Patch 1 interdiff for v14 to v15 (obsolete) — Splinter Review
Attachment #8481394 - Flags: review?(bent.mozilla)
Attachment #8481394 - Attachment is obsolete: true
Attachment #8481394 - Flags: review?(bent.mozilla)
Attachment #8481401 - Flags: review?(bent.mozilla)
Attachment #8481401 - Attachment is obsolete: true
Attachment #8481401 - Flags: review?(bent.mozilla)
Attachment #8481402 - Flags: review?(bent.mozilla)
Comment on attachment 8481402 [details] [diff] [review]
Patch 1 interdiff for v14 to v15 (v3)

Review of attachment 8481402 [details] [diff] [review]:
-----------------------------------------------------------------

This all makes good sense to me.
Attachment #8481402 - Flags: review?(bent.mozilla) → review+
Keywords: dev-doc-needed
We think this has caused bug 1061107 again.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reclosing because I'm pretty sure bug 1061107 (which is actually bug 1053107) is caused by Bug 1059662, so they're backing out Bluetooth embedding in gaia (I don't have a bug for that). Once that happens, if settings is STILL crashing bluetooth, then we can reopen this or whatever. I think this landing just made the error from bug 1053107 show up in opt as well as debug due to the permissions changes to the API.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
We filed bug 1061437 for backing out the patches of bluetooth embedding, FYR.
Component: Gaia::Settings → DOM: Device Interfaces
Product: Firefox OS → Core
Comment on attachment 8481098 [details] [diff] [review]
Patch 1 (v15) - OOP Settings API

Interdiff was reviewed so removing r?
Attachment #8481098 - Flags: review?(bent.mozilla)
Depends on: 1106324
blocking-b2g: backlog → ---
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: