Opening the Loop panel before automatic registration causes rooms not to be displayed

VERIFIED FIXED in Firefox 35

Status

defect
P1
normal
VERIFIED FIXED
5 years ago
4 years ago

People

(Reporter: standard8, Assigned: MattN)

Tracking

unspecified
mozilla36
Points:
2
Dependency tree / graph
Bug Flags:
firefox-backlog +
qe-verify +

Firefox Tracking Flags

(firefox35 verified, firefox36 verified)

Details

(Whiteboard: [rooms])

Attachments

(1 attachment, 4 obsolete attachments)

Reporter

Description

5 years ago
STR:

1) In a profile with rooms enabled, create some rooms.
2) Restart Firefox, and open the panel as soon as it comes up.
3) Switch to the Rooms display

Expected Results

After registration completes, the rooms are displayed

Actual Results

No rooms are displayed. Lots of errors on the console along the lines of:

Failed to register with Loop server with sessionType 1" Error: error: channel already registered:
onRegistered@resource:///modules/loop/MozLoopService.jsm:330:20
MozLoopPushHandler.register@resource:///modules/loop/MozLoopPushHandler.jsm:116:7
MozLoopServiceInternal.promiseRegisteredWithPushServer/registerForNotification/<@resource:///modules/loop/MozLoopService.jsm:344:9
Reporter

Updated

5 years ago
Iteration: --- → 36.2
Priority: -- → P1
Assignee: nobody → mdeboer
Status: NEW → ASSIGNED
Points: --- → 2
Whiteboard: [rooms]
Flags: qe-verify+
Flags: needinfo?(mmucci)
Flags: firefox-backlog+
Reporter

Comment 3

5 years ago
Comment on attachment 8518868 [details] [diff] [review]
Patch v2: deal with race condition when the push server returns an error when attempting to register the same channel twice

Review of attachment 8518868 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/loop/LoopRooms.jsm
@@ +130,5 @@
>        version = null;
>      }
>  
>      Task.spawn(function* () {
> +      yield MozLoopService.promiseRegisteredWithServers(LOOP_SESSION_TYPE.FXA);

The additional parameter here isn't used, and I can't understand why it would be needed.

Was this change a mistake?

::: browser/components/loop/MozLoopService.jsm
@@ +318,5 @@
>        log.debug("registerForNotification", channelID);
>        return new Promise((resolve, reject) => {
>          function onRegistered(error, pushUrl) {
>            log.debug("registerForNotification onRegistered:", error, pushUrl);
> +          let raceURL = MozLoopServiceInternal.pushHandler.registeredChannels[channelID];

This needs a comment to explain what's happening - the fact that we may be handling urls that have already been registered due to timing issues.
As discussed over Vidyo, things are a bit hairier than I expected:

 - `promiseRegisteredWithServers(1)` and `promiseRegisteredWithServers(2)` are racing.
 - Both invoke `promiseRegisteredWithPushServer()`. The first wins the race but takes longer to return to the client with the pushURL, so the second one - loser - gets back to the client earlier with an error message, saying the channel is already registered.
 - OUCH!
Attachment #8518868 - Attachment is obsolete: true
Attachment #8518868 - Flags: review?(standard8)
Attachment #8518919 - Flags: review?(standard8)
Added to IT 36.2
Flags: needinfo?(mmucci)
Reporter

Comment 6

5 years ago
This seems to fail xpcshell-tests:

 0:04.36 TEST_STATUS: Thread-37 test_register_websocket_success_loop_server_fail FAIL [test_register_websocket_success_loop_server_fail : 25] Expected no errors in websocket registration - "undefined" == "404"
/Users/mark/loop/gecko-dev/objdir-ff/_tests/xpcshell/browser/components/loop/test/xpcshell/test_loopservice_registration.js:test_register_websocket_success_loop_server_fail/<:25
resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:Handler.prototype.process:871
resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:this.PromiseWalker.walkerLoop:747
/Users/mark/loop/gecko-dev/testing/xpcshell/head.js:_do_main:184
/Users/mark/loop/gecko-dev/testing/xpcshell/head.js:_execute_test:372
Marco, please re-assign this bug to me.
Assignee: mdeboer → jaws
Flags: needinfo?(mmucci)
Added to IT 36.2
Flags: needinfo?(mmucci)
Comment on attachment 8518919 [details] [diff] [review]
Patch v3: deal with race condition when the push server returns an error when attempting to register the same channel twice

Review of attachment 8518919 [details] [diff] [review]:
-----------------------------------------------------------------

I think another solution to consider is changing promiseRegisteredWithPushServer to do the FxA and guest push registrations separately (with a sessionType arg). Then we would use deferredRegistrations instead of adding a new deferred variable that we have to keep in sync.

::: browser/components/loop/MozLoopService.jsm
@@ +106,5 @@
>  let gFxAOAuthClient = null;
>  let gErrors = new Map();
>  let gLastWindowId = 0;
>  let gConversationWindowData = new Map();
> +let gRegisterWithPushServerPromise = null;

I think we should stop adding globals and add this as a private member as discussed in bug 1079198 and as I already started. It's much easier to test a member over a non-exported global.
/r/383 - Bug 1095379 - WIP to separate push registrations by sessionType.

Pull down this commit:

hg pull review -r 228c6e602fc199e1ca99db7af85ee4711159ea52
Comment on attachment 8520253 [details]
MozReview Request: bz://1095379/MattN

I think this approach is easier to maintain and it's something I wanted to do anyways but left it as-is before to minimize changes.
Attachment #8520253 - Attachment description: MozReview Request: bz://1095379/MattN → MozReview Request: bz://1095379/MattN (Option 4)
Attachment #8520253 - Flags: feedback?(standard8)
Iteration: 36.2 → 36.3
We are already planning to fix this for Rooms Fx35, but the blocking flag wasn't set.

Currently, the decision is whether to go with Mike's approach or MattN's approach to fix this bug.
backlog: --- → Fx35+
(In reply to Maire Reavy [:mreavy] (Plz needinfo me) from comment #13)
> We are already planning to fix this for Rooms Fx35, but the blocking flag
> wasn't set.
> 
> Currently, the decision is whether to go with Mike's approach or MattN's
> approach to fix this bug.

Mark -- as reviewer, which approach do you prefer?
Flags: needinfo?(standard8)
Reporter

Comment 15

5 years ago
I'll look at this in the morning.
Flags: needinfo?(standard8)
Reporter

Comment 16

5 years ago
Comment on attachment 8520253 [details]
MozReview Request: bz://1095379/MattN

Ok, I like this version, as it's reducing the registrations when we don't need them.

However, I'm still slightly concerned that we're leaving ourselves open to a bug which Mike's second patch was trying to fix - namely that of stopping reentrancy failing.

Although I don't think that can happen with the current flow, I'd prefer to remove the possibility now, rather than finding another issue with it later.

Do you think it'd be possible to come up with a patch that combines the two approaches?
Attachment #8520253 - Flags: feedback?(standard8)
Reporter

Updated

5 years ago
Attachment #8518919 - Flags: review?(standard8)
Attachment #8520253 - Attachment description: MozReview Request: bz://1095379/MattN (Option 4) → MozReview Request: bz://1095379/MattN
Attachment #8520253 - Flags: review?(standard8)
/r/383 - Bug 1095379 - Separate push registrations by sessionType and prevent calling promiseRegisteredWithPushServer from outside of promiseRegisteredWithServers. r=Standard8

Pull down this commit:

hg pull review -r df971242191b196191d97e91523860fa2d98870d
Attachment #8518919 - Attachment is obsolete: true
Assignee: jaws → MattN+bmo
I confirmed that the patch fixes comment 0
Reporter

Updated

5 years ago
Attachment #8520253 - Flags: review?(standard8) → review+
Reporter

Comment 19

5 years ago
https://reviewboard.mozilla.org/r/381/#review241

Looks good, seems to work in the various scenarios I tested.
https://hg.mozilla.org/mozilla-central/rev/4a505fb25edb
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Whiteboard: [rooms][fixed-in-fx-team] → [rooms]
Target Milestone: --- → mozilla36
qe-verify- because this already has test coverage. Please need-info me if this needs testing beyond that coverage.
Flags: qe-verify+ → qe-verify-
Anthony, it doesn't have test coverage. There were fixes to a test but no regression test for the fix.
Flags: qe-verify?
Flags: qe-verify-
Flags: needinfo?(anthony.s.hughes)
Flags: in-testsuite+
(In reply to Matthew N. [:MattN] (UTC+1 until Nov. 22; away Nov. 14-19) from comment #24)
> Anthony, it doesn't have test coverage. There were fixes to a test but no
> regression test for the fix.

Certainly something we can verify then, but can this have additional test coverage added? It's not really something we should be spending QE time on repeatedly checking indefinitely.
Flags: qe-verify?
Flags: qe-verify+
Flags: needinfo?(anthony.s.hughes)
Verified fixed 36.0a2 (2014-12-17), 35b4 Win 7
Status: RESOLVED → VERIFIED
Attachment #8520253 - Attachment is obsolete: true
Attachment #8618569 - Flags: review+
You need to log in before you can comment on or make changes to this bug.