Closed Bug 1112786 Opened 7 years ago Closed 6 years ago

cannot create new rooms

Categories

(Hello (Loop) :: Client, defect, P2)

x86
macOS
defect
Points:
1

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1150052
Iteration:
41.1 - May 25
Blocking Flags:
backlog backlog+

People

(Reporter: blassey, Assigned: standard8)

References

Details

(Whiteboard: [investigating][watch])

If I click on the hello icon and then click on the "start a conversation" button, the dialog disappears and nothing happens
jesup suggested restarting, which I did and it fixed this. We should probably have better error messaging though.
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #1)
> jesup suggested restarting, which I did and it fixed this. We should
> probably have better error messaging though.

Definitely.  My belief is that the server had an issue, and we didn't recover well from it. And the browser restart fixed it.  (0.13.2 has several bugs.)   The 0.14.1 server update is still not out.  It was going to go out today;  it should go out tomorrow.

You also may consider turning on additional prefs for debugging, which are described here: https://wiki.mozilla.org/Loop/Logging  (debug.loglevel set to All and debug.websocket set to True are probably the most useful.)

During the work week and during the QPR with TokBox, we said we need to improve error messages (especially as a result of server problems --- and in the process verify that the client recovers well from those problems), improve our metrics, and pay down tech debt immediately (in Fx37 and Fx38).

If this happens again, please feel free to ping folks in #loop (me, standard8, dmose) while it's happening, and we'll try to track it down with you.  (I realize it's not always convenient/possible to do that, but when you can, that'd be appreciated.)
Getting debug output when clicking on the button, with all the loop.debug.* prefs set (except for loop.debug.sdk maybe) would be most useful.

It would also be useful to have all of the browser console logging on as well.

There's a few places the failures could be, so I can't point at anywhere specific without a little bit more details.
depending on if the latest server reconnect fixes - and if there's logs on reproducing - and how many times it happens again...
backlog: --- → Fx36+
Priority: -- → P2
Whiteboard: [investigating]
I'm bumping this down to backlog and "watch" while we keep investigating.  The server has just seen a major upgrade (lots of bug fixes).  I'm very interested to know if anyone can repro this post-server upgrade.
backlog: Fx36+ → backlog
Priority: P2 → --
Whiteboard: [investigating] → [investigating][watch]
I've just been talking to Brad about his latest issue. This i characterised by:

"[Dispatcher] Dispatching action" [object DeadObject] dispatcher.js:71:8
"Failed to clone value:" TypeError: cb is null
Stack trace:
openChat@chrome://browser/content/socialchat.xml:541:1
Chat.open@resource://app/modules/Chat.jsm:119:19
MozLoopServiceInternal.openChatWindow@resource:///modules/loop/MozLoopService.jsm:900:5
this.MozLoopService.openChatWindow@resource:///modules/loop/MozLoopService.jsm:1236:12
LoopRoomsInternal.open@resource://app/modules/loop/LoopRooms.jsm:331:5
this.LoopRooms.open@resource://app/modules/loop/LoopRooms.jsm:541:12
injectObjectAPI/</injectedAPI[func]@resource://app/modules/loop/MozLoopAPI.jsm:157:54

Note: the cb stands for "chatbox".

From my reading of socialchat.xml, it looks like we may be getting a situation where a chatbox has been closed so the window has gone away, but not completely - socialchat is keeping a weak reference to the chatbox and that has gone away, but socialchat doesn't appear to have been fully updated.

I'm also a little worried about the dispatching an action which is a DeadObject, my first assumption was that the console just didn't have access to the object by the time it had got around to display the result (there's a known bug where the browser console doesn't display everything it should when a window is closed), but maybe its a bit of a clue to what is going on.
Duplicate of this bug: 1124498
backlog: backlog+ → backlog-
I reached out to Mark and then Shell today to ask if we can pull this out of backlog in order to add better logging.  Everyone agrees this bug is real and potentially scary, but not actionable because we don't know what's going on from the current logs.  Adding more logs should give us the visibility we need.  I'm needinfo'ing Mark since he has done some investigation already and (I believe) has the best idea of what logging to add. 

Mark - If you can describe the additional logging you'd like to see, then anyone who has some time soon can code it up.  Once that's in place, I can ask blassey to retest with it.  Thanks!!
backlog: backlog- → backlog+
Rank: 29
Priority: -- → P2
Needinfo'ing Mark for real this time. :-)  (Mark -- See Comment 8 above.)
Flags: needinfo?(standard8)
I'm going to see what logging I can put in, I think though, I might have to do special try builds rather than logging that can last - as I can't see any obvious place where this is.

I'm working on bug 1140547 first though, as that seems to be a case that may be slightly more reproducible.
Assignee: nobody → standard8
Iteration: --- → 39.2 - 23 Mar
Points: --- → 1
Flags: needinfo?(standard8)
Flags: firefox-backlog+
Rank: 29 → 20
Iteration: 39.2 - 23 Mar → 40.1 - 13 Apr
Blocks: 1152213
Iteration: 40.1 - 13 Apr → 40.2 - 27 Apr
Iteration: 40.2 - 27 Apr → 40.3 - 11 May
Iteration: 40.3 - 11 May → 41.1 - May 25
Without further details, and having investigated the code paths a few times a while ago, my best guess is that the patch in bug 1150052 that landed today is most likely to fix this.

Hence I'm going to mark it as a duplicate. If it occurs again with a build from after that bug, please reopen or file a new bug - including the browser console output will also help.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1150052
You need to log in before you can comment on or make changes to this bug.