Closed Bug 1012869 Opened 6 years ago Closed 6 years ago

Sometimes, the app manager can't connect to the simulator. Exception in DebuggerTransport.prototype.onInputStreamReady

Categories

(DevTools :: General, defect, P1)

x86
macOS
defect

Tracking

(firefox31 unaffected, firefox32 fixed, firefox33 fixed)

RESOLVED FIXED
Firefox 33
Tracking Status
firefox31 --- unaffected
firefox32 --- fixed
firefox33 --- fixed

People

(Reporter: paul, Assigned: jryans)

References

Details

Attachments

(1 file, 1 obsolete file)

I sadly don't have access to the stack anymore, and I can't reproduce yet. But twice I couldn't connect to the simulator. The callback in client.connect(callback) didn't get called, and I remember that the error was about DebuggerTransport.prototype.onInputStreamReady.

Trying to reconnect worked (restarting the addon).

This happened after I updates my branch. So I guess it's related to bug 797639.
Managed to reproduce: Handler function DebuggerTransport.prototype.onInputStreamReady threw an exception: [Exception... "Component returned failure code: 0x804b000e (NS_ERROR_NET_TIMEOUT) [nsIInputStream.available]"  nsresult: "0x804b000e (NS_ERROR_NET_TIMEOUT)"  location: "JS frame :: resource://gre/modules/devtools/dbg-client.jsm -> resource://gre/modules/devtools/transport/transport.js :: DebuggerTransport.prototype.onInputStreamReady< :: line 335"  data: no]Line: 335, column: 0

Simulator 2.0
(In reply to Paul Rouget [:paul] (slow to respond. Ping me on IRC) from comment #1)
> Managed to reproduce: Handler function
> DebuggerTransport.prototype.onInputStreamReady threw an exception:
> [Exception... "Component returned failure code: 0x804b000e
> (NS_ERROR_NET_TIMEOUT) [nsIInputStream.available]"  nsresult: "0x804b000e
> (NS_ERROR_NET_TIMEOUT)"  location: "JS frame ::
> resource://gre/modules/devtools/dbg-client.jsm ->
> resource://gre/modules/devtools/transport/transport.js ::
> DebuggerTransport.prototype.onInputStreamReady< :: line 335"  data: no]Line:
> 335, column: 0
> 
> Simulator 2.0

Any special steps beyond "try to connect"?  What exact build of the simulator?

Do you have local patches applied or just tip of fx-team (to increase my repro chances)?
Tip of fx-team (with webide compiled). It happened after I installed the simulator. First time I ran it. On osx. After that, it appears to work.

I suspect a race condition triggered by a cold startup. The other failures also happened after I started the simulators when nothing was in memory/hd cache (I haven't used my mac for days).
Currently we only wait 2 seconds when asked to connect[1], so maybe that's not long enough on cold startup?  However, it's been that way for some time (before bulk data).  

Perhaps bulk data causes this error to be revealed where it would have been ignored before, though.  We now use the input stream in a different way that we did previously, so that's possible.

Also, we should catch this error type as we do other known error types.

I'll take a look at this more tomorrow.

[1]: http://dxr.mozilla.org/mozilla-central/source/toolkit/devtools/client/dbg-client.jsm#2601
Blocks: build-am2
It happened again today with Firefox OS 1.2 and the old app manager (cold startup again). So it's not related to my patch queue or to the B2G.
Priority: -- → P1
Paul, can you try increasing the timeout value on the line I mentioned in comment 4?  Perhaps try a value of 30 instead of 2.

I think that would help... I was trying to reproduce by restarting my laptop, but that's when it died. :(
Flags: needinfo?(paul)
Blocks: enable-webide
No longer blocks: build-am2
Flags: needinfo?(paul)
Paul, if you could you try a build with this patch that would be great.  It appears better for me, but at the same time it's very hard to reproduce.

Try: https://tbpl.mozilla.org/?tree=Try&rev=26db96c9c2b7
Assignee: nobody → jryans
Status: NEW → ASSIGNED
Attachment #8435157 - Flags: review?(paul)
It's very hard to reproduce :( I applied the patch, I'll see if it still happens.
It still happens. For some reason, it happens more often now (not related to this patch). When this patch is applied though, I don't get any exception.
(forget my last comment)
So no, comment 9 is valid: It still happens. When this patch is applied though, I don't get any exception.

I managed to reproduce with 1.2, 1.3, 1.4 and 2.0. With the new app manager and the old one.
I didn't run into these issues on Linux. Might be a mac specific problem.
It happens very often now. But only with osx.
Attachment #8435157 - Flags: review?(paul) → review-
Okay, thanks for trying that patch.  I'll attempt to debug this in more depth today.  There's also bug 1020520 which seems similar, though the reporter there *is* having issues on Linux.
See Also: → 1020520
Alright, I am more confident in this fix.  I've seen the NET_TIMEOUT error occur after this patch, and it now connects successfully after retrying, which is what we want.

The core issue is that after bug 797639, the transport can encounter more exceptions (such as NS_ERROR_NET_TIMEOUT here) due the changes in the way streams are used, but only a few exceptions would close the transport.  Since NS_ERROR_NET_TIMEOUT was not in the list, it remains open, and so connection-manager doesn't attempt to retry connecting like it needs to.

Now we're closing for all exceptions, except NS_BASE_STREAM_WOULD_BLOCK, which is safe for transport to continue with.

Try: https://tbpl.mozilla.org/?tree=Try&rev=9c94bbaebc6a
Attachment #8435157 - Attachment is obsolete: true
Attachment #8437179 - Flags: review?(paul)
Blocks: 1020520
Comment on attachment 8437179 [details] [diff] [review]
Allow STREAM_WOULD_BLOCK, close transport for others

Yes! Thanks a lot for hunting this one.

We'll need to make sure this patch also lands in 32.
Attachment #8437179 - Flags: review?(paul) → review+
No longer blocks: 1020520
See Also: 1020520
Duplicate of this bug: 1020520
https://hg.mozilla.org/mozilla-central/rev/35fbc8cf9f13
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → Firefox 33
Comment on attachment 8437179 [details] [diff] [review]
Allow STREAM_WOULD_BLOCK, close transport for others

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 787639
User impact if declined: Connecting to b2g simulators fails intermittently
Testing completed (on m-c, etc.): m-c
Risk to taking this patch (and alternatives if risky): Low
String or IDL/UUID changes made by this patch: None
Attachment #8437179 - Flags: approval-mozilla-aurora?
(In reply to J. Ryan Stinnett [:jryans] from comment #20)
> Comment on attachment 8437179 [details] [diff] [review]
> Allow STREAM_WOULD_BLOCK, close transport for others
> 
> [Approval Request Comment]
> Bug caused by (feature/regressing bug #): bug 787639

This should be bug 797639.
Comment on attachment 8437179 [details] [diff] [review]
Allow STREAM_WOULD_BLOCK, close transport for others

Aurora approval granted.
Attachment #8437179 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.