Closed Bug 1065916 Opened 10 years ago Closed 10 years ago

WifiTethering crashes b2g

Categories

(Firefox OS Graveyard :: Wifi, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-b2g:2.0+, firefox33 wontfix, firefox34 fixed, firefox35 fixed, b2g-v2.0 verified, b2g-v2.0M verified, b2g-v2.1 verified, b2g-v2.2 verified)

RESOLVED FIXED
2.1 S5 (26sep)
blocking-b2g 2.0+
Tracking Status
firefox33 --- wontfix
firefox34 --- fixed
firefox35 --- fixed
b2g-v2.0 --- verified
b2g-v2.0M --- verified
b2g-v2.1 --- verified
b2g-v2.2 --- verified

People

(Reporter: tkundu, Assigned: hchang)

References

Details

(Keywords: crash, Whiteboard: [b2g-crash])

Attachments

(5 files, 4 obsolete files)

We are seeing crashes during wifi tethering
I analyzed this and I am seeing below logs :

09-10 13:28:54.315   223   762 D TetherController: Sending update msg to dnsmasq [update_ifaces:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0]
09-10 13:28:54.315 20787 20787 E dnsmasq : command read from stdin pipe (update_ifaces:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0)


It seems like we are calling setWifiTethering multiple times without disabling it and it is causing stack corruption dnsmasq . 

Fix should be similar to bug 1062550
[Blocking Requested - why for this release]:
blocking-b2g: --- → 2.0?
Could you please give us a quick patch which can ensure that we are not calling setWifitethering true twice from gaia without disabling it ?
Flags: needinfo?(hchang)
The implementation of enabling/disabling wifi tethering should
have avoided consecutively calling setWifitethering true (in WifiWorker.js)

By the way, I cannot reproduce this issue on 2.0...
Flags: needinfo?(hchang)
I tried rapidly toggling wifi tethering and enabled NetworkUtils.cpp
debug message. According to my log, no dangling command 
'tether interface add wlan0' was sent. Don't know why you saw

'update_ifaces:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0'

in your log.
Component: Stability → Wifi
Keywords: crash
Whiteboard: [b2g-crash]
(In reply to Henry Chang [:henry] from comment #5)
> I tried rapidly toggling wifi tethering and enabled NetworkUtils.cpp
> debug message. According to my log, no dangling command 
> 'tether interface add wlan0' was sent. Don't know why you saw
> 
> 'update_ifaces:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:
> wlan0:wlan0:wlan0:wlan0:wlan0:wlan0:wlan0'
> 
> in your log.

Hey Henry, please NI :tapas if you need anyinfo else triage assumed this is on your plate for investigation.
Full logcat logs for following error:

09-11 21:05:24.959  6408  6408 E dnsmasq : CAF command read from stdin pipe (update_ifaces:wlan0:rndis0:wlan0)

STR:
    Open Settings App : 
    A)  Configure wifi and data connection
    B)  Do Wifi tethering on/off
    C)  Do USB tethering on/off
    D)  Repeat B and C till you see 
        update_ifaces:wlan0:rndis0:wlan0 in logcat logs.

Please don't waste time to crash system . We just need to investigate what is causing update_ifaces:wlan0:rndis0:wlan0 in logcat.
Flags: needinfo?(hchang)
Hi Henry, I'll assign to you first, thanks.
Assignee: nobody → hchang
It seems this issue is not solely a wifi tethering issue.
Not sure if the usb tethering patch would fix this.
Could you please take it another try after the patch is landed?

Thanks

(In reply to Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me) from comment #7)
> Created attachment 8488468 [details]
> logcat_01-01-1970-10-24-51.log.txt
> 
> Full logcat logs for following error:
> 
> 09-11 21:05:24.959  6408  6408 E dnsmasq : CAF command read from stdin pipe
> (update_ifaces:wlan0:rndis0:wlan0)
> 
> STR:
>     Open Settings App : 
>     A)  Configure wifi and data connection
>     B)  Do Wifi tethering on/off
>     C)  Do USB tethering on/off
>     D)  Repeat B and C till you see 
>         update_ifaces:wlan0:rndis0:wlan0 in logcat logs.
> 
> Please don't waste time to crash system . We just need to investigate what
> is causing update_ifaces:wlan0:rndis0:wlan0 in logcat.
Flags: needinfo?(hchang)
Henry means Bug 1062550 in Comment 9.
Flags: needinfo?(tkundu)
(In reply to howie [:howie] from comment #10)
> Henry means Bug 1062550 in Comment 9.

Logs in Comment 7 is after applying patch from Bug 1062550 . If you go through the logs then you will see that we are passing some wrong commands to netd at some point !
Flags: needinfo?(tkundu) → needinfo?(hchang)
Flags: needinfo?(hochang)
(In reply to Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me) from comment #11)
> (In reply to howie [:howie] from comment #10)
> > Henry means Bug 1062550 in Comment 9.
> 
> Logs in Comment 7 is after applying patch from Bug 1062550 . If you go
> through the logs then you will see that we are passing some wrong commands
> to netd at some point !

Nice information! I will continue to investigate based on that.
Thanks!
Flags: needinfo?(hchang)
Looking at the log snippet:

09-11 21:04:24.989   207   207 I Gecko   : -*- CAF NetworkManager: setWifiTethering: true
09-11 21:04:26.759   207   207 I Gecko   : -*- CAF NetworkManager: setWifiTethering: success
09-11 21:04:26.759   207   207 I Gecko   : -*- CAF NetworkManager: setWifiTethering: false
09-11 21:04:27.159   207   207 I Gecko   : -*- CAF NetworkManager: 'tethering.usb.enabled' is now true
09-11 21:04:27.159   207   207 I Gecko   : -*- CAF NetworkManager: handleUSBTetheringToggle: true
09-11 21:04:27.329   207   207 I Gecko   : -*- CAF NetworkManager: 'tethering.usb.enabled' is now false
09-11 21:04:27.519   207   207 I Gecko   : -*- CAF NetworkManager: 'tethering.usb.enabled' is now true
09-11 21:04:27.549   207   207 I Gecko   : -*- CAF NetworkManager: handleUSBTetheringToggle: true
09-11 21:04:27.549   207   207 I Gecko   : -*- CAF NetworkManager: Usb tethering already connecting/connected.
09-11 21:04:27.549   207   207 I Gecko   : -*- CAF NetworkManager: setWifiTethering: success

Gecko tried to enable usb tethering before the previously issued 
wifi tethering disable task finished. I will attempt to defer the
issued wifi tethering task whenever there is a pending usb tethering
task. I won't address this issue with a sophisticated solution
since TetheringManager has been introduced in the upstream and it
can be used to solve the entire problem perfectly.
(In reply to Henry Chang [:henry] from comment #13)
> I will attempt to defer the
> issued wifi tethering task whenever there is a pending usb tethering
> task. 

Thanks for help. Consider STR in comment 7. We are trying to enable USB tethering just enabling/disabling wifi tethering. We also need to handle this case. 

> I won't address this issue with a sophisticated solution
> since TetheringManager has been introduced in the upstream and it
> can be used to solve the entire problem perfectly.

Could you please point me to source code of TetheringManager in gecko. Is it in v2.0 ?
Flags: needinfo?(hchang)
Attached patch Bug1065916.diff (obsolete) — Splinter Review
This is the quickest fix that I could come out. It combines the usb tethering patch. However, since I am not in Taipei, I cannot verify this until I get
a SIM card which can connect to data network.
Flags: needinfo?(hchang)
(In reply to Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me) from comment #14)
> (In reply to Henry Chang [:henry] from comment #13)
> > I will attempt to defer the
> > issued wifi tethering task whenever there is a pending usb tethering
> > task. 
> 
> Thanks for help. Consider STR in comment 7. We are trying to enable USB
> tethering just enabling/disabling wifi tethering. We also need to handle
> this case. 
> 
> > I won't address this issue with a sophisticated solution
> > since TetheringManager has been introduced in the upstream and it
> > can be used to solve the entire problem perfectly.
> 
> Could you please point me to source code of TetheringManager in gecko. Is it
> in v2.0 ?

http://hg.mozilla.org/mozilla-central/file/56cba2986c61/dom/tethering/TetheringManager.js
(In reply to Henry Chang [:henry] from comment #16)
> > > I won't address this issue with a sophisticated solution
> > > since TetheringManager has been introduced in the upstream and it
> > > can be used to solve the entire problem perfectly.
> > 
> > Could you please point me to source code of TetheringManager in gecko. Is it
> > in v2.0 ?
> 
> http://hg.mozilla.org/mozilla-central/file/56cba2986c61/dom/tethering/
> TetheringManager.js

I can see TetheringManager.js only in Master , not in FFOS v2.0 . Does it mean that FFOS Master branch already has a fix for this kind of race issue ?
Flags: needinfo?(hchang)
Comment on attachment 8489390 [details] [diff] [review]
Bug1065916.diff

Review of attachment 8489390 [details] [diff] [review]:
-----------------------------------------------------------------

I cannot reproduce this issue anymore after applying this patch. But I would like to know whether this is best solution or not :) .

If you are confident that this will not create new issues then please land this patch asap :) . Thanks for your help .

Please also uplift this to both Master/2.1 branch if applicable.
Attachment #8489390 - Flags: feedback+
(In reply to Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me) from comment #17)
> (In reply to Henry Chang [:henry] from comment #16)
> > > > I won't address this issue with a sophisticated solution
> > > > since TetheringManager has been introduced in the upstream and it
> > > > can be used to solve the entire problem perfectly.
> > > 
> > > Could you please point me to source code of TetheringManager in gecko. Is it
> > > in v2.0 ?
> > 
> > http://hg.mozilla.org/mozilla-central/file/56cba2986c61/dom/tethering/
> > TetheringManager.js
> 
> I can see TetheringManager.js only in Master , not in FFOS v2.0 . Does it
> mean that FFOS Master branch already has a fix for this kind of race issue ?

TetheringManager would not fix this kind of racing issue but it's
a good place to manage all the racing condition. Current wifi/usb tethering
is controlled by settings and is scattered in different places. 
(WifiWorker.js observes wifi tethering settings change and NetworkManager.js 
observes usb tethering settings change.)

Besides, TetheringManager is not formally being used by gaia. wifi/usb tethering is
still controlled by setting at this time.

So I think the patch I just attached is the optimal solution until TetheringManager
is used by gaia to controll all kinds of tethering.

I will firstly try to land on master after the USB tethering patch (Bug 1062550) lands.

Thanks.
Flags: needinfo?(hchang)
Attached patch Bug1065916.patch (obsolete) — Splinter Review
Hi Chuck,

Could you please help review this patch regarding switching wifi/usb tethering quickly issue? There is another patch which fixes a similar issue (Bug 1062550) 
and this bug can be solved with these two patches. Thanks
Attachment #8489390 - Attachment is obsolete: true
Flags: needinfo?(hochang)
(In reply to Henry Chang [:henry] from comment #20)
> Created attachment 8489892 [details] [diff] [review]
> Bug1065916.patch
> 
> Hi Chuck,
> 
> Could you please help review this patch regarding switching wifi/usb
> tethering quickly issue? There is another patch which fixes a similar issue
> (Bug 1062550) 
> and this bug can be solved with these two patches. Thanks

This patch is not dependent on patch in Bug 1062550 but with both patches
applied could all wifi/usb tethering racing issue get fixed.
Comment on attachment 8489892 [details] [diff] [review]
Bug1065916.patch

Review of attachment 8489892 [details] [diff] [review]:
-----------------------------------------------------------------

If we have to pend Wifi Tethering while USB Tethering is running, do we have to do the same to make USB Tethering waiting for Wifi Tethering?
(In reply to Chuck Lee [:chucklee] from comment #22)
> Comment on attachment 8489892 [details] [diff] [review]
> Bug1065916.patch
> 
> Review of attachment 8489892 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> If we have to pend Wifi Tethering while USB Tethering is running, do we have
> to do the same to make USB Tethering waiting for Wifi Tethering?

I think we need.... Thanks for reminding!
This is blocking CAF-v2.0-CC-metabug but I let Release Management making this call.
Flags: needinfo?(bbajaj)
Attached patch Bug1065916_v2.patch (obsolete) — Splinter Review
To address all the potential wifi/usb tethering interruption issue.
Attachment #8489892 - Attachment is obsolete: true
Attachment #8491542 - Flags: review?(chulee)
Comment on attachment 8491542 [details] [diff] [review]
Bug1065916_v2.patch

Review of attachment 8491542 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.
But I am concerning that usb/wifi tethering might facing same, endless timing issue as wifi/hotspot interaction now.
All patches we are doing now and in the future are workarounds to deal with the restriction of using settings.

I think we should encourage, or push, Gaia to use newly-introduced APIs to control wifi/hotspot/tethering instead of settings, and control the interaction of these function(like mutex between wifi and hotspot mode, and between usb and wifi tethering) on their own.

Rapid tapping is not a normal use case for user, I don't think such issues are critical enough for uplifting.

::: dom/system/gonk/NetworkManager.js
@@ +710,4 @@
>  #endif
>    },
>  
> +  _usbTetheringRequestCount: 0,

It seems we don't need request count for usb tethering since we only cares about last request, as described in |handleLastUsbTetheringRequest|.
Use a flag to indicate we need to handle usb tethering request should be enough?
Attachment #8491542 - Flags: review?(chulee) → review+
(In reply to Chuck Lee [:chucklee] from comment #26)
> Comment on attachment 8491542 [details] [diff] [review]
> Bug1065916_v2.patch
> 
> Review of attachment 8491542 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good to me.
> But I am concerning that usb/wifi tethering might facing same, endless
> timing issue as wifi/hotspot interaction now.
> All patches we are doing now and in the future are workarounds to deal with
> the restriction of using settings.
> 
> I think we should encourage, or push, Gaia to use newly-introduced APIs to
> control wifi/hotspot/tethering instead of settings, and control the
> interaction of these function(like mutex between wifi and hotspot mode, and
> between usb and wifi tethering) on their own.
> 
> Rapid tapping is not a normal use case for user, I don't think such issues
> are critical enough for uplifting.
> 
> ::: dom/system/gonk/NetworkManager.js
> @@ +710,4 @@
> >  #endif
> >    },
> >  
> > +  _usbTetheringRequestCount: 0,
> 
> It seems we don't need request count for usb tethering since we only cares
> about last request, as described in |handleLastUsbTetheringRequest|.
> Use a flag to indicate we need to handle usb tethering request should be
> enough?

Thanks Chuck! Using TetheringManager is indeed easier to fix this kind of
racing issue.

_usbTetheringRequestCount is a kind of a tri-state (no request, 1 request, more than 1 requests)
so it cannot be replaced with a binary flag. Thansk!
Hi Tapas,

I did more work to make usb/wifi tethering request executed exclusively.
I manually tested it and it works well. If you have any specific testing script,
could you please try it again?

Thanks.
Flags: needinfo?(tkundu)
blocking-b2g: 2.0? → 2.0+
Flags: needinfo?(bbajaj)
This patch is definitely required by branched repos (2.0/aurora).
It prevents from NetworkService.setWifiTethering/setUsbTethering.
executing interleavely, which is the root cause of the crash.

However, for m-c, Bug 1071450 is filed to address the root cause
mentioned above. I am sure if the patch in this bug is still required
after we fix Bug 1071450. (The caller, NetworkManager, maintains
additional state across the use of these two functions.)

So, my plan is to land this patch on m-c/2.0/2.1 first. After
Bug 1071450 is fixed, we can check what we can either simplify 
or remove the patch in this bug.

Chuck, what do you think of my plan? Thanks!
Flags: needinfo?(chulee)
Sounds good to me.
I still prefer not to resolve such issue by adding command management mechanism like command queue or mutex in the follow up bug, but only return error to tethering API call while there's command already running(with modification of Gaia to use API instead of settings).
Flags: needinfo?(chulee)
FYI, gaia master branch has made usb/wifi tethering able to enable exclusively.
I think API should protect the system from incorrect API operation, not rely on app's behavior.
Most app developers are not aware of such calling limitations, returning error is easier for gecko implementation and apps to understand and handle.
Attachment #8493654 - Attachment description: Bug1065916 v4 → Bug1065916 v4 (r+'d)
Attachment #8492083 - Attachment is obsolete: true
Attachment #8491542 - Attachment is obsolete: true
Keywords: checkin-needed
(In reply to Henry Chang [:henry] from comment #35)
> https://tbpl.mozilla.org/?tree=Try&rev=e6cf032a0e62

This patch only touches dom/system/gonk and as such, an "all" run touching all platforms & test suites is extremely wasteful. Please be more conscientious of your Try usage so as to avoid contributing to backlogs that affect all developers and products.
https://wiki.mozilla.org/Sheriffing/How:To:Recommended_Try_Practices
Flags: needinfo?(hchang)
https://hg.mozilla.org/mozilla-central/rev/685ec81763f8
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2.1 S5 (26sep)
Please request aurora and b2g32 approval on this patch when you get a chance.
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #36)
> (In reply to Henry Chang [:henry] from comment #35)
> > https://tbpl.mozilla.org/?tree=Try&rev=e6cf032a0e62
> 
> This patch only touches dom/system/gonk and as such, an "all" run touching
> all platforms & test suites is extremely wasteful. Please be more
> conscientious of your Try usage so as to avoid contributing to backlogs that
> affect all developers and products.
> https://wiki.mozilla.org/Sheriffing/How:To:Recommended_Try_Practices

Will be conscientious next time. Sorry for that!
Flags: needinfo?(hchang)
Comment on attachment 8493654 [details] [diff] [review]
Bug1065916 v4 (r+'d)

Approval Request Comment
[Feature/regressing bug #]: Wifi
[User impact if declined]: Rapidly switch on/off usb/wifi tethering may cause b2g to crash
[Describe test coverage new/current, TBPL]: Manually test and QC test script
[Risks and why]: No
[String/UUID change made/needed]: No
Attachment #8493654 - Flags: approval-mozilla-b2g32?
Attachment #8493654 - Flags: approval-mozilla-aurora?
Attachment #8493654 - Flags: approval-mozilla-b2g32?
Attachment #8493654 - Flags: approval-mozilla-b2g32+
Attachment #8493654 - Flags: approval-mozilla-aurora?
Attachment #8493654 - Flags: approval-mozilla-aurora+
This works fine for me !
Flags: needinfo?(tkundu)
This issue has been verified successfully on Flame 2.0, 2.1, 2.2 and woodduck 2.0.
See attachment: 1727.MP4
Reproducing rate: 0/5

Step:
1.Launch Settings.
2.Tap "Internet sharing".
3.Do Wifi tethering on/off.
4.Do USB tethering on/off.
5.Repeat step 3 and 4.

Woodduck version:
Gaia-Rev        d742e375aca6dc1bf3a36638000ad7f5338ef457
Gecko-Rev       d049d4ef127844121c9cf14d2e8ca91fd9045fcb
Build-ID        20141126050313
Version         32.0
Device-Name     jrdhz72_w_ff
FW-Release      4.4.2
FW-Incremental  1416949523
FW-Date         Wed Nov 26 05:05:47 CST 2014

Flame 2.1 version:
Gaia-Rev        1bdd49770e2cb7a7321e6202c9bf036ab5d8f200
Gecko-Rev       https://hg.mozilla.org/releases/mozilla-2g34_v2_1/rev/db893274d9a6
Build-ID        20141125001201
Version         34.0
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141125.040617
FW-Date         Tue Nov 25 04:06:28 EST 2014
Bootloader      L1TC00011880

Flame 2.2 version:
Gaia-Rev        824a61cccec4c69be9a86ad5cb629a1f61fa142f
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/acde07cb4e4d
Build-ID        20141125040209
Version         36.0a1
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141125.113029
FW-Date         Tue Nov 25 11:30:43 EST 2014
Bootloader      L1TC00011880

Flame 2.0 version:
Gaia-Rev        99e4594c66aa3738d58b0cb44bd885a87a063b6e
Gecko-Rev       https://hg.mozilla.org/releases/mozilla-2g32_v2_0/rev/f91abc6127d9
Build-ID        20141125000201
Version         32.0
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141125.035023
FW-Date         Tue Nov 25 03:50:34 EST 2014
Bootloader      L1TC00011880
Attached video 1727.MP4
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: