Closed Bug 1142384 Opened 5 years ago Closed 5 years ago
[MTBF] System crashed, wifi keep spinning
111.12 KB, application/octet-stream
175 bytes, text/plain
67.04 KB, text/plain
1.01 KB, patch
|Details | Diff | Splinter Review|
3.39 KB, patch
|Details | Diff | Splinter Review|
STR: Run mtbf-test for more than 12 hours Reproduce rate: low Wifi spinning and can't be brought up. Logcat keeps printing "[Parent][MessageChannel] Error: Channel error: cannot send/recv" might be a ipc error.
Version info: Build ID 20150310162504 Gaia Revision 5af6f8d5d6161dea02002634c6d0a570a122e5dd Gaia Date 2015-03-10 19:17:12 Gecko Revision https://hg.mozilla.org/releases/mozilla-b2g37_v2_2/rev/ec87adb8cf13 Gecko Version 37.0 Device Name flame Firmware(Release) 4.4.2 Firmware(Incremental) eng.cltbld.20150310.200728 Firmware Date Tue Mar 10 20:07:39 EDT 2015 Bootloader L1TC100118D0
Vincent, can you provide comment for this issue?
Crash ID: bp-3e3cbe1f-7e4a-4d3f-bc8a-45bb72150312
It seems wpa_supplicant and wifi driver work fine. I use start wpa_supplicant command and wpa_cli to verify it manually. After restart b2g, I could use settings app to turn on/off wifi, and get AP list from wpa_cli scan command. However, I still could not see the scan list shown up on the settings apps. Not sure what's happened here, may need to put some debug logs.
Vincent - would you like to provide build or patch so that we can get more information? thanks.
Henry, Since Vincent isn't in Taipei, can you please check this issue? Thanks.
(In reply to Ken Chang[:ken](OOO from 2/18 to 3/1) from comment #8) > Henry, Since Vincent isn't in Taipei, can you please check this issue? > Thanks. No problem. I'll take it a look!
I actually don't any connection between the crash and the wifi issue...
This bug crashed and accidentally few logs left, so can't tell if above log appeared. Will try to reproduce in next round.
As Arthur suggested, move around the window.performance.mark('wifiListStart') to avoid race condition.
Crash Signature: [@ mozilla::RefPtr<mozilla::AudioInitTask>::~RefPtr() ]
The bug mentioned in comment 13 is going to fix and land in Bug 1146208, and the crash doesn't related to WiFi in comment 10. So we would like to drop the bug and let others people to jump in.
Assignee: hchang → nobody
Look like a crash in audioTrack, Bobby do we have a chance to see this?
ni? Alastor for audioTrack related
Hi Steven, could you have comments? I saw call stack is strange, it looks not possible to crash in audioInitTask.
Flags: needinfo?(bchien) → needinfo?(slee)
(In reply to Bobby Chien [:bchien] from comment #17) > Hi Steven, could you have comments? I saw call stack is strange, it looks > not possible to crash in audioInitTask. Agree. 1. As the call stack shows, it crashed at "libxul.so!mozilla::RefPtr<mozilla::AudioInitTask>::~RefPtr() + 0xa", but AudioInitTask is running on "CubeInit" thread, . 2. From the call stack, the crash thread should be SocketTransportService thread. So that I think it should not be an audio related problem.  https://dxr.mozilla.org/mozilla-central/source/dom/media/AudioStream.h#422
cancel ni? per comment 18
Jason, This issue is very rarely appear. However, it looks like crashed in HTTP stack. could you help to have comment on this? Thanks.
Doug, could you help to find someone to take a look on this bug? Thanks.
There are a lot of FennecAndroid crash report pointing to the same crash signature "mozilla::RefPtr<mozilla::AudioInitTask>::~RefPtr()", in each report the crash happens in different kind of threads. Maybe this problem is not related to specific thread.
The function name of the call stack might be wrong because the |Release| and auto pointer destructor will be optimize to one function instance. We can see a lot of functions are mapping to the same address. |AsyncLatencyLogger::Release()| and |mozilla::RefPtr<mozilla::AudioInitTask>::~RefPtr()| happen to be the first entry of the group of that kind of functions in symbol file. I think the following call stack is more reasonable by investigating the source code. > 0 libxul.so!nsCOMPtr_base::~nsCOMPtr_base() > 1 libxul.so!mozilla::net::EventTokenBucket::~EventTokenBucket() [nsCOMPtr.h : 344 + 0x7] > 2 libxul.so!mozilla::net::EventTokenBucket::~EventTokenBucket() [EventTokenBucket.cpp:ec87adb8cf13 : 133 + 0x3] > 3 libxul.so!mozilla::net::EventTokenBucket::Release() > 4 libxul.so!mozilla::net::nsHttpConnectionMgr::OnMsgUpdateRequestTokenBucket(int, void*) [nsRefPtr.h : 47 + 0x5] > 5 libxul.so!mozilla::net::nsHttpConnectionMgr::nsConnEvent::Run() [nsHttpConnectionMgr.h:ec87adb8cf13 : 631 + 0xb]
maybe garvan can take a look. bounce it back if you can't.
Flags: needinfo?(dougt) → needinfo?(gkeeley)
I don't know this code, and plate is full ATM, so I'll have to bounce it. Some obvious things to try would be to null check param in OnMsgUpdateRequestTokenBucket() https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpConnectionMgr.cpp#530 Might be worth pinging Patrick McManus, author of the code: https://hg.mozilla.org/mozilla-central/diff/ecf37b2b9a96/netwerk/protocol/http/nsHttpConnectionMgr.cpp Is there any useful/meaningful way to sanity check "param"? Is it possible that the EventTokenBucket had its refcount drop to zero before it gets to OnMsgUpdateRequestTokenBucket. I assume there is some async behaviour in this code, perhaps that introduces that possibility. If there is going to be guessing happening, it would great to find some way to increase to probability of this crash. Not knowing the tests involved, I don't know if they can kicked into overdrive to trigger this bug faster.
(In reply to Garvan Keeley [:garvank] from comment #25) > I don't know this code, and plate is full ATM, so I'll have to bounce it. > > Some obvious things to try would be to null check param in > OnMsgUpdateRequestTokenBucket() > https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/ > nsHttpConnectionMgr.cpp#530 > > Might be worth pinging Patrick McManus, author of the code: > https://hg.mozilla.org/mozilla-central/diff/ecf37b2b9a96/netwerk/protocol/ > http/nsHttpConnectionMgr.cpp > > Is there any useful/meaningful way to sanity check "param"? Is it possible > that the EventTokenBucket had its refcount drop to zero before it gets to > OnMsgUpdateRequestTokenBucket. I assume there is some async behaviour in > this code, perhaps that introduces that possibility. > > If there is going to be guessing happening, it would great to find some way > to increase to probability of this crash. Not knowing the tests involved, I > don't know if they can kicked into overdrive to trigger this bug faster. Paul, are we hitting this now? Or can you trigger a local run to see if we can catch the test results and get more info here?
Haven't seen this issue for long time. I can try and see in our next trigger.
see comment #25
so that member is only supposed to be assigned on the socket thread, and the stack trace looks fine.. however I did find one place where it is assinged on the main thread and during a pref change and that could be racing against the stack trace we see.. the backtrace that is included here has only 2 seconds of uptime, so it makes sense that it is reading the startup prefs. I'm not certain this is your issue, but its worth giving it a try
Assignee: nobody → mcmanus
Status: NEW → ASSIGNED
Please request b2g37 approval on this patch when you get a chance.
Comment on attachment 8597632 [details] [diff] [review] eventtokenbucket thread management NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings. [Approval Request Comment] Bug caused by (feature/regressing bug #): long standing latent bug User impact if declined: potential startup crashes. seen in qa mtbf test Testing completed: regression only Risk to taking this patch (and alternatives if risky): very low. it has had a month of platform coverage String or UUID changes made by this patch: none
Attachment #8597632 - Flags: approval-mozilla-b2g37?
As comment 34 and comment 35, ni Josh to aware last minute request for v2.2.
Attachment #8597632 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
You need to log in before you can comment on or make changes to this bug.