Closed Bug 1066874 Opened 10 years ago Closed 6 years ago

[Dialer] User can be put into a state where there is no one on the other line in a phone call and user is unable to end call

Categories

(Firefox OS Graveyard :: Gaia::Dialer, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(b2g-v2.0 affected, b2g-v2.1 affected, b2g-v2.2 affected)

RESOLVED WONTFIX
Tracking Status
b2g-v2.0 --- affected
b2g-v2.1 --- affected
b2g-v2.2 --- affected

People

(Reporter: jschmitt, Unassigned)

References

()

Details

(Whiteboard: [planned-sprint][in-sprint=v2.1-S6])

Attachments

(6 files)

Attached image Call.png
Description:
A Ghost call in which the other party is not listed and user is unable to exit the phone call.

Repro Steps:
1) Update a Unknown device to BuildID: 
2) Have another device call the DUT
3) Cancel the phone call exactly when the DUT accepts the call
4) Go to lock screen

Actual:
A 'Ghost' call is currently active.

Expected: 
Phone calls are properly working.

Flame 2.2
Environmental Variables:
Device: Flame Master (319mb)
Build ID: 20140912040204
Gaia: 6cb5e0100d70313e4922c8d34bf20dcdd66ef616
Gecko: 2db5b64f6d49
Version: 35.0a1 (Master)
Firmware Version: v123
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0"

Notes:
Repro frequency: 1/5
See attached: screenshot, logcat
Attached file log.txt
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
steps-wanted to see if we can reproduce this.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
Keywords: steps-wanted
STR are exactly as described in comment 0. Able to get this ghost call following those steps. Have to be quick when hanging up with call originating phone and answering with DuT.

Tested with Shallow Flash on 319mb

This bug repro's on Flame KK builds: Flame 2.2 KK, Flame 2.1 KK, Flame 2.0 KK

Actual Results: Able to produce a ghost call where the call does not end when caller hangs up.

Repro Rate: 5/6

Environmental Variables:
Device: Flame Master KK
BuildID: 20141002101320
Gaia: d711d1e469eeeecf25a02b2407a542a598918b2c
Gecko: b85c260821ab
Version: 35.0a1 (Master) 
Firmware Version: L1TC10011800
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0
-----------------------------------------------------------------
Environmental Variables:
Device: Flame 2.1 KK
BuildID: 20141003001634
Gaia: 1cb6775fe67d7f71098d9c8b2fefb08bfe44ec87
Gecko: ebe8760a5f65
Version: 34.0a2
Firmware Version: L1TC10011800
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
-----------------------------------------------------------------
Environmental Variables:
Device: Flame 2.0 KK
BuildID: 20141002193135
Gaia: fa797854bfe708129ed54a158ad4336df1015c39
Gecko: 9b7fd1f78a15
Version: 32.0 (2.0) 
Firmware Version: L1TC10011800
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0

-----------------------------------------------------------------
-----------------------------------------------------------------

This bug does NOT repro on Flame kk build: Flame 2.0 Base KK 

Actual Result: Device ends calls appropriatly when one devices hangs up.

Repro Rate: 0/4

Environmental Variables:
Device: Flame 2.0 Base KK
BuildID: 20140904160718
Gaia: 506da297098326c671523707caae6eaba7e718da
Gecko: 
Gonk: 
Version: 32.0 (2.0)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmitchell)
Keywords: steps-wanted
QA Contact: croesch
[Blocking Requested - why for this release]:not a regression but some pretty bad broken functionality creating horrible UX
blocking-b2g: --- → 2.0?
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(jmitchell)
(In reply to Joshua Mitchell [:Joshua_M] from comment #4)
> [Blocking Requested - why for this release]:not a regression but some pretty
> bad broken functionality creating horrible UX

Nevermind, I thought this was a regression after I saw that it didn't repro on one branch, but it's the base image.
Assignee: nobody → gtorodelvalle
Target Milestone: --- → 2.1 S6 (10oct)
Triage to set it to 2.0+
blocking-b2g: 2.0? → 2.0+
Hi guys! I spent most of my day trying to narrow the issue down but I have to confess that I made little progress. I used a Flame 2.2 KK build (Gecko-745dcf1.Gaia-9306e43) but I got a "much much" lower repro rate compared to what you mention in previous comments. In fact, I think I may have observed a decreasing repro rate as time went by and right now I am not currently able to reproduce it any longer (at least after many tries) :O I may have also observed some influence of rebooting the phone but once again I cannot state any concrete pattern or observed behaviour. Sorry for not being able to be more specific :( BTW, I was making the calls via Microsoft Lync in case this could also be related to the observed results.
Bringing in Tony as well.
Flags: needinfo?(tchung)
From what I understand COM Ril was shorthand for Qualcomm Ril.  We either have moz RiL or a Vendor RiL.  The Ril package exists at /system/b2g/distribution/bundles/ and during shallow flash of Gecko we remove the directory after system/b2g/.  After that we need to replace the Ril package but need it locally at time of flash.

QAnalysts are currently all full flashing now and I believe are on QCRil I also return: RIL Daemon version: Qualcomm RIL 1.0

To turn on RIL logging in case you need to is:
adb root
adb pull /system/b2g/defaults/pref/user.js .
echo 'pref("network.dns.disablePrefetch", true);' >> user.js 
adb remount
adb push user.js /system/b2g/defaults/pref
adb shell sync && adb reboot

If you want to force Moz Ril, do the following (which just removes that bundles directory listing)
adb root
adb remount
adb shell stop b2g
adb shell rm -r /system/b2g/distribution/bundles
adb shell sync
adb shell reboot
Flags: needinfo?(pbylenga)
Comment 8 was completely wrong, so I've tagged it as obsolete, as is comment 9 as it was asked for based on incorrect information. We're looking into other possibilities.

(In reply to Peter Bylenga [:PBylenga] from comment #11)
> echo 'pref("network.dns.disablePrefetch", true);' >> user.js

I think this should be "ril.debugging.enabled".
Flags: needinfo?(mozillamarcia.knous)
Flags: needinfo?(gtorodelvalle)
Based on information from Tony, it seems that everyone is using MozRIL. Today, we eliminated the following possibilities for the causes in certain people being able to repro this and others not:

* Different RIL stacks
* Specific phone networks, or at least both T-Mobile and AT&T both have this issue
* Use of a FxOS device to dial the DUT

Here are the people who are able to repro this:

* Tamara
* Germán, perhaps more intermittently than the others
* All of the QAnalydocs people

Here are the people who are not:

* Me
* Johan

We're pretty stumped here, so at this point any suggestions are welcome. For the time being, we'll have to have Germán and Tamara investigate.
Flags: needinfo?(tchung)
Tamara and I are investigating this. Unfortunately, I still can't repro it. Neither can Gabriele in Amsterdam or Anthony in Paris.

(In reply to Doug Sherk (:drs) from comment #15)
> * Specific phone networks, or at least both T-Mobile and AT&T both have this
> issue

This was definitely wrong as this doesn't repro on T-Mobile for Tamara.

We currently believe that this only repros on AT&T, and possibly even more specifically only with a phone on AT&T calling the DUT as well.

qawanted for a non-AT&T DUT.
Assignee: gtorodelvalle → thills
Keywords: qaurgent, qawanted
Here's everything Tamara has tried or is going to try:

    DUT = Flame 2.2 AT&T
    voip calling DUT - no repro
    iPhone (AT&T) calling DUT - repro on first call
    POTS FiOS home (VZW) - no repro
    iPhone (AT&T 3G) calling DUT - repro after few calls

    DUT = Flame 2.2 AT&T + T-Mobile (2 SIMs)
    iPhone (AT&T) calling DUT's AT&T SIM - 
        repros calling AT&T SIM(1st try).  
        Does not repro when calling Tmobile SIM.
    iPhone (AT&T 3G) calling DUT's AT&T SIM - ?
    iPhone (AT&T) calling DUT's T-Mobile SIM - ?
    iPhone (AT&T 3G) calling DUT's T-Mobile SIM -?

    DUT = Flame 2.2 TMobile SIM
    iPhone (AT&T 4G) calling Tmobile - no repro
    Tamara to try later iphone->iphone to test call clearing

    DUT = Hamachi 2.2 AT&T
    iPhone (AT&T) calling DUT - ?
    iPhone (AT&T 3G) calling DUT - ?
After more testing, I believe that this a problem unique to the carrier AT&T.  I tried the following tests:
1. AT&T iPhone calls AT&T iPhone.  
Result:  I was able to get the ghost call to linger for up to 21 seconds on the iPhone.

2.  AT&T iPhone calls Verizon FiOS phone. 
Result: I was able to get the ghost call to linger on my FiOS phone and eventually go to reorder

I believe that the network is delaying the sending of the disconnect message and this is causing the problem.  The logs from T-Mobile do not exhibit this problem (nor was I able to repro with TMobile as a DUT)

Hsin-Yi, if you could take a look at the logs I posted to confirm the hypothesis, that would be great.  There is one file that has the incoming messages (the REQUEST_ in ril_worker.js) with the timestamps and I pointed out in the logs the 20 second gap that I see in the AT&T hangup scenario, but NOT in the TMobile scenario.  The full logs for each of the scenarios is also included.
blocking-b2g: 2.0+ → 2.0?
Flags: needinfo?(htsai)
Keywords: qaurgent
(In reply to Tamara Hills [:thills] from comment #19)
> After more testing, I believe that this a problem unique to the carrier
> AT&T.  I tried the following tests:
> 1. AT&T iPhone calls AT&T iPhone.  
> Result:  I was able to get the ghost call to linger for up to 21 seconds on
> the iPhone.
> 
> 2.  AT&T iPhone calls Verizon FiOS phone. 
> Result: I was able to get the ghost call to linger on my FiOS phone and
> eventually go to reorder
> 
> I believe that the network is delaying the sending of the disconnect message
> and this is causing the problem.  The logs from T-Mobile do not exhibit this
> problem (nor was I able to repro with TMobile as a DUT)
> 
> Hsin-Yi, if you could take a look at the logs I posted to confirm the
> hypothesis, that would be great.  There is one file that has the incoming
> messages (the REQUEST_ in ril_worker.js) with the timestamps and I pointed
> out in the logs the 20 second gap that I see in the AT&T hangup scenario,
> but NOT in the TMobile scenario.  The full logs for each of the scenarios is
> also included.

Hi Tamara,

Thanks for the hard investigation. According to the log, the call state transition is also right except the huge timing gap. Your inference makes much sense to me. I'd also suggest this be closed as WONTFIX. Thank you :)
Flags: needinfo?(htsai)
Per comment 19 and comment 20, this is more like a carrier specific thing. 
This shouldn't block. Leaving the call to Rachelle who originally sets the block.
Flags: needinfo?(ryang)
removing qa-wanted tag - it got left behind when the qaurgent tag was struck at comment 19
Keywords: qawanted
Whiteboard: [planned-sprint][in-sprint=v2.1-S6]
Target Milestone: 2.1 S6 (10oct) → 2.1 S7 (Oct24)
I/Gecko   ( 8853): -*- RadioInterface[1]: tamarag Received message from worker: {"rilMessageType":"callDisconnected","call":{"state":0,"callIndex":1,"toa":129,"isMpty":false,"isMT":true,"als":0,"isVoice":true,"isVoicePrivacy":false,"number":"4082069694","numberPresentation":0,"name":null,"namePresentation":0,"uusInfo":null,"isOutgoing":false,"isConference":false,"started":1412796218524,"failCause":"NormalCallClearingError"},"rilMessageClientId":1}1412796239301

The call duration is around 20777.

I would like to say this is a network issue, however, it would be better to have MT side log to confirm when to send the disconnect request, and MO side to confirm when to receive REL_CNF.

With above evidence, we can call this bug as network issue w/o any concern.
Per comment 23 from Shawn, triage to denominate it for now.
Need more information of logs from MT side for clarification.
Thanks!
blocking-b2g: 2.0? → ---
Flags: needinfo?(ryang)
Hi Shawn,

Can you help me with which logs you are asking for?  The logs I provided are for the device that answers the call (B in the "A calls B scenario").  Are you looking to match up A's logs with B's logs?  

As I mentioned above, I can repro this problem with two AT&T iPhones calling eachother.

Thanks,

-tamara
Flags: needinfo?(sku)
(In reply to Tamara Hills [:thills] from comment #25)
> Hi Shawn,
> 
> Can you help me with which logs you are asking for?  The logs I provided are
> for the device that answers the call (B in the "A calls B scenario").  Are
> you looking to match up A's logs with B's logs?  
> 
> As I mentioned above, I can repro this problem with two AT&T iPhones calling
> eachother.
> 
> Thanks,
> 
> -tamara

Hi Tamara:
 Please help provide two logs.
(Both A and B are Falme with AT&T SIMs.)

1. A call B
2. B answer the call
3. A hang up the call while B answer the call

Please help provide both logs from device A and B.
I can help check if the symptoms match on both devices.
Flags: needinfo?(sku) → needinfo?(thills)
Hi Shawn,

Please see latest attachment.  I have captured A's logs and B's logs (as per the scenario you listed above).

Both are Flames with AT&T SIMs.  For the A log, it took me a few tries, so it's the last call in the log.

Thanks,

-tamara
Flags: needinfo?(thills)
Flags: needinfo?(sku)
Target Milestone: 2.1 S7 (24Oct) → ---
QA Contact: croesch
Hi Tamara:
 Thanks for your log, however, there is no timestamp for checking the gap between hangup and disconnect.
Could you please re-get log again by using below command on both devices?

// Command
adb logcat -b radio -b main -v threadtime > /tmp/test.log


I/Gecko   (  207): TelephonyService: Dialing 4082069694
I/Gecko   (  207): TelephonyService: handleCallStateChange: {"state":2,"callIndex":1,"toa":129,"isMpty":false,"isMT":false,"als":0,"isVoice":true,"isVoicePrivacy":false,"number":"4082069694","numberPresentation":0,"name":null,"namePresentation":0,"uusInfo":null,"isEmergency":false,"isOutgoing":true,"isConference":false}
I/Gecko   (  207): TelephonyService: handleCallStateChange: {"state":3,"callIndex":1,"toa":129,"isMpty":false,"isMT":false,"als":0,"isVoice":true,"isVoicePrivacy":false,"number":"4082069694","numberPresentation":0,"name":null,"namePresentation":0,"uusInfo":null,"isEmergency":false,"isOutgoing":true,"isConference":false}
...
I/Gecko   (  207): RIL Worker: [0] Received chrome message {"callIndex":1,"rilMessageClientId":0,"rilMessageToken":63,"rilMessageType":"hangUp"}
...
I/Gecko   (  207): -*- RadioInterface[0]: Received message from worker: {"rilMessageType":"callDisconnected","call":{"state":3,"callIndex":1,"toa":129,"isMpty":false,"isMT":false,"als":0,"isVoice":true,"isVoicePrivacy":false,"number":"4082069694","numberPresentation":0,"name":null,"namePresentation":0,"uusInfo":null,"isEmergency":false,"isOutgoing":true,"isConference":false,"hangUpLocal":true,"failCause":"NormalCallClearingError"},"rilMessageClientId":0}
Flags: needinfo?(sku) → needinfo?(thills)
Clearing ni as i've moved off of this bug for other priorities and it's not currently assigned.
Assignee: thills → nobody
Flags: needinfo?(thills)
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: