Closed Bug 999334 Opened 6 years ago Closed 6 years ago

[Tarako][Dialer]Dial to invalid number will trigger two call out

Categories

(Firefox OS Graveyard :: RIL, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

(blocking-b2g:1.3T+, b2g-v1.3T verified)

VERIFIED FIXED
2.0 S1 (9may)
blocking-b2g 1.3T+
Tracking Status
b2g-v1.3T --- verified

People

(Reporter: mlien, Assigned: aknow)

References

Details

(Whiteboard: [p=2][ETA: 4/28] [partner-blocker])

Attachments

(6 files, 2 obsolete files)

Attached file logcat_with_ril.txt
[Device]
  Tarako
---------------------------------------------
[Reproduction build] - 0417pac + PVT v1.3T
  Gaia      3f1d8332d127f1d6bc0fdbb08146ce1deb0efbc0
  Gecko     https://hg.mozilla.org/releases/mozilla-b2g28_v1_3t/rev/5da76152c2bd
  BuildID   20140421164001
  Version   28.1

---------------------------------------------
[Reproduce Steps]
  1. Launch Dialer app
  2. Dial out to invalid number
---------------------------------------------
[Expected Result]
  Only trigger one call out
---------------------------------------------
[Actual Result]
  Trigger two call out
---------------------------------------------
[Reproduce Rate]
  100%
Attached image 2014-04-22-16-50-12.png
triage: base on the symptom, 1.3T+
ni? Etienne/Sku for next steps. thanks
blocking-b2g: 1.3T? → 1.3T+
Flags: needinfo?(sku)
Flags: needinfo?(etienne)
Wondering if the root cause is the same as bug 997394?
There is only one call request, however, we saw two calls on UI. Need Gaia's effort to check what went wrong.

04-22 16:47:36.770 I/Gecko   (  658): MobileConnection initialized
04-22 16:47:36.770 I/Gecko   (  658): MobileConnection initialized
04-22 16:47:38.160 I/Gecko   (   84): MobileConnection initialized
04-22 16:47:38.170 I/Gecko   (   84): MobileConnection initialized

04-22 16:47:43.510 D/RILC    (   96): [0131]> DIAL (num=123,clir=0)
04-22 16:47:43.540 D/RILC    (   96): [0132]< GET_CURRENT_CALLS {[id=1,DIALING,toa=129,norm,mo,als=0,voc,noevp,123,cli=0,name='(null)',0}
04-22 16:47:43.560 D/RILC    (   96): [0131]< DIAL
Flags: needinfo?(sku)
ni? Gabriele
Flags: needinfo?(gsvelto)
Probably not gaia...
Flags: needinfo?(etienne)
Sounds a lot like a dup of bug 998044 too, I'm checking right now.
Flags: needinfo?(gsvelto)
I'm not able to reproduce this on my own build with gaia bfdf93ec79dc39c7db456af2998284349ba5eb48 and gecko 1cb161457a570424532df6c647a6602eec1a8c27. The phone was flashed with the 04/17 PAC file. I'll try to see if this reproduces with the version from comment 0, if it does then this might have been fixed already.
Assignee: nobody → gsvelto
Status: NEW → ASSIGNED
I just tested with the version in comment 0 and I cannot reproduce either. Mike can you still reproduce this? If not I'll close as WORKSFORME.
Flags: needinfo?(mlien)
I still can reproduce it on today's daily build
Gaia      9aa9b04049f0291b124c50e0f9c3ce0e1f547725
Gecko     https://hg.mozilla.org/releases/mozilla-b2g28_v1_3t/rev/2e40920b1d13
BuildID   20140422164001
Version   28.1
Flags: needinfo?(mlien)
OK, maybe we need a more elaborate STR? What I'm trying is the following:

- I've downloaded build 20140422164001 (https://pvtbuilds.mozilla.org/pvt/mozilla.org/b2g_ril/tarako-mozilla-b2g28_v1_3t-20140422164001-ril01.02.00.019.102.zip)

- Flashed it with ./fullflash_gecko_ril_gaia.sh -r

- Turned on the phone

- Dialed the invalid number 123

I've tested with one or two SIMs, switching the active SIM for outgoing calls, etc... In every scenario I get only a single call. I'm starting to get the feeling that this might be carrier-specific or something else that's not under our control. Rik, can you give this a quick spin and see if it reproduces on your Tarako?
Flags: needinfo?(anthony)
I could reproduce but the reproduce rate is around 30%.

This seems a modem or gecko issue. I will attach a log right away.
Flags: needinfo?(anthony)
get_current_calls is coming before request_dial

snippet:
10672 04-23 15:00:28.583 D/RILC    (   95): [0324]> DIAL (num=123,clir=0)
10673 04-23 15:00:28.593 D/RIL     (   95): onRequest: DIAL sState=4
10674 04-23 15:00:28.593 D/RIL     (   95): channel1 state: '0'
10675 04-23 15:00:28.593 D/RIL     (   95): get Channel ID '1'
10676 04-23 15:00:28.593 D/RIL     (   95): requestDial isStkCall = 0
10677 04-23 15:00:28.593 D/AT      (   95): Channel1: AT> ATD123;
10678 04-23 15:00:28.603 D/AT      (   95): Channel0: AT< +ECIND: 0,2,1
10679 04-23 15:00:28.603 D/AT      (   95): Channel0: AT< ^DSCI: 1,0,2,0,0,123,129
10680 04-23 15:00:28.623 D/RILC    (   95): [UNSL]< UNSOL_RESPONSE_CALL_STATE_CHANGED
10693 04-23 15:00:28.623 I/Gecko   (   83): RIL Worker: [0] Unsolicited response for request type 1001
10694 04-23 15:00:28.623 I/Gecko   (   83): RIL Worker: [0] Handling parcel as UNSOLICITED_RESPONSE_CALL_STATE_CHANGED
10695 04-23 15:00:28.623 I/Gecko   (   83): RIL Worker: [0] New outgoing parcel of type 9
10696 04-23 15:00:28.623 I/Gecko   (   83): RIL Worker: Outgoing parcel: 0,0,0,8,9,0,0,0,69,1,0,0
10697 04-23 15:00:28.633 I/RILC    (   95): PCC  alloc one command p_record
10698 04-23 15:00:28.633 D/RILC    (   95): PCB request code 9 token 325
10699 04-23 15:00:28.633 D/RILC    (   95): [0325]> GET_CURRENT_CALLS
10700 04-23 15:00:28.633 D/RIL     (   95): onRequest: GET_CURRENT_CALLS sState=4
10701 04-23 15:00:28.633 D/RIL     (   95): channel2 state: '0'
10702 04-23 15:00:28.633 D/RIL     (   95): get Channel ID '2'
10703 04-23 15:00:28.633 D/AT      (   95): Channel2: AT> AT+CLCC
10704 04-23 15:00:28.633 D/AT      (   95): Channel2: AT< +CLCC: 1,0,2,0,0,"123",129
10705 04-23 15:00:28.633 D/AT      (   95): Channel2: AT< OK
10706 04-23 15:00:28.633 D/RILC    (   95): [0325]< GET_CURRENT_CALLS {[id=1,DIALING,toa=129,norm,mo,als=0,voc,noevp,123,cli=0,name='(null)',0}
10707 04-23 15:00:28.633 D/RIL     (   95): put Channel ID '2'
10708 04-23 15:00:28.633 I/RILC    (   95): -->CommandThread [337] free one command
10709 04-23 15:00:28.633 I/Gecko   (   83): RIL Worker: Next parcel size unknown, going to sleep.
10710 04-23 15:00:28.653 D/RILC    (   95): [0324]< DIAL

The response of [0325] is coming before [0324] that leads to a problem.
Gecko is expecting a subsequent response. The flow gecko expecting to be based on is: gecko is notified of REQUEST_DIAL success response --> gecko is notified UNSOLICITED_RESPONSE_CALL_STATE_CHANGED --> gecko queries GET_CURRENT_CALLS -->
gecko is notified of the response of GET_CURRENT_CALLS.

Based on the flow, Gecko creates a pending outgoing call when REQUEST_DIAL succeeds. And then update that pending outgoing call when the next get_current_calls comes. But the flow is disordered... 

Sam, 
I am wondering if this is a bug on modem/rild. Could gecko expect the order as mentioned above? Thank you.
Flags: needinfo?(sam.hua)
Attachment #8410810 - Attachment mime type: text/x-vhdl → text/plain
I've noticed while trying this on two different SIMs that the response I get from different carriers is also different. In one case the call connects but I'm welcomed with a message that states that the number does not exist, in the other case the call is never connected and just terminates.

Is it possible that I am unable to reproduce this because the bug is triggered by a carrier-dependent behavior? I'm unassigning myself in the meantime as I won't be able to help unless I can reproduce the bug.
Assignee: gsvelto → nobody
Status: ASSIGNED → NEW
Hi hsinyi.

modem replys "atd123" soon..          04-23 15:00:28.623 D/AT      (   95): Channel1: AT< OK,
and reply to gecko:          .  10710 04-23 15:00:28.653 D/RILC    (   95): [0324]< DIAL

the problem is before Channel1: AT< OK, RILC get the 
10678 04-23 15:00:28.603 D/AT      (   95): Channel0: AT< +ECIND: 0,2,1
10679 04-23 15:00:28.603 D/AT      (   95): Channel0: AT< ^DSCI: 1,0,2,0,0,123,129

and it will trigger UNSOL_RESPONSE_CALL_STATE_CHANGED

i am afraid modem can't arrange this sequence because DSCI/ECIND are independently with dial sequence.

could u check how android phone avoid it?
Flags: needinfo?(sam.hua)
Flags: needinfo?(htsai)
okay... I think we could do exactly the same as AOSP does [1]. That is, create a pending outgoing call object before sending REQUEST_DIAL, and update state upon the upcoming GET_CURRENT_CALLS result.

[1] http://grepcode.com/file/repository.grepcode.com/java/ext/com.google.android/android/4.4.2_r1/com/android/internal/telephony/gsm/GsmCallTracker.java/#219

Hi Aknow,
Would you mind taking care of this?
Flags: needinfo?(htsai) → needinfo?(szchen)
Component: Gaia::Dialer → RIL
Assignee: nobody → szchen
Flags: needinfo?(szchen)
WIP. Not yet tested on the issue itself.
Patch is generate from master branch, not 1.3T.
Attachment #8410941 - Flags: review?(htsai)
Comment on attachment 8410941 [details] [diff] [review]
Redesign pending outgoing call mechanism

Review of attachment 8410941 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me. Please co-work with QA/Mike to make sure this issue gets resolved before we land the patch. Thank you.

::: dom/system/gonk/ril_worker.js
@@ +3871,5 @@
>          if (!pendingOutgoingCall &&
>              (newCall.state === CALL_STATE_DIALING ||
>               newCall.state === CALL_STATE_ALERTING)) {
>            // Receive a new outgoing call which is already hung up by user.
> +          if (DEBUG) this.context.debug("pending outgoing call is hung up by user.");

nit: s/pending/Pending/

@@ +5353,5 @@
>    }
>    this._processCalls(calls);
>  };
>  RilObject.prototype[REQUEST_DIAL] = function REQUEST_DIAL(length, options) {
> +  // We already return a successful response before. Don't response it again!

s/response it/respond/ 
"response" is a noun.
Attachment #8410941 - Flags: review?(htsai) → review+
(In reply to Hsin-Yi Tsai  [:hsinyi] from comment #18)
> Comment on attachment 8410941 [details] [diff] [review]
> Redesign pending outgoing call mechanism
> 
> Review of attachment 8410941 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good to me. Please co-work with QA/Mike to make sure this issue gets
> resolved before we land the patch. Thank you.
> 
> ::: dom/system/gonk/ril_worker.js
> @@ +3871,5 @@
> >          if (!pendingOutgoingCall &&
> >              (newCall.state === CALL_STATE_DIALING ||
> >               newCall.state === CALL_STATE_ALERTING)) {
> >            // Receive a new outgoing call which is already hung up by user.
> > +          if (DEBUG) this.context.debug("pending outgoing call is hung up by user.");
> 
> nit: s/pending/Pending/
> 
> @@ +5353,5 @@
> >    }
> >    this._processCalls(calls);
> >  };
> >  RilObject.prototype[REQUEST_DIAL] = function REQUEST_DIAL(length, options) {
> > +  // We already return a successful response before. Don't response it again!
> 
> s/response it/respond/ 
> "response" is a noun.

I'll suggest pass try server first and I can verify the build after landed
Otherwise the verification effort it doubled
(In reply to Mike Lien[:mlien] from comment #19)
> (In reply to Hsin-Yi Tsai  [:hsinyi] from comment #18)
> > Comment on attachment 8410941 [details] [diff] [review]
> > Redesign pending outgoing call mechanism
> > 
> > Review of attachment 8410941 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > Looks good to me. Please co-work with QA/Mike to make sure this issue gets
> > resolved before we land the patch. Thank you.
> > 
> > ::: dom/system/gonk/ril_worker.js
> > @@ +3871,5 @@
> > >          if (!pendingOutgoingCall &&
> > >              (newCall.state === CALL_STATE_DIALING ||
> > >               newCall.state === CALL_STATE_ALERTING)) {
> > >            // Receive a new outgoing call which is already hung up by user.
> > > +          if (DEBUG) this.context.debug("pending outgoing call is hung up by user.");
> > 
> > nit: s/pending/Pending/
> > 
> > @@ +5353,5 @@
> > >    }
> > >    this._processCalls(calls);
> > >  };
> > >  RilObject.prototype[REQUEST_DIAL] = function REQUEST_DIAL(length, options) {
> > > +  // We already return a successful response before. Don't response it again!
> > 
> > s/response it/respond/ 
> > "response" is a noun.
> 
> I'll suggest pass try server first and I can verify the build after landed
> Otherwise the verification effort it doubled

Indeed, the typical flow is dev. makes sure the issue is fixed, then patch landed, then verified by QA. However, since this is a timing issue, and the reproduce rate is quite low for me, to play it safer, I was thinking to ask QA's help in advance.
For master branch. I will provide 1.3T version later.
Attachment #8410941 - Attachment is obsolete: true
Attachment #8411657 - Flags: review+
I clear 'checkin-needed' because when Mike tried to verify the patch (I provided him a 1.3t version), he noticed that call has been dialed out successfully (remote party receives that) but nothing shown on call screen. However, he never see this case on pvt build. Let's investigate more before landing.

STR:
1) Click the green "dial" button quickly and repeatedly.
Keywords: checkin-needed
Attached file ril_worker.js for 1.3T (obsolete) —
(In reply to Hsin-Yi Tsai  [:hsinyi] from comment #23)
> I clear 'checkin-needed' because when Mike tried to verify the patch (I
> provided him a 1.3t version), he noticed that call has been dialed out
> successfully (remote party receives that) but nothing shown on call screen.
> However, he never see this case on pvt build. Let's investigate more before
> landing.
> 
> STR:
> 1) Click the green "dial" button quickly and repeatedly.

I am not quite sure about your STR. What do you mean quickly and repeatedly?
Is it a seldom issue or always reproducible.

You could refer to my 1.3T version in attachment 8411814 [details] .
I have verify it by normal use case. It goes well.
Attachment #8411734 - Attachment mime type: text/x-vhdl → text/plain
(In reply to Szu-Yu Chen [:aknow] from comment #26)
> (In reply to Hsin-Yi Tsai  [:hsinyi] from comment #23)
> > I clear 'checkin-needed' because when Mike tried to verify the patch (I
> > provided him a 1.3t version), he noticed that call has been dialed out
> > successfully (remote party receives that) but nothing shown on call screen.
> > However, he never see this case on pvt build. Let's investigate more before
> > landing.
> > 
> > STR:
> > 1) Click the green "dial" button quickly and repeatedly.
> 
> I am not quite sure about your STR. What do you mean quickly and repeatedly?
> Is it a seldom issue or always reproducible.

Not always reproducible.

> 
> You could refer to my 1.3T version in attachment 8411814 [details] .
> I have verify it by normal use case. It goes well.

Yup, normal cases work well. :) But as this bug results from a timing issue, and I am afraid that this new weird behavior results from another timing issue, which might be caused by this patch, as well, I feel safer to check more before landing, even though it is possible that the new observed case is nothing related to the patch.
Whiteboard: [ETA: 4/29]
Thanks for the deeper investigation, Mike and Aknow.

After reviewing and analyzing the log, everything looks right on gecko ril side. CallStateChanged event is dispatched even when the error, nothing shown on call screen, occurs. It's not obvious that the case mentioned in comment 23 results from this patch.

Normal cases and try server are all good. Let's ship it. And sorry for the false-alarm :p
Whiteboard: [ETA: 4/29] → [p=2][ETA: 4/29]
Target Milestone: --- → 2.0 S1 (9may)
Attachment #8411814 - Attachment is obsolete: true
Whiteboard: [p=2][ETA: 4/29] → [p=2][ETA: 4/28]
https://hg.mozilla.org/mozilla-central/rev/6234b7cdf8ce
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Please rebase on 1.3t.
Flags: needinfo?(szchen)
Depends on: 993255
Flags: needinfo?(szchen)
Blocks: 1000705
checkin-needed for 1.3T
Please land the patch after bug 993255. Thx.
Flags: needinfo?(szchen)
Keywords: checkin-needed
This is under suspicion for causing bug 1002538, so clearing checkin-needed to find out if it was the cause. If it is the cause, then this can't be uplifted until bug 1002538 is resolved.
Keywords: checkin-needed
I checked the log in bug 1002538 but cannot find the strange part related to this one. I also verified on tarako with 1.3T gecko (+ this patch) by the STR. The result is good and the issue is not reproducible.
Whiteboard: [p=2][ETA: 4/28] → [p=2][ETA: 4/28] [partner-blocker]
(In reply to Szu-Yu Chen [:aknow] from comment #35)
> I checked the log in bug 1002538 but cannot find the strange part related to
> this one. I also verified on tarako with 1.3T gecko (+ this patch) by the
> STR. The result is good and the issue is not reproducible.

Ok - we should be able to uplift here. bug 1002538 only affects JB, not ICS, so the fact that this does not reproduce on ICS makes sense.
Keywords: checkin-needed
This issue is verified fixed it today's Tarako build


1.3t Environmental Variables:
Device: Tarako 1.3t
BuildID: 20140502014001
Gaia: a8e0ff550de08e58e4bf75af3cecf175b9b71e70
Gecko: 71790bf476cb
Version: 28.1
Firmware Version: sp6821
FYI, I found it regressed on ICS/Nexus S as reported on bug 1005498.
As jason already suggested, this may also be the root cause of bug 1002538.
refer to comment 38, change to verified & fixed
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.