Closed Bug 976897 Opened 6 years ago Closed 6 years ago

[Sora][Message][MMS]The MMS sending too long time sometimes.

Categories

(Firefox OS Graveyard :: RIL, defect, P1)

defect

Tracking

(blocking-b2g:1.3+, firefox28 wontfix, firefox29 wontfix, firefox30 fixed, b2g-v1.3 fixed, b2g-v1.3T fixed, b2g-v1.4 fixed)

RESOLVED FIXED
1.4 S3 (14mar)
blocking-b2g 1.3+
Tracking Status
firefox28 --- wontfix
firefox29 --- wontfix
firefox30 --- fixed
b2g-v1.3 --- fixed
b2g-v1.3T --- fixed
b2g-v1.4 --- fixed

People

(Reporter: sync-1, Assigned: bevis)

References

()

Details

(Keywords: regression)

Attachments

(14 files, 4 obsolete files)

3.05 MB, application/rar
Details
320.47 KB, text/x-log
Details
291 bytes, application/x-shellscript
Details
630.70 KB, application/x-executable
Details
413.27 KB, text/x-log
Details
651.88 KB, text/plain
Details
1.36 MB, text/plain
Details
693.89 KB, application/vnd.tcpdump.pcap
Details
585.45 KB, text/plain
Details
485.43 KB, application/vnd.tcpdump.pcap
Details
11.79 KB, patch
vicamo
: review+
Details | Diff | Splinter Review
9.73 KB, patch
vicamo
: review+
Details | Diff | Splinter Review
299.65 KB, video/3gpp
Details
299.65 KB, video/3gpp
Details
Firefox OS v1.3
 Mozilla build ID: 20140208004002
 
 Created an attachment (id=645833)
 QXDM log
 
 DEFECT DESCRIPTION:
 ->The MMS sending too long time .
 
  REPRODUCING PROCEDURES:
 ->Enter"Message"to create a MMS;
 ->Tap attachment icon to enter"camera",record a max video and add;
 ->Then send the MMS,the MMS in sending screen too long time(about 15 mins).(ko)
 
  EXPECTED BEHAVIOUR:
 ->Should sent it as soon as possible.
 
  ASSOCIATE SPECIFICATION:
 
  TEST PLAN REFERENCE:
 
  TOOLS AND PLATFORMS USED:
 
  USER IMPACT:
 
  REPRODUCING RATE:50%
 
  For FT PR, Please list reference mobile's behavior:Beetle lite FF can't reproduce.
Attached file QXDM log
originally this should have been filed as a followup to bug 962142, but it was never filed
blocking-b2g: --- → 1.3?
Component: Gaia::SMS → RIL
Hey Bevis, can you have a look to the log?
Flags: needinfo?(btseng)
Would you please follow the link of [1] and add 2 prefs to |true| to enable mms/ril related logs in adb logcat for further analysis?
ril.debugging.enabled
mms.debugging.enabled

To ensure that the correct logs are captured, please check if |MmsService| & |RadioInterfaceLayer| are printed in the captured logcat.

[1] https://wiki.mozilla.org/B2G/QA/Tips_And_Tricks#Changing_preferences

Thanks.
Flags: needinfo?(btseng) → needinfo?(sync-1)
This isn't going to be a Moz RIL bug. IOT testing uses the QC RIL, not the Moz RIL for testing. This is likely regressed actually from bug 947946's work.
Component: RIL → Gaia
(In reply to Jason Smith [:jsmith] from comment #5)
> This isn't going to be a Moz RIL bug. IOT testing uses the QC RIL, not the
> Moz RIL for testing. This is likely regressed actually from bug 947946's
> work.

Bug 947946 just updated the apn.json database with the last version of the AOSP APN database. The APNs from the AOSP APN database are actually validated for the carriers and I don't think this bug might be caused for a bad APN for MMS. Moreover if the MMS get sent eventually this is not regression from 947946. Anyway, may I know the MNC MCC codes in the ICC card has been used for this test? I could have a look and chech whether the APN for MMS has changed.
Adding Anshul to the CC list as he might know.
(In reply to Bevis Tseng [:bevistseng] (btseng@mozilla.com) from comment #4)
> Would you please follow the link of [1] and add 2 prefs to |true| to enable
> mms/ril related logs in adb logcat for further analysis?
> ril.debugging.enabled
> mms.debugging.enabled
> 
> To ensure that the correct logs are captured, please check if |MmsService| &
> |RadioInterfaceLayer| are printed in the captured logcat.
> 
> [1] https://wiki.mozilla.org/B2G/QA/Tips_And_Tricks#Changing_preferences
> 
> Thanks.

Ok, I will upload it as soon as possible.
Attached file test.log
I set both "ril.debugging.enabled" and "mms.debugging.enabled" as true in user.js ,but I just can see "MmsService" strings in the log.
Hi,

From the log, I saw the MMS was sent correctly.
However, there is no timestamp to prove how long it took to send the mms.
Would you please help to get the adb logcat with the follow options:
"adb logcat -v threadtime"

In addition,
Since this is QCRIL enabled ROM, would you please help to get the qxdm log at the same time.
It might be helpful to clarify if this is network-related problem.

Thanks
Bevis
Can we re-validate the regression window here from https://bugzilla.mozilla.org/show_bug.cgi?id=962142#c8 and indicate the last working gaia/first broken gecko info?
blocking-b2g: 1.3? → 1.3+
QA Contact: sarsenyev
I cannot get indicate the last working/gaia/first broken gecko.
The regression window from bug 962142 doesn't work for me. The issue to send MMS message appears everytime when sending MMS over WiFi. The issue is intermittent, sometimes it sends sometimes keeping spinning
I tried on 1.3, 1.2 and 1.1  the same behavior. Also sometimes error message about "service currently unavailable" appears
It works fine with AT&T data service
(In reply to sarsenyev from comment #12)
> I cannot get indicate the last working/gaia/first broken gecko.
> The regression window from bug 962142 doesn't work for me. The issue to send
> MMS message appears everytime when sending MMS over WiFi. The issue is
> intermittent, sometimes it sends sometimes keeping spinning
> I tried on 1.3, 1.2 and 1.1  the same behavior. Also sometimes error message
> about "service currently unavailable" appears
> It works fine with AT&T data service

I don't think you are testing the same bug. The issue TCL hit aligns with the nkot's original testing, so I think there's something different you need to do here. Check with nkot about this.
Flags: needinfo?(sync-1)
QA Contact: sarsenyev → jzimbrick
Bug 962142 is resolved WFM and the followup bug I was going to write up - an MMS sometimes takes too long sending unless the user sends another MMS right after and then both MMSs will be sent out immediately. I still haven't written it up because every time I ran the script to enable MMS debug for the logcat, and restarted device the issue no longer reproduces, that makes me think restarting fixes it...

The issue only reproduces with WiFi On. Another bug that seems to be relevant here is bug 949780 - device temporary locks up while sending MMS.
Duplicate of this bug: 978722
Regression Window:

Last Working Environmental Variables:
Device: Buri
BuildID: 20140120004002
Gaia: 63cc8ffd6f2acdce14eb8285e8822ebe62fcab41
Gecko: 37ec3c7fb511
Version: 28.0a2
Base Image: V1.2-device.cfg

First Broken Envrionmental Variables:
Device: Buri
BuildID: 20140121004137
Gaia: 47049555282a9a01fb60d1e1421b57e2810c96f5
Gecko: 6f7dfe36ab6c
Version: 28.0a2
Base Image: V1.2-device.cfg

Last Working Gaia/First Broken Gecko: Issue DOES reproduce.
Gaia: 63cc8ffd6f2acdce14eb8285e8822ebe62fcab41
Gecko: 6f7dfe36ab6c

First Broken Gaia/Last Working Gecko: Issue does NOT reproduce.
Gaia: 47049555282a9a01fb60d1e1421b57e2810c96f5
Gecko: 37ec3c7fb511

Gecko Pushlog: http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=37ec3c7fb511&tochange=6f7dfe36ab6c
Looks like this is a regression from bug 943215 then.
Blocks: 943215
Component: Gaia → RIL
Let's get logcat with these prefs set:

* ril.debugging.enabled
* mms.debugging.enabled

Make sure that the logcat is retrieved using "adb logcat -v threadtime"
Keywords: qawanted
Add related ppl to cc list to clarify if it's a regression of bug 943215 when log is available.
Bevis, assigned to you as a 1.3+ are a priority.
Assignee: nobody → btseng
could this sitll be a CommRil issue, or is it in Gecko?
Hi Zimbrick,

May I know it's Moz RIL or QC RIL included in the regression window mentioned in comment 16?
Flags: needinfo?(jzimbrick)
(In reply to Bevis Tseng [:bevistseng] (btseng@mozilla.com) from comment #22)
> Hi Zimbrick,
> 
> May I know it's Moz RIL or QC RIL included in the regression window
> mentioned in comment 16?

Moz RIL.

(In reply to Julien Wajsberg [:julienw] from comment #21)
> could this sitll be a CommRil issue, or is it in Gecko?

Probably gecko. The problem happens on Com RIL & Moz RIL, since TCL found it on their IOT build & we've found it in our own internal builds.
Flags: needinfo?(jzimbrick)
Attached file log (obsolete) —
Attaching log using the command and with the prefs enabled that were stated in Comment 18.
Keywords: qawanted
It seems that the mms/ril logs are not enabled properly.

I'll upload a script to ensure that the log is capture correctly.
Script to enable MMS/RIL logs.
More logs will be needed to capture at once. (adb logcat with mms/ril/networkmanager), tcpdump.
I'll update detail steps later.
Attached file tcpdump
update tool to dump tcpdump.

Step to dump tcpdump:
  1. Install attached tcpdump command into test device:
     $adb remount; adb push tcpdump system/bin/;adb shell chmod 777 /system/bin/tcpdump
  2. Start capturing tcpdump:
     $adb shell tcpdump -i any -p -s 0 -w /data/capture.pcap
  3. backup tcpdump result:
     $adb pull /data/capture.pcap .
Hi Zimbrick,

Because the adb logcat wasn't capture correctly, I've upload a script in comment 27 to ensure that the logs are enabled.

To ensure related logs are capture by "adb logcat -v threadtime", please help to check if |MmsService| & |RadioInterface| are printed in the captured log.

In addition, I think we also need the tcpdump to see if the sending request is routed to correct network interface.

Please open another terminal and follow the step in comment 28 to capture the tcpdump at the same time while start testing.
Flags: needinfo?(jzimbrick)
Keywords: qawanted
Attached file sms.log
adb log with ril and mms debug enable and timestamp
Hi 田旻,

After checking the log, I only saw there are 2 MMS message downloaded normally
from the attached log in Comment 30, and the timestamp is around 03-02 20:29:59.041 
and 03-02 20:30:43.282. 
Each MMS message took about 2 seconds to complete the download. 
There is no sending scenario from the log you provided.

Is there anything missing?
BTW, is QC RIL included in your test build?

Regards,
Bevis Tseng

--

03-02 20:29:59.041   298   298 I Gecko   : -@- MmsService: receiveWapPush: msg = {"headers":{"x-mms-message-type":130,"x-mms-transaction-id":"gYi5O0faehS0","x-mms-mms-version":16,"from":{"address":"15809249119","type":"PLMN"},"x-mms-message-class":"personal","x-mms-message-size":2287,"x-mms-expiry":259200,"x-mms-content-location":{"uri":"http://192.232.123.180:8080/iDhEN1"}},"type":130}
03-02 20:29:59.171   298   298 I Gecko   : -@- MmsService: acquire: buffer the MMS request and setup the MMS data call.
03-02 20:30:00.371   298   298 I Gecko   : -@- MmsService: Got the MMS network connected! Resend the buffered MMS requests: number: 1
03-02 20:30:01.791   298   298 I Gecko   : -@- MmsService: retrievedMessage = {"headers":{"x-mms-message-type":132,"x-mms-transaction-id":"gYi5O0faehS0","x-mms-mms-version":16,"message-id":"030515104581000907119","date":"2014-03-05T07:10:45.000Z","from":{"address":"+8615809249119","type":"PLMN"},"to":{"address":"8618629569640","type":"PLMN"},"x-mms-message-class":"personal","x-mms-priority":129,"x-mms-delivery-report":false,"x-mms-read-report":false,"content-type":{"media":"application/vnd.wap.multipart.related","params":{"start":"<start>","type":"application/smil"}}},"type":132,"parts":[{"index":0,"headers":{"content-type":{"media":"application/smil","params":{"charset":{"charset":"us-ascii"}}},"content-length":355,"content-id":"<start>","content-location":"smil.smi"},"content":"<smil><head><layout><root-layout width=\"320px\" height=\"480px\"/><region id=\"Image\" left=\"0\" top=\"0\" width=\"320px\" height=\"320px\" fit=\"meet\"/><region id=\"Text\" left=\"0\" top=\"320\" width=\"320px\" he
03-02 20:30:01.941   298   298 I Gecko   : -@- MmsService: Broadcasting the MMS system message: sms-received

03-02 20:30:43.282   298   298 I Gecko   : -@- MmsService: receiveWapPush: msg = {"headers":{"x-mms-message-type":130,"x-mms-transaction-id":"iok160CZFm20","x-mms-mms-version":16,"from":{"address":"15809249119","type":"PLMN"},"x-mms-message-class":"personal","x-mms-message-size":2287,"x-mms-expiry":259200,"x-mms-content-location":{"uri":"http://192.232.123.180:8080/Eo0p51"}},"type":130}
03-02 20:30:43.432   298   298 I Gecko   : -@- MmsService: acquire: buffer the MMS request and setup the MMS data call.
03-02 20:30:44.341   298   298 I Gecko   : -@- MmsService: Got the MMS network connected! Resend the buffered MMS requests: number: 1
03-02 20:30:45.001   298   298 I Gecko   : -@- MmsService: retrievedMessage = {"headers":{"x-mms-message-type":132,"x-mms-transaction-id":"iok160CZFm20","x-mms-mms-version":16,"message-id":"030515363781000902233","date":"2014-03-05T07:30:36.000Z","from":{"address":"+8615809249119","type":"PLMN"},"to":{"address":"8618629569640","type":"PLMN"},"x-mms-message-class":"personal","x-mms-priority":129,"x-mms-delivery-report":false,"x-mms-read-report":false,"content-type":{"media":"application/vnd.wap.multipart.related","params":{"start":"<start>","type":"application/smil"}}},"type":132,"parts":[{"index":0,"headers":{"content-type":{"media":"application/smil","params":{"charset":{"charset":"us-ascii"}}},"content-length":355,"content-id":"<start>","content-location":"smil.smi"},"content":"<smil><head><layout><root-layout width=\"320px\" height=\"480px\"/><region id=\"Image\" left=\"0\" top=\"0\" width=\"320px\" height=\"320px\" fit=\"meet\"/><region id=\"Text\" left=\"0\" top=\"320\" width=\"320px\" he
03-02 20:30:45.141   298   298 I Gecko   : -@- MmsService: Broadcasting the MMS system message: sms-received

(In reply to 田旻 from comment #30)
> Created attachment 8385924 [details]
> sms.log
> 
> adb log with ril and mms debug enable and timestamp
Flags: needinfo?(tianm)
I didn't get qxdm log because my software is expired, but I will give it by other's computer.
Flags: needinfo?(tianm)
Attached file adb_log.txt
This is the adb log you need.
Hi 田旻,

From the log,
I only saw an incoming MMS instead of an outgoing MMS and it took only 18 seconds to download.
The scenario is totally different from the scenario mentioned in this bug.
Can you also point out the timestamp when you sent the MMS. It will be more easier to see if log is captured.

In addition, for sending MMS, we suppose to see a log like this:
'-@- MmsService: send: aParams:'

Regards,
Bevis Tseng

--
03-05 18:18:38.331   291   291 I Gecko   : -@- MmsService: receiveWapPush: msg
03-05 18:18:38.641   291   291 I Gecko   : -@- MmsService: acquire: buffer the MMS request and setup the MMS data call.
03-05 18:18:46.571   291   291 I Gecko   : -@- MmsService: Got the MMS network connected! Resend the buffered MMS requests: number: 1
03-05 18:18:46.571   291   291 I Gecko   : -@- MmsService: sendRequest: register proxy filter to http://211.137.118.50:188/5eghi1
03-05 18:18:46.611   291   291 I Gecko   : -@- MmsService: applyFilter: MMSC/Content Location is ...
Flags: needinfo?(tianm)
Attached file MMS Log
Attaching new logcat using the script from Comment 26. 

As a side note, It's strange because that script includes the exact same steps I did manually, although in the previous log I realized "Console enabled" wasn't checked in the dev menu while running the log, could that be the cause?

Will attach the .pcap file momentarily.
Flags: needinfo?(jzimbrick)
Attachment #8385621 - Attachment is obsolete: true
Attached file TCPDump capture.pcap
Another Note: During this logcat, the MMS actually seemed to take too long to send, timing out around 30 minutes. In the past I've seen them send after 15-20 minutes when this issue occurs.

This was tested on the latest 1.3 Buri build.

1.3 Environmental Variables:
Device: Buri
BuildID: 20140305004003
Gaia: 9b29f80309212bd80fbdcf2882458b1ae8b30093
Gecko: b3478af0151e
Version: 28.0
Base Image: V1.2-device.cfg
Keywords: qawanted
Can't load qxdm log, because it is too large and system server reject for its exceed its limitation.
Flags: needinfo?(tianm)
Hi Zimbrick,

Thanks for providing the logs. 

After comparing logcat & tcpdump, we found that when the issue happened, 
the MMS transaction was actually sent via wifi instead of MMS PDP connection.

We wondered that there is some timing issue in current design between
NetworkManager (resolve mmsc/proxy host name and set its address into routing table after mms pdp is connected.) 
and MmsService(Send MMS after MMS PDP is connected).

To clarify this, we need to enable more logs in NetworkManager related modules.

Would you please help us to collect the adb logcat(mms/ril/NetworkManager enabled) & tcpdump again?

To enable NetworkManager related logs,
1. mkdir tmp
2. adb pull /system/b2g/omni.ja tmp/
3. cd tmp
4. unzip omni.ja
5. rm omni.ja
6. find out |NetworkManager.js|, |NetworkService.js|, |net_worker.js| in |components| folder
   and modify these files to change |DEBUG = false| to |true|.
7. zip -qr9XD omni.ja *
8. adb remount
9. adb push omni.ja /system/b2g/omni.ja
10. reboot the device to start the test. (Now you suppose to be able to see new logs tagged with |NetworkManager|.

Thanks,
Bevis
Flags: needinfo?(jzimbrick)
Keywords: qawanted
Attached file MMS Log 2
Attaching logcat with the extra steps from Comment 39 enabled.

In this log the MMS sent after about 20 minutes.
Attached file TCPDump capture.pcap 2
Flags: needinfo?(jzimbrick)
Keywords: qawanted
PM Triage: Stays 1.3+
Update finding so far.

After checking the logs in comment 40, comment 41,
There is one race condition that cause the routing table possibly wrong while sending mms.

Once a mms data connection is established:
1. RadioInterfaceLayer will register a new RadioInterface to NetworkManager and 
   notify observers of |network-interface-state-changed|
2. NetworkManager will try to resolve mmsc host and add it to route.
3. When NetworkManager receives |network-interface-state-changed|, 
   it will remove old routes and add new routes.
4. MmsService also relies on the same notification of 
   |network-interface-state-changed| to start MMS transactions.

The problem is that the order of receiving |network-interface-state-changed| is always be 
MmsService, then NetworkManager, even MmsService is created later then NetworkManager.
This might cause the scenario of this bug because 
there are some asynchronized requests ongoing under both MmsService and NetworkManager.

Possible solution is that:
1. New design to ensure everything is done before notify to outer modules 
   who cares the state of the connection. (Big change)
2. How to prioritize the order of the notification for MmsService and NetworkManager?
3. Don't remove old routes in step 3 mentioned above.
   Need to discuss possible impact if we keep old routes. (Likely to be a workaround)
4. Can NetworkManager provide a internal notification to MmsService for MmsService to 
   be aware of the correct state for processing MMS transaction? (Likely to be a workaround)

Need to more discussion internally figure out the solution.
(In reply to Bevis Tseng [:bevistseng] (btseng@mozilla.com) from comment #43)
> Possible solution is that:
> 1. New design to ensure everything is done before notify to outer modules 
>    who cares the state of the connection. (Big change)
> 2. How to prioritize the order of the notification for MmsService and
> NetworkManager?
> 3. Don't remove old routes in step 3 mentioned above.
>    Need to discuss possible impact if we keep old routes. (Likely to be a
> workaround)
> 4. Can NetworkManager provide a internal notification to MmsService for
> MmsService to 
>    be aware of the correct state for processing MMS transaction? (Likely to
> be a workaround)

5. Only remove/add routes when needed at step 3.
This is a workaround to ensure that the mms transaction is started after routing is set by NetworkManager.
Attachment #8387522 - Flags: review?(vyang)
Comment on attachment 8387522 [details] [diff] [review]
Patch: Define additional notification between NetworkManager and MmsService to prevent the race condition of setting routes and sending mms. r=vyang

Review of attachment 8387522 [details] [diff] [review]:
-----------------------------------------------------------------

I found there are still other places listening to that "network-interface-state-changed" event.  So this means most clients are using that original event name, and they suffer from the same problem for sure, while MmsService uses the new one.  Somehow my brain has only 32MB memory, so I can't remember as much things as all the exceptions to create here, and I bet, as time goes by, most people do the same because humans are meant to forget.  Please correct all "network-interface-state-changed" occurrences except those issued from nsINetworkInterface instances.
Attachment #8387522 - Flags: review?(vyang)
Roger that! 
We should not introduce potential maintenance problem while fixing bug.

(In reply to Vicamo Yang [:vicamo][:vyang] from comment #46)
> Comment on attachment 8387522 [details] [diff] [review]
> Patch: Define additional notification between NetworkManager and MmsService
> to prevent the race condition of setting routes and sending mms. r=vyang
> 
> Review of attachment 8387522 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I found there are still other places listening to that
> "network-interface-state-changed" event.  So this means most clients are
> using that original event name, and they suffer from the same problem for
> sure, while MmsService uses the new one.  Somehow my brain has only 32MB
> memory, so I can't remember as much things as all the exceptions to create
> here, and I bet, as time goes by, most people do the same because humans are
> meant to forget.  Please correct all "network-interface-state-changed"
> occurrences except those issued from nsINetworkInterface instances.
Attachment #8387522 - Attachment is obsolete: true
Attachment #8388318 - Flags: review?(vyang)
Comment on attachment 8388318 [details] [diff] [review]
Patch_v2: Define new notification between NetworkManager and outer modules to prevent the race condition of setting routes and using network interface. r=vyang.

Review of attachment 8388318 [details] [diff] [review]:
-----------------------------------------------------------------

RadioInterfaceLayer::observe() listens to "network-interface-state-changed" to send SNTP requests as well.

::: dom/system/gonk/NetworkManager.js
@@ +280,5 @@
> +#ifdef MOZ_B2G_RIL
> +        // Notify outer modules like MmsService to start the transaction after
> +        // the configuration of the network interface is done.
> +        Services.obs.notifyObservers(network, TOPIC_CONNECTION_STATE_CHANGED, null);
> +

nit: redundant empty line
Attachment #8388318 - Flags: review?(vyang)
Comment on attachment 8388913 [details] [diff] [review]
Patch v2: [1.3] Define new notification between NetworkManager and outer modules to prevent the race condition of setting routes and using network interface. r=vyang. a=1.3

Need to fix the problem mentioned in comment 50 for 1.3 as well.
Attachment #8388913 - Flags: review?(vyang)
(In reply to Vicamo Yang [:vicamo][:vyang] from comment #50)
> RadioInterfaceLayer::observe() listens to "network-interface-state-changed"
> to send SNTP requests as well.

Thanks for catching this. I'll fix and verify sntp part.
Attachment #8388913 - Attachment is obsolete: true
Comment on attachment 8389010 [details] [diff] [review]
Patch v3: [1.3] Define new notification between NetworkManager and outer modules to prevent the race condition of setting routes and using network interface. r=vyang. a=1.3+

Fix problem mentioned in comment 50 for 1.3 as well.
Attachment #8389010 - Flags: review?(vyang)
Attachment #8389009 - Flags: review?(vyang) → review+
Both master/1.3 patch has been verified fix.
Attachment #8389010 - Flags: review?(vyang) → review+
Target Milestone: --- → 1.4 S3 (14mar)
https://hg.mozilla.org/mozilla-central/rev/a6b384b72e10
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Please request approval-mozilla-b2g28 when you feel this is ready for uplift to 1.3.
Comment on attachment 8389010 [details] [diff] [review]
Patch v3: [1.3] Define new notification between NetworkManager and outer modules to prevent the race condition of setting routes and using network interface. r=vyang. a=1.3+

NOTE: This flag is now for security issues only. Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): A long-existing design defect between NetworkManager and other Modules (like MmsService) accessing data connection after network interface is established.
NetworkManager is responsible for setting the routing, etc after network-interface(WiFi, Mobile Data Connection) is established.
Modules like MmsService will start transaction after network-interface is established and configured.
However, both NetworkManager & MmsService observe the same notification which introduces potential race condition that cause MmsService fail to send/receive Mms message correctly.

User impact if declined: Take Mms service for example. Sometimes MMS can not be sent/retrieved because the routing might not be ready before transaction begins.
Testing completed: Yes
Risk to taking this patch (and alternatives if risky): No
String or UUID changes made by this patch: No
Attachment #8389010 - Flags: approval-mozilla-b2g28?
Attachment #8389010 - Flags: approval-mozilla-b2g28? → approval-mozilla-b2g28+
I use the last patch (Patch v3[1.3]) on v1.3 of Mozilla build ID: 20140304164005, but I can still reproduce this bug as before.
(In reply to 田旻 from comment #63)
> I use the last patch (Patch v3[1.3]) on v1.3 of Mozilla build ID:
> 20140304164005, but I can still reproduce this bug as before.

JZimbrick - Can you reproduce this on the latest 1.3?
Flags: needinfo?(jzimbrick)
This is strange. 
Please help to collect the logs with the steps in comment 26 and comment 39 for further analysis.
Can we reopen this bug? Because I think this problem hasn't been fixed now.
(In reply to 田旻 from comment #66)
> Can we reopen this bug? Because I think this problem hasn't been fixed now.

Can we have to corresponding log mentioned in comment 65 
to identify if they are the same root cause or not?

We are also waiting for the test result with Moz RIL to see if this is still reproducible in our build.
Flags: needinfo?(tianm)
(In reply to Jason Smith [:jsmith] from comment #64)
> (In reply to 田旻 from comment #63)
> > I use the last patch (Patch v3[1.3]) on v1.3 of Mozilla build ID:
> > 20140304164005, but I can still reproduce this bug as before.
> 
> JZimbrick - Can you reproduce this on the latest 1.3?

I was unable to reproduce this issue on the latest Buri 1.3 build. At no time did I see the phone take 20 or more minutes to send an MMS, most messages sent within a few (7-9) seconds.

Though out of thirty attempts I tried, I did see two of the messages take about one minute to send, this may be what Comment 63 is referring to.

Environmental Variables:
Device: Buri
BuildID: 20140319004003
Gaia: f2f2be55dcca1d6775592ef478948045935a91ed
Gecko: 87d90ecfe616
Version: 28.0
Base Image: V1.2-device.cfg
Flags: needinfo?(jzimbrick)
Maybe a QC-related fix is needed here as well? Sounds like we've got this working on Moz RIL right now, but it's still not working on the QC RIL.

mvines - Can you find someone to investigate this?
Flags: needinfo?(mvines)
Thanks, we're good
Flags: needinfo?(mvines)
Tested as working on QCRIL... with some notable issues.

Something seems to be blocking the main thread when trying to send an MMS while data isn't yet connected.
For some reason it is taking 15 seconds to get anything done on the main thread, which causes the MMS to take about 30 seconds longer to send than it should (15 sec to punt something to the main thread + 15 to send out the observer notification).
The exact same actions when MMS service isn't running takes less than 10 milliseconds.
The UI can also become completely unresponsive when this is happening (with both qc's and moz's ril). 
Initial signs points to the MMS service, but I will be looking into that some more and file a separate bug for it.

So for the purpose and scope of this bug, current patch is confirmed as working when using QCRIL. Every MMS was sent under a minute if data call was down, and in just a few seconds if the data call is already up.
Attached video VID_0006.3gp
If you use this video which is less than 300k, but you can't sent out successfully.
Flags: needinfo?(tianm)
Attached video VID_0006.3gp
It would be easier if you'd file new bugs for each issue.

Thanks!
Hi  田旻  -

Per discussion, please close this bug and file a new one to track the larger video cannot be sent via MMS issue

Thanks

Vance
(In reply to Vance Chen [:vchen][vchen@mozilla.com] from comment #75)
> Hi  田旻  -
> 
> Per discussion, please close this bug and file a new one to track the larger
> video cannot be sent via MMS issue
> 
> Thanks
> 
> Vance

OK, I will file a new bug.
Flags: needinfo?(tianm)
(In reply to Vance Chen [:vchen][vchen@mozilla.com] from comment #75)
> Hi  田旻  -
> 
> Per discussion, please close this bug and file a new one to track the larger
> video cannot be sent via MMS issue
> 
> Thanks
> 
> Vance
I have file a new bug about this problem, which no. is 987546. Please close this one.
This one is already closed, thanks !
Flags: in-moztrap?
Flags: in-moztrap? → in-moztrap+
You need to log in before you can comment on or make changes to this bug.