If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

[l-gonk] plugin-container can't connect to media server

RESOLVED WONTFIX

Status

()

Core
Audio/Video
--
critical
RESOLVED WONTFIX
3 years ago
3 years ago

People

(Reporter: Greg Grisco, Assigned: sotaro)

Tracking

(Blocks: 1 bug, {crash})

unspecified
ARM
Gonk (Firefox OS)
crash
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:2.2+)

Details

(Whiteboard: [b2g-crash][caf-crash 433][caf priority: p1][CR 785241], crash signature)

Attachments

(29 attachments, 4 obsolete attachments)

147.93 KB, text/plain
Details
24.13 KB, text/plain
Details
97.60 KB, text/x-log
Details
7.48 KB, patch
Details | Diff | Splinter Review
746.31 KB, text/x-log
Details
667.75 KB, text/x-log
Details
4.76 KB, patch
Details | Diff | Splinter Review
7.35 KB, patch
Details | Diff | Splinter Review
1.53 MB, text/x-log
Details
733 bytes, text/plain
Details
143.06 KB, text/plain
Details
733 bytes, text/plain
Details
143.06 KB, text/plain
Details
214.64 KB, text/plain
Details
161.74 KB, text/plain
Details
89.83 KB, text/plain
Details
149.38 KB, text/plain
Details
82.96 KB, text/plain
Details
164.91 KB, text/plain
Details
83.29 KB, text/plain
Details
157.70 KB, text/plain
Details
76.31 KB, text/plain
Details
163.39 KB, text/plain
Details
7.48 KB, patch
Details | Diff | Splinter Review
85.74 KB, text/plain
Details
165.71 KB, text/plain
Details
85.85 KB, text/plain
Details
157.92 KB, text/plain
Details
4.69 KB, patch
Details | Diff | Splinter Review
(Reporter)

Description

3 years ago
With v2.2, We have seen this crash signature hundreds of times over the past few weeks during monkey test.  No STR available.  cafbot will upload minidump and .extra file.

[@ mozalloc_abort | __android_log_assert | android::OMXClient::connect | android::OmxDecoder::AllocateMediaResources ]
Created attachment 8554035 [details]
EXTRA file attachment -
Created attachment 8554036 [details]
decoded minidump -
(Reporter)

Comment 3

3 years ago
Bhavana, here's another frequent crash we're seeing.
Flags: needinfo?(bbajaj)

Updated

3 years ago
Whiteboard: [CR 785241]

Updated

3 years ago
Whiteboard: [CR 785241] → [caf priority: p1][CR 785241]

Updated

3 years ago
Whiteboard: [caf priority: p1][CR 785241] → [b2g-crash][caf-crash 433][caf priority: p1][CR 785241]
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.044
Moz BuildID: 20150120002507
B2G Version: 2.2
Gecko Version: 37.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=f5b3d1b6cfa3e702033f613915ae637cb735cbfb
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=bb6c4d3fc51281f1a2ee0ff471221dbc1d0d1035
Patches: bug 1067629, bug 1117862, bug 1119980, bug 1091307, bug 1122162

Comment 5

3 years ago
copying Sotaro and Chris Pearce
Component: Gaia::Video → Video/Audio
Product: Firefox OS → Core
Thanks Hema!

Greg, hoping sotaro or chris can help start investigation here or atleast re-direct to the right owner to help here.
blocking-b2g: 2.2? → 2.2+
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(cpearce)
Flags: needinfo?(bbajaj)
(Assignee)

Comment 7

3 years ago
Greg, can we have logcat logs when the problem happens?
Flags: needinfo?(sotaro.ikeda.g) → needinfo?(ggrisco)
(Assignee)

Comment 8

3 years ago
Greg, can we have logcat logs when the problem happens? And from when, does the problem happen?
(Assignee)

Comment 9

3 years ago
OMXClient did not have a change that seems to affect to this. Other places change seems to be affected.
(Assignee)

Comment 10

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #9)
> OMXClient did not have a change that seems to affect to this. Other places
> change seems to be affected.

correction:
OMXClient did not have a change that seems to affect to this.
(Assignee)

Comment 11

3 years ago
In attachment 8554036 [details], ASSERT happens at OMXClient::connect(). It says that media server did not exist because of a crash. Therefore, attachment 8554036 [details] does not have an information except media server crash.

Therefore, logcat log that catch the crash is necessary to investigate more.
(Assignee)

Comment 12

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #10)
> (In reply to Sotaro Ikeda [:sotaro] from comment #9)
> > OMXClient did not have a change that seems to affect to this. Other places
> > change seems to be affected.
> 
> correction:
> OMXClient did not have a change that seems to affect to this.

Sorry, OmxDecoder is correct workd...
(Assignee)

Updated

3 years ago
Flags: needinfo?(ggrisco)
(Assignee)

Comment 13

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #8)
> Greg, can we have logcat logs when the problem happens? And from when, does
> the problem happen?

Greg, can you also get kernel log(dmesg)?
Flags: needinfo?(ggrisco)
(Reporter)

Comment 14

3 years ago
Created attachment 8554768 [details]
dmesg.log
Flags: needinfo?(ggrisco)
(Reporter)

Comment 15

3 years ago
Created attachment 8554771 [details]
main.log
(Reporter)

Comment 16

3 years ago
Ok, uploaded dmesg and android logs.
Flags: needinfo?(sotaro.ikeda.g)
(Assignee)

Comment 17

3 years ago
Greg, were attachment 8554768 [details] and attachment 8554771 [details] got same time to logs of Comment 1 and Comment 2?
And can you provide the hardware and RAM config information?
Flags: needinfo?(sotaro.ikeda.g) → needinfo?(ggrisco)
(Assignee)

Comment 18

3 years ago
attachment 8554771 [details] says that b2g process was restarted during testing. It seems to be triggered by slogger error. slogger seems not mozilla's module.
 

The following is from attachment 8554771 [details].
--------------------------------------------------------------------------------------

01-01 00:03:31.358   960   962 E slogger : Raw Log write error. Peer has disconnected
01-01 00:03:31.360   960   962 E slogger : Raw Log shutdown. Waiting on reload signal...
01-01 00:03:31.363   960   960 W slogger : Active: 0
01-01 00:03:31.363   960   960 W slogger : Log cleanup: ref.log
01-01 00:03:31.366   960   960 W slogger : Log cleanup: dmesg.log
01-01 00:03:31.370   960   960 W slogger : Log cleanup: main.log
01-01 00:03:31.372   960   960 W slogger : Log cleanup: radio.log
01-01 00:03:31.374   960   960 W slogger : Log cleanup: events.log
01-01 00:03:31.374   960   960 W slogger : Log cleanup: meminfo.log
01-01 00:03:31.374   960   960 W slogger : Log cleanup: battery.log
01-01 00:03:31.374   960   960 W slogger : Log cleanup: ion.log
01-01 00:03:31.374   960   960 W slogger : Log cleanup: kgsl.log
01-01 00:03:31.374   960   960 W slogger : Log cleanup: buddyinfo.log
01-01 00:03:31.374   960   960 W slogger : Log cleanup: procrank.log
01-01 00:03:31.374   960   960 W slogger : Log cleanup: top.log
01-01 00:03:31.374   960   960 W slogger : Log cleanup: lsof.log
01-01 00:03:31.375   960   960 W slogger : Log cleanup: b2g-info.log
01-01 00:03:31.375   960   960 W slogger : Log cleanup: b2g-ps.log
01-01 00:03:31.375   960   960 W slogger : Log cleanup: b2g-procrank.log
01-01 00:03:31.375   960   960 W slogger : Log cleanup: b2gmemreport.log
01-01 00:03:32.360   960   962 W slogger : Raw Log stop
01-01 00:03:34.434   246   876 D QLOC-OSAgent: Subscriber already registerd XTWiFi-PE
01-01 00:03:35.880   246   876 D QLOC-OSAgent: Subscriber already registerd XTWiFi-PE
01-01 00:03:37.221   237   857 D AudioFlinger: setProcessedAudioState(trackId=55, processed=0)


The following says that b2g process was killed/re-started.

01-01 00:03:37.239   217   217 I ServiceManager: service 'media.resource_manager' died
01-01 00:03:37.239   217   217 I ServiceManager: service 'SurfaceFlinger' died
01-01 00:03:37.239   217   217 I ServiceManager: service 'permission' died
01-01 00:03:37.239   217   217 I ServiceManager: service 'display.qservice' died
(Assignee)

Comment 19

3 years ago
Greg, do you know if slogger is codeaurora's module?
(Reporter)

Comment 20

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #19)
> Greg, do you know if slogger is codeaurora's module?

Hi Sotaro, yes, slogger is an internal CAF utility.  Let me see if I can figure out why this is happening.  I'll close this for now and re-open if needed.  Thanks.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Flags: needinfo?(ggrisco)
Resolution: --- → INVALID
(Assignee)

Comment 21

3 years ago
no problem :-)
(Reporter)

Comment 22

3 years ago
Created attachment 8555441 [details]
decoded minidump of crash
Attachment #8554035 - Attachment is obsolete: true
Attachment #8554036 - Attachment is obsolete: true
Attachment #8554768 - Attachment is obsolete: true
Attachment #8554771 - Attachment is obsolete: true
(Reporter)

Comment 23

3 years ago
Created attachment 8555442 [details]
EXTRA file attachment (plus some logs)
(Reporter)

Comment 24

3 years ago
Created attachment 8555443 [details]
dmesg.log
(Reporter)

Comment 25

3 years ago
I reproduced this by first removing our slogger utility.  Then all I did was start Video app with an sdcard inserted (that contained some videos). The app opened but complained that there was another instance of the app running (which shouldn't be the case).  So I closed the app and that's when I saw the crash.  It's pretty repeatable, so please let me know if there's any extra information I can gather for you.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
(Assignee)

Comment 26

3 years ago
(In reply to Greg Grisco from comment #25)
> I reproduced this by first removing our slogger utility.  Then all I did was
> start Video app with an sdcard inserted (that contained some videos). The
> app opened but complained that there was another instance of the app running
> (which shouldn't be the case).  So I closed the app and that's when I saw
> the crash.  It's pretty repeatable, so please let me know if there's any
> extra information I can gather for you.

Greg, can you explain more explicitly or can you upload the screen shot when "there was another instance of the app running". It seems different problem than the crash.
Flags: needinfo?(ggrisco)
(Assignee)

Comment 27

3 years ago
(In reply to Greg Grisco from comment #25)
> I reproduced this by first removing our slogger utility.  Then all I did was
> start Video app with an sdcard inserted (that contained some videos).

Greg, Is the message is like the following?
- "Cannot play video: Another app is currently using the video player"

Video app actually does not have a such capability. Therefore, the video app just emulate it just by some timer. Similar bug to it is Bug 1112412.
(Assignee)

Updated

3 years ago
Assignee: nobody → sotaro.ikeda.g
(Assignee)

Comment 28

3 years ago
(In reply to Greg Grisco from comment #25)
> I reproduced this by first removing our slogger utility.  Then all I did was
> start Video app with an sdcard inserted (that contained some videos). The
> app opened but complained that there was another instance of the app running
> (which shouldn't be the case).  So I closed the app and that's when I saw
> the crash.  It's pretty repeatable, so please let me know if there's any
> extra information I can gather for you.

Greg, can you provide a set of logs when removing our slogger utility?
(Assignee)

Comment 29

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #28)
> (In reply to Greg Grisco from comment #25)
> > I reproduced this by first removing our slogger utility.  Then all I did was
> > start Video app with an sdcard inserted (that contained some videos). The
> > app opened but complained that there was another instance of the app running
> > (which shouldn't be the case).  So I closed the app and that's when I saw
> > the crash.  It's pretty repeatable, so please let me know if there's any
> > extra information I can gather for you.
> 
> Greg, can you provide a set of logs when removing our slogger utility?

Sorry, it is not necessary. logs of Comment 22, 23, 24 seems new log and does not have b2g process restart.
(Assignee)

Comment 30

3 years ago
attachment 8555442 [details] have the following. It seems to say that the device was oom.

> 01-01 02:52:14.148  1398  1452 E Gecko   : mozalloc_abort: Redirecting call to abort() to mozalloc_abort
(Assignee)

Comment 31

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #30)
> attachment 8555442 [details] have the following. It seems to say that the
> device was oom.
> 
> > 01-01 02:52:14.148  1398  1452 E Gecko   : mozalloc_abort: Redirecting call to abort() to mozalloc_abort

One strange thing is that the crash log(attachment 8555441 [details] ) say that there is no mediaserver.
(Reporter)

Comment 32

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #27)
> (In reply to Greg Grisco from comment #25)
> 
> Greg, Is the message is like the following?
> - "Cannot play video: Another app is currently using the video player"

Yes, that is the exact message.  I just reproduced and saw the message.  The app crashes right after that.  Now I have been able to reproduce only 50% of the time.  Other times I am able to play video (although no audio is being played).
Flags: needinfo?(ggrisco)
(Assignee)

Comment 33

3 years ago
(In reply to Greg Grisco from comment #25)
> I reproduced this by first removing our slogger utility.  Then all I did was
> start Video app with an sdcard inserted (that contained some videos). The
> app opened but complained that there was another instance of the app running
> (which shouldn't be the case).  So I closed the app and that's when I saw
> the crash.  It's pretty repeatable, so please let me know if there's any
> extra information I can gather for you.

Greg, did you reproduce the crash just by opening one video app? Can you write more detailed step(STR) to reproduce the crash?
Flags: needinfo?(ggrisco)
(Reporter)

Comment 34

3 years ago
Here are the steps I used to reproduce:

1. reboot phone
2. select Video app

video images are displayed

3. select a video to play
4. if the video starts to play, go to step 1.  But after only two iterations of this I saw the "Cannot play video" message followed by crashs.
Flags: needinfo?(ggrisco) → needinfo?(sotaro.ikeda.g)
(Assignee)

Comment 35

3 years ago
Greg, thanks. I have an additional question. Can you answer the following questions?

- In which version of gonk is used for testing? (Gonk KitKat?).
- On which hardware does the test is done?
- Can you upload OMXClient.cpp file used for the ROM?
Flags: needinfo?(sotaro.ikeda.g) → needinfo?(ggrisco)
(Assignee)

Comment 36

3 years ago
(In reply to Greg Grisco from comment #34)
> Here are the steps I used to reproduce:
> 
> 1. reboot phone
> 2. select Video app

Greg, in the above "1", how did you reboot the phone. Did you do power off and on? Or, did you just reboot b2g?
(Reporter)

Comment 37

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #35)
> Greg, thanks. I have an additional question. Can you answer the following
> questions?
> 
> - In which version of gonk is used for testing? (Gonk KitKat?).
> - On which hardware does the test is done?
> - Can you upload OMXClient.cpp file used for the ROM?

This was on L-based build on reference hardware.  Gonk version is listed in comment 4.

(In reply to Sotaro Ikeda [:sotaro] from comment #36)
> (In reply to Greg Grisco from comment #34)
> > Here are the steps I used to reproduce:
> > 
> > 1. reboot phone
> > 2. select Video app
> 
> Greg, in the above "1", how did you reboot the phone. Did you do power off
> and on? Or, did you just reboot b2g?

The crash occurs in both scenarios.
Flags: needinfo?(ggrisco)
(Assignee)

Comment 38

3 years ago
Used OMXClient.cpp seems like the following.
https://www.codeaurora.org/cgit/quic/la/platform/frameworks/av/tree/media/libstagefright/OMXClient.cpp?h=mozilla/b2g-5.0.0_r6
(Assignee)

Comment 39

3 years ago
Wired thing about this bug is that logcat says that the crash happened because of memory allocation failure.

> Gecko   : mozalloc_abort: Redirecting call to abort() to mozalloc_abort

But crash log says that android::OMXClient::connect() failure calls abort.
(Assignee)

Comment 40

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #38)
> Used OMXClient.cpp seems like the following.
> https://www.codeaurora.org/cgit/quic/la/platform/frameworks/av/tree/media/
> libstagefright/OMXClient.cpp?h=mozilla/b2g-5.0.0_r6

If the above code is correct, when abort was called in OMXClient::connect(), it succeeded to get IServiceManager and IMediaPlayerService. But failed to get IOMX. IOMX should always succeed when IMediaPlayerService exists.
(Assignee)

Comment 41

3 years ago
Another wired thing is that if the crash happened by OMXClient::connect() failure, the following log should exit in logcat log.

> F OMXClient: frameworks/av/media/libstagefright/OMXClient.cpp:406 CHECK(mOMX.get() != NULL) failed.
(Reporter)

Comment 42

3 years ago
Maybe mOMX is null?
(Assignee)

Comment 43

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #39)
> Wired thing about this bug is that logcat says that the crash happened
> because of memory allocation failure.
> 
> > Gecko   : mozalloc_abort: Redirecting call to abort() to mozalloc_abort
> 
> But crash log says that android::OMXClient::connect() failure calls abort.

There seems no inconsistency about the above. In b2g and b2g container process, abort() call is redirected to mozalloc_abort() call. attachment 8555441 [details] shows mozalloc_abort() call. And mozalloc_abort() prints the following log.

> Gecko   : mozalloc_abort: Redirecting call to abort() to mozalloc_abort
(Assignee)

Comment 44

3 years ago
(In reply to Greg Grisco from comment #42)
> Maybe mOMX is null?

There is such possibility. But MediaPlayerService::getOMX() always return valid OMX.

https://www.codeaurora.org/cgit/quic/la/platform/frameworks/av/tree/media/libmediaplayerservice/MediaPlayerService.cpp?h=LA.AF.1.1.1#n357
(Assignee)

Comment 45

3 years ago
Created attachment 8556062 [details] [diff] [review]
log patch - framework/av

Add some logs around OMXClient and OMX.
(Assignee)

Comment 46

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #45)
> Created attachment 8556062 [details] [diff] [review]
> log patch - framework/av
> 
> Add some logs around OMXClient and OMX.

I checked the patch on nexus5-l. Some how, I rarely saw log around OMXClient, although log around MediaPlayerService almost always seems to be printed to logcat.
(Assignee)

Comment 47

3 years ago
Greg, can you test again with attachment 8556062 [details] [diff] [review]? some log seems not always printed to logcat, but the additional log could help for debugging. Thanks.
Flags: needinfo?(ggrisco)
(Reporter)

Comment 48

3 years ago
Created attachment 8556165 [details]
video_works.log

This is what the log looks like when I'm able to play video without crash.  Extra logging from attachment 8556062 [details] [diff] [review] is applied.
Flags: needinfo?(ggrisco)
(Reporter)

Comment 49

3 years ago
Created attachment 8556166 [details]
video_crashes.log

This is what the log looks like when Video crashes.  Extra logging from attachment 8556062 [details] [diff] [review] is applied.
(Assignee)

Comment 50

3 years ago
Thanks! When the problem happens, somehow, an application seems to failed to call MediaPlayerService::getOMX(). The function call log exist only on video_works.log(attachment 8556165 [details]).

But it is still not clear why it failed.
(Assignee)

Comment 51

3 years ago
I re-checked decoded minidump of the crash attachment 8555441 [details]. I recognized one wired thing. It has all threads stacks in the crash process. But it does not have IPC Pool threads. They are created by android::ProcessState::self()->startThreadPool() call. The function is called during application process initialization.
(Assignee)

Comment 52

3 years ago
The problem seems to happen around android binder ipc.
(Assignee)

Comment 53

3 years ago
If we have msm8909 device, it might becomes easier to investigate this problem...
(Reporter)

Comment 54

3 years ago
Sotaro, is there any other information that I collect for you since you don't have 8909?
(Reporter)

Updated

3 years ago
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(cpearce)
(Assignee)

Comment 55

3 years ago
(In reply to Greg Grisco from comment #54)
> Sotaro, is there any other information that I collect for you since you
> don't have 8909?

Thanks, I will prepare a new log patch tomorrow(Thursday).
Flags: needinfo?(sotaro.ikeda.g) → needinfo?(sotaro.ikeda.g)
(Assignee)

Comment 56

3 years ago
Created attachment 8559996 [details] [diff] [review]
log patch - gecko
Flags: needinfo?(sotaro.ikeda.g)
(Assignee)

Comment 57

3 years ago
Created attachment 8559997 [details] [diff] [review]
log patch - framework/native
(Assignee)

Comment 58

3 years ago
Greg, can you get a log with the following patches applied?

  - log patch - gecko: attachment 8559996 [details] [diff] [review]
  - log patch - framework/native: attachment 8559997 [details] [diff] [review]
  - log patch - framework/av: attachment 8556062 [details] [diff] [review]
Flags: needinfo?(ggrisco)
(Reporter)

Comment 59

3 years ago
Hi Sotaro, some of these patches no longer apply.  I'm trying to fix them up but will have to finish it tomorrow.  I hope to have some logs for you to look at tomorrow morning, thanks.
(Reporter)

Comment 60

3 years ago
Created attachment 8560553 [details]
music_video.log

Now it's getting hard to reproduce this bug.  We were seeing hundreds of these crashes in past couple of weeks, but last two days automation hasn't produced any crashes.  However, I can see the "Another app is currently using the video player." message after I follow these steps:

1. Start Music app
2. Select music track to play

With some content, I'm seeing the Music app freeze (bug 1129685)

3. Kill the Music app
4. Start the Video app

Now I see the message.

I collected logs for this with logging patches applied.
Flags: needinfo?(ggrisco)
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.065
Moz BuildID: 20150205002503
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.065.xml?h=release
Gecko Version: 37.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=c2047a46e29696238e9b4c9caaba47736421449a
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=7927bc87af9e30bb8fc9cb00fc7608a1002ea3f0
Patches: bug 1067629, bug 1117862, bug 1091307, bug 1106637
(Assignee)

Comment 62

3 years ago
(In reply to Greg Grisco from comment #60)
> Created attachment 8560553 [details]
> music_video.log
> 
> Now it's getting hard to reproduce this bug.  We were seeing hundreds of
> these crashes in past couple of weeks, but last two days automation hasn't
> produced any crashes.  However, I can see the "Another app is currently
> using the video player." message after I follow these steps:
> 
> 1. Start Music app
> 2. Select music track to play
> 
> With some content, I'm seeing the Music app freeze (bug 1129685)
> 
> 3. Kill the Music app
> 4. Start the Video app
> 
> Now I see the message.
> 
> I collected logs for this with logging patches applied.

This seems like a different problem/bug. The log is not optimum for this symptom. But from the log, there seems to have a problem to "input data read" of OMXCodec from DataSource. It might be cause by same problem as bug 1129685.
(Assignee)

Comment 63

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #62)
> (In reply to Greg Grisco from comment #60)
> > Created attachment 8560553 [details]
> > music_video.log
> > 
> > Now it's getting hard to reproduce this bug.  We were seeing hundreds of
> > these crashes in past couple of weeks, but last two days automation hasn't
> > produced any crashes.  However, I can see the "Another app is currently
> > using the video player." message after I follow these steps:

Video tag does not expose such capability. Therefore, the above status is detected by heuristics by using event response time. Therefore, the above message normally say, gecko MediaDecoder is not emitting the event withing 2 second or so.
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.068
Moz BuildID: 20150205002503
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.068.xml?h=release
Gecko Version: 37.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=c2047a46e29696238e9b4c9caaba47736421449a
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=7927bc87af9e30bb8fc9cb00fc7608a1002ea3f0
Patches: bug 1130271, bug 1067629, bug 1125418, bug 1091307
(Reporter)

Comment 65

3 years ago
Although I have trouble reproducing this manually now, it still reproduces in monkey test.  I'm trying to get a log with the logging patches applied.  Will share those when I get them.
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.073
Moz BuildID: 20150212002504
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.073.xml?h=release
Gecko Version: 37.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=791e53728cd8018f1d7cf7efe06bbeb1179f0370
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=2a90b4d5194b06c63ebcf0377703126b08349b92
Patches: bug 1130271, bug 1067629, bug 1125418, bug 1091307, bug 1130196
Created attachment 8564490 [details]
EXTRA file attachment - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.073
Created attachment 8564491 [details]
decoded minidump - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.073
(Assignee)

Comment 69

3 years ago
Greg, Comment 67 and Comment 68 does not have logcat data. Can we have the logcat log?
Flags: needinfo?(ggrisco)
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.074
Moz BuildID: 20150214002504
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.074.xml?h=release
Gecko Version: 37.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=ea64caf6d4ab03fc4472eca9f41f20d651d55fa9
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=f0efef29cfa937da4d8497636cdae4b978d93674
Patches: bug 1091307, bug 1067629, bug 1125418, bug 1130271, bug 1130196
Created attachment 8565019 [details]
EXTRA file attachment - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.074
Created attachment 8565020 [details]
decoded minidump - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.074
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.075
Moz BuildID: 20150216002503
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.075.xml?h=release
Gecko Version: 37.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=ea64caf6d4ab03fc4472eca9f41f20d651d55fa9
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=f9bcea955c11dd8e3268b1d84e011145eec8299f
Patches: bug 1130271, bug 1067629, bug 1125418, bug 1091307, bug 1130196
Created attachment 8565291 [details]
EXTRA file attachment - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.075
Created attachment 8565292 [details]
decoded minidump - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.075
(Reporter)

Comment 76

3 years ago
It seems this exact crash is not reproducing anymore with the logging patch applied.  Could just be a coincidence, but I'll leave the patch in for a while longer to see if we get anything.
Flags: needinfo?(ggrisco)
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.088
Moz BuildID: 20150227002521
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.088.xml?h=release
Gecko Version: 37.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=eb6a5ac9081d3962198e0f4520b0743d716d7a27
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=1545ecabf0082e7bd38cd35577f757e0a517cbd0
Patches: bug 1067629, bug 1091307, bug 1132528
Created attachment 8571002 [details]
EXTRA file attachment - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.088
Created attachment 8571003 [details]
decoded minidump - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.088
(Assignee)

Comment 80

3 years ago
attachment 8571002 [details] has the followings. From them, mediaserver seems not started yet.

> ServiceManager: Waiting for service media.camera...

> ServiceManager: Waiting for service media.player...
(Assignee)

Comment 81

3 years ago
From comment 80, there might be the following possibilities.

- [1] mediaserver's initialization takes too long time
- [2] mediaserver's initialization became stuck
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.089
Moz BuildID: 20150301002505
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.089.xml?h=release
Gecko Version: 37.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=77609916ca5ab721150fab2b7bc5c37f43ee3a5a
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=69679bf309a149fce64043bbdb78e87c3905b626
Patches: bug 1067629, bug 1133147, bug 1132528, bug 1091307
Created attachment 8571203 [details]
EXTRA file attachment - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.089
Created attachment 8571204 [details]
decoded minidump - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.089
(Reporter)

Comment 85

3 years ago
BTW, I had removed the logging patch this weekend and now we're seeing crashes again.  It seems that the extra logging was preventing the crash from happening.
(Assignee)

Comment 86

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #81)
> From comment 80, there might be the following possibilities.
> 
> - [1] mediaserver's initialization takes too long time
> - [2] mediaserver's initialization became stuck

There is also another possibility.
- [3] Binder IPC sometimes does not work correctly from content process.
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.090
Moz BuildID: 20150301002505
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.090.xml?h=release
Gecko Version: 37.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=77609916ca5ab721150fab2b7bc5c37f43ee3a5a
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=69679bf309a149fce64043bbdb78e87c3905b626
Patches: bug 1091307, bug 1133147, bug 1132528, bug 1067629, bug 1138123
Created attachment 8571812 [details]
EXTRA file attachment - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.090
Created attachment 8571813 [details]
decoded minidump - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.090
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.091
Moz BuildID: 20150302002504
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.091.xml?h=release
Gecko Version: 37.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=77609916ca5ab721150fab2b7bc5c37f43ee3a5a
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=69679bf309a149fce64043bbdb78e87c3905b626
Patches: bug 1091307, bug 1133147, bug 1132528, bug 1067629, bug 1138123
Created attachment 8572485 [details]
EXTRA file attachment - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.091
Created attachment 8572486 [details]
decoded minidump - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.091

Comment 93

3 years ago
Hi Sotaro,

(In reply to Sotaro Ikeda [:sotaro] from comment #86)
> (In reply to Sotaro Ikeda [:sotaro] from comment #81)
> ...
>
> There is also another possibility.
> - [3] Binder IPC sometimes does not work correctly from content process.

Are you testing that possibility? What next steps can be taken to move this issue forward?

Thanks,
Mike
Flags: needinfo?(sotaro.ikeda.g)
(Assignee)

Comment 94

3 years ago
(In reply to Mike Lee [:mlee] from comment #93)
> 
> Are you testing that possibility? What next steps can be taken to move this
> issue forward?

We do not have a environment for testing. Recently, only device was received, but correct software environment was not provided yet. Therefore, only thing is ask code aurora to add some logs to there environment. But adding a lot of log could cause comment 85.
Flags: needinfo?(sotaro.ikeda.g)
(Assignee)

Comment 95

3 years ago
I am going to create a different patch to check mediaserver's service start up.
(Assignee)

Comment 96

3 years ago
Created attachment 8572867 [details] [diff] [review]
log patch - mediaserver service start up

Add log around mediaserver's service start up.
(Assignee)

Comment 97

3 years ago
Greg, can you add log patch attachment 8572867 [details] [diff] [review] to codeaurora's build? It add log around mediasercices' start up. It seems not add too much log.
Flags: needinfo?(ggrisco)
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.092
Moz BuildID: 20150303002527
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.092.xml?h=release
Gecko Version: 37.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=3d188c414e30acc392253d5389a42352fcfbc183
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=de066d8c423190a3326fb7331c8cac80ae8fbc66
Patches: bug 1067629, bug 1138123, bug 1091307, bug 1133147, bug 1132528, bug 1137875
Created attachment 8573094 [details]
EXTRA file attachment - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.092
Created attachment 8573095 [details]
decoded minidump - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.092
(Reporter)

Comment 101

3 years ago
(In reply to Sotaro Ikeda [:sotaro] from comment #97)
> Greg, can you add log patch attachment 8572867 [details] [diff] [review] to
> codeaurora's build? It add log around mediasercices' start up. It seems not
> add too much log.

Ok, it should get built today and monkey will run this weekend.  If there are crashes, I (or cafbot) will send you the logs.
Flags: needinfo?(ggrisco)
(Assignee)

Comment 102

3 years ago
Thank you!
(Reporter)

Comment 103

3 years ago
Hi Sotaro, we're able to reproduce more reliably now and think this might be due to race condition at startup.  We're going to debug this now.  I marked as [POVB] for now, will loop you back in when we know more.  Thanks!
Whiteboard: [b2g-crash][caf-crash 433][caf priority: p1][CR 785241] → [POVB][b2g-crash][caf-crash 433][caf priority: p1][CR 785241]
(Assignee)

Comment 104

3 years ago
Thanks for the notification:-)
Observed on: 

Device: msm8909
Gonk Version: AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.093
Moz BuildID: 20150305002528
Manifest: https://www.codeaurora.org/cgit/quic/lf/b2g/manifest/tree/caf_AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.093.xml?h=release
Gecko Version: 37.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=89af288bad6751248ff84504fa898206fee127fe
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=e2d76608d1e418e9d0dc835809074f0cc16b1152
Patches: bug
Created attachment 8573762 [details]
EXTRA file attachment - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.093
Created attachment 8573763 [details]
decoded minidump - AU_LINUX_GECKO_LF.BR.1.2.3.00.00.00.000.093
It turns out that this doesn't really have much to do with media/OMX.  It now looks like on some (all?) L gonks, /system/b2g/plugin-container is unable to connect to the mediaserver.  This can also be reproduced in the Dialer app and Built-in Keyboard app, when they try to play audio.

Example using the Phone app instead of Video app, since it's easier to demonstrate the issue:

BAD
---
$ adb shell b2g-ps
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC        NAME
b2g              0 root      260   1     273796 89472 ffffffff b6f43d08 S /system/b2g/b2g
(Nuwa)           0 root      541   260   67472  24452 ffffffff b6f43d08 S /system/b2g/b2g
Homescreen       2 u0_a2728  2728  541   167484 37608 ffffffff b6f43d08 S /system/b2g/b2g
(Preallocated a  2 u0_a3579  3579  541   70720  17596 ffffffff b6f43d08 S /system/b2g/b2g

$ adb shell kill 3579
$ adb shell b2g-ps
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC        NAME
b2g              0 root      260   1     268924 88952 ffffffff b6f43d08 S /system/b2g/b2g
(Nuwa)           0 root      541   260   67472  24452 ffffffff b6f43d08 S /system/b2g/b2g
Homescreen       2 u0_a2728  2728  541   166444 38040 ffffffff b6f43d08 S /system/b2g/b2g

<<< Launch Phone app >>>

$ adb shell b2g-ps
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC        NAME
b2g              0 root      260   1     244048 90124 ffffffff b6f43d08 S /system/b2g/b2g
(Nuwa)           0 root      541   260   67472  23772 ffffffff b6f43d08 S /system/b2g/b2g
Homescreen       2 u0_a2728  2728  541   98428  30132 ffffffff b6f43d08 S /system/b2g/b2g
Communications   2 u0_a3661  3661  260   91080  37004 ffffffff b6f3f070 R /system/b2g/plugin-container

$ adb logcat | grep Waiting
I/ServiceManager( 3661): Waiting for service media.audio_flinger...
I/ServiceManager( 3661): Waiting for service media.audio_flinger...
I/ServiceManager( 3661): Waiting for service media.audio_flinger...

<<<  The Phone app is completely wedged here >>

GOOD
----
$ adb shell b2g-ps
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC        NAME
b2g              0 root      260   1     257484 91604 ffffffff b6f43d08 S /system/b2g/b2g
(Nuwa)           0 root      541   260   67472  23592 ffffffff b6f43d08 S /system/b2g/b2g
Homescreen       2 u0_a2728  2728  541   131876 35916 ffffffff b6f43d08 S /system/b2g/b2g
(Preallocated a  2 u0_a3694  3694  541   70720  17224 ffffffff b6f43d08 S /system/b2g/b2g

<<< Launch Phone app >>>

$ adb shell b2g-ps
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC        NAME
b2g              0 root      260   1     246800 91928 ffffffff b6f43d08 S /system/b2g/b2g
(Nuwa)           0 root      541   260   67472  23404 ffffffff b6f43d08 S /system/b2g/b2g
Homescreen       2 u0_a2728  2728  541   98428  29672 ffffffff b6f43d08 S /system/b2g/b2g
Communications   2 u0_a3694  3694  541   89372  28908 ffffffff b6f43d08 S /system/b2g/b2g

$ adb logcat | grep Waiting
null

<<<  The Phone app is working fine here >>
Summary: Assertion failure in OMXDecoder::AllocateMediaResources during monkey test → [l-gonk] plugin-container can't connect to media server
Whiteboard: [POVB][b2g-crash][caf-crash 433][caf priority: p1][CR 785241] → [b2g-crash][caf-crash 433][caf priority: p1][CR 785241]
Once bug 1053634 is fixed plugin-container should never be used and perhaps this bug just disappears.

It would be interesting if the STR from comment 108 reproduce the issue on N5 though, I don't have one to try here.
Depends on: 1053634

Updated

3 years ago
Blocks: 1142220

Updated

3 years ago
No longer blocks: 1142220

Comment 110

3 years ago
Shawn,

Can someone on the Taipei team confirm whether this issue reproduces on the Nexus 5? CAF is trying to understand if this is only an issue with their device.

Thanks,
Mike
Flags: needinfo?(sku)

Comment 111

3 years ago
(In reply to Mike Lee [:mlee] from comment #110)
> Shawn,
> 
> Can someone on the Taipei team confirm whether this issue reproduces on the
> Nexus 5? CAF is trying to understand if this is only an issue with their
> device.
> 
> Thanks,
> Mike


Mike,
There is no such BAD symptom as comment 108 said. 

Blake,
 Once you got time, please can you try check this issue as well?



$ adb shell b2g-ps
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC        NAME
b2g              0 root      190   1     314028 117936 ffffffff b6e5b490 S /system/b2g/b2g
(Nuwa)           0 root      385   190   66652  22216 ffffffff b6e5b490 S /system/b2g/b2g
Homescreen       2 u0_a822   822   385   105224 28620 ffffffff b6e5b490 S /system/b2g/b2g
Built-in Keyboa  2 u0_a969   969   190   89904  31744 ffffffff b6e5e490 S /system/b2g/plugin-container
Find My Device   2 u0_a1049  1049  385   74152  22112 ffffffff b6e5b490 S /system/b2g/b2g
(Preallocated a  2 u0_a1776  1776  385   70932  17800 ffffffff b6e5b490 S /system/b2g/b2g

<< Launch communication app >>
$ adb shell b2g-ps
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC        NAME
b2g              0 root      190   1     364464 119004 ffffffff b6e5b490 S /system/b2g/b2g
(Nuwa)           0 root      385   190   66652  22216 ffffffff b6e5b490 S /system/b2g/b2g
Homescreen       2 u0_a822   822   385   108304 27896 ffffffff b6e5b490 S /system/b2g/b2g
Built-in Keyboa  2 u0_a969   969   190   89904  31744 ffffffff b6e5e490 S /system/b2g/plugin-container
Find My Device   2 u0_a1049  1049  385   74152  22112 ffffffff b6e5b490 S /system/b2g/b2g
Communications   2 u0_a1776  1776  385   98516  28224 ffffffff b6e5b490 S /system/b2g/b2g

<< Kill communication app >>
$ adb shell kill 1776
$ adb shell b2g-ps
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC        NAME
b2g              0 root      190   1     357232 118708 ffffffff b6e5b490 S /system/b2g/b2g
(Nuwa)           0 root      385   190   66652  22216 ffffffff b6e5b490 S /system/b2g/b2g
Homescreen       2 u0_a822   822   385   130172 34664 ffffffff b6e5b490 S /system/b2g/b2g
Built-in Keyboa  2 u0_a969   969   190   89904  31744 ffffffff b6e5e490 S /system/b2g/plugin-container
Find My Device   2 u0_a1049  1049  385   74152  22112 ffffffff b6e5b490 S /system/b2g/b2g
(Preallocated a  2 u0_a2019  2019  385   70932  17996 ffffffff b6e5b490 S /system/b2g/b2g

<< Re-launch communication app >>
$ adb shell b2g-ps
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC        NAME
b2g              0 root      190   1     366676 120680 ffffffff b6e5b490 S /system/b2g/b2g
(Nuwa)           0 root      385   190   66652  22216 ffffffff b6e5b490 S /system/b2g/b2g
Homescreen       2 u0_a822   822   385   112492 39960 ffffffff b6e5b490 S /system/b2g/b2g
Built-in Keyboa  2 u0_a969   969   190   89904  31744 ffffffff b6e5e490 S /system/b2g/plugin-container
Find My Device   2 u0_a1049  1049  385   74152  22112 ffffffff b6e5b490 S /system/b2g/b2g
Communications   2 u0_a2019  2019  385   97492  28392 ffffffff b6e5b490 S /system/b2g/b2g

<< Launch video app >>
$ adb shell b2g-ps
APPLICATION    SEC USER     PID   PPID  VSIZE  RSS     WCHAN    PC        NAME
b2g              0 root      190   1     367636 126220 ffffffff b6e5b490 S /system/b2g/b2g
(Nuwa)           0 root      385   190   66652  22216 ffffffff b6e5b490 S /system/b2g/b2g
Homescreen       2 u0_a822   822   385   108304 27636 ffffffff b6e5b490 S /system/b2g/b2g
Built-in Keyboa  2 u0_a969   969   190   89904  31744 ffffffff b6e5e490 S /system/b2g/plugin-container
Find My Device   2 u0_a1049  1049  385   74152  22112 ffffffff b6e5b490 S /system/b2g/b2g
Communications   2 u0_a2019  2019  385   101576 28216 ffffffff b6e5b490 S /system/b2g/b2g
Video            2 u0_a2210  2210  385   92732  25072 ffffffff b6e5b490 S /system/b2g/b2g
Flags: needinfo?(sku) → needinfo?(bwu)
Currently I don't have nexus 5 at my hand. mchiang should be able to help check it. :)
Flags: needinfo?(bwu) → needinfo?(mchiang)
Hi,

I follow the steps in comment 34 and try to reproduce the bug with Nexus 5.
Cannot reproduce it for 10 times.
Since Greg also cannot reproduce the bug manually now, we need to come out a monkey & stress test to hit this problem.

Hi Greg,

According to the log, seems the problem is related to the binder/IPC.
Does your cafbot also record kernel log?
Could you help do the favor:
1) Set cafbot to record all logs from the boot up (We expect to see the log like "MediaPlayerService::instantiate()", which ensures mediaplayerservice indeed instantiate successfully)
2) Provide both logcat log and kernel log.
3) Ask your system team to assist analyzing the kernel log and see if there is any IPC/binder issue. Is that possible there is a binder leakage?

Regards,
Munro
Flags: needinfo?(mchiang)
(Reporter)

Updated

3 years ago
Flags: needinfo?(ggrisco)
(Assignee)

Comment 114

3 years ago
Created attachment 8576783 [details] [diff] [review]
log patch - servicemanager

This log patch is a part of Attachment 8559997 [details] [diff]. Limit log to servicemanager.
(Assignee)

Comment 115

3 years ago
I do not think that "plugin-container can't connect to media server". Just before video decoder instantiation, content process access mediaserver by getMediaResourceManagerService().

getMediaResourceManagerService() access to the media server. IMediaResourceManagerService is used to reserve video decoder resource. The instance of IMediaResourceManagerService exits in b2g process and it is registered to mediaserver. The content process get the interface via mediaserver.

https://dxr.mozilla.org/mozilla-central/source/dom/media/omx/mediaresourcemanager/IMediaResourceManagerDeathNotifier.cpp#45
(Assignee)

Comment 116

3 years ago
Greg, can you add attachment 8576783 [details] [diff] [review]? It make sure if content process access mediasercver.
Flags: needinfo?(ggrisco)
(Assignee)

Comment 117

3 years ago
Another reason of comment 115 is in attachment 8554771 [details] in comment 15. Only this log has b2g process reboot log. During b2g process reboot. ServiceManager always emit some service died messages. But it does not have a died message of mediaserver's services.

Died logs of attachment 8554771 [details] was the following.
--------------------------------------------
01-01 00:03:37.239   217   217 I ServiceManager: service 'media.resource_manager' died
01-01 00:03:37.239   217   217 I ServiceManager: service 'SurfaceFlinger' died
01-01 00:03:37.239   217   217 I ServiceManager: service 'permission' died
01-01 00:03:37.239   217   217 I ServiceManager: service 'display.qservice' died

But we normally expect the followig died logs. b2g process reboot also trigger mediaserver prorocess reboot.
-------------------------------------------------------------------

03-12 13:06:41.790   175   175 I ServiceManager: service 'android.security.keystore' died
03-12 13:06:41.790   175   175 I ServiceManager: service 'media.resource_manager' died
03-12 13:06:41.790   175   175 I ServiceManager: service 'SurfaceFlinger' died
03-12 13:06:41.790   175   175 I ServiceManager: service 'permission' died
03-12 13:06:41.790   175   175 I ServiceManager: service 'display.qservice' died
03-12 13:06:41.791   193   193 D nfcd    :  4001536 of bytes to be sent... data=0x0 ret=0
03-12 13:06:41.893   175   175 I ServiceManager: service 'media.sound_trigger_hw' died
03-12 13:06:41.893   175   175 I ServiceManager: service 'media.audio_flinger' died
03-12 13:06:41.893   175   175 I ServiceManager: service 'media.player' died
03-12 13:06:41.893   175   175 I ServiceManager: service 'media.camera' died
03-12 13:06:41.893   175   175 I ServiceManager: service 'media.audio_policy' died
Flags: needinfo?(mvines)
Flags: needinfo?(ggrisco)
(In reply to mchiang from comment #113)
> I follow the steps in comment 34 and try to reproduce the bug with Nexus 5.
> Cannot reproduce it for 10 times.
 
Thanks for checking!

> Since Greg also cannot reproduce the bug manually now, we need to come out a
> monkey & stress test to hit this problem.

This can now be reproduced 100% over here, please see comment 108 for how.   It's something to do with fd management in the plugin-container process as the issue does not reproduce if plugin-container was not needed and the Video/Dialer/Keyboard app run in a b2g process and certainly may be a POVB.

However I think we should focus our attention on fixing bug 1053634 right now, as this will remove the plugin-container process entirely and as a fringe benefit should resolve this bug (fingers crossed).
Flags: needinfo?(mvines)
Bug 1053634 avoids this issue as hoped so we don't need a fix here.
Status: REOPENED → RESOLVED
Last Resolved: 3 years ago3 years ago
Resolution: --- → WONTFIX
Depends on: 1155547
You need to log in before you can comment on or make changes to this bug.