Closed Bug 1073088 Opened 10 years ago Closed 10 years ago

pthread_kill | raise | __libc_android_abort | ? | close | mozilla::system::OpenFileFinder::Next(mozilla::system::OpenFileFinder::Info*)

Categories

(Firefox OS Graveyard :: Stability, defect, P2)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-b2g:2.0+, firefox33 wontfix, firefox34 fixed, firefox35 fixed, b2g-v2.0 unaffected, b2g-v2.1 fixed, b2g-v2.2 fixed)

RESOLVED FIXED
2.1 S6 (10oct)
blocking-b2g 2.0+
Tracking Status
firefox33 --- wontfix
firefox34 --- fixed
firefox35 --- fixed
b2g-v2.0 --- unaffected
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: dhirnyj, Assigned: dhylands)

References

Details

(Keywords: crash, Whiteboard: [caf-crash 372][b2g-crash][caf priority: p1][CR 730173])

Attachments

(6 files, 2 obsolete files)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:20.0) Gecko/20100101 Firefox/20.0
Build ID: 20130329030832

Steps to reproduce:

UTRAN: W+G
Hardware used: 8926 MTP
Meta Build location: \\holi\builds516\INTEGRATION\M8926AAAAANFGD200007H.1
Logs content: QPST logs location
Mini dumps location: None
ADB logs & QPST Logs Location: \\range\TargetPST\B2G_8926_2.0\M8926AAAAANFGD200007.1\25sep\1a5a62a_W_Manual_B4_C17
Test steps:
1.Select USB storage ON
2.Check MTP radio button option in USB storage
3.Select media storage and do eject SD card
4.Select USB storage OFF, device got crashed, dumps got collected


Actual results:

Device Crashed, see attached files (to be added later).


Expected results:

Not crashed
[Blocking Requested - why for this release]:
blocking-b2g: --- → 2.1?
OS: All → Gonk (Firefox OS)
Priority: -- → P2
Hardware: All → ARM
Component: Gaia::System → General
Whiteboard: [CR 730173]
Whiteboard: [CR 730173] → [caf priority: p1][CR 730173]
Whiteboard: [caf priority: p1][CR 730173] → [b2g-crash][caf-crash 336][caf priority: p1][CR 730173]
Keywords: crash
Observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_2.0.01.04.00.114.092
Moz BuildID: 20140916000206
B2G Version: 2.1
Gecko Version: 34.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=713448b8963cd53c561f4b38640f8c63b655ce33
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=395370094caa386ebf8fe8831d030bc5484fa599
Patches: bug 1055299, bug 1068978, bug 1067205, bug 1070431, bug 1068394, bug 1051633, bug 1057220
Component: General → Stability
I am NI :dhylands to start here.. as he may be familiar the storage code or will be able to pass this around for investigation.
Flags: needinfo?(dhylands)
blocking-b2g: 2.1? → 2.1+
Flags: needinfo?(mwu)
We may have attached the wrong log, I'm looking into it.
Flags: needinfo?(ggrisco)
Attached file Decoded minidump
Attachment #8496085 - Attachment is obsolete: true
Attached file Extra file dump
Attachment #8496084 - Attachment is obsolete: true
Flags: needinfo?(ggrisco)
Whiteboard: [b2g-crash][caf-crash 336][caf priority: p1][CR 730173] → [b2g-crash][caf priority: p1][CR 730173]
Whiteboard: [b2g-crash][caf priority: p1][CR 730173] → [caf-crash 336][b2g-crash][caf priority: p1][CR 730173]
fwiw, on a flame running current trunk we are not crashing we the steps in comment 0. Can we get QA to do a branch test?
Keywords: qawanted
Dave, please investigate. Assigning this to you. 

Copying Eric Chou too
Assignee: nobody → dhylands
Flags: needinfo?(mwu)
Whiteboard: [caf-crash 336][b2g-crash][caf priority: p1][CR 730173] → [caf-crash 372][b2g-crash][caf priority: p1][CR 730173]
So this is crashing inside some custom version of close that CAF is using.

On our builds, close is found inside of close.S and is just a system call into the kernel.

Which would explain why we aren't seeing any crash - since the kernel will just return -1 with errno set to EBADF and doesn't raise an exception.

So, I'd like to see the source for the custom close to determine why it thinks it should be raising an exception.

In all likelyhood, this is still a bug on our side, but we can't reproduce the problem because we aren't running the same code.
Flags: needinfo?(dhylands)
ni mvines to find out if we'll be allowed to see the source, or if we have to debug this in the dark...
Flags: needinfo?(mvines)
It seems that we're hitting a double close here.

Android's MtpServer::run calls close here:
https://github.com/mozilla-b2g/platform_frameworks_av/blob/master/media/mtp/MtpServer.cpp#L240

and then we call close again in the ScopedClose descriptor:
http://dxr.mozilla.org/mozilla-central/source/dom/system/gonk/MozMtpServer.cpp#207

So we need to forget the file descriptor if we call run.

I'll put together a patch.
Dave -- I guess you figured out the issue but for reference here is the source we are using.
It's available on CAF website:
https://www.codeaurora.org/cgit/quic/la/platform/frameworks/av/tree/?h=LNX.LF.3.5.1_RB1.2
Flags: needinfo?(mvines)
MTP seems to behave the same as before this patch.

CAF will need to verify if it fixes their crash since we don't have the same libc.
Attachment #8496321 - Flags: review?(mwu)
Attachment #8496321 - Flags: review?(mwu) → review+
Comment on attachment 8496321 [details] [diff] [review]
Don't close the mtp file descriptor when MtpServer::run does

Approval Request Comment
[Feature/regressing bug #]: When MTP was added
[User impact if declined]: Most likely None.
[Describe test coverage new/current, TBPL]: I verified that MTP still behaves the same as before the patch (none of our tests catch this problem).
[Risks and why]: Low
[String/UUID change made/needed]: None

This affects MTBF tests being done by CAF. There is a small window where the wrong file handle could be closed (So the user impact isn't none, but it is still a bug).
Attachment #8496321 - Flags: approval-mozilla-aurora?
Keywords: qawanted
(In reply to Inder from comment #15)
> Dave -- I guess you figured out the issue but for reference here is the
> source we are using.
> It's available on CAF website:
> https://www.codeaurora.org/cgit/quic/la/platform/frameworks/av/tree/?h=LNX.
> LF.3.5.1_RB1.2

I meant the source for libc that you're using. Because the stack trace shows that it doesn't match the source that we're using for libc.
Flags: needinfo?(ikumar)
Yeah we have some additional stuff that help us catch issues like this quicker during stability runs.
Flags: needinfo?(ikumar)
https://hg.mozilla.org/mozilla-central/rev/81cc689e42b1
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Attachment #8496321 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
The description of this bug doesn't seem to match the crash dump.

The description says:

pthread_kill | raise | __libc_android_abort | ? | close | mozilla::system::OpenFileFinder::Next(mozilla::system::OpenFileFinder::Info*)

which implies to me that the crash occurred in OpenFileFinder. But OpenFileFnder doesn't show up in the attached crash dump.

Is there another problem?
Flags: needinfo?(dhirnyj)
We're seeing this on 2.0, would like it fixed there.
Flags: needinfo?(bbajaj)
Comment on attachment 8496321 [details] [diff] [review]
Don't close the mtp file descriptor when MtpServer::run does

NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): When MTP was added
User impact if declined: Most likely None.
Testing completed: I verified that MTP still behaves the same as before the patch (none of our tests catch this problem).
Risk to taking this patch (and alternatives if risky): Low
String or UUID changes made by this patch: None
Attachment #8496321 - Flags: approval-mozilla-b2g32?
blocking-b2g: 2.1+ → 2.0+
Flags: needinfo?(bbajaj)
Attachment #8496321 - Flags: approval-mozilla-b2g32? → approval-mozilla-b2g32+
(In reply to Greg Grisco from comment #24)
> We're seeing this on 2.0, would like it fixed there.

Bug 1029533 didn't land on v2.0, so I'm not sure what I'm supposed to be uplifting to b2g32 here.
Flags: needinfo?(dhylands)
Greg - are you seeing the title of the bug, or the description?

The backtraces included in this bug are what was fixed and don't match the title.

I put a needinfo on comment 22 to get this clarified, but never received a response.

If you're seeing the title, then we should change the title of this one and open a new bug.
Flags: needinfo?(dhylands) → needinfo?(ggrisco)
Hi Dave, sorry for the confusion.  You're right, we messed up the title on this one.  Also, the minidump that we uploaded was for v2.1, but we see this crash on both v2.0 and v2.1.  Although we haven't seen this crash recur in past two weeks, I think we still might want to fix on v2.0.  Is there any harm in bringing it to 2.0?
Flags: needinfo?(ggrisco) → needinfo?(dhylands)
Greg - so the issue is that the minidump is for a crash in MTP. We never landed MTP on 2.0, so there is no way of uplifting this patch to 2.0 (the files to apply the patch to don't even exist in 2.0).

So that's why there is some confusion about how you could be seeing this in 2.0. Perhaps you're seeing the crash described in the title, and not the crash described by the minidump?
Flags: needinfo?(dhylands) → needinfo?(ggrisco)
(In reply to Dave Hylands [:dhylands] from comment #29)
> Greg - so the issue is that the minidump is for a crash in MTP. We never
> landed MTP on 2.0, so there is no way of uplifting this patch to 2.0 (the
> files to apply the patch to don't even exist in 2.0).
> 
> So that's why there is some confusion about how you could be seeing this in
> 2.0. Perhaps you're seeing the crash described in the title, and not the
> crash described by the minidump?

Ok, I see.  We are seeing same crash signature on v2.0 (cafbot just uploaded minidump for such) but I guess the root cause is different and the top part of crash signature is overly generic.  For the one just uploaded, we're not seeing the crash in past couple of weeks.  Since it's not an MTP issue then we will open another bug if it ever reproduces. Thanks!
Flags: needinfo?(ggrisco)
Comment on attachment 8496321 [details] [diff] [review]
Don't close the mtp file descriptor when MtpServer::run does

Removing b2g32 approval since this patch can't be uplifted to 2.0.
Attachment #8496321 - Flags: approval-mozilla-b2g32+
And changing v2.0 to unaffected (since MTP never landed on 2.0)
This issue has been verified successfully on Flame 2.1, 2.2. We have verified with or without USB plugged in.
See attachment: 1556.MP4
Reproducing rate: 0/5

Test steps:
1.Launch Settings, then turn on USB storage.
2.Check MTP radio button option in USB storage.
3.Select media storage and do eject SD card.
4.Select USB storage OFF.

Actual results:
Not crashed

Flame 2.1 version:
Gaia-Rev        ccb49abe412c978a4045f0c75abff534372716c4
Gecko-Rev       https://hg.mozilla.org/releases/mozilla-2g34_v2_1/rev/18fb67530b22
Build-ID        20141201001201
Version         34.0
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141201.034405
FW-Date         Mon Dec  1 03:44:15 EST 2014
Bootloader      L1TC00011880

Flame 2.2 version:
Gaia-Rev        39214fb22c203e8849aaa1c27b773eeb73212921
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/08be3008650f
Build-ID        20141201040205
Version         37.0a1
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141201.074509
FW-Date         Mon Dec  1 07:45:20 EST 2014
Bootloader      L1TC00011880
This issue has been verified successfully on Flame 2.1, 2.2. We have verified with or without USB plugged in.
See attachment: 1637.MP4
Reproducing rate: 0/5

Test steps:
1.Launch Settings, then turn on USB storage.
2.Check MTP radio button option in USB storage.
3.Select media storage and do eject SD card.
4.Select USB storage OFF.

Actual results:
Not crashed

Flame 2.1 version:
Gaia-Rev        ccb49abe412c978a4045f0c75abff534372716c4
Gecko-Rev       https://hg.mozilla.org/releases/mozilla-2g34_v2_1/rev/18fb67530b22
Build-ID        20141201001201
Version         34.0
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141201.034405
FW-Date         Mon Dec  1 03:44:15 EST 2014
Bootloader      L1TC00011880

Flame 2.2 version:
Gaia-Rev        39214fb22c203e8849aaa1c27b773eeb73212921
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/08be3008650f
Build-ID        20141201040205
Version         37.0a1
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141201.074509
FW-Date         Mon Dec  1 07:45:20 EST 2014
Bootloader      L1TC00011880
Attached video 1637.MP4
Clearning ni since this bug is resolved.
Flags: needinfo?(dhirnyj)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: