Closed Bug 1324312 Opened 3 years ago Closed 3 years ago

Crash in OOM | large | mozalloc_abort | mozalloc_handle_oom | moz_xmalloc | mozilla::gl::TexSubImage2DHelper

Categories

(Core :: Graphics, defect, P3, critical)

Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox-esr45 --- wontfix
firefox50 --- wontfix
firefox51 --- wontfix
firefox52 --- fixed
firefox53 --- fixed
firefox54 --- fixed

People

(Reporter: njn, Assigned: jnicol)

Details

(Keywords: crash, Whiteboard: [gfx-noted])

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-03955c39-f8be-4cb3-a719-6657b2161217.
=============================================================

This Fennec-only crash has been around for a long time and has moderately high
volume. For the past 7 days:

> Product        Version  Count Percentage Installations
> FennecAndroid  52.0a2   74    37.0%      15
> FennecAndroid  47.0     18     9.0%      13
> FennecAndroid  50.0.2   16     8.0%      10
> FennecAndroid  53.0a1   13     6.5%       3
> FennecAndroid  46.0.1   12     6.0%       5
> FennecAndroid  50.1.0   12     6.0%      14
> FennecAndroid  34.0.1   11     5.5%      10
> FennecAndroid  50.0      8     4.0%       2
> FennecAndroid  45.0.2    6     3.0%       3
> FennecAndroid  41.0.2    4     2.0%       5
> FennecAndroid  42.0.2    4     2.0%       1
> FennecAndroid  43.0      3     1.5%       3
> FennecAndroid  45.0.1    3     1.5%       3
> FennecAndroid  47.0b2    3     1.5%       1
> FennecAndroid  44.0.2    2     1.0%       2
> FennecAndroid  46.0      2     1.0%       2
> FennecAndroid  49.0      2     1.0%       1
> FennecAndroid  30.0      1     0.5%       1
> FennecAndroid  32.0.3    1     0.5%       1
> FennecAndroid  33.0      1     0.5%       1
> FennecAndroid  38.0.5    1     0.5%       1
> FennecAndroid  41.0b10   1     0.5%       1
> FennecAndroid  42.0      1     0.5%       1
> FennecAndroid  42.0.1    1     0.5%       1

It's an OOM crash. For the reports I looked at the allocation request sizes
were in the range 2--4 MiB, which is big enough that they should be fallible.

jgilbert, is this one you can take a look at?
Flags: needinfo?(jgilbert)
Priority: -- → P3
Whiteboard: [gfx-noted]
That's not a very high volume :)
Flags: needinfo?(jgilbert) → needinfo?(jnicol)
Sotaro, did bug 1245552 not make us align our texture data so that this shouldn't be a problem any more?
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Jamie Nicol [:jnicol] from comment #2)
> Sotaro, did bug 1245552 not make us align our texture data so that this
> shouldn't be a problem any more?

bug 1245552 made mask layer data as to align to 4. From the following, it seems that there are still cases that Texture data is not aligned to 4.

https://crash-stats.mozilla.com/signature/?signature=OOM%20%7C%20large%20%7C%20mozalloc_abort%20%7C%20mozalloc_handle_oom%20%7C%20moz_xmalloc%20%7C%20mozilla%3A%3Agl%3A%3ATexSubImage2DHelper&date=%3E%3D2016-12-14T03%3A49%3A00.000Z&date=%3C2016-12-21T03%3A49%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-version&_sort=-date&page=1
Flags: needinfo?(sotaro.ikeda.g)
On recent firefox crashes, I saw TiledLayerBufferComposite often in the stack.
One user crashed 17 times in Aurora 20170113004016 with this crash.
Assignee: nobody → jnicol
Flags: needinfo?(jnicol)
I was confused why we're still hitting this after bug 1245552, since I thought mask layers would likely be the only place where stride != width*depth. But actually we hit this case any time we're doing a partial upload and the subregion width < full image width.

To fix this crash I'll make the alloc fallible, perhaps falling back to uploading row-by-row if the alloc fails. Will be a performance hit but won't crash, and ideally we wouldn't be in this low-memory situation either.

We should also consider avoiding using partial uploads when UNPACK_ROW_LENGTH is unsupported, depending on the upload size. It might sometimes be faster to upload the entire thing than using a temp buffer.
Comment on attachment 8829903 [details]
Bug 1324312 - Handle alloc failure when uploading texture.

https://reviewboard.mozilla.org/r/106866/#review107920
Attachment #8829903 - Flags: review?(sotaro.ikeda.g) → review+
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/8c80b15ea92b
Handle alloc failure when uploading texture. r=sotaro
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/8c80b15ea92b
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Please request Aurora/Beta approval on this when you get a chance.
Comment on attachment 8829903 [details]
Bug 1324312 - Handle alloc failure when uploading texture.

Approval Request Comment
[Feature/Bug causing the regression]: Long existing crash
[User impact if declined]: Occasional crashes (when low on memory)
[Is this code covered by automated tests?]: No
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: No
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: No.
[Why is the change risky/not risky?]: We simply upload texture data in a probably slower way than before. Only affects cases when we previously would have crashed, however.
[String changes made/needed]: None
Flags: needinfo?(jnicol)
Attachment #8829903 - Flags: approval-mozilla-release?
Attachment #8829903 - Flags: approval-mozilla-beta?
Attachment #8829903 - Flags: approval-mozilla-release? → approval-mozilla-aurora?
Whoops! I did of course mean aurora not release. Thanks!
Comment on attachment 8829903 [details]
Bug 1324312 - Handle alloc failure when uploading texture.

make a buffer allocation faillible, aurora53+, beta52+
Attachment #8829903 - Flags: approval-mozilla-beta?
Attachment #8829903 - Flags: approval-mozilla-beta+
Attachment #8829903 - Flags: approval-mozilla-aurora?
Attachment #8829903 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.