Crash in [@ core::slice::<T>::copy_from_slice] on Linux
Categories
(Core :: Graphics: WebRender, defect, P3)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr115 | --- | unaffected |
| firefox121 | --- | unaffected |
| firefox122 | --- | unaffected |
| firefox123 | --- | fixed |
People
(Reporter: Sylvestre, Assigned: sotaro)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: crash, regression)
Crash Data
Crash report: https://crash-stats.mozilla.org/report/index/2909ff05-38fb-4fc4-9a7c-a60c10231220
Reason: SIGSEGV / SEGV_MAPERR
Top 10 frames of crashing thread:
0 libc.so.6 __memcpy_evex_unaligned_erms sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:265
1 libxul.so core::intrinsics::copy_nonoverlapping library/core/src/intrinsics.rs:2687
1 libxul.so core::slice::<impl [T]>::copy_from_slice library/core/src/slice/mod.rs:3619
1 libxul.so webrender::device::gl::TextureUploader::upload gfx/wr/webrender/src/device/gl.rs:4693
2 libxul.so webrender::renderer::upload::upload_to_texture_cache gfx/wr/webrender/src/renderer/upload.rs:180
2 libxul.so webrender::renderer::Renderer::update_texture_cache gfx/wr/webrender/src/renderer/mod.rs:1933
3 libxul.so webrender::renderer::Renderer::render_impl gfx/wr/webrender/src/renderer/mod.rs:1478
4 libxul.so webrender::renderer::Renderer::render gfx/wr/webrender/src/renderer/mod.rs:1235
5 libxul.so wr_renderer_render gfx/webrender_bindings/src/bindings.rs:619
6 libxul.so mozilla::wr::RendererOGL::UpdateAndRender gfx/webrender_bindings/RendererOGL.cpp:190
| Reporter | ||
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment 1•1 year ago
|
||
I can't interpret the assembly in memmove, and the source pointer is to the start of that function so it's not obvious which clause we are failing. Based on the qualifier further up the stack of of copy_nonoverlapping it makes me wonder if we are supplying an overlapping data pointer and if we could add any checks here to ensure that we aren't. Glenn, what do you think?
Comment 2•1 year ago
|
||
I think we might be dealing with several different crashes under this signature. The nightly crashes all have in common that they seem to happen right past the end of a buffer, as if they were overflowing. In one case the buffer has a name: anon_inode:i915.gem. Note that the buffer is not always writable, in some crashes it's not accessible but it is mapped, as if something had changed its access permissions either before or right after the crash. This doesn't change the nature of the crash as the address we're accessing is always one past the end of the buffer.
Comment 3•1 year ago
|
||
My PC was crashing from this very often. it's random, I can't find any STR. Maybe it's because I had gfx.webrender.all set to true, I did that for some other testing a long time ago and never removed it. Otherwise I don't know anything that could help. Sometimes it happens when I'm using firefox, sometimes it happens when I'm not using firefox. It doesn't matter which page I look at. Maybe it's which sites I have loaded since it started this evening but firefox had been fine while the sun was up ;-).
I have:
NVidia RTS 2060
Ryzen 3900X
Linux Mint
Firefox Nightly.
Comment 4•1 year ago
|
||
gfx.webrender.all made no difference.
Comment 5•1 year ago
•
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #2)
I think we might be dealing with several different crashes under this signature.
There is definitely a recent nightly regression that is getting mixed up with pre-existing release and ESR volume. According to nightly volume aggregated over build ID the recent regression has most likely started with build ID 20231219231600 since that is when we've started to receive continuous volume:
6 20231028092407 1 3.33 %
7 20231124214933 1 3.33 %
8 20231215050055 1 3.33 %
3 20231219231600 3 10.00 %
2 20231220041048 4 13.33 %
5 20231220221923 2 6.67 %
4 20231221052522 3 10.00 %
1 20231221170215 14 46.67 %
9 20231222164453 1 3.33 %
As far as I can tell the nightly crashes occur because we are providing to memcpy a source buffer that is not (or no longer) mapped. Given the call stack I think that would mean that our update_list contains a texture that we want to upload for which the source buffer is not (or no longer) mapped.
These two considerations make the two recent commits from bug 1829026 highly suspicous to me since they are the only addition from 20231219152636 to 20231219231600 and one of them talks about removing waiting texture IDs which I guess could lead to "no longer mapped", so I'm setting it as regressor but feel free to change that if incorrect.
The nightly crashes all have in common that they seem to happen right past the end of a buffer, as if they were overflowing. In one case the buffer has a name:
anon_inode:i915.gem. Note that the buffer is not always writable, in some crashes it's not accessible but it is mapped, as if something had changed its access permissions either before or right after the crash. This doesn't change the nature of the crash as the address we're accessing is always one past the end of the buffer.
Maybe what's after the end of the buffer you see used to be the start of another buffer? (the crashing addresses are always page-aligned)
Comment 6•1 year ago
|
||
Set release status flags based on info from the regressing bug 1829026
Updated•1 year ago
|
Comment 8•1 year ago
|
||
The bug is linked to a topcrash signature, which matches the following criterion:
- Top 10 desktop browser crashes on nightly
:gw, could you consider increasing the severity of this top-crash bug?
For more information, please visit BugBot documentation.
Updated•1 year ago
|
Comment 9•1 year ago
|
||
:lsalzman, since you are the author of the regressor, bug 1829026, could you take a look?
For more information, please visit BugBot documentation.
Updated•1 year ago
|
Comment 10•1 year ago
•
|
||
If the patch at fault is "Bug 1829026 - Remove waiting texture ids if nothing uses them.", then bug 1872522 should fix it. To be clear, it will not fix any pre-existing crashes here prior to that patch, since clearly this particular crash also has a conflated cause that preexists any of my work, though some of my work might have caused a spike. To make things simpler, I separated out bug 1872522 to address just that fix.
Comment 11•1 year ago
•
|
||
It seems like Sotaro's work in bug 1868928 improved this. The builds that include that patch aren't showing up in new nightly reports.
Comment 12•1 year ago
|
||
Sotaro, it really looks like bug 1868928 almost entirely fix this. Can you guess why?
| Assignee | ||
Comment 13•1 year ago
|
||
(In reply to Lee Salzman [:lsalzman] from comment #12)
Sotaro, it really looks like bug 1868928 almost entirely fix this. Can you guess why?
Sorry, I am not sure about the reason.
Comment 14•1 year ago
|
||
Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.
For more information, please visit BugBot documentation.
Comment 15•1 year ago
|
||
We don't have crashes in 123 beta for this signature, can we mark this bug as fixed by bug 1868928? Thanks
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Description
•