Crash in [@ neqo_common::codec::Encoder::encode_uint<T>]
Categories
(External Software Affecting Firefox :: Other, defect, P1)
Tracking
(relnote-firefox 131+, firefox-esr115132+ fixed, firefox-esr128132+ fixed, firefox131+ fixed, firefox132+ fixed, firefox133+ fixed)
People
(Reporter: release-mgmt-account-bot, Assigned: mail)
References
(Blocks 1 open bug)
Details
(Keywords: crash, Whiteboard: [necko-triaged])
Crash Data
Attachments
(3 files)
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-release+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-esr128+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-esr115+
|
Details | Review |
Crash report: https://crash-stats.mozilla.org/report/index/da8fc03b-0e89-4049-8f4e-57bf40240914
MOZ_CRASH Reason: assertion failed: n > 0 && n <= 8
Top 10 frames of crashing thread:
0 xul.dll MOZ_Crash mfbt/Assertions.h:317
0 xul.dll RustMozCrash mozglue/static/rust/wrappers.cpp:18
1 xul.dll mozglue_static::panic_hook mozglue/static/rust/lib.rs:102
2 xul.dll core::ops::function::Fn::call<void /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/ops/function.rs:166
3 xul.dll alloc::boxed::impl$50::call library/alloc/src/boxed.rs:2084
3 xul.dll std::panicking::rust_panic_with_hook library/std/src/panicking.rs:808
4 xul.dll std::panicking::begin_panic_handler::closure$0 library/std/src/panicking.rs:667
5 xul.dll std::sys::backtrace::__rust_end_short_backtrace<std::panicking::begin_panic_handler::closure_env$0, never$> library/std/src/sys/backtrace.rs:168
6 xul.dll std::panicking::begin_panic_handler library/std/src/panicking.rs:665
7 xul.dll core::panicking::panic_fmt library/core/src/panicking.rs:74
By querying Nightly crashes reported within the last 2 months, here are some insights about the signature:
- First crash report: 2024-09-14
- Process type: Parent
- Is startup crash: No
- Has user comments: No
- Is null crash: No
Updated•5 months ago
|
Comment 1•5 months ago
|
||
Hey Max, FYI
Do you have any idea what recent changes might have caused this?
Thank you for triaging Sunil. I will take a more in-depth look.
Tracking upstream in github.com/mozilla/neqo#2132.
Updated•5 months ago
|
Updated•4 months ago
|
Comment 4•4 months ago
|
||
As :yjuglaret noted in the neqo issue, this crash is highly correlated with Avast, and the fact that it's showing up across many versions of Firefox strongly indicate this is an issue with Avast. I'm moving this bug to External Software Affecting Firefox and will ping our Avast contacts to see if they can take a look.
People are creating forum threads about this too. Since it is making Firefox unusable for them, shouldn't this issue be set to P1?
I don't have enough experience with crash-stats.mozilla.org to judge the importance of this issue.
Maybe Greg Stoll, can you comment here?
We could handle the error more gracefully, failing the connection instead of panicking. Though I assume that would make it more difficult to actually find and fix the root cause of the issue.
For the record, we have extended the assert with additional metadata. This will be available with #1922479.
Comment 8•4 months ago
|
||
It's the #2 topcrash in Nightly 133. Since there are also several public complaints we have more evidence that this is affecting many people.
While this does seem to affect all channels, it looks worse in 133 nightly, which could mean a nightly-only feature or pref increases the chance of the crash. Or, there could be some other new issue in 133 that makes it worse, in which case it will be a high volume crash for 133 beta and release. So my tendency would be to bump up the priority here and in bug 1922479. jesup, you set the priority in bug 1992479 so maybe you can weigh in .
Updated•4 months ago
|
Updated•4 months ago
|
Reporter | ||
Comment 9•4 months ago
|
||
The bug is marked as tracked for firefox133 (nightly). However, the bug still has low severity.
:gcp, could you please increase the severity for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit BugBot documentation.
Updated•4 months ago
|
Comment 10•4 months ago
|
||
shouldn't this issue be set to P1?
That only really serves a purpose if we can fix it, not if the issue is with a third party. We can't force Avast to prioritize this. But maybe we can mitigate it...somehow?
Since there are also several public complaints we have more evidence that this is affecting many people.
Nightly only or release?
Comment 11•4 months ago
|
||
Max - will bug 1923287 fix this crash? (or is it just adding more logging?)
Assignee | ||
Comment 12•4 months ago
|
||
bug 1923287 contains mozilla/neqo#2150, which terminates a connection, instead of panicking, when receiving an invalid ACK, i.e. an ACK for a packet that was never sent.
While this will most likely resolve the panic itself, it is still an open question why we are seeing an increase in invalid ACKs.
Comment 13•4 months ago
|
||
Note that neqo_transport::connection::Connection::add_packet_number
is the new signature for this crash after we added some additional metadata in neqo v0.9.1.
As of neqo v0.9.2, we believe that Firefox should close the connection if we get an unexpected ACK instead of panicking.
Comment 15•4 months ago
|
||
Looking at the Nightly data the fix bug 1923287 seems to have eliminated the crash, although presumably connections are instead being closed so the underlying issue remains.
Reporter | ||
Comment 17•4 months ago
|
||
Copying crash signatures from duplicate bugs.
Assignee | ||
Comment 18•4 months ago
|
||
For the record, I have reached out to Avast, informing them that the crash itself is fixed, but the underlying issue now surfaces as expected through connections being closed. They are investigating whether this is related to their logic that "intercepts Quic traffic".
Comment 19•4 months ago
|
||
Since we have fixed the crash, I'm closing this bug (and will request uplifts) and filed bug 1923943 to cover the QUIC connection being closed as a result of the unexpected ACK.
Updated•4 months ago
|
Updated•4 months ago
|
Updated•4 months ago
|
Comment 20•4 months ago
|
||
Updated•4 months ago
|
Comment 21•4 months ago
|
||
release Uplift Approval Request
- User impact if declined: Possible crash
- Code covered by automated testing: yes
- Fix verified in Nightly: yes
- Needs manual QE test: no
- Steps to reproduce for manual QE testing: N/A
- Risk associated with taking this patch: Low
- Explanation of risk level: Patch is verified
- String changes made/needed: N/A
- Is Android affected?: yes
Updated•4 months ago
|
Updated•4 months ago
|
Comment 22•4 months ago
|
||
uplift |
Comment 23•4 months ago
|
||
Updated•4 months ago
|
Comment 24•4 months ago
|
||
esr128 Uplift Approval Request
- User impact if declined: possible crash
- Code covered by automated testing: yes
- Fix verified in Nightly: yes
- Needs manual QE test: no
- Steps to reproduce for manual QE testing: N/A
- Risk associated with taking this patch: Low
- Explanation of risk level: Patch is verified
- String changes made/needed: N/A
- Is Android affected?: yes
Updated•4 months ago
|
Comment 25•4 months ago
|
||
uplift |
Updated•4 months ago
|
Comment 26•4 months ago
|
||
uplift |
Updated•4 months ago
|
Comment 27•4 months ago
|
||
Updated•4 months ago
|
Comment 28•4 months ago
|
||
esr115 Uplift Approval Request
- User impact if declined: Possible crash
- Code covered by automated testing: yes
- Fix verified in Nightly: yes
- Needs manual QE test: no
- Steps to reproduce for manual QE testing: N/A
- Risk associated with taking this patch: Low
- Explanation of risk level: This patch is straightforward
- String changes made/needed: N/A
- Is Android affected?: yes
Updated•4 months ago
|
Updated•4 months ago
|
Updated•4 months ago
|
Updated•4 months ago
|
Comment 29•4 months ago
|
||
uplift |
Description
•