Consider surviving to crashes that originate from threads started by injected third-party libraries, e.g. aswJsFlt.dll
Categories
(External Software Affecting Firefox :: Other, enhancement)
Tracking
(Not tracked)
People
(Reporter: yannis, Unassigned)
References
Details
Third-party injected libraries can cause big crash incidents such as bug 1794064 which originated from Avast's aswJsFlt.dll
. That DLL creates a dedicated thread in the main process. This thread will listen on a named pipe and process the incoming messages.
While working on bug 1794064, I noticed that the three major browsers have a different behavior with respect to crashes originating from the aswJsFlt.dll
thread:
- in Firefox and Chrome, the main process will crash;
- in Edge, the thread will get killed but the main process will (try to) survive.
We could consider having the same behavior as Edge with respect to third-party injected threads. That would limit the impact of incidents like bug 1794064.
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Updated•2 years ago
|
Comment 1•2 years ago
|
||
I believe this would have helped bug 1837242 and bug 1837246 too.
Comment 2•2 years ago
|
||
I think this would be great to do, but I don't know how to do it :-) Some kind of SEH? Or maybe some way to configure Breakpad, or use SetUnhandledExceptionFilter()?
Comment 3•1 years ago
|
||
Note that a major concern here is that the thread could have been holding onto a lock when it terminated, which could lead to a deadlock later one. (for reference: WaitForSingleObject() will return WAIT_ABANDONED
the next time another thread tries to acquire it)
Comment 4•1 years ago
|
||
(In reply to Yannis Juglaret [:yannis] from comment #0)
- in Edge, the thread will get killed but the main process will (try to) survive.
My immediate reaction is that this sounds like the kind of thing that will generate multiple sec-bugs and then eventually get rolled back.
(In reply to Greg Stoll from comment #3)
Note that a major concern here is that the thread could have been holding onto a lock when it terminated, which could lead to a deadlock later one. (for reference: WaitForSingleObject() will return
WAIT_ABANDONED
the next time another thread tries to acquire it)
I'd describe that as a special case of the more general concern that a crashed thread may — arguably, will — leave the process in an inconsistent state. Trying to continue from that may literally be trying to continue after heap-corruption.
I'd be all for (e.g.) a dialog at next startup, offering the option to block the DLL responsible for the crash; but I think once the injected thread actually goes off the rails, it's too late to try to catch it.
Comment 5•1 year ago
|
||
Yeah, I guess this is probably too likely to cause security bugs to realistically do.
We are looking into notifying the user that a specific DLL was responsible for a previous Firefox crash, hopefully that's something that will get worked on soon.
Comment 6•1 year ago
|
||
too likely to cause security bugs
You'd crash the users' anti-virus by finding a way to trigger a bug in it?
I'm not sure this would cause actual security bugs, but it may make exploitation easier. That is, you'd (only) have to find a bug that crashes the anti-virus (instead of finding one to exploit it directly), and can then trick the user into doing something the anti-virus would have blocked.
If you construct an exploit this way, I'm not sure I'd argue it's our fault or problem, quite frankly.
Reporter | ||
Comment 7•1 year ago
•
|
||
From a stability perspective already, it would be quite surprising if nothing would hang after the antivirus thread dies, since it should now be a "non-responding man in the middle" for many things (JS? Network? Files?). But that's what seemed to be happening in Edge as far as I remember, as surprising as it sounds. Maybe we can start by testing what happens if we deliberately kill an AV thread? Maybe we need some discussion with AV vendors to know their opinion on what should happen when their thread crashes? For example, we may want to have some visual way for the user to know that they are no longer benefiting from antivirus features at this point and that they should restart their browser if they want them back.
From a security perspective, it seems rather common for antivirus DLLs (at least from the small sample of DLLs that I have analyzed after crash spikes) to create their own heap and use that, so a heap corruption in their code does not necessarily imply that our whole process is doomed. Yes, some products could be allocating from the default heap and it could be a bad idea to try to recover from a heap corruption there. So, I don't think we want to just recover for all products indeed.
I think that if we can identify some specific widely spread products where we know (because we tested and/or interacted with the vendor) that it seems reasonably safe and stable to recover from crashes, we should be able to gain some stability guarantees. The risk would be that we put a lot of efforts on products that are already well engineered and then the crash spikes happen with the ones that felt unsafe or unstable.
Description
•