Open Bug 1781106 Opened 3 years ago Updated 3 years ago

Random crashes on Linux, dmesg hints at libxul.so

Categories

(Core :: JavaScript Engine: JIT, defect, P5)

Firefox 102
defect

Tracking

()

UNCONFIRMED

People

(Reporter: maksim.ivanov.mail, Unassigned)

References

(Blocks 1 open bug)

Details

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0

Steps to reproduce:

Browse for a while.

Actual results:

Tab or (sometimes) the whole application crashes, at random times. Usually at page loading.

dmesg shows the following:
[ 2431.861772] Isolated Web Co[10017]: segfault at 2b6e9e2ff800 ip 00002a90e0184687 sp 00007fffdbd6c3b0 error 4
[ 2431.861778] Code: f8 c1 ef 0a 81 c7 c0 d7 00 00 66 89 39 41 8b f8 81 e7 ff 03 00 00 81 cf 00 dc 00 00 66 89 79 02 4c 8b da 49 81 cb ff f7 0f 00 <49> 83 bb 01 00 f0 ff 00 0f 85 18 00 00 00 4c 8b de 49 81 cb ff ff

https://crash-stats.mozilla.org/report/index/a4cd3d32-3f79-471c-a5af-f3b380220725

Expected results:

No crashes

Setting the component to make this bug visible.
If this is not the correct component, please feel free to change it to a more appropriate one. Thanks

Component: Untriaged → JavaScript Engine: JIT
Product: Firefox → Core

Unfortunately, this kind of crashes is hardly actionable in general.

I tried to make sense of the code listed in comment 0, without success of matching it with any of the code that we might produce, either because this is not our any kind of code we generate, or because it is mixed with data.

Out of curiosity, have you checked whether the RAM is well behaving?

Blocks: SadJit
Severity: -- → S4
Flags: needinfo?(maksim.ivanov.mail)
Priority: -- → P5

(In reply to Nicolas B. Pierron [:nbp] from comment #2)

Unfortunately, this kind of crashes is hardly actionable in general.

I tried to make sense of the code listed in comment 0, without success of matching it with any of the code that we might produce, either because this is not our any kind of code we generate, or because it is mixed with data.

Out of curiosity, have you checked whether the RAM is well behaving?

One of the first things I checked was RAM, as everything happened on a brand new laptop (Dynabook Satellite Pro). I used memtest. Moreover, this crash report is always the same with the same version of Firefox, so it is not something random. It is different with a different version, but it is still in xul and with access violation. Later I found that other browsers (Konqueror, Chromium) also crash, with different trace but still access violation. All was done under various flavors of Linux. Then I tried Windows 10, and both Firefox and Edge crashed. That was a big surprise. But then under Windows 11 all is stable, which surprised me even more. So I'm thinking there is some issue with WiFi drivers, as non-networking worked great. I didn't try to check the stability under wired networking, though. Anyhow, under Linux I could only use alphanumeric symbols for WiFi password (no slashes, bars, or whatnot). Under Win10 I had no WiFi until I installed drivers. Win11 worked out-of-the-box. This is why I think there is some issue with drivers. I do not know how to debug it, though...

Flags: needinfo?(maksim.ivanov.mail)

This remind me of a very old problem I had with a non-working PCMCIA Wifi card which required some specific memory address ranges to be reserved by the kernel. Without reserving these address ranges, the Wifi would simply not work.

I presume something similar might be happening in reverse. Where the Wifi chip register it-self at some absolute memory addresses, and the kernel is not aware of it. Thus when the kernel allocates a page to a process, the wifi chip would cause some corruption.

Looking into Windows 11, you might be able to find the memory ranges that Windows 11 reserved for the driver. Using the same memory ranges on Linux might solve your issue. Unfortunately, I long forgot all the details on how to do that :/

(In reply to Nicolas B. Pierron [:nbp] from comment #4)

This remind me of a very old problem I had with a non-working PCMCIA Wifi card which required some specific memory address ranges to be reserved by the kernel. Without reserving these address ranges, the Wifi would simply not work.

I presume something similar might be happening in reverse. Where the Wifi chip register it-self at some absolute memory addresses, and the kernel is not aware of it. Thus when the kernel allocates a page to a process, the wifi chip would cause some corruption.

Looking into Windows 11, you might be able to find the memory ranges that Windows 11 reserved for the driver. Using the same memory ranges on Linux might solve your issue. Unfortunately, I long forgot all the details on how to do that :/

I wish I new how to debug such situation, but my skills are too detached from gdb and building kernels with debug symbols :(

You need to log in before you can comment on or make changes to this bug.