Closed Bug 1644159 Opened 4 years ago Closed 3 years ago

Crash in [@ libxul.so (deleted)@0xd2bb50] - MOZ_CRASH(Library preload failure: Failed to get binary file)

Categories

(Core :: Security: Process Sandboxing, defect, P3)

defect

Tracking

()

RESOLVED INVALID
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- wontfix
firefox77 --- wontfix
firefox78 --- wontfix
firefox79 --- wontfix
firefox83 --- wontfix

People

(Reporter: Sylvestre, Assigned: shravanrn)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-f9e8e84a-efc4-4efd-af08-f6d2b0200608.

Top 10 frames of crashing thread:

0 libxul.so (deleted) libxul.so @0xd2bb50 
1 libxul.so (deleted) libxul.so @0xd2bc4a 
2 libxul.so (deleted) libxul.so @0x22e5566 
3 libxul.so (deleted) libxul.so @0x22e5edd 
4 libxul.so (deleted) libxul.so @0x2102f8c 
5 libxul.so (deleted) libxul.so @0x2102fe8 
6 libxul.so (deleted) libxul.so @0x212dadd 
7 libxul.so (deleted) libxul.so @0x212dd7c 
8 libxul.so (deleted) libxul.so @0x20b3e5d 
9 libxul.so (deleted) libxul.so @0x20a0f64 

second crash report bp-cedeac23-f615-4b4c-a42a-32c910200608

This assertion is in ipc/glue/LibrarySandboxPreload.cpp

Component: General → Security: Process Sandboxing

The assertion is part of RLBox. This looks like it's caused by an update happening underneath a running install — libxul.so is shown as unlinked, and the mozilla::BinaryPath::Get call will try to readlink /proc/self/exe which would fail in that case.

We have code to catch the case where a newly launched child process is from the wrong version due to an update and show an error UI, but I don't think we have anything for trying to use mozilla::BinaryPath::Get in an already-running process.

Regressed by: 1575985
Has Regression Range: --- → yes
Component: Security: Process Sandboxing → General

Nathan, can you prioritize this bug (and find an owner)?

Component: General → Security: Process Sandboxing
Flags: needinfo?(nfroyd)
Assignee: nobody → shravanrn

Sorry for the delay here.
@Sylvestre: Just checking. Is this a crash report from a machine that you own or is this a crash from an unknown machine. The reason I am asking is that it would be useful to get some extra information if you have access to this machine. In particular, did you install firefox from the package manager or from mozilla provided binaries or other sources?

In particular, I'm hoping to find out if you see a a file called "libgraphitewasm.so" in the firefox directory? Let me know if you need more information.

Flags: needinfo?(sledru)
Priority: -- → P3

There dont seem to be too many crash reports of this type yet. So keeping this as a p3 for now. We can re-asses if there are more reports.

I agree with Shravan on the P3 designation. As we sandbox more and more stuff, this is going to become more and more of a problem, but I don't think we have to look at this right away. (Also not quite sure how we're going to solve it, since we're a long way away from the IPC layer at this point...)

Flags: needinfo?(nfroyd)

I have libgraphitewasm.so

$ ls -al libgraphitewasm.so
-rwxr-xr-x 1 sylvestre sylvestre 764896 juin  17 03:06 libgraphitewasm.so

I do have firefox debian packages but no libgraphitewasm.so

Flags: needinfo?(sledru)

(In reply to Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #2)

The assertion is part of RLBox. This looks like it's caused by an update happening underneath a running install — libxul.so is shown as unlinked, and the mozilla::BinaryPath::Get call will try to readlink /proc/self/exe which would fail in that case.

Correction: it will suffix (deleted), or at least it does for me in a simple test, and I don't see anything in this code that tests whether the path actually exists before removing the last component, so this ought to actually work (or at least fail later) if the installation was deleted by an update such that there's a new version at the same path.

Also, I missed this before, but this happens in the content process during startup (RecvSetProcessSandbox), which means that in order to get libxul.so (deleted) in the crash report, there would have to be a content process launch racing with an update; maybe that's significant.

The severity field is not set for this bug.
:gcp, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(gpascutto)

S4 due to low crash volume.

Severity: -- → S4
Flags: needinfo?(gpascutto)

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME

I think it is just because the signature name changed

Status: RESOLVED → REOPENED
Crash Signature: [@ libxul.so (deleted)@0xd2bb50] → [@ libxul.so (deleted)@0xd2bb50] [@ libxul.so (deleted)@0xd7a33d]
Resolution: WORKSFORME → ---
Crash Signature: [@ libxul.so (deleted)@0xd2bb50] [@ libxul.so (deleted)@0xd7a33d] → [@ libxul.so (deleted)@0xd2bb50] [@ libxul.so (deleted)@0xd7a33d] [@ libxul.so (deleted)@0xd8248d] [@ libxul.so (deleted)@0xd81ffd] [@ libxul.so (deleted)@0xd6eead] [@ libxul.so (deleted)@0xd67b6d]

I have this crash ~once a week (but seems still pretty rare)

MOZ_CRASH(Library preload failure: Failed to get binary file )

Ah I see. Going off @jld's earlier comment

The assertion is part of RLBox. This looks like it's caused by an update happening underneath a running install

This would possibly happen on linux when updating from a mozilla provided distribution (which uses wasm based sandboxes and must preload i.e. dlopen these files early during the process startup) to a package manager provided distribution (the debian based os releases have disabled the wasm sandboxing by default).

Unfortunately, I am not sure of the best way to address this. It sounds like the browser is running in some sort of mixed state between two different distributions. One thing I can do is defer the error --- rather than crashing the process, we can fail silently. This is not a fix though, as any tab needs functionality that requires us to create a wasm sandbox (currently rendering a graphite font, demuxing an ogg file), we would then fail and crash.

Any suggestions?

Flags: needinfo?(sledru)
Flags: needinfo?(jld)

Maybe it is happening when I am updating my Debian (and I usually have the Firefox package installed).
I will try to see if I can reproduce.

Anyway, the error message could be updated to make it more clear.
At least provide the lib name?

Flags: needinfo?(sledru)
Crash Signature: [@ libxul.so (deleted)@0xd2bb50] [@ libxul.so (deleted)@0xd7a33d] [@ libxul.so (deleted)@0xd8248d] [@ libxul.so (deleted)@0xd81ffd] [@ libxul.so (deleted)@0xd6eead] [@ libxul.so (deleted)@0xd67b6d] → [@ libxul.so (deleted)@0xd2bb50] [@ libxul.so (deleted)@0xd7a33d] [@ libxul.so (deleted)@0xd8248d] [@ libxul.so (deleted)@0xd81ffd] [@ libxul.so (deleted)@0xd6eead] [@ libxul.so (deleted)@0xd67b6d] [@ libxul.so (deleted)@0xd8832d]
Crash Signature: [@ libxul.so (deleted)@0xd2bb50] [@ libxul.so (deleted)@0xd7a33d] [@ libxul.so (deleted)@0xd8248d] [@ libxul.so (deleted)@0xd81ffd] [@ libxul.so (deleted)@0xd6eead] [@ libxul.so (deleted)@0xd67b6d] [@ libxul.so (deleted)@0xd8832d] → [@ libxul.so (deleted)@0xd2bb50] [@ libxul.so (deleted)@0xd7a33d] [@ libxul.so (deleted)@0xd8248d] [@ libxul.so (deleted)@0xd81ffd] [@ libxul.so (deleted)@0xd6eead] [@ libxul.so (deleted)@0xd67b6d] [@ libxul.so (deleted)@0xd8832d] [@ libxul.so (del…

I am getting this crash every day now.
I am not upgrading the system that often.

Crash Signature: (deleted)@0x13867b4] → (deleted)@0x13867b4] [@ libxul.so (deleted)@0x1393e34]

At least provide the lib name?

Yup, we can do this. I think we initially decided that adding the libname string would not be safe for error reporting, because we weren't using static strings, but I think given this bug, we should at least statically include library name strings.

I am getting this crash every day now.

Oh interesting. I think this may be good actually as it is now easy to repro. I am trying to think of the best way to see what is happening here. Ideally it would be fantastic if I can just attach a debugger to the code --- i'll assume this is not possible given that this is your personal machine (however, if this is a machine you can provide access to, this would be fantastic)

Alternately, I am wondering about the best way to collect some information. As a starting point, could you provide the logs or running nm in libxul.so and libgraphitewasm.so

nm firefox > /tmp/firefoxsymbols.txt
nm libxul.so > /tmp/libxulsymbols.txt
nm libgraphitewasm.so > /tmp/libgraphitesymbols.txt

This will help me figure out if there is some weird mismatch of binaries that is happening.

Flags: needinfo?(sledru)

@sledru: Actually, I just realized an easier and more helpful option could also be if you could just zip up your firefox installation and attach it here. Please let me know.

Attached file nm.tar.gz
Flags: needinfo?(sledru)
Crash Signature: (deleted)@0x13867b4] [@ libxul.so (deleted)@0x1393e34] → (deleted)@0x13867b4] [@ libxul.so (deleted)@0x1393e34] [@ libxul.so (deleted)@0x13a66b4]

Shravan, did it help? Thanks

Flags: needinfo?(shravanrn)

Duplicate of #1550074 ?

Closing because no crashes reported for 12 weeks.

Status: REOPENED → RESOLVED
Closed: 4 years ago3 years ago
Resolution: --- → WORKSFORME
Flags: needinfo?(shravanrn)
Flags: needinfo?(jld)

Still happening with a different signature

Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Crash Signature: (deleted)@0x13867b4] [@ libxul.so (deleted)@0x1393e34] [@ libxul.so (deleted)@0x13a66b4] → (deleted)@0x13867b4] [@ libxul.so (deleted)@0x1393e34] [@ libxul.so (deleted)@0x13a66b4] [@ main]

Sorry, I think this got de-prioritized a bit since we were working on Windows support. The upside is that a couple of other changes have landed (Bug 1700534) that should hopefully make this much more unlikely. If we still see this issue, then I will look into eliminating the need for a separate library that needs to be loaded and including all relevant components inside xul directly (Bug 1572618). I'll try to circle back to this bug in a week or so

I had a crash with a similar error message:
https://crash-stats.mozilla.org/report/index/7d1f9cdf-8508-4719-a7d9-586ba0210928
"MOZ_CRASH(Library preload failure: Failed to get binary file )"

on:
src/cairo-spans-compositor.c:909
which seems bizarre ?!

This should completely have gone away by now. We no longer load any files dynamically Bug 1572618 landed. My best guess for this crash you saw is that it is possible that some other code in firefox may be using the same helper functions/error messages as the rlbox sandboxing feature. In this case, it looks like something related to font files given that this is cairo.

@Sylvestre: I think it would make sense to track this latest crash in a separate bug, so I will close this for now.

Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → INVALID

I haven't seen it for a while indeed, thanks :)

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: