Closed Bug 1691347 Opened 3 years ago Closed 3 years ago

Crash in [@ _platform_strstr] from cups

Categories

(Core :: Printing: Setup, defect)

Unspecified
macOS
defect

Tracking

()

RESOLVED FIXED
95 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox-esr91 --- wontfix
firefox85 --- wontfix
firefox86 --- wontfix
firefox87 --- wontfix
firefox93 --- wontfix
firefox94 --- wontfix
firefox95 --- fixed

People

(Reporter: aryx, Assigned: alaskanemily)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

Crash report: https://crash-stats.mozilla.org/report/index/6949fd22-228e-4f05-bc67-aae9e0210206

Reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS

Top 10 frames of crashing thread:

0 libsystem_platform.dylib _platform_strstr 
1 libcups.2.dylib _cupsGetDestResource 
2 libcups.2.dylib cupsCopyDestInfo 
3 libcups.2.dylib _ippFindOption 
4 libcups.2.dylib cupsCheckDestSupported 
5 XUL semver_parser::common::numeric_identifier third_party/rust/semver-parser/src/common.rs:55
6 XUL <semver::version::Version as core::convert::From<semver_parser::version::Version>>::from third_party/rust/semver/src/version.rs:129
7 XUL semver::version::Version::parse third_party/rust/semver/src/version.rs:218
8 libsystem_kernel.dylib mach_msg 
9 libsystem_kernel.dylib kdebug_trace 

This stacktrace seems pretty suspect, I don't really see how the semver-parser could call into CUPS. Possibly stack smashing?

We also use constexpr function pointers to get the CUPS functions on OS X, I don't think that this could be the CUPS shim being trashed.

From a different crash report;

Crash report: https://crash-stats.mozilla.org/report/index/d4207f0f-21e9-4045-b0ba-7cc9c0210219

Reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS

Top 10 frames of crashing thread:

0 libsystem_platform.dylib _platform_strstr 
1 libcups.2.dylib _cupsGetDestResource 
2 libcups.2.dylib cupsCopyDestInfo 
3 XUL nsPrinterCUPS::TryEnsurePrinterInfo const widget/nsPrinterCUPS.cpp:342
4 XUL nsPrinterCUPS::Supports const widget/nsPrinterCUPS.cpp:166
5 XUL void mozilla::SpawnPrintBackgroundTask<nsPrinterBase, bool, > const widget/PrintBackgroundTask.h:53
6 XUL mozilla::detail::RunnableFunction<void mozilla::SpawnPrintBackgroundTask<nsPrinterBase, bool, > xpcom/threads/nsThreadUtils.h:534
7 XUL nsThreadPool::Run xpcom/threads/nsThreadPool.cpp:303
8 XUL nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1152
9 XUL mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:302
Severity: -- → S2

Seems macOS specific, will try to file a radar for them to hopefully take a look?

Flags: needinfo?(emilio)

Another crash report with a similar backtrace as hiro's:
https://crash-stats.mozilla.org/report/index/0e1911ba-49ba-40d7-a19d-2b9480211009

The innermost mozilla code here is:

  // All CUPS calls that take the printer info do null-checks internally, so we
  // can fetch this info and only worry about the result of the later CUPS
  // functions.
  lock->mPrinterInfo = mShim.cupsCopyDestInfo(aConnection, mPrinter);

Emily -- looks like you wrote this comment -- I'm wondering if maybe this is a case where CUPS is not doing a null-check internally? Maybe we need to add some null checks here to avoid tripping this crash?

Flags: needinfo?(emcdonough)

Ah, ok, let's try ^ first before filing one :)

Flags: needinfo?(emilio)

I'll take a look. In the past it wasn't too awful to trace through the CUPS code itself to find where this kind of crash occurred, so I'm going to at least try to figure that out first. It's possible the NULL is coming from something the arguments point to, rather than the arguments themselves.

Flags: needinfo?(emcdonough)
Flags: needinfo?(emcdonough)

This is just a large conditional to check for the specific branches taken
that would result in the null string. The path in CUPS is as follows:

In cupsCopyDestInfo, CUPS_DEST_FLAG_DEVICE is set when the connection is not
null (same as CUPS_HTTP_DEFAULT), the print server is not the same as our
hostname and is not path-based (starts with a '/'), or the IPP port is
different than the global server IPP port.

https://github.com/apple/cups/blob/c9da6f63b263faef5d50592fe8cf8056e0a58aa2/cups/dest-options.c#L718-L722

In _cupsGetDestResource, CUPS fetches the IPP options "printer-uri-supported"
and "device-uri". Importantly, IPP options are returned as null when missing.

https://github.com/apple/cups/blob/23c45db76a8520fd6c3b1d9164dbe312f1ab1481/cups/dest.c#L1138-L1141

If the CUPS_DEST_FLAG_DEVICE is set or the "printer-uri-supported" option is
not set, CUPS checks for "._tcp" in the "device-uri" option without doing a
NULL-check first.

https://github.com/apple/cups/blob/23c45db76a8520fd6c3b1d9164dbe312f1ab1481/cups/dest.c#L1144

If we find that those branches will be taken, don't actually fetch the CUPS
data and instead just return an empty printer info.

Assignee: nobody → emcdonough
Status: NEW → ASSIGNED

I can't really test that this patch will work since I can't reproduce the bug, but looking at the CUPS code this appears to be the only way a null-pointer could be passed to strstr in that callstack.

Assignee: emcdonough → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(emcdonough)
Assignee: nobody → emcdonough
Status: NEW → ASSIGNED

Unfortunately, this fix can only work on CUPS >= 2.0

Since the crash appears to happen much more often on OS X for whatever reason, and our minimum OS X version has a recent enough CUPS version, this should hopefully still fix this crash for a majority of cases.

Pushed by emcdonough@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a7c37fd765d9
Work around a null argument bug in _platform_strstr in CUPS. r=dholbert
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 95 Branch

(In https://phabricator.services.mozilla.com/D128529#inline-706999, @alaskanemily wrote)

I'm still waiting on the CUPS PR comments and possibly an accompanying github issue, it seems like most PR's don't have an accompanying issue for CUPS so I didn't open an issue first. I can add a link to the PR, or just wait and see what the best link back to that is and add it when something happens in the CUPS repo.

This looks like it needs to be fixed in OpenPrinting CUPS too. It think that's where new development is happening.

https://github.com/OpenPrinting/cups/blob/b94651b24d277d5ff9fe2c3c4756001400bc957b/cups/dest.c#L1145

Flags: needinfo?(emcdonough)

Yes, I can submit the patch to the OP repo too. This seems to almost exclusively happen on OS X currently so that's been my priority. It might be that dynamically loading CUPS as we do on Linux but not OS X is foiling proper backtraces, or that most people printing on Linux aren't using Avahi/Bonjour/mDNS printers.

Flags: needinfo?(emcdonough)

The underlying bug is now fixed in OpenPrinting CUPS 2.4

I'm still waiting for action on the patch for the Apple CUPS repo.

I just noticed we still seem to be getting instances of this crash, in version 95 which should theoretically be protected via the mitigation that we landed in comment 10 - 11 here.

I filed bug 1746956 to follow up (though it's really a continuation of this same bug report, but just with a hypothetical-future-patch landing in a later release).

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: