Last Comment Bug 573039 - Segfault with CUPS 1.4.4 (Print and Print Preview do not work)
: Segfault with CUPS 1.4.4 (Print and Print Preview do not work)
Status: RESOLVED FIXED
[qa-examined-191] [qa-examined-192] [...
: crash
Product: Core
Classification: Components
Component: Printing: Output (show other bugs)
: unspecified
: x86 Linux
: -- critical with 5 votes (vote)
: ---
Assigned To: Matthew Gregan [:kinetik]
:
Mentors:
: 516459 607258 (view as bug list)
Depends on:
Blocks: 257381
  Show dependency treegraph
 
Reported: 2010-06-18 08:54 PDT by Richard Fuller
Modified: 2011-02-24 01:07 PST (History)
21 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
final+
.14-fixed
.17-fixed


Attachments
patch v0 (882 bytes, patch)
2010-12-21 18:34 PST, Matthew Gregan [:kinetik]
karlt: review-
Details | Diff | Review
patch v1 (5.14 KB, patch)
2010-12-21 21:02 PST, Matthew Gregan [:kinetik]
karlt: review+
christian: approval1.9.2.14+
Details | Diff | Review
1.9.2 patch v1 (4.93 KB, patch)
2010-12-28 08:53 PST, Jory A. Pratt
no flags Details | Diff | Review
v1 1.9.1 branch (4.93 KB, patch)
2011-01-05 06:36 PST, Jory A. Pratt
christian: approval1.9.1.17+
Details | Diff | Review

Description Richard Fuller 2010-06-18 08:54:10 PDT
User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-GB; rv:1.9.2.3) Gecko/20100611 Namoroka/3.6.3
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3

Please see http://www.cups.org/str.php?L3606 for detailed information.

Firefox (3.6.3) and Thunderbird (3.0.4) crash when trying to print, or print preview, when using CUPS 1.4.4.

Reproducible: Always

Steps to Reproduce:
1. Open any page
2. File > Print Preview
Actual Results:  
Product exits with segmentation fault.

Expected Results:  
A print preview should be displayed.

The CUPS developer states:

"The problem is that the Mozilla apps are dlopen'ing libcups, which then initializes the SSL library. They then dlclose libcups after the print dialog goes away which leaves the OpenSSL threading stuff pointing at functions that are no longer in the process address space.

One of two things needs to happen - Firefox/Thunderbird need to stop using dlopen/dlclose (or at least dlclose) for libcups, or OpenSSL and GNU TLS need to actually support threading out of the box and not depend on the application or library to provide threading support."
Comment 1 Boris Zbarsky [:bz] (Out June 25-July 6) 2010-06-18 09:04:16 PDT
Looks like this has been a problem ever since bug 257381 first landed...

roc, can we avoid the dlopen stuff here?
Comment 2 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2010-06-18 17:20:02 PDT
No idea. Karl might know.
Comment 3 Karl Tomlinson (ni?:karlt) 2010-06-19 01:29:37 PDT
libcups is not a runtime dependency, so we can't avoid the dlopen by linking directly to the library.
https://wiki.mozilla.org/Linux/Runtime_Requirements

If we can't tell libcups to shut down, then we shouldn't dlclose.
I can't imagine any reason why it must be closed.

I'm assuming from Zach's comments in bug 394413 that libcups is actually still necessary until we add something to replace it?
Comment 4 Zack Weinberg (:zwol) 2010-06-19 22:22:37 PDT
All I know for sure is that when I ripped out the MOZ_ENABLE_POSTSCRIPT code (which is really about direct poking at CUPS) I couldn't make gtk printing work afterward.  That was two years ago, things may have changed.  In principle, Gtk-native apps don't seem to need to poke at CUPS directly AFAIK so we shouldn't need to do it either.

This is not something I have time to look at in the near future.
Comment 5 Octavian Petre 2010-07-17 01:02:10 PDT
As a workaround, until this is fixed, one might use cups 1.4.3
Comment 6 Martin Stránský 2010-07-19 06:56:22 PDT
We have cups-1.4.4 in Fedora 14 so I'm going to work on it.
Comment 7 Jory A. Pratt 2010-07-20 06:36:50 PDT
(In reply to comment #6)
> We have cups-1.4.4 in Fedora 14 so I'm going to work on it.

Martin If you need anything let me know, I am in process of working on this as well for gentoo.
Comment 8 Martin Stránský 2010-07-21 05:55:59 PDT
Hm, I'm not sure why but I'm unable to reproduce in with Fedora 14. 

Package cups-1.4.4-5.fc13.i686, I tested mozilla binary 3.6.7, 3.6.4 from distro and debug build of ff4.b2pre and no problems, printing works fine and print preview too.
Comment 9 Zack Weinberg (:zwol) 2010-07-21 09:36:07 PDT
Is your libcups actually doing anything with libssl?
Comment 10 Martin Stránský 2010-07-30 06:54:33 PDT
Ahh, you're right, we built cups with gnutls for now.
Comment 11 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2010-08-03 12:34:53 PDT
This is pretty trivial to work around (simply not call dlclose and leak the library), so I think we should probably block on it, assuming there are actual Linux distros that it affects or will affect.

What distros does this affect?


That said, it also seems pretty bad that using cups once leaves stuff running for the rest of the lifetime of the browser.
Comment 12 Jory A. Pratt 2010-08-03 16:35:15 PDT
(In reply to comment #11)
> What distros does this affect?

This effects just about all distro's and *bsds. We are all now forcing users to link cups to gnutls to work around the bug.
Comment 13 Jory A. Pratt 2010-09-14 17:02:15 PDT
(In reply to comment #12)
> (In reply to comment #11)
> > What distros does this affect?
> 
> This effects just about all distro's and *bsds. We are all now forcing users to
> link cups to gnutls to work around the bug.

I rolled a snapshot at changeset b397e6db5067, I am now unable to produce segfault when requesting printing or print preview with ssl built cups. I will try to make sometime to track down what commit has fixed this but seems to be working as expected at the moment.
Comment 14 Jory A. Pratt 2010-09-18 06:30:57 PDT
(In reply to comment #13)
> (In reply to comment #12)
> > (In reply to comment #11)
> > > What distros does this affect?
> > 
> > This effects just about all distro's and *bsds. We are all now forcing users to
> > link cups to gnutls to work around the bug.
> 
> I rolled a snapshot at changeset b397e6db5067, I am now unable to produce
> segfault when requesting printing or print preview with ssl built cups. I will
> try to make sometime to track down what commit has fixed this but seems to be
> working as expected at the moment.

It actually looks like openssl-1.0.0a resolves this issue, when cups links with it all seems to work fine. If someone on another distro could confirm some findings would be appreciated.
Comment 15 Frédéric Buclin 2010-10-27 04:28:43 PDT
*** Bug 607258 has been marked as a duplicate of this bug. ***
Comment 16 Frédéric Buclin 2010-10-27 04:39:06 PDT
(In reply to comment #12)
> > What distros does this affect?
> 
> This effects just about all distro's and *bsds.

Yeah, we are getting many reports about this problem:

https://qa.mandriva.com/show_bug.cgi?id=61213
https://qa.mandriva.com/show_bug.cgi?id=61009
https://bugzilla.novell.com/show_bug.cgi?id=617026

This bug affects current stable Linux distros + stable Fx (3.6.11), so more and more people are affected. Any chance to see this problem fixed in 3.6.x too?

Updating the bug summary to make it easier to find.
Comment 17 Frédéric Buclin 2010-10-27 06:29:04 PDT
(In reply to comment #14)
> It actually looks like openssl-1.0.0a resolves this issue, when cups links with
> it all seems to work fine. If someone on another distro could confirm some
> findings would be appreciated.

Jory, Mandriva guys seem to say this doesn't help, see:

https://qa.mandriva.com/show_bug.cgi?id=61009#c25
Comment 18 Daniel Veditz [:dveditz] 2010-10-29 10:25:27 PDT
We're not going to block branch releases on this, but when a reviewed and tested patch lands on trunk we'll take it if it's not introducing too much risk.
Comment 19 Octavian Petre 2010-10-31 00:34:59 PDT
"It actually looks like openssl-1.0.0a resolves this issue."


Gentoo
cups 1.4.4
openssl 1.0.0a
firefox 3.6.10
Comment 20 Jory A. Pratt 2010-12-08 19:54:51 PST
(In reply to comment #17)
> (In reply to comment #14)
> > It actually looks like openssl-1.0.0a resolves this issue, when cups links with
> > it all seems to work fine. If someone on another distro could confirm some
> > findings would be appreciated.
> 
> Jory, Mandriva guys seem to say this doesn't help, see:
> 
> https://qa.mandriva.com/show_bug.cgi?id=61009#c25

Are you sure that all apps where relinked to openssl-1.0.1a when testing? Based off what I seen in the post it is not the case.
Comment 21 Jory A. Pratt 2010-12-08 19:55:29 PST
(In reply to comment #20)
> (In reply to comment #17)
> > (In reply to comment #14)
> > > It actually looks like openssl-1.0.0a resolves this issue, when cups links with
> > > it all seems to work fine. If someone on another distro could confirm some
> > > findings would be appreciated.
> > 
> > Jory, Mandriva guys seem to say this doesn't help, see:
> > 
> > https://qa.mandriva.com/show_bug.cgi?id=61009#c25
> 
> Are you sure that all apps where relinked to openssl-1.0.1a when testing? Based
> off what I seen in the post it is not the case.

err excuse me I mean 1.0.0a
Comment 22 Jonas Sicking (:sicking) PTO Until July 5th 2010-12-14 10:06:49 PST
Sounds like it's not really affecting distros.
Comment 23 Frédéric Buclin 2010-12-14 10:10:38 PST
(In reply to comment #22)
> Sounds like it's not really affecting distros.

You should re-read comment 16!
Comment 24 Matthew Gregan [:kinetik] 2010-12-21 18:34:29 PST
Created attachment 499215 [details] [diff] [review]
patch v0

I looked at doing something clever here, but it doesn't seem to be worth the complexity.  The NS_WARNINGs might be unnecessary, but I figured they'd serve as a reminder to clean this up one day.
Comment 25 Karl Tomlinson (ni?:karlt) 2010-12-21 18:57:47 PST
Comment on attachment 499215 [details] [diff] [review]
patch v0

nsCUPSShims get instantiated and destroyed multiple times.

When not unloading cups, it would be better to use a static variable for the library or static variables for the symbols, than to open the library multiple times.

>             msg.Append(" not found in CUPS library");
>             NS_WARNING(msg.get());
> #endif
>-            PR_UnloadLibrary(mCupsLib);
>+            NS_WARNING("Unable to safely unload CUPS library; leaking reference");

I'm not certain, but I would hope it is safe to unload the library in this case where the library has not been used.
Comment 26 Matthew Gregan [:kinetik] 2010-12-21 19:19:17 PST
dlopen is refcounted:
       "If the same library is loaded again with dlopen(), the same file handle
       is returned.  The dl library maintains  reference  counts  for  library
       handles,  so  a  dynamic library is not deallocated until dlclose() has
       been called on it as many times as dlopen() has succeeded on  it."

So there doesn't seem to be any benefit to using static variables so that we only open the library once.
Comment 27 Karl Tomlinson (ni?:karlt) 2010-12-21 19:39:47 PST
I expect dlopen to open the new library should it be reinstalled, which could have the possibility of problems with symbol conflicts when both are installed, but NSPR prevents that because it keeps a mapping from filenames to handles (with reference counts).

So what you are claiming is that we are safe to continue incrementing the reference count either because it is not incremented frequently enough to overflow during the lifetime of the system or because it doesn't get decremented and so is never tested.

That may be the case but it feels ugly to me and is difficult to prove given that these objects are instantiated on the stack.

Is there a problem with using static variable(s)?
Comment 28 Matthew Gregan [:kinetik] 2010-12-21 21:02:44 PST
Created attachment 499233 [details] [diff] [review]
patch v1

New patch attached that addresses your concerns.

As an aside:
(In reply to comment #27)
> I expect dlopen to open the new library should it be reinstalled, which could

dlopen is keyed on the filename so this does not happen.
Comment 29 Karl Tomlinson (ni?:karlt) 2010-12-22 14:10:34 PST
Comment on attachment 499233 [details] [diff] [review]
patch v1

>         nsCUPSShim() : mCupsLib(nsnull) { }

>+nsCUPSShim gCupsShim;

With gcc 4.4.4 even -O3 I'm seeing

% nm -C obj/widget/src/gtk2/nsPSPrinters.o | grep global
0000000000000000 t global constructors keyed to nsPSPrinters.cpp

so unless we can know for sure that the static constructor is being optimized away,
I think this approach needs the user-declared mCupsLib constructor removed,
and a commented added to the class indicating that it intended only for static storage (or that default initialization must be performed).
Comment 30 Karl Tomlinson (ni?:karlt) 2010-12-22 14:11:37 PST
(In reply to comment #29)
>  mCupsLib constructor

I mean nsCUPSShim constructor.
Comment 31 Karl Tomlinson (ni?:karlt) 2010-12-22 14:18:05 PST
Comment on attachment 499233 [details] [diff] [review]
patch v1

r+ with the static constructor issue addressed.
Comment 32 Matthew Gregan [:kinetik] 2010-12-22 18:47:26 PST
http://hg.mozilla.org/mozilla-central/rev/6f38fc526313
Comment 33 Matthew Gregan [:kinetik] 2010-12-22 18:47:54 PST
Comment on attachment 499233 [details] [diff] [review]
patch v1

Sounds like this is needed on the 1.9.2 branch as well.
Comment 34 christian 2010-12-27 10:24:01 PST
Comment on attachment 499233 [details] [diff] [review]
patch v1

a=LegNeato for 1.9.2.14. We won't block the release on this though.
Comment 35 Jory A. Pratt 2010-12-28 08:53:52 PST
Created attachment 500048 [details] [diff] [review]
1.9.2 patch v1

Patch needed to be updated to apply cleanly to 1.9.2 branch.
Comment 36 Mounir Lamouri (:mounir) 2010-12-29 04:24:51 PST
Pushed in 1.9.2 branch:
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/ffa1ef8ab52b
Comment 37 Jory A. Pratt 2011-01-05 06:36:27 PST
Created attachment 501308 [details] [diff] [review]
v1 1.9.1 branch

Allow to be applied cleanly
Comment 38 christian 2011-01-05 09:31:05 PST
Comment on attachment 501308 [details] [diff] [review]
v1 1.9.1 branch

Assuming the status flag request was set to ask approval. Approving for 1.9.1.17.
Comment 39 Mounir Lamouri (:mounir) 2011-01-06 06:40:58 PST
Pushed in 1.9.1 branch:
https://hg.mozilla.org/releases/mozilla-1.9.1/rev/bf6ce381ea0e
Comment 40 [On PTO until 6/29] 2011-01-06 16:33:34 PST
This doesn't affect all distros by default since I print preview in a downloaded Firefox 3.6.13 on Ubuntu 10.10 without any crashing. Is there a particular mechanism to force the browser to use CUPS?
Comment 41 Frédéric Buclin 2011-01-06 16:40:32 PST
(In reply to comment #40)
> This doesn't affect all distros by default since I print preview in a
> downloaded Firefox 3.6.13 on Ubuntu 10.10 without any crashing. Is there a
> particular mechanism to force the browser to use CUPS?

Are you sure 1) that CUPS 1.4.4 is installed?, and 2) that Ubuntu didn't already fix the problem themselves? Several distros didn't wait for Mozilla to fix the problem.
Comment 42 [On PTO until 6/29] 2011-01-06 16:44:31 PST
I'm using CUPS 1.4.4. I double-checked. I have no idea of whether Ubuntu fixed it or not. I'm not using their distribution of Firefox. The comments above stated it affected all distros so I used what I had.
Comment 43 Jory A. Pratt 2011-01-06 17:37:01 PST
(In reply to comment #42)
> I'm using CUPS 1.4.4. I double-checked. I have no idea of whether Ubuntu fixed
> it or not. I'm not using their distribution of Firefox. The comments above
> stated it affected all distros so I used what I had.

It is not so much the use of cups but rather what lib cups is linked to, Ubuntu as I understand it links to gnutls and not openssl hense you will not see the failure.
Comment 44 [On PTO until 6/29] 2011-01-06 17:58:58 PST
What specific distros will show the failure?
Comment 45 Jory A. Pratt 2011-01-06 18:01:03 PST
Hard to say as I do not have time to keep up with all distros, I would check with mandrake or even fedora, easiest way would be to check their bug reports and see if they are closed or still open before you go to installing.
Comment 46 Frédéric Buclin 2011-01-07 04:15:52 PST
(In reply to comment #44)
> What specific distros will show the failure?

Mandriva and OpenSUSE had the issue for sure, see comment 16. But if you try now, you won't be able to trigger the problem as they fixed the issue meanwhile.
Comment 47 Daniel Holbert [:dholbert] (largely AFK until June 28) 2011-02-24 01:07:09 PST
*** Bug 516459 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.