Closed Bug 5459 Opened 25 years ago Closed 25 years ago

libjpeg, libpng functions don't resolve since ImageLib 2 checkin

Categories

(Core :: Graphics: ImageLib, defect, P3)

x86
Linux
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: newt, Assigned: pnunn)

References

()

Details

(Whiteboard: done, flushed, gone, fixed)

On first run after new build, apprunner emits error messages about every
function used in libjpeg and libpng:

nsDOMPropsCoreFactory::nsDOMPropsCoreFactory
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'png_set_progressive_read_fn'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'png_create_read_struct'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'png_set_interlace_handling'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol 'png_set_gray_to_rgb'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol 'png_process_data'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol 'png_get_valid'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol 'png_set_expand'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'png_destroy_read_struct'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'png_read_update_info'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'png_create_info_struct'
**************************************************
nsComponentManager: Load(/util/www/mozilla/dist/bin/components/libnspng.so)
FAILED with error: Unable to resolve symbol
**************************************************
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'jpeg_resync_to_restart'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol 'jpeg_read_scanlines'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'jpeg_calc_output_dimensions'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'jpeg_start_decompress'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'jpeg_destroy_decompress'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol 'jpeg_std_error'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'jpeg_CreateDecompress'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'jpeg_has_multiple_scans'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol 'jpeg_consume_input'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'jpeg_set_marker_processor'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol 'jpeg_read_header'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol
'jpeg_finish_decompress'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol 'jpeg_finish_output'
/util/www/mozilla/dist/bin/apprunner: can't resolve symbol 'jpeg_start_output'
**************************************************
nsComponentManager: Load(/util/www/mozilla/dist/bin/components/libnsjpg.so)
FAILED with error: Unable to resolve symbol
**************************************************
nsFindComponent's NSRegisterSelf successful

The strange thing is that both libjpeg and libpng are statically linked (and
successfully):

c++  -o apprunner ./nsAppRunner.o ./nsSetupRegistry.o ./nsUnixStubs.o -Wall
-pipe -g -L../../dist/./bin -L../../dist/./lib -L../../dist/./bin -lnsappshell
-lxpcom -lraptorbase -lwidgetgtk -lraptorgfx -lgfxgtk -lgfxps -lgmbasegtk -lreg
-labouturl -lhttpurl -lsockstuburl -lfileurl -lgophurl -lftpurl -lremoturl -lxp
-lnetutil -lnetcache -lnetcnvts -lmimetype -lnetwork -lnetlib -lraptorwebwidget
-lraptorhtml -lraptorhtmlpars -lexpat -lxmltok -ljsdom -lraptorplugin -ljsurl
-lraptorbase -lsecfree -lmozjs  -lpref -limg ../../dist/./lib/libjpeg.a
../../dist/./lib/libpng.a -lmozutil -lxp -lxpcom -lz
-L/usr/local/experimental/lib -lplds3 -lplc3 -lnspr3   -lpwcac
-L/usr/local/experimental/lib -L/usr/X11/lib -lgtk -lgdk -rdynamic -lgmodule
-lglib -ldl -lXext -lX11 -lm  -ll -ldl -lm

Subsequent invocations don't produce the error message anymore (it's now in the
registry?), but no PNG or JPEG images are ever displayed.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → INVALID
Greg:
Have you
1>removed your libimg directory
2>repulled the whole libimg directory
   cvs co mozilla/modules/libimg
3>flush out the old autoconfig stuff:
  rm -f configure congif.log config.cache config.status
4> reconfigure:
   autoconf -l build/autoconf
   ./configure --with-pthreads --enable-debug

5> gmake -f client.mk clobber_all
6> gmake -f client.mk build_all

I didn't mention the reconfig stuff since I assumed
you would do that when you pulled the whole tree.
I'm listing it here just in case.

All the info you mentioned tells me you are not building
with the new makefiles in libimg.
-pn
4>
Status: RESOLVED → VERIFIED
Rubber-stamping as Verified; Greg, please re-open if appropriate. Thank you!
Status: VERIFIED → REOPENED
I nuked and rebuilt the config files as indicated by Pamela, did a new clobber
build (make clobber; make), re-ran, and got exactly the same error.  As far as I
can tell, you are under the mistaken impression that I'm talking about a build
error; the quoted errors are in fact <I>runtime</I> errors.  A simple failure to
reconfigure properly would have bombed immediately when it got to public_com or
pngcom (for example), because there would be no Makefile and hence no export or
libs targets.  (I know because this is indeed what happened the first time after
I *only* nuked libimg and pulled a new sub-tree.  I subsequently pulled the
whole tree specifically to get the updated config stuff.)

So, to summarize:  the build (still from a 4/23 pull of the complete tree)
succeeds completely, and apprunner runs (well, aside from core-dumping the first
time if the registry is not present).  But the first run generates the
previously reported errors, and subsequent runs simply fail to display PNG or
JPEG images.

Differences from Pam's build:  gcc 2.7.2.3, libc5, no pthreads, build from
mozilla directory, no clobber_all target, no RPMs, results in mozilla/dist,
SeaMonkey builds but doesn't work.
Greg:
do the links in mozilla/dist/bin/components/libnsgif.so, libnspng.so,
libnsjpg.so point to good files? what are the file sizes on these?
-pn
Resolution: INVALID → ---
OK, the real question is why Pam's version works at all under Linux; as far as I
can see, it never should.  But anyway...

This all gets back to the dynamic, runtime resolving and loading of shared
libraries by XPCOM, which is what I suspected; linking libnspng.so with libpng
(static or shared, but I only tested the latter) mostly fixes the problem.  That
is, doing the "c++ blah blah libnspng.so blah blah -L../png -lpng" line by hand,
nuking the registry, and running apprunner only produced the error this time for
libjpeg symbols, and after the usual core dump, apprunner started up and ran on
IMG PNGs just fine.  Unfortunately, the new imagelib seems to break the
just-fixed OBJECT PNG code for some reason, but I'll open a separate bug on that
after I try out Pamela's official fix for this problem.  Oddly enough, -lpng is
sufficient; apparently zlib gets pulled in by libpng.so itself.  That may not be
the case if you put libpng into libnspng statically.

I'd root through the makefiles and submit a patch, but I have to get some baby
pics online. :-)
Sorry, Pamela, I missed your last message/entry last night.  Yes, they all point
at real files, and here are the sizes:

-rwxr-xr-x   1 roelofs    158798 Apr 26 22:51 libnsgif.so*
-rwxr-xr-x   1 roelofs    159389 Apr 26 22:51 libnsjpg.so*
-rwxr-xr-x   1 roelofs    185355 Apr 27 18:42 libnspng.so*

libnspng.so has a later date since that's the one I rebuilt by hand; the stock
version was:

-rwxr-xr-x   1 roelofs    185331 Apr 26 22:51 libnspng.so.stock*

...so an extra 24 bytes to refer to libpng.so explicitly.
Greg:
As far as I can tell, there are only 2 of you on linux with runtime
problems. So something is different in the autoconf building, the
environment settings, the installed libraries.

In autoconf, there is logic for deciding which PNG/JPG/ZLIB libraries to
use. My guess is that the logic is falling down for your case and coop's
case.
-pn
Target Milestone: M5
ok.
Here is a way to build on linux with jpeg and png image displays
for linux folk that were seeing problems. It statically links the
jpeg lib the jpeg decoder component so the jpeg lib is only loaded if
the decoder component is requested. Ditto for the png decoder.

To get the png lib to load correctly, I had to force the use of the
moz png lib not the native one. While this is not optimal, it will
get people who are blocked, unblocked. I will keep working on a better
solution.

To configure the build and specify use of the local moz png lib, do:
./configure --with-pthreads --enable-debug --without-png

The changes to the makefiles will be checked in after I get some
Xheads to review and approve the changes.

Just to add a doc note, the magic needed for to force the libs to
be linked with the api interface is
EXTRA_DSO_LDOPTS += $(FOO_LIBS)       in Makefile.in in the decoder dir.
Whiteboard: waiting for feedback on code review...
I've verified the problem with loading the system libpng; unfortunately, I don't
have any way to do the equivalent of "ldd apprunner" anymore since the
executable itself isn't what actually loads libpng.  What I did discover is that
my "-L../png -lpng" hack produces a libnspng.so that differs subtly from the one
that doesn't work:  it refers to libpng.so, while the one that uses a system
library (and doesn't find it) refers to libpng.so.2.  Since the system directory
has both libpng.so.2 and libpng.so links while Mozilla only creates the latter,
I suspected that whatever XPCOM or libimg thing does the runtime loading was
only looking in LD_LIBRARY_PATH, not the rest of the system dynamic libraries;
this indeed seems to be the case:  after I added a libpng.so.2 symlink in
dist/bin, it worked again, and when I deleted both libpng.so and libpng.so.2
from dist/bin, it failed to go look in the normal system directories.

Just for completeness, I added /usr/lib (libpng.so.2 location) to
LD_LIBRARY_PATH explicitly, nuked the registry and retried; it still didn't find
libpng.  So the XPCOM loader code does *not* use the proper dynamic-library load
paths but instead appears to have hardcoded MOZILLA_FIVE_HOME as its sole search
location for these libraries.  I believe that's an XPCOM bug, no?

In other news, I also found that, unlike in my original bug report, apprunner
and viewer don't report a runtime failure to load the PNG module on the first
invocation after deleting the registry.  This also seems like a bug.
> I've verified the problem with loading the system libpng; unfortunately, I
don't
> have any way to do the equivalent of "ldd apprunner" anymore since the
> executable itself isn't what actually loads libpng.

in mozilla/dist/bin/components:
ldd -r libnspng.so

> Since the system directory
> has both libpng.so.2 and libpng.so links while Mozilla only creates the
latter,
> I suspected that whatever XPCOM or libimg thing does the runtime loading was
> only looking in LD_LIBRARY_PATH, not the rest of the system dynamic libraries;
> this indeed seems to be the case:  after I added a libpng.so.2 symlink in
> dist/bin, it worked again, and when I deleted both libpng.so and libpng.so.2
> from dist/bin, it failed to go look in the normal system directories.

ok. This is the big issue. I'll pass this on the XPCOM engineers.

> Just for completeness, I added /usr/lib (libpng.so.2 location) to
> LD_LIBRARY_PATH explicitly, nuked the registry and retried; it still didn't
find
> libpng.  So the XPCOM loader code does *not* use the proper dynamic-library
load
> paths but instead appears to have hardcoded MOZILLA_FIVE_HOME as its sole
search
> location for these libraries.  I believe that's an XPCOM bug, no?

I'm not sure this is so much a bug, as a current status of the design.
I'll add the XPCOM engineers to this so they can be aware of the issue.


> In other news, I also found that, unlike in my original bug report, apprunner
> and viewer don't report a runtime failure to load the PNG module on the first
> invocation after deleting the registry.  This also seems like a bug.

Nope. These components are only loaded if they are requested and are available
for loading in the components directory. This way (once I get the mimetype
passing
issues fixed) ANY image format can be dropped in and supported as an inline
image.

If someone wants to only view jpgs (or gifs or fractally compressed images) and
they
have low machine resources, they can remove the image decoding components they
are not
interested in.

-----------------------------
I got an ok from Steve (my X reviewer) and plan to check in these changes today.
The bug report is really not the best place for discussions. (Its hard to
write in the little comment window. :->  ) I'd like to
propose taking this to email or newsgroup. and I'd like to close this once I get
my changes checked in and verified.

-pn
Yes, this bug can be closed as soon as you check in--the minor Makefile.in
changes you previously noted fix the original problem.

As for the discussion, either e-mail or the newsgroups work, but the latter is
probably more in keeping with the open development model.  I'd suggest .unix and
.xpcom, with an initial crosspost (or a separate pointer) in .builds (where
there's already a thread).
Status: REOPENED → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
checked in. posted notice of change to builds newsgroup.
pn
Greg, could you possibly handle the verification on this bug, as its originator?

(I'm really not qualified to verify this bug.)
I thought developers still WONT be able to use their native libpng.so in
/usr/lib  If this bug isn't that and there is another bug for that, I
understand. If not this bug should be kept OPEN.
only if they reconfig with:
./configure --without-png

I talked to Leaf about making this default for awhile.
-pn
only if they reconfig with:
./configure --without-png

I talked to Leaf about making this default for awhile.
-pn
OK, I've verified that the Makefile.in modifications resolve the specific bug I
originally reported.

What is still unclear is whether we should consider this bug fixed/verified and
open a new one or reopen this one.  That is, it still doesn't fix my problem,
and I am now completely confused as to why not.  I'm still supposedly building
with system (native) PNG libraries enabled.  If I copy or link the previously
Mozilla-built libpng.so to dist/bin/libpng.so.2 (not libpng.so, which is what
the current build creates), it all works fine.  But as soon as I replace that
with a link or copy of the actual system libpng.so.2, it stops working.  There
are no error or debug messages; PNG images simply don't display.  Putting the
Mozilla-built version back makes it start working again, just as silently.  For
whatever reason, Mozilla just does *not* like my system libpng.so.2.  (I
verified that the system library works fine with other apps like pngtopnm,
though.)

Note that both libpngs are the same version (1.0.2); the only obvious difference
is that the system one is half the size, presumably because it was an optimized
build.  Does anyone know if shared libraries must all be debug or all optimized?
I didn't think there was any problem with mixing them, but I'm stuck on this.

Also note (as posted to mozilla.unix/mozilla.xpcom) that I'm using libc5 and
ld.so 1.9.5, so "ldd -r libnspng.so" doesn't work--apparently that's a GNUism or
something (or the failure is a BSD bug?).  There may be other libc5 issues at
work, too.
Status: RESOLVED → VERIFIED
Whiteboard: waiting for feedback on code review... → done, flushed, gone, fixed
Marking verified, but see related bugs 5841 and 5842.

--Greg
You need to log in before you can comment on or make changes to this bug.