PR_LoadLibraryWithFlags may accidentally load an old libc

NEW
Unassigned

Status

()

Toolkit
OS.File
--
major
4 years ago
3 years ago

People

(Reporter: David A. Madore, Unassigned)

Tracking

({regression})

30 Branch
x86
Linux
regression
Points:
---
Bug Flags:
firefox-backlog +

Firefox Tracking Flags

(firefox30 affected, firefox33 affected)

Details

Attachments

(1 attachment)

(Reporter)

Description

4 years ago
I encountered the following symptom when running Firefox 30.0: the Return key is not responding in the URL bar (nor is the right arrow symbol at the end of the URL bar).  Some details and analysis follow, culminating in the hypothesis that the search service is not properly initialized.

This is using the official Linux 32-bit build of Firefox-30.0 found at <URL: https://download-installer.cdn.mozilla.net/pub/firefox/releases/30.0/linux-i686/en-US/firefox-30.0.tar.bz2 >.  The PC is a fairly slow Pentium D running Linux Debian stable=wheezy (32-bit userland/distribution, on a 64-bit 3.10.41 kernel: probably irrelevant).  Previous versions of Firefox worked fine on this machine.

I tried using a pristine profile, running in safe mode, and removing all plugins: none of this has any impact on the problem.

I am aware that the exact same version of Firefox runs fine on extremely similar hardware and software configurations, and I strongly suspect that a subtle race condition of some kind is at play.  Even on the exact config, the bug is not completely reproducible: launching Firefox 2 or 3 times on the same (initially pristine) profile will eventually work (perhaps because the profile is eventually in the disk cache, making startup times slightly different).  Subtle timing issues could also involve the fact that a popup window appears while Firefox is not fully initialized (either to request the master password or, on a pristine profile, to notify that the specific version of Firefox being used is not the default browser, or some such thing; I encountered bug #997901 in a similar context).  I am unfortunately unable to give exact circumstances that will trigger this bug.

The following error messages occur in the browser console when the bug happens:

mutating the [[Prototype]] of an object will cause your code to run very slowly; instead create the object with the correct initial [[Prototype]] value using Object.create Preferences.jsm:378
Could not read chrome manifest 'file:///usr/local/opt/firefox-30.0.MOZ/browser/extensions/%7B972ce4c6-7e08-4474-a285-3208198ce6fd%7D/chrome.manifest'.
Cannot initialize search service, bailing out: 2147500037 search.xml:94
[Exception... "Failure'Failure' when calling method: [nsIBrowserSearchService::currentEngine]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://app/components/nsBrowserGlue.js :: BrowserGlue.prototype._syncSearchEngines :: line 356"  data: no] nsBrowserGlue.js:356
Error in AboutHome.sendAboutHomeData: [Exception... "Failure'Failure' when calling method: [nsIBrowserSearchService::defaultEngine]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://app/modules/AboutHome.jsm :: AboutHome.sendAboutHomeData/< :: line 205"  data: no] AboutHome.jsm:221

the first two are irrelevant (I think) and are also logged when the bug does not happen, but the "Cannot initialize search service, bailing out: 2147500037" at search.xml:94 seems to be the cause of the problem.

Indeed, after this, clicking on the right arrow at the end of the URL bar causes the following error to be logged in the browser console:

NS_ERROR_FAILURE: Failure'Failure' when calling method: [nsIBrowserSearchService::currentEngine] browser.js:11478

Similarly running "Services.search.getEngines()" in a scratchpad in browser context returns the same error.

I do not know enough about the workings of nsSearchService.js to understand what is going on beyond some failure in initialization.
Hello David,

Could you provide the following information:
* a screenshot of about:config filtered by "browser.search."?
* turn on browser.search.log and attach the output from the Browser Console? I'd like to know if you are hitting _syncInit or _asyncInit. You may also see "failure loading engines" with some relevant info.
* attachment of search.json from your profile folder
* Can you test if the same problem occurs with Nightly from https://nightly.mozilla.org/ in a new profile?

Thanks
status-firefox30: --- → affected
status-firefox33: --- → ?
Flags: needinfo?(david+bugs)
Flags: firefox-backlog+
Keywords: regression
(Reporter)

Comment 2

4 years ago
I am using a pristine profile to trigger the bug, assuming it will be simpler to debug that way.  So prefs.js contains nothing pertaining to browser.search except browser.search.log set to true as per your instructions.  The profile directory does not contain any file named search.json (I don't know if it should; my normal profile has such a file but also encounters the same bug).

Here is the browser console log when triggering the bug from a pristine profile only modified to set browser.search.log to true (sorry for the bad formatting):

mutating the [[Prototype]] of an object will cause your code to run very slowly; instead create the object with the correct initial [[Prototype]] value using Object.create Preferences.jsm:378
Could not read chrome manifest 'file:///usr/local/opt/firefox-30.0.MOZ/browser/extensions/%7B972ce4c6-7e08-4474-a285-3208198ce6fd%7D/chrome.manifest'.
SearchService.init
metadata init: starting
metadata init: could not load JSON file Unix error 2 during operation open on file /home/david/.mozilla/firefox/qmycs0vk.pristine/search-metadata.json (No such file or directory)
metadata init: complete
_asyncInit start
_asyncLoadEngines: start
_asyncReadCacheFile: Error reading cache file: Unix error 2 during operation open on file /home/david/.mozilla/firefox/qmycs0vk.pristine/search.json (No such file or directory)
_asyncLoadEngines: Absent or outdated cache. Loading engines from disk.
_asyncLoadEnginesFromDir: Searching in /usr/local/opt/firefox-30.0.MOZ/browser/searchplugins for search engines.
Uncaught asynchronous error: Unix error 2 during operation stat on file /usr/local/opt/firefox-30.0.MOZ/browser/searchplugins/ing.xml (No such file or directory) at
undefined
_asyncInit: failure loading engines: Unix error 2 during operation stat on file /usr/local/opt/firefox-30.0.MOZ/browser/searchplugins/ing.xml (No such file or directory)
Cannot initialize search service, bailing out: 2147500037 search.xml:94
_ensureInitialized: failure
[Exception... "Failure'Failure' when calling method: [nsIBrowserSearchService::currentEngine]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://app/components/nsBrowserGlue.js :: BrowserGlue.prototype._syncSearchEngines :: line 356"  data: no] nsBrowserGlue.js:356
_asyncInit: Completed _asyncInit
_ensureInitialized: failure
Error in AboutHome.sendAboutHomeData: [Exception... "Failure'Failure' when calling method: [nsIBrowserSearchService::defaultEngine]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://app/modules/AboutHome.jsm :: AboutHome.sendAboutHomeData/< :: line 205"  data: no] AboutHome.jsm:221
_ensureInitialized: failure

and here is the output in the terminal in which Firefox was launched:

*** Search: SearchService.init
*** Search: metadata init: starting
*** Search: metadata init: could not load JSON file Unix error 2 during operation open on file /home/david/.mozilla/firefox/qmycs0vk.pristine/search-metadata.json (No such file or directory)
*** Search: metadata init: complete
*** Search: _asyncInit start
*** Search: _asyncLoadEngines: start
*** Search: _asyncReadCacheFile: Error reading cache file: Unix error 2 during operation open on file /home/david/.mozilla/firefox/qmycs0vk.pristine/search.json (No such file or directory)
*** Search: _asyncLoadEngines: Absent or outdated cache. Loading engines from disk.
*** Search: _asyncLoadEnginesFromDir: Searching in /usr/local/opt/firefox-30.0.MOZ/browser/searchplugins for search engines.
*** Search: Uncaught asynchronous error: Unix error 2 during operation stat on file /usr/local/opt/firefox-30.0.MOZ/browser/searchplugins/ing.xml (No such file or directory) at
undefined
*** Search: Uncaught asynchronous error: Unix error 2 during operation stat on file /usr/local/opt/firefox-30.0.MOZ/browser/searchplugins/ing.xml (No such file or directory) at
undefined
*** Search: _asyncInit: failure loading engines: Unix error 2 during operation stat on file /usr/local/opt/firefox-30.0.MOZ/browser/searchplugins/ing.xml (No such file or directory)
*** Search: _ensureInitialized: failure
*** Search: _asyncInit: Completed _asyncInit
*** Search: _ensureInitialized: failure
*** Search: _ensureInitialized: failure

(There is, indeed, no file called browser/searchplugins/ing.xml in the Firefox distribution tarball; there is, instead, a file called browser/searchplugins/bing.xml instead: is this what it's all about?  But why would this be a problem only on one particular PC?)

I will be trying a Nightly soon.
(Reporter)

Comment 3

4 years ago
Tried with a nightly ("Built from https://hg.mozilla.org/mozilla-central/rev/1dc6b294800d" according to about:buildconfig), with the same results.  Here is the browser console log output with browser.search.log to true on a pristine profile:

mutating the [[Prototype]] of an object will cause your code to run very slowly; instead create the object with the correct initial [[Prototype]] value using Object.create Preferences.jsm:378
Could not read chrome manifest 'file:///usr/local/opt/firefox-33.0a1.MOZ/browser/extensions/%7B972ce4c6-7e08-4474-a285-3208198ce6fd%7D/chrome.manifest'.
SearchService.init
metadata init: starting
metadata init: could not load JSON file Unix error 2 during operation open on file /home/david/.mozilla/firefox/emf775qa.pristine/search-metadata.json (No such file or directory)
metadata init: complete
_asyncInit start
_asyncLoadEngines: start
_asyncReadCacheFile: Error reading cache file: Unix error 2 during operation open on file /home/david/.mozilla/firefox/emf775qa.pristine/search.json (No such file or directory)
_asyncLoadEngines: Absent or outdated cache. Loading engines from disk.
_asyncLoadEnginesFromDir: Searching in /usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins for search engines.
_asyncInit: failure loading engines: Unix error 2 during operation stat on file /usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/ing.xml (No such file or directory)
_ensureInitialized: failure
uncaught exception: 2147500037
NS_ERROR_FAILURE: Failure'Failure' when calling method: [nsIBrowserSearchService::currentEngine] nsBrowserGlue.js:375
_asyncInit: Completed _asyncInit
Cannot initialize search service, bailing out: 2147500037 search.xml:94
SearchService.init
Error in AboutHome.sendAboutHomeData: 2147500037 AboutHome.jsm:248
Key event not available on some keyboard layouts: key="c" modifiers="accel,alt" browser.xul
SearchService.init
_ensureInitialized: failure
uncaught exception: 2147500037
A promise chain failed to handle a rejection. Did you forget to '.catch', or did you forget to 'return'?
See https://developer.mozilla.org/Mozilla/JavaScript_code_modules/Promise.jsm/Promise

Date: Mon Jul 07 2014 16:41:43 GMT+0200 (CEST)
Full Message: Failure'Failure' when calling method: [nsIBrowserSearchService::currentEngine] ContentSearch.jsm:211


And here is the terminal output for the same run:

*** Search: SearchService.init
*** Search: metadata init: starting
*** Search: metadata init: could not load JSON file Unix error 2 during operation open on file /home/david/.mozilla/firefox/emf775qa.pristine/search-metadata.json (No such file or directory)
*** Search: metadata init: complete
*** Search: _asyncInit start
*** Search: _asyncLoadEngines: start
*** Search: _asyncReadCacheFile: Error reading cache file: Unix error 2 during operation open on file /home/david/.mozilla/firefox/emf775qa.pristine/search.json (No such file or directory)
*** Search: _asyncLoadEngines: Absent or outdated cache. Loading engines from disk.
*** Search: _asyncLoadEnginesFromDir: Searching in /usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins for search engines.
*** Search: _asyncInit: failure loading engines: Unix error 2 during operation stat on file /usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/ing.xml (No such file or directory)
*** Search: _ensureInitialized: failure
*** Search: _asyncInit: Completed _asyncInit
*** Search: SearchService.init
*** Search: SearchService.init
*** Search: _ensureInitialized: failure
*** Search: SearchService.init
*** Search: SearchService.init
1404744170925   addons.manager  ERROR   Exception calling provider shutdown: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://app/modules/experiments/Experiments.jsm :: this.Experiments.PreviousExperimentProvider.prototype<.shutdown :: line 2188"  data: no] Stack trace: this.Experiments.PreviousExperimentProvider.prototype<.shutdown()@resource://app/modules/experiments/Experiments.jsm:2188 < callProvider()@resource://gre/modules/AddonManager.jsm:194 < AMI_unregisterProvider()@resource://gre/modules/AddonManager.jsm:849 < AMP_unregisterProvider()@resource://gre/modules/AddonManager.jsm:2321 < Experiments.Experiments.prototype._unregisterWithAddonManager()@resource://app/modules/experiments/Experiments.jsm:500 < Experiments.Experiments.prototype.uninit<()@resource://app/modules/experiments/Experiments.jsm:446 < TaskImpl_run()@resource://gre/modules/Task.jsm:314 < TaskImpl_handleResultValue()@resource://gre/modules/Task.jsm:393 < TaskImpl_run()@resource://gre/modules/Task.jsm:322 < TaskImpl()@resource://gre/modules/Task.jsm:275 < createAsyncFunction/asyncFunction()@resource://gre/modules/Task.jsm:249 < Barrier.prototype<._wait()@resource://gre/modules/AsyncShutdown.jsm:571 < Barrier.prototype<.wait()@resource://gre/modules/AsyncShutdown.jsm:537 < Spinner.prototype.observe()@resource://gre/modules/AsyncShutdown.jsm:335 < <file:unknown>
WARNING: A completion condition encountered an error while we were spinning the event loop. Condition: Telemetry: shutting down Phase: profile-before-change2 State: (none)
WARNING: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://gre/modules/TelemetryPing.jsm :: detachObservers :: line 863"  data: no]
(Reporter)

Comment 4

4 years ago
OK, so I guess this is a bug in OS.File.DirectoryIterator: I tried running the following code in a scratchpad in browser context:

var testPath = "/usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins";
var testIterator = new OS.File.DirectoryIterator(testPath);
var testPromise = testIterator.forEach(function(ent) {console.log(ent.path);});
testPromise.then(function(){testIterator.close();}, Components.utils.reportError);

and I got the following output:

"/usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/ing.xml" Scratchpad/1:3
"/usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/" Scratchpad/1:3
"/usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/mazondotcom.xml" Scratchpad/1:3
"/usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/Bay.xml" Scratchpad/1:3
"/usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/oogle.xml" Scratchpad/1:3
"/usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/ahoo.xml" Scratchpad/1:3
"/usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/witter.xml" Scratchpad/1:3
"/usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/ikipedia.xml" Scratchpad/1:3

In other words, there is one character missing at the start of every filename!

(This is, again, on Firefox nightly, with a pristine profile.)

I guess the bug should be reassigned and reclassified, then.  I don't know how to do this: I am awaiting further instructions.
Flags: needinfo?(david+bugs)
(In reply to David A. Madore from comment #4)
> OK, so I guess this is a bug in OS.File.DirectoryIterator

That was going to be my guess too.

David A. Madore, could you provide more info about your file system and mount options?

Yoric, is comment 4 a known issue?
status-firefox33: ? → affected
Component: Search → OS.File
Flags: needinfo?(dteller)
Flags: needinfo?(david+bugs)
Product: Firefox → Toolkit
I can't reproduce this. I suppose that this might be a problem with file paths encodings (bug 978056).
Flags: needinfo?(dteller)
(Reporter)

Comment 7

4 years ago
I don't think there's anything remarkable about my mount options: here is the output of "mount" and of ls of the various paths leading up to the directory in question:

pleiades david ~ $ mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=506069,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=405548k,mode=755)
/dev/md1 on / type ext3 (rw,relatime,sync,errors=continue,barrier=1,data=writeback)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /run/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=1967760k)
/dev/md0 on /boot type ext3 (rw,relatime,sync,errors=continue,barrier=1,data=writeback)
/dev/md2 on /var type ext3 (rw,relatime,errors=continue,barrier=1,data=writeback)
/dev/md3 on /home type reiserfs (rw,relatime,acl)
/dev/md4 on /usr type ext3 (rw,relatime,errors=continue,acl,barrier=1,data=writeback)
/dev/md6 on /tmp type ext2 (rw,relatime,errors=continue,user_xattr,acl)
/dev/md7 on /data type xfs (rw,relatime,attr2,inode64,noquota)
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)
/dev/md5 on /usr/local type ext3 (rw,relatime,errors=continue,acl,barrier=1,data=writeback)
pleiades david ~ $ ls -ld /opt /usr /usr/local /usr/local/opt /usr/local/opt/firefox-33.0a1.MOZ /usr/local/opt/firefox-33.0a1.MOZ/browser /usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins /usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/bing.xml
lrwxrwxrwx  1 root root    13 Aug 25  2008 /opt -> usr/local/opt
drwxr-xr-x 14 root root  4096 Sep  2  2013 /usr
drwxr-xr-x 10 root root  4096 Jun 21  2009 /usr/local
drwxr-xr-x 16 root root  4096 Jul  7 16:38 /usr/local/opt
drwxr-xr-x  8 bin  root  4096 Jul  7 16:39 /usr/local/opt/firefox-33.0a1.MOZ
drwxr-xr-x  7 bin  root  4096 Jul  7 14:21 /usr/local/opt/firefox-33.0a1.MOZ/browser
drwxr-xr-x  2 bin  root  4096 Jul  7 14:21 /usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins
-rw-r--r--  1 bin  root 13068 Jul  7 14:12 /usr/local/opt/firefox-33.0a1.MOZ/browser/searchplugins/bing.xml

The directory /usr/local/opt/firefox-33.0a1.MOZ contains exactly the contents of the nightly tarball.  I can't think of anything strange about my config.

In any case, if I try using OS.File.DirectoryIterator on any directory whatsoever (as per code snippet in comment 4), the result is always the same: the first character is removed from every filename and a spurious file with empty name is also returned.  I tried this across a number of different filesystems, so it's probably not filesystem-related.

There are no non-ASCII characters in any of the filenames involved, and I also reproduced the bug under a number of different locales (at least POSIX, en_US and en_US.UTF-8), so I don't think it's encoding-related.

I was able to reproduce the bug on two different PC's now (with a very similar config), but I still don't know what the determining factor is.  Both have in common that they run a 64-bit Linux kernel with a 32-bit userland, so maybe that's relevant.  But even that is insufficient (I tried running different combinations of 64-bit kernel and 32-bit userland in virtual boxen, and none of them exhibited the bug).

Is there some way to get debug output from osfile_unix_front.jsm and osfile_unix_back.jsm (I can apply patches and recompile FF if necessary; I just don't know how these worker threads work)?
Flags: needinfo?(david+bugs)
You can activate OS.File logging by setting option `toolkit.osfile.debug` to `true`.

Also, can you run the same experiment, but displaying `entry.name` and `this._path` instead of (or in addition to `entry.path`)?
(Reporter)

Comment 9

4 years ago
I ran the test with toolkit.osfile.debug=true and also toolkit.osfile.log=true (not sure which was right, so I used both).  The directory /tmp/test contained exactly two (empty) files called foo and bar and I ran

var testPath = "/tmp/test";
var testIterator = new OS.File.DirectoryIterator(testPath);
var testPromise = testIterator.forEach(function(ent) {console.log(ent.path, ent.name, testIterator._path);});
testPromise.then(function(){testIterator.close();}, Components.utils.reportError);

The relevant part of the terminal output is this (I can provide full output if it's of any use):

OS Controller Message posted
OS Controller Expecting reply
OS Controller Received message from worker {"id":20}
OS Controller Posting message {"fun":"new_DirectoryIterator","args":[{"string":"/tmp/test"},null],"id":21}
OS Controller Message posted
OS Controller Expecting reply
OS Controller Received message from worker {"ok":3,"id":21}
OS Controller Posting message {"fun":"DirectoryIterator_prototype_next","args":[3],"id":22}
OS Controller Message posted
OS Controller Expecting reply
OS Controller Received message from worker {"ok":{"isDir":false,"isSymLink":false,"name":"","path":"/tmp/test/"},"id":22}
OS Controller Posting message {"fun":"DirectoryIterator_prototype_next","args":[3],"id":23}
OS Controller Message posted
OS Controller Expecting reply
OS Controller Received message from worker {"ok":{"isDir":false,"isSymLink":false,"name":"oo","path":"/tmp/test/oo"},"id":23}
OS Controller Posting message {"fun":"DirectoryIterator_prototype_next","args":[3],"id":24}
OS Controller Message posted
OS Controller Expecting reply
OS Controller Received message from worker {"ok":{"isDir":false,"isSymLink":false,"name":"ar","path":"/tmp/test/ar"},"id":24}
OS Controller Posting message {"fun":"DirectoryIterator_prototype_next","args":[3],"id":25}
OS Controller Message posted
OS Controller Expecting reply
OS Controller Received message from worker {"fail":{"exn":"StopIteration"},"id":25}
OS Controller Got error {"data":{"exn":"StopIteration"}}
OS Controller Posting message {"fun":"DirectoryIterator_prototype_close","args":[3],"id":26}

So the f in foo and the b in bar were cut already in the messages received from the worker thread.

The output in the browser console is:

"/tmp/test/" "" undefined Scratchpad/1:3
"/tmp/test/oo" "oo" undefined Scratchpad/1:3
"/tmp/test/ar" "ar" undefined Scratchpad/1:3

(Was undefined supposed to be something else?  I tried all of testIterator._path and testPromise._path and this._path in the code snippet above, none of them exists.)

I guess this gets us a little bit further down the rabbit-hole.  How many layers until the actual system call?
Summary: "Cannot initialize search service, bailing out: 2147500037" (nsSearchService.init returns NS_ERROR_FAILURE) → OS.File.DirectoryIterator removing first character of file names
(Reporter)

Comment 10

4 years ago
Ha!  I just had a stroke of genius, and I understood how this bug came to be and also why it affects so few people.

Here's the thing: my 32-bit Linux systems still have the old pre-glibc version of libc (libc5) installed, as /lib/libc.so.5.4.46

And sure enough, it appears in Firefox's /proc/$PID/maps

In other words, for some reason, Firefox's JavaScript foreign function interface dynlinks the wrong libc version.  (I'm surprised this doesn't instantly translate to a segfault, actually.)  And I imagine the dirent structure moved the d_name offset by 1 byte between libc5 and libc6 (probably for alignment), and the JavaScript interface binding is created from header data that relates to the libc6 structure.  So, one byte chopped off.

Sure enough, if I move /lib/libc.so.5.4.46 out of the way, the bug disappears.  And I guess very few Firefox users have such old versions of the libraries lying around (although I know someone who still has a working a.out libc4!).

So, to fix this, you "just" need to make sure you link to the exact version of libc that Firefox is already using.  (This should also prevent a similar potential problem on OpenBSD, which I believe has the habit of changing their libc interface every two weeks.)
Ah, thanks for the catch. I was triple-checking our code and couldn't find anything suspicious.

Mike, do you have any idea how we could solve this?
Flags: needinfo?(mh+mozilla)
Summary: OS.File.DirectoryIterator removing first character of file names → PR_LoadLibraryWithFlags may accidentally load an old libc
(Reporter)

Comment 12

4 years ago
A comment about the original symptom: even if the problem should, of course, be treated at its cause, wouldn't it be good, in the interest of robustness, to also mitigate the symptom?  I.e., even if nsSearchService.js somehow fails to initialize properly, it would be nice if the address bar didn't become completely non-functional.
Created attachment 8453025 [details] [diff] [review]
Attempt to contain the error to nsSearchService (EXPERIMENTAL, DO NOT LAND)

Does this patch succeed at containing the error?
Flags: needinfo?(david+bugs)
(Reporter)

Comment 14

4 years ago
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo") from comment #13)
> Does this patch succeed at containing the error?

No.  Now the browser console says "uncaught exception: 2147500037" (and the URL bar still doesn't work).

According to the debugger (and assuming I'm using it properly), this exception seems to be thrown in nsSearchService.js in the function _ensureInitialized of SearchService.prototype - at line 2768 (or whereabouts) which says "throw this._initRV;"
Flags: needinfo?(david+bugs)
(In reply to David A. Madore from comment #10)
> Sure enough, if I move /lib/libc.so.5.4.46 out of the way, the bug
> disappears.  And I guess very few Firefox users have such old versions of
> the libraries lying around (although I know someone who still has a working
> a.out libc4!).

Do you have a libc.so symbolic link that points to that libc.so.5? If not I don't see how it would get picked.

(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo") from comment #11)
> Ah, thanks for the catch. I was triple-checking our code and couldn't find
> anything suspicious.
> 
> Mike, do you have any idea how we could solve this?

How about making osfile stop searching for all irrelevant libraries? libSystem.B.dylib is there only for mac, libc.so only there for android, and a.out for the other unices. Obviously, having libc.so before a.out can do surprising things on linux.
Flags: needinfo?(mh+mozilla)
(Reporter)

Comment 16

4 years ago
I have /usr/i486-linuxlibc1/lib/libc.so belonging to the Debian libc5-altdev package which is a symlink to /lib/libc.so.5.4.46 and /usr/lib/i386-linux-gnu/libc.so (belonging to libc6-dev) which is a linker script as is standard.  It's probably the combination of both that causes problems: dlopen("libc.so") fails to load /usr/lib/i386-linux-gnu/libc.so because it's a text file, and succeeds in loading the (wrong version) /usr/i486-linuxlibc1/lib/libc.so

I may be wrong about this, but I believe the correct way to dlopen libc, at least under GNU/Linux, is to pass NULL as argument to dlopen(), rather than "libc.so" or even "libc.so.6": afaik, dlopen(NULL) returns the same handle as RTLD_DEFAULT, which means "the set of libraries that is currently dynlinked", and libc is always among them, so this ensures that the correct version will be used.  Documentation is very unclear, though.
(In reply to David A. Madore from comment #12)
> A comment about the original symptom: even if the problem should, of course,
> be treated at its cause, wouldn't it be good, in the interest of robustness,
> to also mitigate the symptom?  I.e., even if nsSearchService.js somehow
> fails to initialize properly, it would be nice if the address bar didn't
> become completely non-functional.

Yes, we should dive into this a bit more - can you file a separate bug about that and mention its number here?
BTW - thanks a lot for digging into this one, David.
You need to log in before you can comment on or make changes to this bug.