Closed Bug 519357 (compdir-lockdown) Opened 15 years ago Closed 15 years ago

Only load known components from app directory

Categories

(Core :: XPCOM, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9.3a1
Tracking Status
status1.9.2 --- beta3-fixed
blocking1.9.1 --- -
status1.9.1 --- wontfix

People

(Reporter: vlad, Assigned: benjamin)

References

(Blocks 2 open bugs)

Details

(4 keywords)

Attachments

(2 files, 1 obsolete file)

We have a current problem where some non-malware apps are placing binary components in the Firefox app components dir.  Their components were written for 3.0, and are causing crashes in 3.5.  We need to lock this directory down -- they should be adding any extensions/components as an addon, that the user can then see and optionally disable, and that we can blocklist if necessary.
Attached patch something like this (obsolete) — Splinter Review
Something like this seems to work.

Manual maintenance of the list is a problem, but not a huge one; ideally in the future we would autogenerate the list from the packaging files, though that would have to be a step done right at the start of the build process, and there's some weirdness here -- e.g. we are locking down the app dir from xpcom, so in an ideal world the app would set a lockdown list of its own components at InitXPCOM time.
Attachment #407651 - Flags: superreview?
Attachment #407651 - Flags: review?(benjamin)
Yeeeeessssssss.
Flags: blocking1.9.2+
Priority: -- → P2
If we can do this safely on 1.9.1, I really, really want it.
blocking1.9.1: --- → ?
What legit things might we break doing it?

I'm not ready to say we'll block 1.9.1.5 on this, but I'd love to see some testing in 1.9.2 showing we can do this safely.
blocking1.9.1: ? → needed
We could break any number of things -- we don't tend to see the stuff that goes in there and doesn't crash or otherwise disrupt visibly -- but they are things that the users aren't necessarily aware of being installed.

I'm a little nervous, tbh, about 1.9.2 for this without a beta, since there won't be a chance for integrating-types to repackage as an add-on until they find out in the RC that their integration went away.  Not nervous enough to not do it, but I would be pretty receptive to ways to reduce that nervousness! :-)
(In reply to comment #4)
> What legit things might we break doing it?

Depends how you look at "legit" -- I'd say "none", since dropping things in our components dir was never a legit way to do anything...
How will users look at legit?  "Something was working before" -- if we don't have stability as a justification, "we told them so" isn't going to impress anyone.
I'd like to see how a Makefile.in can say "Allow this component name too".  Maybe they can be appended to a .h file which a tier_toolkit C++ file (or tools_tier_xpcom?) can read from when it's compiled?

I'm thinking in terms of a XULRunner app with its own custom binary components.  I'd hate to have "browserdirprovider.dll" on the list if I don't have a browserdirprovider built - especially if someone else later creates a binary component with that name and I didn't know about it!  That would defeat the purpose of restricting by file name.  

(Speaking of which, what prevents an extension from replacing an existing binary component?  If it's small enough, the user wouldn't necessarily notice any missing functionality.  Perhaps we could use this list of known binary components and mark them protected, or create a hash to validate them.  But that's me being paranoid.)

Forgive me if I've asked ignorant questions here.
Yeah, I don't think we'd turn this on for XULRunner apps, at least at first.  We can get some experience with Firefox first, since it's the bigger "target".  But if you are willing to do the Makefile diving that vlad describes above, I would certainly expect some XULRunner stuff in your patch to be gratefully accepted! :)

This isn't designed to protect against attacks on Firefox; that is a hard battle to win (though we do the hash check on every update, and pave over if there's a mismatch).  This is to close off an extension mechanism that "happened to work" but burned us really badly when we went from 3 to 3.5 and a bunch of jammed-in binary components started crashing people.  If it doesn't work, people will find another way: add-ons!
Yep, what shaver said.  Those are all good questions to ask, though; I'm not really happy with the mechanism that the patch proposes right now, though it's the simplest thing that will work with the least risk.  As a followup, I'd like to have the gecko/libxul/GRE components locked down, and then have each app opt-in to the lockdown by passing in a list at xpcom init time.  The generic xulrunner runtime could provide some way for apps to pass in that list, though keeping it in a file defeats the purpose since it's easy to add entries :-)
Mossop is working on a patch for 1.9.3 where components are registered using a manifest (eventually chrome.manifest), we are going to stop directory-enumerating to find manifests and just load a single one. So I think we should save the longer-term discussion for the newsgroups and eventually bug 505983 and its relations.
For 1.9.3 we may well not want to load binary components from extensions at all, given a good show by jsctypes, but that'll be good for JS components as well, I suppose!
Comment on attachment 407651 [details] [diff] [review]
something like this

>diff --git a/xpcom/components/Makefile.in b/xpcom/components/Makefile.in

>+DEFINES += -DMOZ_XPCOM_LOCKDOWN

What is this define for, and how does it differ from the makefile MOZ_COMPONENT_LOCKDOWN variable? The autoconf variable seems completely unused at this point.

>diff --git a/xpcom/components/nsAllowedComponentsList.h b/xpcom/components/nsAllowedComponentsList.h

Why are we doing this hardcoded instead of in, say an INI or plaintext file? Are we worried about bad guys appending to the text file?

Generating the list at build time is really really easy.

>+const char *sAllowedAppComponents[] = {
>+#if defined(XP_WIN)
>+    "browserdirprovider.dll",
>+    "brwsrcmp.dll",
>+#elif defined(XP_LINUX)
>+    "browserdirprovider.so",
>+    "brwsrcmp.so",
>+#elif defined(XP_MACOSX)
>+    "browserdirprovider.dylib",
>+    "brwsrcmp.dylib",
>+#endif
>+    NULL
>+};
>+
>+const char *sAllowedGREComponents[] = {
>+    NULL
>+};

This is definitely incorrect: missing libdbusservice.so, libimgicon.so, libmozgnome.so, libnkgnomevfs.so are missing.
Attachment #407651 - Flags: review?(benjamin) → review-
Finding the list at startup, cross-platform, in a way that doesn't cause a bunch of libraries to already be loaded is a lot of fragile work.  And deleting that file would cause Firefox to basically be non-functional, so I'm not really for it.  I don't think it adds much to use a text file for it, other than more synchronous I/O at startup.
Actually we could be better off with it, because we could use that file to list the components to register, rather than having to enumerate the directory listing. And finding the list is no problem at all... we can do it per directory-to-search. Proof upcoming.
(In reply to comment #14)
> Why are we doing this hardcoded instead of in, say an INI or plaintext file?
> Are we worried about bad guys appending to the text file?

I'd be worried that anyone already just dumping something in where it really shouldn't be would just find a way to continue to do so if it's easily possible.
They could replace one of our existing components if they care that much. We'll fix it at the next minor update in any case, I think.
I'm more concerned here with those that don't care than those that do. Replacing an existing component is something that clearly sounds bad, whereas just amending some file might sound like the new correct way to the lazy. I'm just advocating that whatever is done to kick out these components not be trivially overridable by someone to "make it work", even temporarily.
Yes, I don't think it's worth the time to put it in a separate file.  It doesn't solve a problem we have right now, and it increases the complexity of what we're doing.
This is the mostly-baked alternative. It builds and uses components/components.list instead of trawling through the components directory. This patch does not include:

* packaging changes (package-manifest.in)
* changes necessary for universal builds (sorting components.list before packaging)
Attachment #409165 - Flags: review?(vladimir)
I kind of explicitly don't want a components list -- there's no reason to have one in the GRE or the app components dir, and the thing we're trying to prevent becomes much harder if they have to hack the firefox binary instead of just appending a line to a text file.
I'm slightly worried about the maintenance burden, especially since the features that would be silently missing are almost all OS-integration features for which we have little automated testing. But overall I don't think we can take this as it is because it will break the -app feature of Firefox, which is used by Prism. We could certainly invent an API XRE_RegisterAllowedComponents and optionally call that with static lists of app components (and have a hardcoded list of XULRunner components somewhere), but that's a fair bit more work than the text list.

We can easily ensure that the components list is reset at every minor update: is that insufficient for ensuring that updates of the browser don't cause instability because of random unversioned components thrown into our application directories?
Hm, I don't understand why it breaks -app -- why would Prism ever have any binary components?

But, as you point out, the list would get reset at every minor update (though it would mean those users would end up getting a full mar) -- shaver, does that sound like a good compromise?
My motivation for compromise is small, but I'll go along to get this along and into the tree.  I suspect we'll have people who can't start because of some permission error on that file, or a virus thing, or I dunno from what, but if it's really going to make it hard to commit the patch with a built-in list...
Would it be within the realm of possibility to generate a list at build time but compile it in somehow so there's no separate file to worry about?
Yes, that's entirely possible.
(In reply to comment #26)
> Would it be within the realm of possibility to generate a list at build time
> but compile it in somehow so there's no separate file to worry about?

I believe that's what I suggested in comment 9... a header file.
(In reply to comment #24)
> But, as you point out, the list would get reset at every minor update (though
> it would mean those users would end up getting a full mar)

FWIW, we can force the list to be overwritten with a partial update too.
What's the next steps and ETA here? Hard to tell from the patches and what I'm being told on IRC, and we're waiting for this before the next beta refresh.
(In reply to comment #21)
> * packaging changes (package-manifest.in)

This needs fixing before landing, but it's simple.

> * changes necessary for universal builds (sorting components.list before
> packaging)

My patch in bug 526668 can fix this, we'll just need to add --unify-with-sort "components\.list$$" in the call to unify in build/macosx/universal/flight.mk.

I think bsmedberg's patch needs review by vlad and my patch needs review by bsmedberg, and we'll be good to go.
Comment on attachment 409165 [details] [diff] [review]
Alternative, read and build components.list

Ok, this looks good here -- we'll need followup bugs for:

1) Decide which direction we should fail if we don't have a components.list; right now this fails "safe", that is if that file goes away, it'll load everything in the directory.  This has the benefit that addons and other similar things can use components.list as well I think, and leaves us an easy workaround in case we end up breaking something, but it also gives an easy way for people to circumvent this.

2) Figuring out how to protect components.list -- at the security review meeting today, it was suggested we just store a hash in a separate file, and not obfuscate it; but just have a "corrupted" error in case the hash doesn't match.  Separate file location would be harder to find without reading the source, at which point we have multiple chances to wave people off what they're doing.
Attachment #409165 - Flags: review?(vladimir) → review+
On OS X we could actually hide the file with the kIsInvisible attribute, but that's probably overkill.
Pushed to m-c:
http://hg.mozilla.org/mozilla-central/rev/0cc47ba7304b

I'll file followups in a few.
Assignee: vladimir → benjamin
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Attachment #407651 - Attachment is obsolete: true
Attachment #407651 - Flags: review?(ted.mielczarek)
Depends on: 526760
Blocks: 526764
No longer depends on: 526764
Another bustage fix:
http://hg.mozilla.org/mozilla-central/rev/507ef6d6b979

This broke running reftest on packaged builds because of the hacky way we package httpd.js there.
OS: Windows NT → Windows 2000
Target Milestone: --- → mozilla1.9.3a1
Version: unspecified → Trunk
Rolled all the bustage fixes into one branch landing:
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/8dddd6843317

We should get some tests for this. Should be easy enough to do with xpcshell.
Flags: in-testsuite?
OS: Windows 2000 → Windows NT
Target Milestone: mozilla1.9.3a1 → ---
Version: Trunk → unspecified
Backed out from 192:
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/33c33143f5b7

This apparently was causing every test machine to leak. bsmedberg: can you take a look at this when you get a chance?
Comment on attachment 409165 [details] [diff] [review]
Alternative, read and build components.list


> nsresult
> nsComponentManagerImpl::AutoRegisterDirectory(nsIFile *inDirSpec,
>                           nsCOMArray<nsILocalFile>    &aLeftovers,
>                           nsTArray<DeferredModule>    &aDeferred)
> {
>+    nsCOMPtr<nsIFile> componentsList;
>+    inDirSpec->Clone(getter_AddRefs(componentsList));
>+    if (componentsList) {
>+        nsCOMPtr<nsILocalFile> lfComponentsList =
>+            do_QueryInterface(componentsList);
>+        lfComponentsList->AppendNative(NS_LITERAL_CSTRING("components.list"));
>+        PRFileDesc* fd;
>+        if (NS_SUCCEEDED(lfComponentsList->OpenNSPRFileDesc(PR_RDONLY,
>+                                                            0400, &fd)))
>+            return AutoRegisterComponentsList(inDirSpec, fd,
>+                                              aLeftovers, aDeferred);
>+    }

Missed this -- we end up leaking fd, don't we?  Need a PR_Close(fd) before we return?
Relanded with the PR_Close on 1.9.2, let's see what happens.
And backed out again: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/be6eea1b1402

There were still leaks on the M and E boxes and still widespread crashes on Linux talos: Tdhtml, Tgfx, Tsvg, Tjss, Tp4.  I looked at the stacks for the former 3, and they all look like this:

 0  0xffffe410
    eip = 0xffffe410   esp = 0xbf9dc274   ebp = 0xbf9dc298   ebx = 0xb3092bc0
    esi = 0xffffffff   edi = 0xb6784ff4   eax = 0xfffffffc   ecx = 0x00000008
    edx = 0xffffffff   efl = 0x00000246
 1  libglib-2.0.so.0.1400.1 + 0x32592
    eip = 0xb6a96593   esp = 0xbf9dc2a0   ebp = 0xbf9dc2f8
 2  libglib-2.0.so.0.1400.1 + 0x32ac4
    eip = 0xb6a96ac5   esp = 0xbf9dc300   ebp = 0xbf9dc318
 3  libxul.so!nsAppShell::ProcessNextNativeEvent(int) [nsAppShell.cpp:081ac0b8126b : 147 + 0x9]
    eip = 0xb7b58308   esp = 0xbf9dc320   ebp = 0xbf9dc338
That stack looks a lot like bug 519438 :/
Having slept on it, the only thing I can think of is that the component registration order changed, and some component is doing things when it loads/registers which it shouldn't, such as trying to get services which may not be registered yet.
Should we push a printf patch that spits out failed service access and the libraries that _aren't_ loaded? perf hit, but would give us some data if we can't repro off those machines or try server.

(Or we could grab one of those machines out of the pool; releng is double-plus awesome about that stuff.)
I did some tests on Linux:

Started up: the browser started up to the browser-chrome test window

compared components.list to the actual list of files: no discrepancies

Deleted registration info; sorted components.list using LC_ALL=C (which is often how posix systems will enumerate their directories); browser started up normally.

This definitely points to some components trying to get XPCOM services/components provided by other components when they load or register, which is bad. I tried to write a simple assertion for this case and was foiled by xpconnect. I'm looking at writing a slightly more complex assertion.
Depends on: 527058
There are a number of problems:

- crashes on linux Talos
- leaks on every platform
- odd Windows leaktest box assertion failure

I'm trying to reproduce the latter, but I'm not having any success; it looks like this in the logs (followed by the assertion stack trace):

...
*** Registering components in: xpconnect_test
*** Registering components in: BrowserDirProvider
*** Registering components in: nsBrowserCompsModule
nsNativeModuleLoader::LoadModule("e:\builds\moz2_slave\mozilla-1.9.2-win32-debug\build\obj-firefox\dist\bin\components\i18n.dll") - load FAILED, rv: 80004003, error:
	error 1114
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040154: file e:/builds/moz2_slave/mozilla-1.9.2-win32-debug/build/caps/src/nsScriptSecurityManager.cpp, line 3317
###!!! ASSERTION: Failed to initialize nsScriptSecurityManager: 'NS_SUCCEEDED(rv)', file e:/builds/moz2_slave/mozilla-1.9.2-win32-debug/build/caps/src/nsScriptSecurityManager.cpp, line 3395
nsStringStats
 => mAllocCount:          12902
 => mReallocCount:           22
 => mFreeCount:           12341  --  LEAKED 561 !!!
 => mShareCount:          12542
 => mAdoptCount:              8
 => mAdoptFreeCount:          6  --  LEAKED 2 !!!
Error 1114 is ERROR_DLL_INIT_FAILED, assuming that's a win32 error; that could be due to an unresolved symbol dependency.  Wonder if we're having some interaction with dll blocklisting here?  I still can't reproduce this problem on my own builds.
Hmm, that might be unrelated, because it also shows up in the previous checkin's leaktest, which was ted's explicit backout of this patch.  But doesn't show up after the latest backout.  This is being tracked by bug 527122 apparently, so unrelated to this.
Target Milestone: --- → mozilla1.9.3a1
Oh.  Of course this is failing -- all the talos/pageloader based tests drop the pageloader component into the components dir (tp-cmdline.js)!

So, this is "working as intended", but a better question is why it's not failing everywhere, including on win32 with the branch (OSX has the same failures) or on the trunk.  Looking into that now.
See bug 527458 -- tp-cmdline.js and other stuff is actually in components.list even though it probably shouldn't be, and that followup moves it into a separate bundle directory.
OS: Windows NT → All
Hardware: x86 → All
tp-cmdline.js and reftest-cmdline.js doesn't end up in components.list on 1.9.2 or try server, on Linux or OSX.  They do end up in there on win32, presumably because the win32 builder builds with --enable-tests and the others don't.

On the trunk, all of our builds seem to be done with --enable-tests, so tp-cmdline.js and reftest-cmdline.js end up in components.list on all platforms.

So one fix here would be to build with --enable-tests everywhere, or a better one (but harder) would be to change how tp-cmdline.js and reftest-cmdline.js get added to the build, potentially adding them to components.list at the same time.. that would likely be the cleanest fix, but I'm not sure how quickly we can make that change.
Previously we didn't --enable-tests in mac release builds because it didn't have a packaging manifest. That's fixed now, so I think that doing --enable-tests across the board is fine. That still leaves bug 527458, but as noted on IRC actually deploying that will require changes to Talos and can be put off... it's just a few console warnings in release builds.
Correct me if I'm being stupid here, but isn't this bug about avoiding unknown binary components?  Why does it affect JS components at all?
Because with the components.list approach, it turned into about avoiding all unknown components.  I don't think there's any reason to not do that, and it simplifies the implementation considerably.
tests were already enabled on Mac: this enables tests on Linux for 1.9.2
Attachment #412033 - Flags: review?(ccooper)
Attachment #412033 - Flags: review?(ccooper) → review+
Benjamin, what would be the easiest way to check if a .js module has been loaded? It would be great to have a way to check the loaded state of those modules. Tools like the Process Monitor only list DLL files.
Hardware: All → Other
One more thing: When I remove single entries from components.list or even place an empty file in this folder all modules get loaded in any way. Removing the compreg.dat file doesn't help either. So how do we check for changes and update our list of allowed modules?
So the leaks still seem to be there on Linux 1.9.2 only:

s: moz2-linux-slave18TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 4984 bytes during test execution
leaked 5 instances of nsBinaryOutputStream with size 16 bytes each (80 bytes total)
leaked 1 instance of nsBufferedInputStream with size 48 bytes
leaked 1 instance of nsBufferedOutputStream with size 56 bytes
leaked 2 instances of nsBufferedStream with size 40 bytes each (80 bytes total)
leaked 10 instances of nsFastLoadFileUpdater with size 240 bytes each (2400 bytes total)
leaked 10 instances of nsFastLoadFileWriter with size 228 bytes each (2280 bytes total)
leaked 1 instance of nsStringBuffer with size 8 bytes
leaked 1 instance of nsSystemPrincipal with size 32 bytes

Only thing I can think of is that this may be a component load order issue, as the patch here doesn't touch fastload at all.  Those leaks seem to imply a failure to clean up fastload stuff.
I just filed bug 528630 to track the Linux issue, and am #if 0'ing this out for Linux.  This bug is getting too crowded as it is.
(In reply to comment #57)
> One more thing: When I remove single entries from components.list or even place
> an empty file in this folder all modules get loaded in any way. Removing the
> compreg.dat file doesn't help either. So how do we check for changes and update
> our list of allowed modules?

This is covered by bug 528623 now.
For mozilla-1.9.2 this is now disabled on all OSes due to leaks:

 http://hg.mozilla.org/releases/mozilla-1.9.2/rev/b695ec4ecbbc
To test this, I'd write an xpcshell test something like:
- test file xpcom/tests/unit/test_componentslist.js or something similar
- two simple JS components (don't need to actually do anything, just register factories) at xpcom/tests/unit/components/{one,two}.js
- xpcom/tests/unit/components.list listing only one.js
- test_componentslist.js calls Components.manager.autoRegister(do_get_file("components")), then checks that the contract ID specified in one.js is present, and the contract id specified in two.js is not.
We intentionally break software compatibility. Do we have informed any of the add-ons or software vendors who place some of their files into our components folder? There isn't so much time for those to update their extensions. As example Google Desktop Search will stop working from now on.
Keywords: late-compat, relnote
Summary: Only load known binary components from app directory → Only load known components from app directory
Henrik: we've blogged about this quite a bit, and I suppose that you can claim this is "intentionally breaking software compatibility" but throwing DLLs in a components directory and hoping that the software would load them isn't a documented or supported path.

We'll be reaching out to partners, of course, and there are alternative mechanisms for them to do what's needed in order to continue accessing Firefox resources.

You mention GDS; do you have evidence (ie: have tried with a nightly) that it no longer works?
This is documented now.
(In reply to comment #65)
> Henrik: we've blogged about this quite a bit, and I suppose that you can claim
> this is "intentionally breaking software compatibility" but throwing DLLs in a
> components directory and hoping that the software would load them isn't a
> documented or supported path.

Mike, I fully agree with you. I'm only a bit worried about that probably some of the extensions, which are using that not supported way, will not be updated when we release 3.6. Authors will not have much time to get their extensions working the proposed way. I have read Johnathans blog post yesterday and it will be a great help for anyone.

> You mention GDS; do you have evidence (ie: have tried with a nightly) that it
> no longer works?

GDS places three files (GoogleDesktopMozilla.dll, GoogleDesktopMozillaStub.js, and GoogleDesktopMozillaStub.xpt) into the components folder. None of those files will be registered. I'm not sure what extra feature those components offer but it will definitely not work.
Reading through a couple of German forums I have seen that users starting discussions about that feature. Mostly the question arise why the components.list has to exist at all. I was thinking about that too and given the following items why we cannot do the same as with DLL blacklisting? Or do I miss something?

* We always know which components will be whitelisted under the components folder. Otherwise we were not be able to fill that file with content.
* We always overwrite that file with an update.
* We do not want to allow other applications to modify that file.

What is the reason why we are using the components.list file? One downside I see is that we cannot create that list automatically but have update our source if new components get added or existing ones removed. But how often will that happen?
By "do the same" you mean why do we not just embed the list?  For simplicity, really; the list of components differs from platform to platform and configure args.  It's built up as part of the build process, so the net result is exactly what you built.  If we tried to hardcode it (like my original patch did), that file would likely end up being a painful mess to deal with.
(For posterity, since I've lost it a couple times, this was the checkin that eventually turned this feature back on - http://hg.mozilla.org/releases/mozilla-1.9.2/rev/57171f47b2bd which occurred just before tagging for 3.6b3).
Marking verified fixed on 1.9.2 based on our test results which are logged here: https://wiki.mozilla.org/QA/Firefox3.6/TestPlan:DLL_Blocklisting#Test_Results
Keywords: verified1.9.2
Hardware: Other → All
Version: unspecified → Trunk
Blocks: 533692
Depends on: 524649
No longer depends on: 524649
This is marked as wanted/needed on the 1.9.1 branch, but would we really make this change in a 1.9.1.x update? It would fix crashes, but would also mysteriously make some things users might want stop working which is not good behavior from a security update.
Not good behaviour except for the part where we stop the browser from being unstable.

How complex and risky is the backport? I remember this change being problematic on trunk ...
I don't think we should try to backport this: enough registration goop was reworked between 1.9.1 and 1.9.2 that there might be hidden side effects.
Updating flags appropriately!
blocking1.9.1: needed → -
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: