Closed Bug 656331 Opened 14 years ago Closed 14 years ago

XPCOM binary component registration ignores Module::kVersion

Categories

(Core :: XPCOM, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla5
Tracking Status
firefox5 + fixed
firefox6 + fixed
blocking2.0 --- -
status2.0 --- wanted

People

(Reporter: WeirdAl, Assigned: benjamin)

Details

Attachments

(2 files)

At APN, we compiled a binary component for FF4, using an old XULRunner SDK.  Module::kVersion was 1, and with the FF5 Aurora code changing that to 2, we expected we needed to recompile for FF5.

However, in smoke-testing, I discovered that XPCOM code is not checking the Module::kVersion field of the component it is registering.  This means that a binary component compiled for one version of XPCOM (including an invalid version like 220) would be enabled across all versions of XPCOM from Gecko 2.0+.

I recognize that Module::kVersion was changed for a reason, to force binary components to update.  Allowing mismatched kVersion fields means an incompatible DLL could register - or worse, supersede a properly compiled DLL for that version of Firefox (see bug 616056 for details).
Assignee: nobody → benjamin
Status: NEW → ASSIGNED
Attachment #531663 - Flags: review?(bzbarsky)
Comment on attachment 531663 [details] [diff] [review]
Actually do the version compare, rev. 1

r=me
Attachment #531663 - Flags: review?(bzbarsky) → review+
http://hg.mozilla.org/mozilla-central/rev/1706e681390c
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Attachment #531663 - Flags: approval-mozilla-aurora?
Discussed in triage today - we'd like to approve this, but given that we've missed it before, it needs a test.
Affects Fx4 but extremely unlikely we'll ship an extra release for this. Fx5 is coming soon.
blocking2.0: --- → -
If no one beats me to it, I can work on the testcase tonight, using Mozilla-only code.
Attachment #531837 - Flags: review?(benjamin)
Comment on attachment 531837 [details] [diff] [review]
testcase as patch

brilliantly simple
Attachment #531837 - Flags: review?(benjamin) → review+
Totally nitpicky, but our modern Makefile style is two-space indent after line continuations, like:
EXTRA_DSO_LDOPTS = \
  $(NSPR_LIBS) \
...
  $(NULL)
Check-in needed for trunk.  Per comment 4, this is a prerequisite for FF5 Aurora.
Flags: in-testsuite?
Keywords: checkin-needed
http://hg.mozilla.org/mozilla-central/rev/b23a920b981e
Flags: in-testsuite? → in-testsuite+
Keywords: checkin-needed
Comment on attachment 531663 [details] [diff] [review]
Actually do the version compare, rev. 1

a=me on this patch and the accompanying test once it runs clean on central. Thanks for the quick turnaround!
Attachment #531663 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Please land for Aurora by Monday May 16 or the approval will potentially be lost.  Please mark as status-firefox5 fixed when you do.
There is a bug in /xpcom/components/nsNativeComponentLoader.cpp!
>> if (mozilla::Module::kVersion == data.module->mVersion && ...

I think there should be "<=" instead of "==", cause:
* in Firefox7a - kVersion=7
* in Firefox6a - kVersion=6
* in Firefox5b - kVersion=2

For example XPCOM component built with old SDK supports Firefox7a with kVersion=7. If this component will be running under Firefox5b with the fix I wrote above - all will be fine:
  mozilla::Module::kVersion = 2;
  data.module->mVersion = 7;
  2 <= 7 == OK
No, this is intentionally ==. Binary components must be recompiled for each release per our new policy.
So you think that I should have for example 3 XPI files with different XPCOM files for 3 different versions of Firefox??? Or should I include 3 different XPCOM files (with similar source code, but different kVersion value in Module unit) in one XPI file? What's a reason for this? I think Firefox must have backwards compatibility with older versions.
It's a pain point, to be sure (particularly when waiting for a SDK weeks after it should have been posted, grumble, grumble), but it's actually a defensive decision on Mozilla's part.

Simply put, the interfaces (and all code) in FF4+ are never frozen.  So let's say you're relying on a XPCOM service in FF5.  In FF7, someone removes that service.  If you QI for that service, and then try to use it, that usually results in a null-dereference crash for binaries.

Also, this also frees Mozilla up to make compatibility-breaking changes.  Mozilla's had to maintain backwards-compatibility support for over 10 years on its first XPCOM model; that's pretty difficult to do, and means bad architectural decisions made 10 years ago have to be supported in perpetuity.  Can you say decisions you're making now should hold up for 10 years or more?  I can't - my crystal ball tends to break after 10 months.

Finally, do recall that binary components are a risk point for Mozilla:  they can cause crashes, memory leaks, and other instabilities.  Mozilla has very few ways to respond to a buggy DLL (and some of them quite draconian).  They're harder to maintain than JS-based components, generally speaking, and this forces them to be much more actively maintained than JS-based components.

---

Having to ship a new DLL every six weeks, per the new release schedule, is painful... and in my opinion, it should be.  Binaries are dangerous, much more so than JavaScript-based components.  You really should use binary components only as a last resort, and only to support the specific capabilities you need.

There are alternatives which I haven't explored yet (js-ctypes, the ipccode of bug 68702), which should be considered as well.  In theory, these alternatives would mean you ship a JS-based component which could access binary code without the XPCOM boilerplate... and that means shipping a single DLL.  In practice... I need code samples.

My apologies to everyone for the spammy bug post.
I understand all that you wrote. But for example my XPCOM dll has only 1 class which doesn't use any other FF interfaces. It just allows to connect from JS-code (through the methods of my own interface/class) with my stanalone EXE application to obtain some info & vice versa - my EXE can connect to JS-code.

P.S. If someone will delete any service from FF7 for example - I think you'll got the NULL as a result when you query that service. Am I right?
It sounds like you'd be better served using js-ctypes then, which will let you call your C++ code from JS without any XPCOM in between, and without a need to recompile for each Firefox release:
https://developer.mozilla.org/en/js-ctypes/Using_js-ctypes
"js-ctypes" is not usable for me, cause 1st of all from my JS-code I need to know where the DLL is located. 2nd - it's too difficult to use js-ctypes functions, in XPCOM I have scriptable interface which allows to use "in" & "out" params in a simplest way.
These are both solvable problems, but this discussion should move to a newsgroup, not a bug:
http://www.mozilla.org/about/forums/#dev-platform
or
http://www.mozilla.org/about/forums/#dev-extensions

In any event, you are welcome to continue using binary XPCOM components, you will just have to recompile them for every version. That's the trade-off, sorry.
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20100101 Firefox/6.0

I was trying to see if this is fixed for Fx6 since the flag "status-firefox6" is set on "fixed", but I couldn't.
Is there a test case or any steps / guidelines for this bug that can be used to verify the fix? Thanks
Find a binary component compiled against an earlier version and try to register it in Firefox 6? I doubt this needs additional verification, though, since there has been plenty of discussion about it in the groups.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: