Closed Bug 297925 Opened 16 years ago Closed 10 years ago

Software update + xulrunner

Categories

(Toolkit :: Application Update, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Future

People

(Reporter: benjamin, Unassigned)

References

Details

Software update currently works by updating a directory in-place. This is not
acceptable for xulrunner + app combinations where upgrading a xulrunner might
break other applications which use that xulrunner. My basic plan involves making
an extra mode for the software update executable: instead of updating in-place,
it will copy an entire xulrunner to another directory and update it there (i.e.
copy xulrunner-1.8 -> xulrunner 1.8.1, so that they both exist. Applications
would continue to be updated in-place.
I assume you'd only do this if there were dependencies on the old version? 

(So as not to leave a trail of XULRunners on a system where the only thing that
exists is Firefox?)
Disk space is cheap, upgrade pain is not. I would prefer to leave old XULRunners
around by default. Here is the sequence of events that I'm worried about:

1) sysadmin/privileged user A installs Firefox 2.0 with XULRunner 1.9

2) unprivileged user B downloads and installs Foopy 1.0 which requires XULRunner
1.9. The program is installed inside Documents and Settings/UserB/something...
At this point, the Foopy installer can't set a Foopy -> XULRunner 1.9 dependency
in the HKLM registry.

3) user A upgrades Firefox 2.0 -> 2.1 with XULRunner 1.9 -> 1.9.1

Unless we keep XULRunner 1.9 around, Foopy 1.0 will break.

Disk space is cheap, let's use it: advanced users who really care can manually
remove it using the Control panel add/remove programs.
What if there are N (where N is on the order of 10 or more) security updates to
Firefox 1.5? On Windows, the 1.0 upgrade path to date (1.0.0-1.0.5) would have
taken up 132.6MB of disk space, and we're only eight or so months into the
post-release timeline. 

We still run with competitive performance on a lot of older hardware (e.g.
PII-233s, PIII-500s etc) which don't necessarily have gigs of disk to spare. 

Darin suggests keeping around XULRunners for major API changes only e.g. the
latest of 1.1.x, the latest of 1.5.x etc to prevent incompatibilities between
applications while optimizing disk use.
As long as xulrunner apps identify themself as "compatible with xulrunner 1.9.x"
instead of "compatible with "1.9.1", that would be a decent compromise; I do
worry about the ramifications of security updates, since we have taken major
updates along the 1.0.x path that I fear would break xulrunner apps.

Perhaps we should come up with an intermediate minor-versioning strategy
major.minor,release, where we have the major version bump with the yearly gecko
release, the minor version bumps every time there is a security/stability fix
that has a reasonable chance of breaking some xulrunner apps (native wrappers),
and a release version bump with small security/stability fixes

xulrunner 2.0
xulrunner 2.0.1 -> zlib buffer overflow fix
xulrunner 2.0.2 -> networking crash fix
xulrunner 2.1 -> some xpcnativewrappers or XBL prototype change here!

That way we can upgrade 2.0 -> 2.0.2 without leaving any old version behind, but
we'll leave 2.0.2 on the system when we upgrade to 2.1.
We may not have done the best job in the past of maintaining binary
compatibility on our "stable" branches.  Ideally, a release branch is not
intended to be changed in a major way.  Stable implies bug-for-bug
compatability.  Exceptions are security fixes, and those are almost always
resolved without API changes.  Now the xpcnativewrapper stuff of late is
definitely an exception where we have significantly changed our platform on a
"stable" branch.  I use quotes there for a reason ;-)

We haven't rev'd app.extension.version have we?  We aren't forcing old
extensions to break, right?  (<-- referring to the recent 1.0.x branch changes.)
 Basically, whatever mechanism we adopt for extensions should be good for
applications.  Our stable releases should not break extensions, and they should
not break applications.  The versioning scheme you describe could be used, but
then I'd argue that it really shouldn't need to be.
(In reply to comment #5)
> We haven't rev'd app.extension.version have we?  We aren't forcing old
> extensions to break, right?  (<-- referring to the recent 1.0.x branch
> changes.)

Right.

1.0.3 regressed DHTML support at MSN, among other places, but that bug was not
necessary for security, it was a logic bug not caught by narrowband branch QA.

>  Basically, whatever mechanism we adopt for extensions should be good for
> applications.  Our stable releases should not break extensions, and they
> should not break applications.  The versioning scheme you describe could be
> used, but then I'd argue that it really shouldn't need to be.

I agree.  We could elaborate version numbering by adding more degrees of
freedom, but in the end we have to stipulate compatibility, not just describe it
after the fact based on (market, painful) testing.  Developers will flee if we
keep breaking compatibility where we ought not.  So let's keep things simple and
unified, and not break compatibility on branches.

IOW, the DHTML bug in 1.0.3 is not something I would worry about or use as a
vague argument here.  If we ever find ourselves having to hack security fixes in
that kind of hurry, but without better QA, I will quit -- I've said this before.

(Maybe that doesn't help any of you not to worry :-/.  It works for me, though,
and, with Bob Clary leading the charge and now Coop on board driving QMO, there
are real grounds for believing we will have better QA coverage for platform
security fixes, which when made on branches such as the 1.0.x branches, should
be backward compatible.)

Suppose there were no way to fix a platform security hole without breaking
compatibility on such a branch, you say?  Then wouldn't we need something like
what Benjamin proposes in comment 4?

Yes, but that case can't match the example in comment 2 step 3: Firefox 2 (no
.0, note well! just "2") to 2.1 (or whatever we'll call it) forcing XULRunner to
go from 1.9 to 1.9.1 -- the XULRunner version change should go to something that
changes in the second decimal number, with one . (1.10 or 2.0).  And we would
not be making such a platform security fix on a branch.  We would be deprecating
the 1.9 branch at that point, for anyone subject to security concerns (some
XULRunner apps won't be networked).

There's no way around deprecation and obsolescence, in the worst case.  If we
have to keep patching the 1.0.x branch for too long, we'll (a) die from doubled
or trebled effort; (b) find ourselves unable to fix except by back-porting whole
architecture fixes from the trunk, or what was once the trunk and is now just a
more recent branch.  Both points say we should obsolete the older branch at that
point, otherwise we are just making more work, approximating a clone of a branch
of the trunk in an older branch.

BTW, this is a real risk for the 1.0.x branch right now, and I intend to push
for obsoleting it ASAP, including with trailing-edge Enterprise customers.  If
any care enough to back-port harder, someone else will have to do the work --
and if that work breaks compatibility, I will argue that we shouldn't host it in
our branch under a nominally-compatible version.

One question: I remember Benjamin saying XULRunner would track Gecko rv:
major.minor or major.minor.point numbering, but it sounds like there may be
valid need to diverge from Gecko numbering.  Is that the case?  If we keep API
compatibility within minor releases, then we can conserve disk space as Ben
says.  But is there any necessary relationship between XULRunner's version and
Gecko's rv: sent in every user-agent?

Another question: the dotted-arbitrary-precision-decimal notation should use
dots to indicate release (which is to say, CVS branch) points.  We should not
have another trunk milestone with two dots in its version.  Is this obvious and
agreed?  I want to check sanity early and often here, in case there's a new
requirement I've missed (or some other relevant conundrum).

/be
We have tried to marry the XULRunner and toolkit version to the Gecko version
for simplicity.  It hurts to have so many version numbers floating around.  I
think we can get by without diverging from this.


> We should not have another trunk milestone with two dots in its version. 

I'm not sure which trunk milestone you mean.  DPA1 is app version "1.0+" w/
Gecko version 1.8b2, which should probably have been 1.7+, but I don't know if
web authors are ready for our freakish "+" notation.  I have had a hard time
explaining it to people, that's for sure.  1.0+ not equal 1.0 branch!  huh?! :-(
(In reply to comment #7)
> We have tried to marry the XULRunner and toolkit version to the Gecko version
> for simplicity.  It hurts to have so many version numbers floating around.  I
> think we can get by without diverging from this.

Great, good to hear confirmed.

> > We should not have another trunk milestone with two dots in its version. 
> 
> I'm not sure which trunk milestone you mean.

I was referring to the ancient 0.9.6-0.9.10 run-up to Mozilla 1.0.

> DPA1 is app version "1.0+" w/
> Gecko version 1.8b2, which should probably have been 1.7+, but I don't know if
> web authors are ready for our freakish "+" notation.  I have had a hard time
> explaining it to people, that's for sure.  1.0+ not equal 1.0 branch!  huh?! :-(

If it were 1.0post or 1.0after would it be any better?  Not in my book.  We
could pick numbers and stick to them, using 1.1- instead, but that assumes more
risk on our part (assuming there's no risk in hard times explaining things ;-).

/be

> If it were 1.0post or 1.0after would it be any better?  Not in my book.  We
> could pick numbers and stick to them, using 1.1- instead, but that assumes more
> risk on our part (assuming there's no risk in hard times explaining things ;-).

The problem really is with parsing the version string: 1.0foo looks like 1.0.x
Joke:

1.0 < 1.0.1 < 1.1 < 1.1.1 ...

Problem: what to designate the trunk after 1.0 has branched (for 1.0.1 etc.) but
before 1.1 has shipped?

Solution: use surreal numbers: 1.0 < 1.{0|} < 1.1 ("Every real number is
surrounded by surreals, which are closer to it than any real number" -- see
http://mathworld.wolfram.com/SurrealNumber.html).

This will choke the standard JS version number comparison functions I've seen!

/be
> keep breaking compatibility where we ought not.  So let's keep things simple and
> unified, and not break compatibility on branches.

How do you intend to do that? We just broke compatibility (again) in nsFileSpec
for mozilla 1.7.9 (see bug 299133), and we have changed the following interfaces
along the aviary 1.0.1 branch:

nsIPrincipal
nsIFileSpec
nsISearchService (not part of the toolkit, thank goodness)
nsIAutoCompleteInput

The point of stable-release compatibility is that we can't really change our
internal interfaces without breaking apps, because apps do use them, out of
ignorance or necessity.

> changes in the second decimal number, with one . (1.10 or 2.0).  And we would
> not be making such a platform security fix on a branch.  We would be

I do not think this is a reasonable assertion. Security issues generally have a
TTL of days-to-weeks, while the trunk has a TTL ranging into months, depending
on what destabilizing changes have taken place. We are going to be forced to
make "internal API" changes on a branch at some point. I think we need to plan
now for how that is going to work; whether that involves bumping the xulrunner
version number or keeping older versions of xulrunner around on disk.
It was arguably a bad thing that those interfaces were changed on a supposedly
stable branch.  (It's not a _stable_ branch if interfaces change.)  In all of
those cases, an additional interface could have been introduced to preserve
interface stability on the branch.  That said, we do have to allow for internal
interfaces that should never be used by applications and can therefore be
changed more freely.  I think this really argues for more and better frozen
interfaces.  It's easy to make the argument that no one should be using
nsIFileSpec.  It is hard to apply that to nsIPrincipal.
You guys are talking about unfrozen APIs.  If you want to support and version
them, then freeze them as-is.  But don't change the long-standing rules.

Given non-zero unfrozen APIs in any real-world project of our size, we will
*always* have some people willing or foolish enough to use them for apps that
might break if they change.

If the answer is always "never change any API, frozen or not", then we should
just mark everything frozen, and have fun figuring out, specifying/documenting,
and maintaining the unfrozen ones (clue: it may be *impossible* to do so and fix
bad bugs -- I don't mean hard, as in having to mix in another interface -- I
mean it can't be done).

I think it is unrealistic to proceed with unknown, under- or unspecified, and
potentically over-constrained-to-impossible requirements on XULRunner's backward
compatibility.  I don't care what syntax you use to version things -- I object
to the idea that we have to maintain backward compatibility for all interfaces.

See bug 300008 comment 18 for more.

/be
I am talking about unfrozen interfaces, and about real-world applications that
use them. I do not expect these applications to be forwards-compatible with
changes we make to these unfrozen interfaces. I do expect these applications to
continue to work when firefox goes from 2.0 -> 2.0.1.

I don't really care about the syntax used to version things, but I do care that
we don't go arbitrarily removing old xulrunners.
> See bug 300008 comment 18 for more.

I responded in that bug as well.  Can you show me an extension for Firefox that
uses only frozen interfaces?  Right away you have to exclude any extension that
uses XUL.  install.rdf is not capable of discerning 1.0.4 from 1.0.5, which
means that interface changes may require users to resort to running Firefox in
safe mode in order to have any chance of recovering from a bad extension.
(In reply to comment #14)
> I am talking about unfrozen interfaces, and about real-world applications that
> use them. I do not expect these applications to be forwards-compatible with
> changes we make to these unfrozen interfaces. I do expect these applications to
> continue to work when firefox goes from 2.0 -> 2.0.1.

Well, this is the first I've heard that you care about tracking unfrozen API
compatibility.

How do you propose to do that?  If it's anything other than marking all APIs
frozen, I want no part of it.

/be
I intend for the xulapp to state "compatible with xulrunner 1.9.0", and only to
work with xulrunner 1.9.0. Application-update can change this to "compatible
with xulrunner 1.9.0->1.9.4" or whatever makes sense.
(In reply to comment #15)
> > See bug 300008 comment 18 for more.
> 
> I responded in that bug as well.  Can you show me an extension for Firefox that
> uses only frozen interfaces?

I never said anything about Firefox extensions -- the new toolkit and all the
structural XUL in Firefox that extensions may use to overlay, e.g., is not
frozen. So that says you have to track those APIs' compatibility somehow,
changing whatevr part of the version number string means "I'm not backward
compatible now" when you break compatibility in any of those APIs.

That's different from XPCOM APIs such as nsIPrincipal.  Why?  Just this: XPCOM
IDL-declared APIs have had a freeze process and @status comment convention for
what, six or so years.  That means something.  The new toolkit and front end
APIs have not.  XULRunner coming along does not automatically force the XPCOM
unfrozen interfaces into the same boat.

If we're arguing only that XULRunner's version string has to change for branch
API revisions, I give.  But I'm not interested in arguing about that right now.
What I heard is that we can't change unfroze APIs on "stable branches", not that
we merely need to revise version numbers to say "incompatible now".  That's what
I object to.

(Mainly -- if there's time, I will also object to the idea that we must declare
compatibility broken for every change to an unfrozen API.  Even if we declare
all APIs frozen, who will notice when one changes and bump that indicator?  We
may as well bump it on every release, and then we are back to a dozen XULRunner
installations in a couple of years, which I agree with ben is a problem.)

/be
Brendan:

The deal is simply this: people want to author extensions and applications on
our platform.  We want to make sure that they have a platform that is free of
security bugs.  When we want to push a security release, we want to make sure
that all of the extensions and applications are on board.  It would be a shame
if some organization had to choose between either a secure platform or a
functional application.  For this reason, we need to ensure that security
updates have minimal API impact.

The manner in which we promise frozen APIs is different depending on the APIs. 
For example, we mark our XPCOM APIs with @status FROZEN when we determine that
they will never change across major releases of Mozilla-based products.  NSPR
takes a different stand, promising complete API compatibility with all future
releases.  NSS is similar.  For the DOM, we promise compatibility with published
specifications.  The behavior of all of these APIs (including XUL, XBL, and any
other nifty API that we want people to use) taken together should remain as
stable as possible across security updates.  Otherwise, we severely hurt the
viability of our platform.

When we talk about XUL extensions or the extension manager in Firefox,
understand that we mean that to include all of the ways in which Firefox may be
extended.  That includes NPAPI plugins, XPCOM components, pref names, etc. 
Extension Manager (and the corresponding version checking in XULRunner) is so
critical to our platform.  Without it, we would have a very weak platform.  It
is critical to our ability to push security updates and so on.

Just as we have adopted a policy of changing UUIDs whenever XPCOM interfaces are
modified, we need to adopt the policy of not changing interfaces on release
branches unless it is absolutely necessary, and the cost to existing
applications and extensions has been determined to be minimal.  Adhering to this
rule will only increase in importance going forward.

I believe that we should not break API compatibility when we push a security
release, and if we go that route then we do not need to worry about changing the
extension compatibility version when we push security releases.  It just
simplifies version management across the board, and it also makes our platform
easier for would-be developers to grok.  In general, they shouldn't have to
worry about how our security releases may impact them.

Please tell me this makes sense?  I don't think we can afford to go any other
route, but I'm open to suggestions :-/
> It is critical to our ability to push security updates and so on.

Sigh... what I really meant was that version checking done by EM allows us to
push _major_ updates to Firefox without worry that untested extensions that use
unfrozen APIs will crash Firefox.  The same applies to XULRunner.  New platforms
(XULRunner or Firefox) that are not compatible with an existing extension or
application will not even attempt to load said extension or application.  This
gives us the ability to safely modify interfaces across major updates to the
platform.
Darin and I talked more over IRC.

First, the way 1.0.5 was rushed out today still bugs me, and it had better not
have yet another DHTML regression (but how do we know it doesn't?  The QMO team
was only engaged last night, and didn't get through a full run of their current
and insufficient automation).  If we had had more time, the API breakage we've
now shipped could have been undone.

Second, I agree completely with rationalizing our compatibility and versioning
via the Firefox (or XUL app, XULRunner, I mean) extension model.

Third, Darin makes a strong case for freezing a bunch of interfaces, set TBD,
that are not yet frozen.  In conjunction with my change to nsIPrincipal, this
means nsIPrincipal2 should have been done on the branch and trunk.  The reason
is that there's no win in doing nsIPrincipal2 on the branch, but morphin the
interface as I did both places only on the trunk.  That would make more work for
me and for any extensions (JVM plugins) that want to work in 1.0x and 1.1.

Alas, it's too late this time.  But I agree: never again.  The policy you guys
have figured out well enough to know better than I did, in the heat of hacking a
security patch, needs to be written down and communicated widely.  Who will do
that?  Darin's in line, with Benjamin as wingman.

Thanks for putting up with my protestations.

/be
-> future.  no plans to do anything here for Firefox 1.5
Target Milestone: --- → Future
Blocks: 299986
Blocks: 315452
Adding bug 326930 to the list of deps even though it's already fixed for easier tracking.
Blocks: 326930
Would it be a solution to store data in the "Common Application Data" folder, and create subfolders for each user account which installs an XULRunner App, (modifying the ACL for each folder so that other users than admins cannot access), and simply using this other than the registry for figuring out which XULRunner versions need to stick around?

(I came up with this while chatting with someone about potential ways to read HKCU from an admin account for other users, it seems hard or impossible, it is at least not easily findable).
About the version numbering issue (version number of the trunk), my proposal is to use x.x.99 numbers (like 3.9.99 for the trunk version that will eventually become Firefox 4.0 in the future: 3.9.99 -> 4.0a1 -> 4.0a2 -> 4.0b1, and so on). Also, we should change the way how we number other releases aswell. Currently, we have an always-zero digit before the revision/release number that was introduced with Firefox 1.5 (e.g. Firefox 1.5.0.1, 1.5.0.6, etc.) It would be good to switch back to the version numbering scheme used in 1.0 (logically, after 1.0.1, 1.0.2... 1.0.x, the minor revisions of 1.5 and 2.0 should be 1.5.x and 2.0.x, rather than 1.5.0.x and 2.0.0.x, like 1.5.3 instead of 1.5.0.3 and 2.0.2 instead of 2.0.0.2). I do not want this to be applied retroactively to the previous "misnumbered" releases, but we should use this naming scheme for later releases (like Firefox 3: the first update after 3,0 should be 3.0.1, not 3.0.0.1, and if we decide to make a version 3.5, its first update should be 3.5.1.) I see no point why an extra zero was added after 1.5.
Is there still work going on about this bug?  The last update I can see in this bug is 10 months old, and it was posted by me...
Since the bug is assigned to nobody and there have been no updates, you may safely assume that nobody is working on it.
To Solve the problem of what do you do if an unprivileged user installs a XULrunner app for themselves then the system wide xulrunner is upgraded to an incompatible version, you could have the new xulrunner offer to download the latest version of xulrunner that works with that app and give the app it's own private copy of xulrunner, then when the app updates to a version supporting the system wide xulrunner the private xulrunner can be removed as part of the update process.

The only issue I can see with this is that the user will be confused as to why this download is suddenly needed, however it should not be a major issue since the user only becomes aware of the problem when xulrunner offers to fix it for them.

This also has the advantage that there will only ever be xulrunner versions that are actually required installed on the system, system wide apps can be tracked by the update system I suppose, so then you still have multiple system wide xulrunner versions, but only the ones needed for apps that are installed system wide.
I think Benjamin Smedberg is a Windows-user. ;-)
On other systems the package management system can handle this perfect.
On Windows XULRunner could do this task of a package management system.
But: No private XULRunner for Firefox, please! There is no reason for this.
Product: Firefox → Toolkit
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.