Closed Bug 977464 Opened 7 years ago Closed 5 years ago

Detect required XPIDL UUID updates during incremental builds

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(firefox46 fixed)

RESOLVED FIXED
mozilla46
Tracking Status
firefox46 --- fixed

People

(Reporter: gps, Assigned: ehsan)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

https://hg.mozilla.org/integration/mozilla-inbound/rev/de17e5deb35d required a clobber due to an updated XPIDL interface not being propagated.

The landing was clean. But it was a clobber build.
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=de17e5deb35d

However, the next push (and subsequent pushes were not):
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=88723262c85c

If we trace a failing log:
https://tbpl.mozilla.org/php/getParsedLog.php?id=35324611&tree=Mozilla-Inbound&full=1

We see it's doing all the right things:

/builds/slave/m-in-osx64-d-00000000000000000/build/obj-firefox/_virtualenv/bin/python /builds/slave/m-in-osx64-d-00000000000000000/build/config/pythonpath.py -I/builds/slave/m-in-osx64-d-00000000000000000/build/other-licenses/ply -I/builds/slave/m-in-osx64-d-00000000000000000/build/xpcom/idl-parser -I../../../xpcom/idl-parser /builds/slave/m-in-osx64-d-00000000000000000/build/python/mozbuild/mozbuild/action/xpidl-process.py --cache-dir ../../../xpcom/idl-parser ../../../dist/idl ../../../dist/include xpt .deps xpcom_system nsIBlocklistService nsICrashReporter nsIDeviceSensors nsIGConfService nsIGIOService nsIGSettingsService nsIGeolocationProvider nsIGnomeVFSService nsIHapticFeedback nsIXULAppInfo nsIXULRuntime

/builds/slave/m-in-osx64-d-00000000000000000/build/obj-firefox/_virtualenv/bin/python -m mozbuild.action.buildlist ../../dist/bin/components/interfaces.manifest 'interfaces xpcom_system.xpt'

../../dist/bin/nsinstall -L /builds/slave/m-in-osx64-d-00000000000000000/build/obj-firefox/xpcom/system -m 644 '../../config/makefiles/xpidl/xpt/xpcom_system.xpt' '../../dist/bin/components'

However, if you dump the xpt, the interface isn't updated (missing crashReporter.UpdateCrashEventsDir):

       - ::nsICrashReporter (4b525eab-d4b8-4f71-bf43-b698e306631b):
          Parent: ::nsISupports
          Flags:
             Scriptable: TRUE
             BuiltinClass: FALSE
             Function: FALSE
          Methods:
       G       uint32 enabled(out retval boolean);
         H     uint32 setEnabled(in boolean);
       G       uint32 serverURL(out retval nsIURL);
        S      uint32 serverURL(in nsIURL);
       G       uint32 minidumpPath(out retval nsIFile);
        S      uint32 minidumpPath(in nsIFile);
               uint32 annotateCrashReport(in CString &, in CString &);
               uint32 appendAppNotesToCrashReport(in CString &);
               uint32 registerAppMemory(in uint64, in uint64);
         H     uint32 writeMinidumpForException(in void *);
         H     uint32 appendObjCExceptionInfoToAppNotes(in void *);
       G       uint32 submitReports(out retval boolean);
        S      uint32 submitReports(in boolean);
          Constants:
             No Constants
Blocks: clobber
The patch didn't update the UUID.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INVALID
<glandium> gps: haha yes, Typelib.Interface doesn't compare methods
<glandium> gps: it compares names and iids
<glandium> gps: and the packager code relies on that comparison
This has a nice side effect ... it means that the build system basically validates whether you change the uuid like you are supposed to.
And that's when I learned that you always need to update UUIDs when changing interfaces.

That being said, this seems like something we should be able to detect better. I guess seasoned Gecko developers just don't make this mistake?
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #3)
> This has a nice side effect ... it means that the build system basically
> validates whether you change the uuid like you are supposed to.

Not on a clobber, though.
Arguably, we could make Typelib.Interface.__cmp__ actually compare the methods and make the packager bark if it hits the case where uiid and names are equal, but methods aren't. (or the packager could do the methods check). That still wouldn't catch anything on clobber builds, though. At least it would make a seamingly random test failure into a build failure.
(In reply to Mike Hommey [:glandium] from comment #6)
> Arguably, we could make Typelib.Interface.__cmp__ actually compare the
> methods and make the packager bark if it hits the case where uiid and names
> are equal, but methods aren't. (or the packager could do the methods check).
> That still wouldn't catch anything on clobber builds, though. At least it
> would make a seamingly random test failure into a build failure.

+1
Reshaping the bug.

Let's make the packager fail if it encounters an interface that's missing a UUID update.

That being said - could we do this sooner? I /think/ we should be able to catch this during .xpt generation. Just read in the old .xpt and compare members. Would that have undesirable implications on developer workflows? I suppose it's no different from detecting it during packaging. But I have a feeling a lot of Gecko developers aren't doing packaging.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Summary: de17e5deb35d required a clobber (missing added XPIDL method) → Detect required XPIDL UUID updates during incremental builds
+1!

This is a common cause of bustage that we hit on TBPL and it's often ambiguous when it occurs. Having a clear error during the build would make life much better.
Is somebody working on this bug?  I was burned by this yesterday in bug 866528...
Nobody is working on this bug.

Code for writing the XPT from the build system is at https://hg.mozilla.org/mozilla-central/file/f2c1c0a02595/python/mozbuild/mozbuild/action/xpidl-process.py#l60.
I've been guilty of this many times.

Would it be easier to to a pre-push hook that requires "[uuid]" in the patch description, similar to how bug numbers are required, if it includes an IDL change? That would notify the user and probably solve a lot of cases. It wouldn't catch everything of course.
Push hooks that look at code are fragile. For example, determining whether an actual UUID change is required means parsing the .idl (because we don't want comment-only changes to trigger). Parsing the IDL requires knowing about references interfaces. This means knowing about IDL files outside of the current directory. This requires the build system. That's not feasible to do during a code push.

Taras was mentioning that they want to move forward with "containerization" of automation builds. In theory, that would allow us to perform try pushes on top of existing builds. i.e. Try pushes could be incremental builds. That would help identify issues like this (try builds today are all clobber builds so clobber issues often aren't realized until actual landing).

We can also emphasize sets of client-side VCS hooks to prevent bad commits. Those client-side hooks would have access to the build system and could provide the kinds of "static analysis" you desire.

Let's try to focus on making the build system error on incremental builds first.
(In reply to Gregory Szorc [:gps] from comment #11)
> Nobody is working on this bug.
> 
> Code for writing the XPT from the build system is at
> https://hg.mozilla.org/mozilla-central/file/f2c1c0a02595/python/mozbuild/
> mozbuild/action/xpidl-process.py#l60.

I'm willing to write a patch if you can tell me what exactly needs to be done here.  I'm not sure if I grok the previous conversation on this bug.
(In reply to Gregory Szorc [:gps] from comment #11)
> Nobody is working on this bug.
> 
> Code for writing the XPT from the build system is at
> https://hg.mozilla.org/mozilla-central/file/f2c1c0a02595/python/mozbuild/
> mozbuild/action/xpidl-process.py#l60.

I'm willing to write a patch if you can tell me what exactly needs to be done here.  I'm not sure if I grok the previous conversation on this bug.  What happened was that this patch <https://hg.mozilla.org/integration/mozilla-inbound/rev/c0e333b0f7dc> changed the type of an attribute <https://hg.mozilla.org/integration/mozilla-inbound/rev/c0e333b0f7dc#l6.17> and also the type of an argument to a method <https://hg.mozilla.org/integration/mozilla-inbound/rev/c0e333b0f7dc#l4.20> without updating the uuids.
Err, not sure how I managed to tangle up the previous comment!  What I meant to say is:

Note that people are mostly talking about detecting a new method added to an interface here, but bug 866528 was more subtle than that. What happened was that this patch <https://hg.mozilla.org/integration/mozilla-inbound/rev/c0e333b0f7dc> changed the type of an attribute <https://hg.mozilla.org/integration/mozilla-inbound/rev/c0e333b0f7dc#l6.17> and also the type of an argument to a method <https://hg.mozilla.org/integration/mozilla-inbound/rev/c0e333b0f7dc#l4.20> without updating the uuids.
Either in the aforementioned xpidl-process.py file or in xpcom/typelib/xpt/tools/xpt.py, you'll want to add code that compares the existing typelib file (represented by the Python Typelib class from xpt.py) against the new one. If an interface has changed but its UUID hasn't an error should be raised.

You probably want to implement a method for the Typelib class that checks if a UUID bump is needed. See https://hg.mozilla.org/mozilla-central/file/d6e549d77ab2/xpcom/typelib/xpt/tools/xpt.py#l885 for how interfaces are compared.

khuey can provide more details.
(In reply to Gregory Szorc [:gps] from comment #17)
> Either in the aforementioned xpidl-process.py file or in
> xpcom/typelib/xpt/tools/xpt.py, you'll want to add code that compares the
> existing typelib file (represented by the Python Typelib class from xpt.py)
> against the new one. If an interface has changed but its UUID hasn't an
> error should be raised.
> 
> You probably want to implement a method for the Typelib class that checks if
> a UUID bump is needed. See
> https://hg.mozilla.org/mozilla-central/file/d6e549d77ab2/xpcom/typelib/xpt/
> tools/xpt.py#l885 for how interfaces are compared.
> 
> khuey can provide more details.

Kyle, what do you think about this plan?
Flags: needinfo?(khuey)
> Parsing the IDL requires knowing about references interfaces. This means
> knowing about IDL files outside of the current directory. This requires
> the build system.

I don't think this is true.
(In reply to Mike Hommey [:glandium] from comment #19)
> > Parsing the IDL requires knowing about references interfaces. This means
> > knowing about IDL files outside of the current directory. This requires
> > the build system.
> 
> I don't think this is true.

You're right, but a hook is a terrible option here either way.  We should just fix the build system.
Don't most (if not all) interfaces inherit from nsISupports?
Don't we need to bump the UUID of all interfaces that inherit from a changed interface?

Given those two facts, I don't understand how my comment about IDL discovery is inaccurate? I suppose a poor man's implementation could look at the file list in the commit for *.idl. But:

$ hg locate '*.idl' | xargs basename | sort | uniq -c | grep -v 1
   2 SessionStore.idl
   2 nsIShellService.idl

That's a lovely rat hole we've stumbled upon.
comment 18 sounds reasonable to me.
Flags: needinfo?(khuey)
The landscape has changed here a bit.  We no longer support binary extensions so revving the IID is not required for binary compatibility.  However the packaging code still considers an interface unchanged if its IID hasn't been updated.  If we fix Interface.__cmp__() to actually compare the full interface information, we no longer need to rely on IIDs being revved for incremental builds, and we can therefore relax the requirement of revving XPIDL interface IIDs every time you change one.

I have a patch for this.
Assignee: nobody → ehsan
Status: REOPENED → ASSIGNED
Be aware that changing Interface.__cmp__() is likely to have a noticeable impact on performance of the packaging code.
Since we no longer support binary extensions, revving an interface's IID
is not necessary for binary compatibility.  However, we currently skip
relinking XPT files if a change to an interface doesn't update its IID.

This patch fixes that requirement by comparing full interfaces against
each other, so that updating an XPIDL interface without rvving its IID
works well with incremental builds.

This paves the way to remove the requirement on revving interface IIDs
when making a change to an XPIDL interface.
Attachment #8706162 - Flags: review?(khuey)
(In reply to Mike Hommey (VAC: 31 Dec - 11 Jan) [:glandium] from comment #24)
> Be aware that changing Interface.__cmp__() is likely to have a noticeable
> impact on performance of the packaging code.

Can you suggest a good way to measure the impact before and after?
Comment on attachment 8706162 [details] [diff] [review]
Always relink XPT files for all changed XPIDL interfaces without requiring the IID to be revved

Review of attachment 8706162 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/typelib/xpt/tools/xpt.py
@@ +165,5 @@
> +            return c
> +        c = cmp(self.reference, other.reference)
> +        if c != 0:
> +            return c
> +        return 0

All that can be written as

return (
    cmp(type(self), type(other)) or
    cmp(self.pointer, other.pointer) or
    cmp(self.reference, other.reference)
)
(In reply to :Ehsan Akhgari from comment #26)
> (In reply to Mike Hommey (VAC: 31 Dec - 11 Jan) [:glandium] from comment #24)
> > Be aware that changing Interface.__cmp__() is likely to have a noticeable
> > impact on performance of the packaging code.
> 
> Can you suggest a good way to measure the impact before and after?

Run `make package` once, then compare the time spent in mozpack.XPTFile.copy when running `make package` again, with and without the patch.

(In reply to :Ehsan Akhgari from comment #25)
> Created attachment 8706162 [details] [diff] [review]
> Always relink XPT files for all changed XPIDL interfaces without requiring
> the IID to be revved
> 
> Since we no longer support binary extensions, revving an interface's IID
> is not necessary for binary compatibility.

That's actually not true.
*Firefox* no longer supports (third-party) binary extensions. Thunderbird still does (at the very least for enigmail).
(In reply to Mike Hommey (VAC: 31 Dec - 11 Jan) [:glandium] from comment #28)
> (In reply to :Ehsan Akhgari from comment #26)
> > (In reply to Mike Hommey (VAC: 31 Dec - 11 Jan) [:glandium] from comment #24)
> > > Be aware that changing Interface.__cmp__() is likely to have a noticeable
> > > impact on performance of the packaging code.
> > 
> > Can you suggest a good way to measure the impact before and after?
> 
> Run `make package` once, then compare the time spent in mozpack.XPTFile.copy
> when running `make package` again, with and without the patch.

This function gets called twice during `make package`.  Before:

XXXehsan the time spent was: 0.879724
XXXehsan the time spent was: 0.014546

After:

XXXehsan the time spent was: 1.055032
XXXehsan the time spent was: 0.014182

The times are in seconds, measured using time.clock().

> (In reply to :Ehsan Akhgari from comment #25)
> > Created attachment 8706162 [details] [diff] [review]
> > Always relink XPT files for all changed XPIDL interfaces without requiring
> > the IID to be revved
> > 
> > Since we no longer support binary extensions, revving an interface's IID
> > is not necessary for binary compatibility.
> 
> That's actually not true.
> *Firefox* no longer supports (third-party) binary extensions. Thunderbird
> still does (at the very least for enigmail).

Really?  This code seems to disagree: <https://dxr.mozilla.org/mozilla-central/source/toolkit/xre/nsXREDirProvider.cpp#123>
(In reply to :Ehsan Akhgari from comment #29)
> Really?  This code seems to disagree:
> <https://dxr.mozilla.org/mozilla-central/source/toolkit/xre/nsXREDirProvider.
> cpp#123>

That's the bundles directory. What you're looking for is MOZ_BINARY_EXTENSIONS which was added in bug 1165428.
TL;DR: Thunderbird has no problem with this, but it would be great it release drivers could inform us of changing IDLs in stable branches so that we can port the changes if appropriate.

Without binary extensions (at least in Firefox), the requirement to roll IID for .idl changes is not critical. But at least for the ESR directory, stability of extensions that might use a particular .idl, even legitimate JS extensions, still requires that there is some discipline in reducing changes to .idl files there. For example in the esr38 cycle, there was a particular security patch that renamed .idl methods that could have potentially broken JS extensions. After discussion, that patch was ported for esr38 in the same manner as was done prior to removal of binary extension support (though not without some discussion of the need).

For Thunderbird, although we still support binary extensions, what that really means is that they are heavily deprecated to the point that anybody using binary extensions had better be in close contact with the core team so that we can maintain compatibility with their particular needs. The only active binary extensions (Enigmail, Lightning, and ExQuilla) are maintained by people with close connections to the core team (or ARE the core team), and will take whatever steps are needed to maintain that compatibility. That would include special porting of patches landed on mozilla-* directories that might affect compatibility of the supported binary extensions. This is very different from the previous state where there was a guaranteed level of support, so that silent extension writers could expect binary compatibility. Also, I am not aware of anybody trying to support multiple versions of interfaces in the same binary file, that would require strict UUID reliability. Also, the current expectation is that there will be no binary extension support post-TB45, so this issue only affects uplifts to gecko45 or earlier, and not current trunk.

Ideally what Thunderbird would like would simply be no UUID changes at beta or later, or if necessary for Firefox then some sort of warning to us so that we could port any changes that might be needed for existing support binary extensions. But that is more of a release driver issue than a code issue, so should not have any effect on the proposals in this bug.
(In reply to Kent James (:rkent) from comment #31)
> TL;DR: Thunderbird has no problem with this, but it would be great it
> release drivers could inform us of changing IDLs in stable branches so that
> we can port the changes if appropriate.
> 
> Without binary extensions (at least in Firefox), the requirement to roll IID
> for .idl changes is not critical. But at least for the ESR directory,
> stability of extensions that might use a particular .idl, even legitimate JS
> extensions, still requires that there is some discipline in reducing changes
> to .idl files there. For example in the esr38 cycle, there was a particular
> security patch that renamed .idl methods that could have potentially broken
> JS extensions. After discussion, that patch was ported for esr38 in the same
> manner as was done prior to removal of binary extension support (though not
> without some discussion of the need).

Firefox 38 does support binary extensions.  Plus, the patches you're mentioning could break JS extensions, and that was more relevant in that context over their impact on binary extensions.

> For Thunderbird, although we still support binary extensions, what that
> really means is that they are heavily deprecated to the point that anybody
> using binary extensions had better be in close contact with the core team so
> that we can maintain compatibility with their particular needs. The only
> active binary extensions (Enigmail, Lightning, and ExQuilla) are maintained
> by people with close connections to the core team (or ARE the core team),
> and will take whatever steps are needed to maintain that compatibility. That
> would include special porting of patches landed on mozilla-* directories
> that might affect compatibility of the supported binary extensions. This is
> very different from the previous state where there was a guaranteed level of
> support, so that silent extension writers could expect binary compatibility.
> Also, I am not aware of anybody trying to support multiple versions of
> interfaces in the same binary file, that would require strict UUID
> reliability. Also, the current expectation is that there will be no binary
> extension support post-TB45, so this issue only affects uplifts to gecko45
> or earlier, and not current trunk.

Revving the UUID is really meant to make it possible to run binary extensions compiled against older versions of Gecko on newer ones (by making QueryInterface fail if it sees a UUID that it doesn't know about.)  All Thunderbird binary extensions need to do in order to keep working in future releases is to recompile against those releases and don't use an old binary against a newer Gecko.

> Ideally what Thunderbird would like would simply be no UUID changes at beta
> or later, or if necessary for Firefox then some sort of warning to us so
> that we could port any changes that might be needed for existing support
> binary extensions. But that is more of a release driver issue than a code
> issue, so should not have any effect on the proposals in this bug.

The point of these changes is to enable us to never change UUIDS, ever.

Also please note that this bug is about a specific problem in the build system and discussing Thunderbird extensions in the context of this bug is sort of off-topic...
Comment on attachment 8706162 [details] [diff] [review]
Always relink XPT files for all changed XPIDL interfaces without requiring the IID to be revved

Review of attachment 8706162 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/typelib/xpt/tools/xpt.py
@@ +165,5 @@
> +            return c
> +        c = cmp(self.reference, other.reference)
> +        if c != 0:
> +            return c
> +        return 0

What glandium said.  This applies to the other __cmp__ functions too.
Attachment #8706162 - Flags: review?(khuey) → review+
Err, please fix the indentation you added.
It's

return (
    foo or
    bar
)

not

return (
        foo or
        bar
)
https://hg.mozilla.org/mozilla-central/rev/cca3da96be3e
Status: ASSIGNED → RESOLVED
Closed: 7 years ago5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Depends on: 1240053
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.