include GTK3 version in update URL

RESOLVED FIXED in Firefox 44

Status

()

defect
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: karlt, Assigned: glandium)

Tracking

Trunk
mozilla45
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox44+ fixed, firefox45+ fixed, firefox46blocking fixed, firefox-esr38-, b2g-v2.5 fixed)

Details

Attachments

(1 attachment, 2 obsolete attachments)

For the update service to know whether GTK2 builds can update to a GTK3 build, it needs to be told whether GTK+ 3.4 or newer is installed on the system.

There are no plans to change GTK2 dependencies of GTK2 builds, so I expect the GTK2 version currently in the URL can be replaced with a GTK3 version if present.
Any recommendations on how to get this info via javascript across distros?
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(karlt)
Note that we have in the past added this info via C++ XPCOM when it has been necessary which is also an option.
I think we need new C code to find the GTK3 libraries.

The current GTK2 version comes from the "secondaryLibrary" property of the 
@mozilla.org/system-info;1 nsIPropertyBag2.

https://hg.mozilla.org/mozilla-central/rev/71a73815eab4

Assuming this is not used for anything else, I was imagining replacing this with the values of gtk_major_version, etc. from dlopen("libgtk-3.so.0", RTLD_LOCAL).

I'm assuming that libgtk-3.so.0 doesn't instantiate any GTK2-conflicting glib types from its .init method.  I can't find any use of __attribute__((constructor)) or G_DEFINE_CONSTRUCTOR in the source code.
Flags: needinfo?(karlt)
(In reply to Karl Tomlinson (ni?:karlt) from comment #3)
> I'm assuming that libgtk-3.so.0 doesn't instantiate any GTK2-conflicting
> glib types from its .init method.  I can't find any use of
> __attribute__((constructor)) or G_DEFINE_CONSTRUCTOR in the source code.

All its dependencies will also be loaded, and they may have constructors as well. I wonder if RTLD_NOLOAD still allows to dlsym(). The safest would be to fork().
Flags: needinfo?(mh+mozilla)
Assignee: nobody → mh+mozilla
Attachment #8692454 - Flags: review?(karlt)
(In reply to Mike Hommey [:glandium] from comment #6)
> Try push with Gtk+2:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=676fc3bccc68

The oranges are for the unrelated changes I pushed at the same time.
Comment on attachment 8692454 [details] [diff] [review]
Include the Gtk+3 version in update URL if available on Gtk+2 builds

Thanks, Mike.  That's providing what I had in mind.  However:

"A process shall be created with a single thread. If a multi-threaded process
 calls fork(), the new process shall contain a replica of the calling thread
 and its entire address space, possibly including the states of mutexes and
 other resources.  Consequently, to avoid errors, the child process may only
 execute async-signal-safe operations until such time as one of the exec
 functions is called. Fork handlers may be established by means of the
 pthread_atfork() function in order to maintain application invariants
 across fork() calls."

nsSystemInfo::Init() is called after spawning other threads:
https://hg.mozilla.org/mozilla-central/annotate/74c7941a9e22d50057800771ebae07f69deecc9f/xpcom/base/nsSystemInfo.cpp#l71

Can you be sure that dlopen()/dlsym() are safe even when other threads have
been created?  Similarly, even snprintf() may allocate memory, which tends to
use mutexes.

I wonder about piggy-backing on fire_glxtest_process() if the gtk work is done
on a different pipe before the GLX code may crash the process.  Having both
readers waitpid() would not be ideal due to possible re-use of the pid (even
though NSPR already gives us potential for that bug), but I think it is fine if only one reaps the child.

https://dxr.mozilla.org/mozilla-central/search?q=glxtest_pipe&case=true

>+    switch (pid) {
>+    case 0: {
>+      close(pipefd[0]);

Gecko style is to indent the case labels (and then the statements one indent
further).  See the other switch in this file.

>+        guint (*gtk_get_major_version)(void);
>+        guint (*gtk_get_minor_version)(void);
>+        guint (*gtk_get_micro_version)(void);
>+
>+        gtk_get_major_version = reinterpret_cast<guint (*)(void)>(
>+          dlsym(handle, "gtk_get_major_version"));
>+        gtk_get_minor_version = reinterpret_cast<guint (*)(void)>(
>+          dlsym(handle, "gtk_get_minor_version"));
>+        gtk_get_micro_version = reinterpret_cast<guint (*)(void)>(
>+          dlsym(handle, "gtk_get_micro_version"));

I would declare these auto, and initialize at declaration.

>+          if (len > 0 && len < 64) {
>+            // Assume this succeeds the first time.
>+            (void) write(pipefd[1], gtkver, len);
>+          }

Yes, that's a valid assumption because POSIX requires PIPE_BUF >= 512 > 64.

ArrayLength(gtkver) instead of "64".

>+      int len = read(pipefd[0], gtkver, 64);
>+      close(pipefd[0]);
>+      waitpid(pid, NULL, 0);
>+      if (len > 0 && len < 64) {

EINTR handling is required on read() and waitpid() because each of them may
block.

NUL-termination of the string depends on the atomic write for lengths less
than PIPE_BUF, and POSIX requiring PIPE_BUF >= 512 > 64.  Can you add a
comment to explain this, please?

ArrayLength(gtkver) instead of "64" in the read.

len < 64 is not necessary here because the writer ensures that it only writes
NUL-terminated strings (up to 64-bytes including the NUL).
Attachment #8692454 - Flags: review?(karlt) → review-
A different attempt. This piggies back on the glx test, which starts earlier than us initializing separate threads.
Attachment #8692454 - Attachment is obsolete: true
Attachment #8692825 - Flags: review?(karlt)
Comment on attachment 8692825 [details] [diff] [review]
Include the Gtk+3 version in update URL if available on Gtk+2 builds

GfxInfo uses NS_DECL_THREADSAFE_ISUPPORTS.
Is there something that ensures that GetData() is called on only one thread?

nsURLFormatter looks like it fetches secondaryLibrary only once.
Is there something that ensures that GfxInfo::GetData() is called before
this getter and any other clients that do similar?

https://dxr.mozilla.org/mozilla-central/rev/7883e81f3c305078353ca27a6b1adb8c769d5904/toolkit/components/urlformatter/nsURLFormatter.js#58

I'd feel safer with a separate pipe for the GTK version and reading from
nsSystemInfo::Init().

>+#if MOZ_WIDGET_GTK == 2
>+#include <gtk/gtk.h>
>+#endif

It's a bit weird including GTK2 headers for using GTK3 functions.
I assume this is for guint.  If so, please make this <glib.h>.

"typedef unsigned int guint" would also be fine.
The snprintf() use is assuming this anyway.

>+      if (len > 0 && size_t(len) < sizeof(gtkver) - 1) {

"- 1" is not required here, right?

"a return value of size or more means that the output was truncated."

>+    do {
>+        bytesread = read(glxtest_pipe, &buf,
>+                         buf_size-1); // -1 because we'll append a zero
>+    } while (bytesread < 0 && errno == EINTR);

Now that there is more than one write, a read() will not necessarily read the
bytes from both writes, and so read() needs to be repeated, with correct byte
accounting, until it returns zero or -1 with errno != EINTR.
Attachment #8692825 - Flags: review?(karlt) → review-
Tracking for 45 as we may be aiming this feature at the 45 release.
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #12)
> Tracking for 45 as we may be aiming this feature at the 45 release.

If we aim Gtk+3 at the 45 release, then *this* bug needs tracking for 44.
Flags: needinfo?(lhenry)
(In reply to Karl Tomlinson (ni?:karlt) from comment #11)
> Comment on attachment 8692825 [details] [diff] [review]
> Include the Gtk+3 version in update URL if available on Gtk+2 builds
> 
> GfxInfo uses NS_DECL_THREADSAFE_ISUPPORTS.
> Is there something that ensures that GetData() is called on only one thread?

I don't know, but either users of GfxInfo do, or it's not a problem in practice.

> nsURLFormatter looks like it fetches secondaryLibrary only once.
> Is there something that ensures that GfxInfo::GetData() is called before
> this getter and any other clients that do similar?

It's called indirectly from nsBaseWidget::CreateCompositor. Do you think nsURLFormatter can ever start before that?
Now, one problem is that this call only happens if accelerated layers are enabled, and they can be disabled via a pref, an env variable, or by safe mode. There are other code paths that can end up invoking it, but they either root to WebGL or DriverCrashGuard afaict. So here's the interesting thing: if accelerated layers are disabled by env variable, pref, or safe mode, it looks like we're zombifying a forked process, since we never waitpid() for it... but that's independent of this change. ... actually checking, there's a JS code path that triggers it too, indirectly, from a profile-after-change observer in TelemetryStartup.js. OTOH, nsURLFormatter's OSVersion is not called during startup until about:home is opened, which is well after profile-after-change.

> I'd feel safer with a separate pipe for the GTK version and reading from
> nsSystemInfo::Init().

Then we get into the problem of handling waitpid correctly. That seems worse than the alternative of having OSVersion computed before GetData() is called, which is *very* unlikely, if not impossible. For something that's not going to live longer than one release of Firefox, I'd say it's not worth the effort to try to plug all possible theoretical holes.
Flags: needinfo?(karlt)
(In reply to Mike Hommey [:glandium] from comment #14)
> (In reply to Karl Tomlinson (ni?:karlt) from comment #11)
> > GfxInfo uses NS_DECL_THREADSAFE_ISUPPORTS.
> > Is there something that ensures that GetData() is called on only one thread?
> 
> I don't know, but either users of GfxInfo do, or it's not a problem in
> practice.

Please add MOZ_ASSERT(NS_IsMainThread()) or appropriate, if depending on this,
so that we might discover if the assumption is incorrect.  Bugs are
expected and frequent in GL usage and we can't distinguish between Gecko bugs
and OS bugs, so we wouldn't know whether this was a problem in practice.

> > nsURLFormatter looks like it fetches secondaryLibrary only once.
> > Is there something that ensures that GfxInfo::GetData() is called before
> > this getter and any other clients that do similar?
> 
> It's called indirectly from nsBaseWidget::CreateCompositor. Do you think
> nsURLFormatter can ever start before that?

I don't know how nsURLFormatter is used, but I could imagine it used for first
run pages.  I haven't reviewed other secondaryLibrary clients.

> Now, one problem is that this call only happens if accelerated layers are
> enabled, and they can be disabled via a pref, an env variable, or by safe
> mode.

So-called "accelerated" layers usually describes GL layers in this port, which
are off by default.  The OMTC pref may be involved here too.

> There are other code paths that can end up invoking it, but they
> either root to WebGL or DriverCrashGuard afaict. So here's the interesting
> thing: if accelerated layers are disabled by env variable, pref, or safe
> mode, it looks like we're zombifying a forked process, since we never
> waitpid() for it... but that's independent of this change.

In the past, I saw that until about:support was opened.  I see it reaped now.

> ... actually
> checking, there's a JS code path that triggers it too, indirectly, from a
> profile-after-change observer in TelemetryStartup.js.

I wonder how prefs affect that, or whether prefs are even available at that
point.  Most telemetry (at least) is opt-in on release channels.

> OTOH, nsURLFormatter's
> OSVersion is not called during startup until about:home is opened, which is
> well after profile-after-change.

FWIW running code to gather unnecessary telemetry info about GL before showing
the first page sounds like a bug.

> > I'd feel safer with a separate pipe for the GTK version and reading from
> > nsSystemInfo::Init().
> 
> Then we get into the problem of handling waitpid correctly. That seems worse
> than the alternative of having OSVersion computed before GetData() is
> called, which is *very* unlikely, if not impossible. For something that's
> not going to live longer than one release of Firefox, I'd say it's not worth
> the effort to try to plug all possible theoretical holes.

What is the problem with handling waitpid?
The existing waitpid code is sufficient there I assume (apart from existing potential zombies)?

The pipe will still exist after the child is reaped.

The issue here is not so much a theoretical hole, but no clear mechanism to
provide that one function to be called before another.  That leaves every
chance that it will work for some people but not others.
Flags: needinfo?(karlt)
needinfoing nthomas to get his input on where in the update url this value should be included for balrog.
Flags: needinfo?(nthomas)
ni? re waitpid questions
Flags: needinfo?(mh+mozilla)
I ran the build from comment #10 (on Ubuntu 14.04 LTS) and it queried for updates like this:

*** AUS:SVC Checker:checkForUpdates - sending request to: https://aus5.mozilla.org/update/3/Firefox/45.0a1/20151126234334/Linux_x86_64-gcc3/en-US/default/Linux%203.19.0-33-generic%20(GTK%203.10.8)/default/default/update.xml?force=1

ie OS_VERSION is being set to 'Linux 3.19.0-33-generic (GTK 3.10.8)'. For comparison 42.0 sends 'Linux 3.19.0-33-generic (GTK 2.24.23)'

What would an install with only GTK 2 send ? I can't get the build to run on Ubuntu 10.04 LTS.
(In reply to Nick Thomas [:nthomas] from comment #18)
> I ran the build from comment #10 (on Ubuntu 14.04 LTS) and it queried for
> updates like this:
> 
> *** AUS:SVC Checker:checkForUpdates - sending request to:
> https://aus5.mozilla.org/update/3/Firefox/45.0a1/20151126234334/Linux_x86_64-
> gcc3/en-US/default/Linux%203.19.0-33-generic%20(GTK%203.10.8)/default/
> default/update.xml?force=1
> 
> ie OS_VERSION is being set to 'Linux 3.19.0-33-generic (GTK 3.10.8)'. For
> comparison 42.0 sends 'Linux 3.19.0-33-generic (GTK 2.24.23)'
> 
> What would an install with only GTK 2 send ? I can't get the build to run on
> Ubuntu 10.04 LTS.

It would send the GTK 2 version, like before.
Attachment #8692825 - Attachment is obsolete: true
Flags: needinfo?(mh+mozilla)
Attachment #8695618 - Flags: review?(karlt)
Ok, that should be fine for the purposes of blocking updates.  Supposing we ship GTK3 at 45, and this detection code in 44, we'd probably do this for beta and release:
* add an update watershed, so that all users would update to 44.0 first (either the last beta or 44.0.x as appropriate). We could probably limit this to only Linux, since there's no reason to penalise Windows or Mac
* adjust the existing blocking-rule for deprecated GTK 2.0-2.17 so that it's applied to versions < 44 (instead of all versions)
* add a new blocking rule for requests with version >= 44.0 and OS_VERSION containing GTK 2, GTK 3.0, GTK 3.1, GTK 3.2, or GTK 3.3

Does that sound correct ? I have a nagging feeling I'm missing something.

It would be helpful if the new detection code was present in 44.0b1, since betas all use a version like 44.0 and we'd have to add extra restrictions using buildID.
Flags: needinfo?(nthomas)
As I mentioned in an email, isn't blocking going to "silently" leave people on an insecure version forever, possibly without knowing? How did we handle the requirement for XPSP2 or SP3?
The update.xml can tell the client that it is unsupported and notify the user

See https://bugzilla.mozilla.org/show_bug.cgi?id=843497 for screenshots of the UI's.
Comment on attachment 8695618 [details] [diff] [review]
Include the Gtk+3 version in update URL if available on Gtk+2 builds

Looks good, thanks!

>+  extern int gtk_read_end_of_the_pipe;

Nice.  I didn't know these could be declared in function scope.

>-                               nsDependentCString(gtkver));
>-    PR_smprintf_free(gtkver);
>+                               nsDependentCString(gtkver, gtkver_len));

Good idea.  That makes things simpler and safe.

>+    } while (gtkver_len < 0 && errno == EINTR);
>+    close(gtk_read_end_of_the_pipe);
>+  }
>+#endif
>+
>+  if (gtkver_len == 0) {
>+    gtkver_len = snprintf(gtkver, sizeof(gtkver), "GTK %u.%u.%u",

gtkver_len <= 0 in the condition may be better, so as to provide a more consistent looking string if something is wrong with the read.  I don't know how well things would cope with a missing secondaryLibrary.
Attachment #8695618 - Flags: review?(karlt) → review+
(In reply to Nick Thomas [:nthomas] from comment #21)
> * add a new blocking rule for requests with version >= 44.0 and OS_VERSION
> containing GTK 2, GTK 3.0, GTK 3.1, GTK 3.2, or GTK 3.3

Sounds good, thanks, as long as GTK 3.12, GTK 3.22, etc. are not blocked.
Just in case... there isn't any reason for me to review this as well since it doesn't touch app update code which should speed up getting this landed.
(In reply to Mike Hommey [:glandium] from comment #13)
> (In reply to Liz Henry (:lizzard) (needinfo? me) from comment #12)
> > Tracking for 45 as we may be aiming this feature at the 45 release.
> 
> If we aim Gtk+3 at the 45 release, then *this* bug needs tracking for 44.

Also, we'd need to switch 44 back to Gtk+2 before beta1, ideally.
https://hg.mozilla.org/mozilla-central/rev/139130c7ed7a
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
(In reply to Karl Tomlinson (back Dec 21 ni?:karlt) from comment #25)
> (In reply to Nick Thomas [:nthomas] from comment #21)
> > * add a new blocking rule for requests with version >= 44.0 and OS_VERSION
> > containing GTK 2, GTK 3.0, GTK 3.1, GTK 3.2, or GTK 3.3
> 
> Sounds good, thanks, as long as GTK 3.12, GTK 3.22, etc. are not blocked.

They would be, thanks for noticing that. We can add trailing dots to avoid that, ie GTK 2,GTK 3.1.,GTK 3.2.,GTK 3.3.
Ritu, see comment 13, you likely want to track this for 44.
Flags: needinfo?(lhenry) → needinfo?(rkothari)
Comment on attachment 8695618 [details] [diff] [review]
Include the Gtk+3 version in update URL if available on Gtk+2 builds

Approval Request Comment
[Feature/regressing bug #]: This allows to properly handle users without Gtk+3 installed, so as to not upgrade Firefox and leave them with something that doesn't start.
[User impact if declined]: Firefox not starting for some users after upgrade to the first version we ship with Gtk+3.
[Describe test coverage new/current, TreeHerder]:
- Open browser console
- Choose to show XHR requests under Network
- Open About Firefox/Nightly dialog
- Check in the browser console what the aus5.mozilla.org XHR request says about the GTK version. Without Gtk+3 installed on the test system, it should say version 2.something, with Gtk+3 installed, it should say 3.something.
[Risks and why]: Relatively low. The worst thing that can happen is the Gtk+3 version not being detected properly and the Gtk+2 version being sent in the update request in that case, instead of that of Gtk+3.
[String/UUID change made/needed]: None
Attachment #8695618 - Flags: approval-mozilla-aurora?
Comment on attachment 8695618 [details] [diff] [review]
Include the Gtk+3 version in update URL if available on Gtk+2 builds

Since 44 is now on beta, sliding the approval request. See comment 32.
Attachment #8695618 - Flags: approval-mozilla-aurora? → approval-mozilla-beta?
Comment on attachment 8695618 [details] [diff] [review]
Include the Gtk+3 version in update URL if available on Gtk+2 builds

This is needed for GTK+3 support in FF45 AFAIU. Beta44+
Flags: needinfo?(rkothari)
Attachment #8695618 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Will this statistic be available anywhere in public?
I think this info will be useful for other projects in the same situation.
(In reply to Wes Kocher (:KWierso) from comment #37)
> Backed out in https://hg.mozilla.org/releases/mozilla-beta/rev/4fd821ab2a4f
> for build bustage:
> 
> https://treeherder.mozilla.org/logviewer.html#?job_id=689732&repo=mozilla-
> beta

Crossing bug 1219392. Just needs to change Unused to unused in https://hg.mozilla.org/releases/mozilla-beta/rev/ed9d79f506e3#l1.78
Flags: needinfo?(mh+mozilla) → needinfo?(wkocher)
Depends on: 1239516
(In reply to Nick Thomas [:nthomas] from comment #30)
> (In reply to Karl Tomlinson (back Dec 21 ni?:karlt) from comment #25)
> > (In reply to Nick Thomas [:nthomas] from comment #21)
> > > * add a new blocking rule for requests with version >= 44.0 and OS_VERSION
> > > containing GTK 2, GTK 3.0, GTK 3.1, GTK 3.2, or GTK 3.3
> > 
> > Sounds good, thanks, as long as GTK 3.12, GTK 3.22, etc. are not blocked.
> 
> They would be, thanks for noticing that. We can add trailing dots to avoid
> that, ie GTK 2,GTK 3.1.,GTK 3.2.,GTK 3.3.

I enabled this on the Beta channel today. However, we don't yet have any text to point newly unsupported users at, so this is a straight hard block that serves them no update. Once that exists I'll update Balrog to return it.
[Tracking Requested - why for this release]:

Over in bug 1234337 it was pointed out that we'll need this backported to ESR38, otherwise we won't be able to safely update from people from ESR38 -> ESR45, because we won't have accurate GTK information from them.
(In reply to Ben Hearsum (:bhearsum) from comment #42)
> Over in bug 1234337 it was pointed out that we'll need this backported to
> ESR38, otherwise we won't be able to safely update from people from ESR38 ->
> ESR45, because we won't have accurate GTK information from them.

Good point, thanks, but I think bug 1245476 will take care of that.
(In reply to Karl Tomlinson (ni?:karlt) from comment #43)
> (In reply to Ben Hearsum (:bhearsum) from comment #42)
> > Over in bug 1234337 it was pointed out that we'll need this backported to
> > ESR38, otherwise we won't be able to safely update from people from ESR38 ->
> > ESR45, because we won't have accurate GTK information from them.
> 
> Good point, thanks, but I think bug 1245476 will take care of that.

Ah, so we don't need to backport since we'll pick up the new update URL for ESR users when 45.0 ESR ships (which will still use GTK2). Great!
Can we put this into place for 46 with update server rules, before I turn updates back on for 45.0.2 and below ?  What would we put into the xml to let users know Firefox updates are blocked until they update to gtk3?
Marking this as blocking 46.0.1 until we have figured this out.
The client side was completed in this bug. The releng side still needs to be completed in a new bug.
Bug 1270358 for that, we're blocking updates and will add the desupport message once we deploy a code tweak to the update server.
Depends on: 1270358
You need to log in before you can comment on or make changes to this bug.