Closed Bug 639842 Opened 14 years ago Closed 14 years ago

Implement GfxInfo on X11 by running a small glxinfo-like program as separate process

Categories

(Core :: Graphics, defect)

All
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla6

People

(Reporter: bjacob, Assigned: bjacob)

References

Details

Attachments

(8 files, 5 obsolete files)

1.33 KB, text/plain
Details
5.78 KB, text/plain
Details
17.49 KB, patch
joe
: review+
Details | Diff | Splinter Review
17.48 KB, patch
bjacob
: review+
Details | Diff | Splinter Review
8.07 KB, patch
Details | Diff | Splinter Review
8.88 KB, patch
joe
: review+
Details | Diff | Splinter Review
11.26 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
23.76 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
Safely getting GL driver info on X11 is hard, because the only way [1] to do that is to create a GL context and call glGetString(), but with bad drivers, just creating a GL context may crash.

During this conversation [1] Brian Paul suggested [2] to just do that in a separate process, and that turns out to be the only realistic option.

A naive approach to doing this, which JP pointed out is used by Compiz, is to just run glxinfo and parse its output. However, since we need far less information than glxinfo returns and we want this to be as fast as possible, let's implement our own tiny glxinfo-like program.

To further clear the startup-time concerns, we could have the firefox startup script run this in parallel.

[1] http://lists.freedesktop.org/archives/mesa-dev/2011-February/005267.html
[2] http://lists.freedesktop.org/archives/mesa-dev/2011-February/005306.html
Blocks: 624593
...of course another option is to just fork(). That could be faster, and would also have the great advantage of reusing our GLContextProviderGLX code.
Here's a mini glxinfo-like program that just creates a pixmap-backed GL context and prints GL strings.

Problem: it is very slow to run, actually it's only slightly faster than glxinfo.

$ glxtest.cpp -o glxtest -lX11 -L /usr/lib/nvidia/ -lGL -lXext -O2
$ time ./glxtest 
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=Quadro FX 880M/PCI/SSE2
GL_VERSION=3.3.0 NVIDIA 260.19.21

real    0m0.131s
user    0m0.016s
sys     0m0.088s

Can anyone more knowledgeable than me see if there is room for improvement? Note that most of the time is sys. Here's what 'perf' has to say about it:

$ perf_2.6.37 record -f ./glxtest
$ perf_2.6.37 report
    37.08%    :6167  [kernel.kallsyms]  [k] smp_call_function_many
    10.23%  glxtest  [kernel.kallsyms]  [k] fget_light
     5.14%  glxtest  [kernel.kallsyms]  [k] copy_page_c
     3.64%    :6167  [kernel.kallsyms]  [k] __purge_vmap_area_lazy
     2.95%  glxtest  ld-2.11.2.so       [.] _dl_relocate_object
     2.88%    :6167  [nvidia]           [k] cache_flush
     2.66%    :6167       7fc45a08cddc  [.]     7fc45a08cddc
     2.47%    :6167  [kernel.kallsyms]  [k] intel_pmu_disable_all
     2.43%    :6167  [kernel.kallsyms]  [k] walk_system_ram_range
     2.25%    :6167  [kernel.kallsyms]  [k] pci_conf1_read
     2.23%  glxtest  [kernel.kallsyms]  [k] handle_mm_fault
     1.96%    :6167  [kernel.kallsyms]  [k] handle_edge_irq
     1.87%  glxtest  libc-2.11.2.so     [.] memcpy
     1.85%    :6167  [nvidia]           [k] _nv021528rm
     1.72%  glxtest  [kernel.kallsyms]  [k] link_path_walk
     1.62%  glxtest  [kernel.kallsyms]  [k] mark_page_accessed
     1.61%    :6167  [kernel.kallsyms]  [k] __rcu_read_lock
     1.52%    :6167  [nvidia]           [k] os_alloc_mem
     1.22%    :6167  [nvidia]           [k] _nv021858rm
     1.13%  glxtest  [kernel.kallsyms]  [k] find_vma
     1.06%    :6167  [nvidia]           [k] _nv009782rm
     1.03%    :6167  [nvidia]           [k] _nv014874rm
Attached file call-graph from perf
The slowness seems concentrated on the glXMakeCurrent() call. Without it, I get:
real    0m0.050s
user    0m0.021s
sys     0m0.016s

(the glGetString calls are not significant).

So what's taking all this sys time is glXMakeCurrent() in NVIDIA blob; I wonder how other drivers behave.
...well of course, 50 milliseconds is still far too much just to get 3 strings!
With the FGLRX driver I get:
real	0m0.211s
user	0m0.080s
sys	0m0.010s
so it's even worse in 'real' time, and removing the glXMakeCurrent call doesn't improve it. Perf says:
    50.39%  glxtest       7ff7a502ae0d  [.] 0x007ff7a502ae0d
    15.23%  glxtest  [kernel.kallsyms]  [k] clear_page_c
    14.05%  glxtest  libc-2.12.1.so     [.] 0x00000000127598
    11.80%  glxtest  fglrx_dri.so       [.] 0x00000001664b69
     4.78%  glxtest  [kernel.kallsyms]  [k] page_fault
     2.11%  glxtest  [kernel.kallsyms]  [k] __wake_up_bit
     1.24%  glxtest  [kernel.kallsyms]  [k] free_hot_cold_page
     0.34%  glxtest  [fglrx]            [k] 0x00000000073094
     0.06%  glxtest  [kernel.kallsyms]  [k] perf_event_comm_ctx
     0.00%  glxtest  [kernel.kallsyms]  [k] native_write_msr_safe

Would be nice to check Mesa now...
GL_VENDOR=X.Org R300 Project
GL_RENDERER=Gallium 0.4 on ATI RV515
GL_VERSION=2.1 Mesa 7.10.1

real    0m0.036s
user    0m0.023s
sys     0m0.007s

Removing XSynchronize saved only about 2ms
(In reply to comment #7)
> GL_VENDOR=X.Org R300 Project
> GL_RENDERER=Gallium 0.4 on ATI RV515
> GL_VERSION=2.1 Mesa 7.10.1
> 
> real    0m0.036s
> user    0m0.023s
> sys     0m0.007s

Much better!

> 
> Removing XSynchronize saved only about 2ms

Oops --- didn't mean to leave this in.
So if you apply these two patches, then running firefox will give this output:

$ dist/bin/firefox -P test -no-remote
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=Quadro FX 880M/PCI/SSE2
GL_VERSION=3.3.0 NVIDIA 260.19.21

and the gist is that this info was obtained in a separate process but was printed by the main firefox process at the place where we need GfxInfo.

Next steps:
  * check impact on startup time
  * implement the real part 2 == implement GfxInfo on X11 using that info
  * investigate how to get rid of the zombie process (does that matter?)
(In reply to comment #10)
>   * investigate how to get rid of the zombie process (does that matter?)

Yes, it matters.  You can wait on the process at the point where you need its output.  You'll need to do that anyway so you can get the exit code if that process crashed.

For that matter, you might seriously consider having the standalone program install appropriate signal handlers so it can capture crash information and backtraces, and put that information somewhere you can get at it later to report as a driver bug.
(In reply to comment #11)
> (In reply to comment #10)
> >   * investigate how to get rid of the zombie process (does that matter?)
> 
> Yes, it matters.  You can wait on the process at the point where you need its
> output.  You'll need to do that anyway so you can get the exit code if that
> process crashed.

Yep, that's what I figured meanwhile, thanks. Fixes it.

> 
> For that matter, you might seriously consider having the standalone program
> install appropriate signal handlers so it can capture crash information and
> backtraces, and put that information somewhere you can get at it later to
> report as a driver bug.

Hm, that could be done locally on the developer's machine, but then, any debugger already does the trick. It would be very hard to deploy onto the end users machine, because the point of this separate process is to be able to silently crash without showing the user.
(In reply to comment #12)
> (In reply to comment #11)
> > For that matter, you might seriously consider having the standalone program
> > install appropriate signal handlers so it can capture crash information and
> > backtraces, and put that information somewhere you can get at it later to
> > report as a driver bug.
> 
> Hm, that could be done locally on the developer's machine, but then, any
> debugger already does the trick. It would be very hard to deploy onto the end
> users machine, because the point of this separate process is to be able to
> silently crash without showing the user.

Seems useful to record crashes for the same reason Firefox has a crash reporter.  This kind of crash seems like a reasonable thing to record, to help find issues that need reporting upstream to driver developers.
(In reply to comment #13)
> Seems useful to record crashes for the same reason Firefox has a crash
> reporter.  This kind of crash seems like a reasonable thing to record, to help
> find issues that need reporting upstream to driver developers.

What one could do is install a simple signal handler that writes at least the signal number and any X errors to the pipe.
This is ready for review. I have yet to measure the performance impact.
Attachment #518836 - Attachment is obsolete: true
Attachment #519941 - Flags: review?(jmuizelaar)
GfxInfo on X11 is implemented in xpwidgets, as it doesn't care about the actual widget toolkit, and I wanted this to work regardless of gtk/qt.

However, the nsWidgetFactory.cpp files seem to be one-per-widget-directory, so I wasn't sure how to do that part in xpwidgets. Specifically, I didn't see how to add stuff into kWidgetContracts[] and kWidgetCIDs[]. So I did that part separately for gtk and for qt.
Attachment #518838 - Attachment is obsolete: true
Attachment #519944 - Flags: review?
Attachment #519944 - Flags: review? → review?(joe)
Bug 642502 gives us a nicely working Graphics section in about:support on all platforms, including X11. For X11, it depends on this bug.
Blocks: 642502
Comment on attachment 519941 [details] [diff] [review]
Part 1: the firefox-app-side code: implement probe, set up pipe, fire probe as separate process

I have concerns about code running before XRE_Main; I'd like Benjamin to take a look at this.
Attachment #519941 - Flags: review?(benjamin)
Blocks: 595557
Comment on attachment 519944 [details] [diff] [review]
Part 2: the libxul-side changes: implement GfxInfo on X11

>diff --git a/widget/src/xpwidgets/GfxInfoX11.cpp b/widget/src/xpwidgets/GfxInfoX11.cpp

>+ * Portions created by the Initial Developer are Copyright (C) 2010

2011 :)

>+NS_EXPORT int glxtest_pipe = 0;

Put a comment in here saying that this is initialized to the id of an fd in fire_glxtest_process() in some_other_file.cpp.

>+nsresult
>+GfxInfo::Init()
>+{
>+    GfxInfoBase::Init();
>+    return NS_OK;

return GfxInfoBase::Init() ?

>+GfxInfo::GetData()
>+    ssize_t bytesread = read(glxtest_pipe,
>+                             &buf,
>+                             buf_size-1); // -1 because we'll append a zero
>+    close(glxtest_pipe);

I'm not super-enthusiastic about the fact that it's dup()'d in one file, at one time, and closed in another file, at a different time, but I don't have a better suggestion.



>diff --git a/widget/src/xpwidgets/GfxInfoX11.h b/widget/src/xpwidgets/GfxInfoX11.h

>+ * Portions created by the Initial Developer are Copyright (C) 2010

2011 :)
Attachment #519944 - Flags: review?(joe) → review+
Comment on attachment 519941 [details] [diff] [review]
Part 1: the firefox-app-side code: implement probe, set up pipe, fire probe as separate process

Why are we doing this is nsBrowserApp instead of in XRE_main where all xulrunner apps can share it? I believe this is the wrong place for this generic code.
Attachment #519941 - Flags: review?(benjamin) → review-
This was indeed a good idea. Now I fire the probe right before where we set up error handlers, which is exactly what I want.
Attachment #521946 - Flags: review?(benjamin)
Carrying forward r+.
Attachment #521947 - Flags: review+
And here are timings.

First of all, global timings (these are stupid). For Firefox warm startup on the about:support URI, my patches actually REDUCES starup time, here from 880 ms down to 640 ms. The reason is probably the same as for Mike Hommey's famous 'most stupid patch ever':
   https://bug632404.bugzilla.mozilla.org/attachment.cgi?id=510621
In other words: the fork is effectively loading the same libraries that Firefox will have to load anyway, so it acts as a preloader.


Now let's look at more precise timings about the parts which one could have feared would have slowed things down. This is again for a Firefox warm-start on about:support.

begin main(), time =          1301088058.784292
firing glx process, time =    1301088058.784456
data is ready, time =         1301088058.859206
need data, time =             1301088059.311368
finished reading data, time = 1301088059.311392


The "data is ready" line is printed by the probe when it's done writing into the pipe. The "need data" line is printed by GfxInfo when it's about to start reading from the pipe, and the "finished reading data" when it's done.

So what we see here is that the data was ready more than 440 ms before it was needed, which is a nice safety margin; it would be even higher on a cold start (this was a warm start) and on a slower machine (this is a Core i7).

Only 0.02 milliseconds were spent reading from the pipe, so I think it's safe to say that this doesn't have a significant effect on startup time.
(Notice that here about:support is just an example of a page that forces calling GfxInfo. I simulated Firefox-without-my-patches by no longer firing the probe, and no longer reading from the pipe, but I kept GfxInfoX11)
This patch calls XSync() and uses a X error handler that writes X errors into the pipe. In that way, we catch the case where creating a GL context would generate X errors (a major source of crashes on linux). This required to make the write-end of the pipe a global variable.

This patch also writes into the pipe any error we might encounter.

At the other end of the pipe, in GfxInfoX11.cpp, this patch writes driver into into AppNotes; if driver info couldn't be read from the pipe, that means that an error occured, so we put the text read from the pipe into AppNotes, assuming that contains an error message.
Attachment #522145 - Flags: review?(joe)
Blocks: 645407
Comment on attachment 522145 [details] [diff] [review]
Part 3: complete error checking, write driver info into AppNotes as we do on Windows

>-  ///// Get GL vendor/renderer/versions strings, write into pipe /////
>+  ///// Get GL vendor/renderer/versions strings /////
>   enum { bufsize = 1024 };
>   char buf[bufsize];
>+  const GLubyte *vendorString = glGetString(GL_VENDOR);
>+  const GLubyte *rendererString = glGetString(GL_RENDERER);
>+  const GLubyte *versionString = glGetString(GL_VERSION);
>+  
>+  if (!vendorString || !rendererString || !versionString)
>+    fatal_error("glGetString returned null");
>+
>   int length = snprintf(buf, bufsize,
>                         "VENDOR\n%s\nRENDERER\n%s\nVERSION\n%s\n",
>                         glGetString(GL_VENDOR),
>                         glGetString(GL_RENDERER),
>                         glGetString(GL_VERSION));

Did you mean to replace the snprintf parameters with vendorString et al?
Thanks, indeed I didn't mean to call glGetString again. Updated patch.
Attachment #522145 - Attachment is obsolete: true
Attachment #522145 - Flags: review?(joe)
Attachment #522451 - Flags: review?(joe)
Comment on attachment 521947 [details] [diff] [review]
Part 2 UPDATED: applied Joe's comments

>+    close(glxtest_pipe);
>+    glxtest_pipe = 0;
>+    wait(NULL); // kill zombie process

This should really be waitpid().
wait() will collect any one child.  i.e. that could collect a different child and that child may well have an owner that wants to know the exit status.

It would be nice to check the exit status of this child here too, to check that it shutdown cleanly.
Poor shutdown could be a good indication that multiple contexts will not work well.
@ Karl / comment 29: OK, I am doing these changes in bug 645407. Sorry for intertwining these 2 bugs, I promise I'll land them both at once.
Attachment #519941 - Flags: review?(jmuizelaar)
Updated part 3 to also check for presence of the GLX extension, see bug 645407 comment 21.
Attachment #523020 - Flags: review?(joe)
Comment on attachment 523020 [details] [diff] [review]
Part 3 UPDATED: complete error checking, write driver info into AppNotes as we do on Windows

>+  // In case of X errors, our X error handler will exit() now.
>+  // We really want to make sure that the system is able to create a GL context without generating X errors,
>+  // as these would crash the application.
>+  XSync(dpy, False);

>   XCloseDisplay(dpy);

XCloseDisplay will check for errors (using XSync), so there should be no need for an additional XSync.
Attachment #522451 - Flags: review?(joe)
Comment on attachment 523020 [details] [diff] [review]
Part 3 UPDATED: complete error checking, write driver info into AppNotes as we do on Windows

In the case that there's an error in reading the results of the glxtest process, you probably want to mention that in the appnote field, prepending it to buf.
Attachment #523020 - Flags: review?(joe) → review+
Comment on attachment 521946 [details] [diff] [review]
Part 1 UPDATED: moved to XRE_main

glxtest_pipe does not need to be/should not be NS_IMPORT/NS_EXPORT now that we require libxul.

I did not review the actual graphics calls because I don't know anything about them. I am worried about the wait() call in the read half of the solution: you're calling wait(NULL), but I really think we want to wait on the specific PID of the process we launched, to avoid any race conditions of other processes firefox launches exiting in the meantime.

Sorry for the delayed review.

Did you explicitly decide not to worry about the possibility of a hung process?

Also the return value of read() should be checked before calling buf[bytesread] = 0 to avoid issues ;-)
Attachment #521946 - Flags: review?(benjamin) → review-
> glxtest_pipe does not need to be/should not be NS_IMPORT/NS_EXPORT now that we
> require libxul.

OK, updated patch coming.

> I did not review the actual graphics calls because I don't know anything about
> them. I am worried about the wait() call in the read half of the solution:
> you're calling wait(NULL), but I really think we want to wait on the specific
> PID of the process we launched, to avoid any race conditions of other processes
> firefox launches exiting in the meantime.

Yes, sorry for the confusion: this is being addressed in attachment 523032 [details] [diff] [review], see bug 645407 comment 24. Sorry for the two intertwined patch series scattered over two bugs. This patch does exactly what you describe, using waitpid().

> 
> Also the return value of read() should be checked before calling buf[bytesread]
> = 0 to avoid issues ;-)

Good point, updated patch coming.
This applies Benjamin's review comment about dropping NS_IMPORT; also, it folds the toolkit/ side of Part 3 and of the patches in bug 645407 so in particular it calls waitpid() instead of wait(), addressing the rest of Benjamin's review comments.
Attachment #519941 - Attachment is obsolete: true
Attachment #521946 - Attachment is obsolete: true
Attachment #526279 - Flags: review?(benjamin)
Note: no need to worry about the graphics specifics, that has been reviewed by other people, here and in bug 645407
This patch folds part 3 and patches from bug 645407.

The only thing here that still needs review is the read() return value checking. The rest has already received r+.
Attachment #526324 - Flags: review?(benjamin)
Attachment #526279 - Flags: review?(benjamin) → review+
Attachment #526324 - Flags: review?(benjamin) → review+
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Backed out due to crashes on Linux64 opt. But it was green on tryserver.
http://hg.mozilla.org/mozilla-central/rev/418b0b9985c6
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
http://hg.mozilla.org/mozilla-central/rev/dba982206658
http://hg.mozilla.org/mozilla-central/rev/3d3e24239251
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
Now that this change has gone in, will it appear in one of the 5.0 alpha (Aurora) builds?
No, it's only in Nightly at the moment, and will first get released in Firefox 6.
Target Milestone: --- → mozilla6
Depends on: 658840
Depends on: 660466
Assignee: nobody → bjacob
Depends on: 678372
Is it me or does this version lack the wait check that the version in GfxInfoX11 does (from changeset 809d3205d3b8), so that there's a zombie process for the lifetime of the application?
AIUI the zombie exists until GfxInfoX11 is used.
Right now, that might only be WebGL.  In the future, perhaps the opengl layers code will reap the child.
(In reply to neil@parkwaycc.co.uk from comment #44)
> Is it me or does this version lack the wait check that the version in
> GfxInfoX11 does (from changeset 809d3205d3b8), so that there's a zombie
> process for the lifetime of the application?

This bug is a bit old so I dont remember the specifics of what the state of things was at this time, but we did/do have a zombie process problem in certain cases, see below:

(In reply to Karl Tomlinson (:karlt) from comment #45)
> AIUI the zombie exists until GfxInfoX11 is used.
> Right now, that might only be WebGL.  In the future, perhaps the opengl
> layers code will reap the child.

The last bug in this area should be bug 681026 which just landed on inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb708067dd57
Great, I'll look forward to pulling it from m-c!
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: