Closed Bug 518506 Opened 10 years ago Closed 10 years ago

Horrible flicker when scrolling windowed plugins living out-of-process

Categories

(Core :: Widget: Gtk, defect, P2)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla1.9.3a1
Tracking Status
status1.9.2 --- .4-fixed

People

(Reporter: cjones, Assigned: karlt)

References

()

Details

(Keywords: regression, verified1.9.2, Whiteboard: [3.6.x][fixed-lorentz])

Attachments

(3 files, 10 obsolete files)

5.62 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
42.36 KB, patch
roc
: review+
Details | Diff | Splinter Review
5.39 KB, patch
Details | Diff | Splinter Review
Navigate a recent (> 2009/09/23) Electrolysis build to the URL above.  The plugin windows appear to go white on every scroll event.

I have no idea how to fix this, and it's not urgent at all, but I'd like to make sure it's something we know how to avoid.
Blocks: OOPP
This is a regression from bug 339548 and affects out-of-process plugins such as the java plugin even on mozilla-central and 1.9.2. e.g.
http://java.sun.com/products/plugin/1.5.0/demos/plugin/applets/GraphLayout/example1.html
Assignee: nobody → mozbugz
Blocks: 339548
Flags: blocking1.9.2?
Summary: Horrible flicker when scrolling windowed plugins living out-of-process (Electrolysis) → Horrible flicker when scrolling windowed plugins living out-of-process
This is caused by the unmapping and re-mapping of widgets for plugins.  That
worked OK for in-process plugins that used GDK because GDK temporarily removes
the background from descendant GdkWindows when showing windows.

If the backgrounds are still present when the ancestor is mapped, the X
server paints the background when the windows become visible.  However, the
contents are not painted until glib processes idle events.

I think the solution is going to be dividing up the region passed to
gdk_window_move_region so that each sub region can be ordered wrt child
widgets such that no overlapping occurs.

roc and I discussed this and it sounds like this would be best done in XP code
as that is where the sorting algorithm currently is (SortBlitRectsForCopy, for
MS Windows blitting I assume, but only for blit rects) and having the child
widgets inter-sorted may also be useful on MS win32.
The default Window background is actually None with XCreateWindow but some color with XCreateSimpleWindow, black for GDK windows, and dependent on the theme for GDK windows in GTK widgets.  Some documentation:
http://tronche.com/gui/x/xlib/window/attributes/background.html
http://my.safaribooksonline.com/9780596806187/settable_attributes
http://www.sbin.org/doc/Xlib/chapt_04.html
Why would this block 1.9.2?
(In reply to comment #4)
> Why would this block 1.9.2?

It is a noticeable regression from 1.9.1 when scrolling with any of a number of plugins.

This includes out of process plugins such as the Java plugin, and plugins run through nspluginwrapper, which sounds reasonably common on Fedora.

It would also affect any non-GDK or non-XEmbed plugins, such as Swfdec, which AFAIK is the best open-source alternative to Adobe Flash Player.
Keywords: regression
Need to fix, even if it means backing out the fix that caused this (although that's compositor related, yes?)
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P2
Karl has a plan for this that will not require a backout.
Status update?
Karl, will you have this done by the freeze (Wed night pacific)? It's one of the only gfx blockers that hasn't seen much traction.
This is a record of an attempt at a quick solution using the Composite extension to redirect rendering of plugin windows during the scroll.  It possibly works quite well during the scroll, but when unredirecting rendering, the X server shows the background and sends an expose event -> same flicker problem.

Redirecting all (visible) plugin windows permanently would probably scroll well but I don't think we want the extra round trip from when the plugin draws to when we get notified and copy to the visible window.
For the record, last time I checked gdk_window_set_composited() wasn't in our minimum required gtk version.
Attached patch WIP 0.1.3 (obsolete) — Splinter Review
This is the work in progress.  It contains some proof of concept code and some
framework for doing this properly (as well as splashes of other ideas - ignore the nsRegion changes).

The approach is as described in comment 2 though its all just in GTK code at
the moment.  (It doesn't use Composite.)

Thus far it proves the idea works.  It orders the blit rects appropriately but
doesn't order the child widgets well.  Pixman is used to order the blit rects.
Pixman doesn't seem to have any documentation, but X relies on the same behavior so I'm not expecting any changes in what it provides.

It works well with rectangular plugin clip regions when scrolling vertically
or horizontally or when there are not many plugins or fixed position elements
around.

Remaining issues are particularly noticeable when scrolling unaligned adjacent plugins diagonally.  That is resolvable.

Still to do:
1) Finish ordering of child widgets for diagonal scrolling.
2) Work out what, if any, should be XP.
3) Properly fix build issues with pixman and --enable-shared.
4) Clean up.

A difference that is possibly not preferred is that, when plugins scroll into
view (either from out of the viewport or from behind a fixed position element)
the X server paints the background for the unpainted region before the plugin
gets the expose event.  Previously the previous screen content remained,
giving a blurring or tiling effect.  I think this difference is reasonable,
given that it is the plugin that chooses what background should be painted
when it is exposed.  What this approach is fixing is the hide and show that
the plugin can't control.

(In reply to comment #10)
> Karl, will you have this done by the freeze (Wed night pacific)?

I working hard on this and hoping to have this done by then.
Depends how reviews go, though.
Attachment #412774 - Attachment is obsolete: true
To the 1.9.2 drivers: This is both very scary and very important, and, were I in the position to make such a decision, I would _totally_ hold the release for this bug.
(In reply to comment #13)
> Still to do:
> 1) Finish ordering of child widgets for diagonal scrolling.
> 2) Work out what, if any, should be XP.
> 3) Properly fix build issues with pixman and --enable-shared.
> 4) Clean up.

Forgot to mention:
5) Fix bug 528852

Without that fixed, it could be quite obvious when scrolling with mulitple windowed plugins.  That bug shouldn't be as complex as what is happening here.
Depends on: 528852
Attached patch WIP 0.2.1 (obsolete) — Splinter Review
Filled in some gaps in child widget ordering and pruned out unused stuff.
Diagonal scrolling now usually works well.
Still need to resolve cycles better for complex geometries.
Attachment #412776 - Attachment is obsolete: true
I'm not sure this blocks; most Linux distros aren't going to pick up Firefox 3.6 for a while, and I'm told that Java on Linux isn't that popular.

Moving back to a nomination to discuss.
Flags: blocking1.9.2+ → blocking1.9.2?
(In reply to comment #17)
> I'm not sure this blocks; most Linux distros aren't going to pick up Firefox
> 3.6 for a while, and I'm told that Java on Linux isn't that popular.

The first part of that is not really correct. Ubuntu, for example, is already working out plans to upgrade its users to Firefox 3.6 due to upstream's (us) decision to most likely upgrade 3.5 users to 3.6 as a minor update.

However, I don't know how popular Java is, but asac may have some numbers.
Attachment #413042 - Flags: review? → review?(benjamin)
Attachment #413042 - Flags: review?(benjamin)
I think this is the simpler way to do what you want.
Attachment #413066 - Flags: review?(mozbugz)
Comment on attachment 413066 [details] [diff] [review]
Wrap pixman.h in non-libxul build, rev. 1

>diff --git a/config/system-headers b/config/system-headers

>+pixman.h

Thanks, Benjamin.  I tried this first and accidentally left this out of attachment 413042 [details] [diff] [review], but it wasn't enough on its own.

pixman definition source files include "pixman.h" not <pixman.h>, so they get the version in the same directory as the source file, not the dist/system_wrappers version.

I see now that PIXMAN_EXPORT is set up correctly for __GNUC__ and __SUNPRO_C and used in definitions, so that works well with -fvisibility=hidden (and -xldscope=hidden I assume), but PIXMAN_EXPORT is not used in the headers, and "#pragma GCC visibility push(hidden)" seems to affect the declarations which end up overriding the definitions.

I'm not sure whether sorting this out is essential as libpixman users should be using MOZ_CAIRO_LIBS (or similar) anyway, but it seems nice to pick up the definition from libthebes.so if it is already there rather than having another copy.
Attachment #413066 - Flags: review?(mozbugz)
(In reply to comment #17)
> I'm told that Java on Linux isn't that popular.

See also comment 5.  If nspluginwrapper is widely used, then this will show up with Adobe Flash Player.
Comment on attachment 413152 [details] [diff] [review]
make public pixman symbols available in --disable-libxul and --enable-system-cairo builds v2
[Checkin: See comment 38]

Ugh. That's really ugly makefile magic (please remove the $(warning) before checkin.)

Is there really no way to modify pixman.h so that for example it has something like:

#ifdef USE_VISIBILITY_PRAGMAS
#pragma GCC visibility push(default)
#endif

body of pixman.h

#ifdef USE_VISIBILITY_PRAGMAS
#pragma GCC visibility pop
#endif
Attachment #413152 - Flags: review?(benjamin) → review+
(In reply to comment #24)
> (please remove the $(warning) before checkin.)

Oh, yeah.  Thanks.

> Is there really no way to modify pixman.h so that for example it has something
> like:
> 
> #ifdef USE_VISIBILITY_PRAGMAS
> #pragma GCC visibility push(default)
> #endif
> 
> body of pixman.h
> 
> #ifdef USE_VISIBILITY_PRAGMAS
> #pragma GCC visibility pop
> #endif

Something like that could probably be done.
We are using pixman sources designed to use -fvisibility=hidden but dropping their configure that provides the flag and using our own Makefile and magic.  The issue results from our build system being different, so I'd like to fix in our build system if possible.

What about just setting VISIBILITY_FLAGS to -fvisibility=hidden for gcc?
Would there be any disadvantage in that for C code?
The linker seems to resolve DEFAULT UND internal references to the HIDDEN symbols.  Would the calls really go through the PLT?
(In reply to comment #25)
> The linker seems to resolve DEFAULT UND internal references to the HIDDEN
> symbols.  Would the calls really go through the PLT?

OK.  Sounds like that is not optimal.

http://gcc.gnu.org/wiki/Visibility
"This also helps producing more optimised code: when you declare something defined outside the current compilation unit, GCC cannot know if that symbol resides inside or outside the DSO in which the current compilation unit will eventually end up; so, GCC must assume the worst and route everything through the GOT (Global Offset Table) which carries overhead both in code space and extra (costly) relocations for the dynamic linker to perform. To tell GCC a class, struct, function or variable is defined within the current DSO you must specify hidden visibility manually within its header file declaration (using the example above, you declare such things with DLLLOCAL). This causes GCC to generate optimal code."
Attached patch patch v0.5 (obsolete) — Splinter Review
This is now fully functional.  This version includes support for non-rectangular plugin geometries and resolves cycles appropriately.  It does not include a fix for bug 528852.  It needs cleaning up and more comments.
Attachment #412821 - Attachment is obsolete: true
Jeff showed the effects of this to me today; doesn't block, IMO, but the fix is great to have, thanks Karl!
Flags: blocking1.9.2? → blocking1.9.2-
Whiteboard: [3.6.x]
Blocks: 530605
Attachment #413258 - Attachment is obsolete: true
Attachment #414135 - Flags: review?(roc)
+    if (absDelta.y >= absDelta.x) {

Slightly simpler to just use PR_ABS(aDelta.y) >= PR_ABS(aDelta.x) and dispense with the local variables.

+    // binary merge

You should probably explain why this is necessary.

+            // gdk_window_move_region, up to GDK 2.16 at least, has a ghastly
+            // bug where it doesn't restrict blitting to the given region, and
+            // blits its full bounding box.  So we have to work around that by
+            // blitting one rectangle at a time.

This comment is obsolete now that you've integrated plugin movement.
This patch is amazing. In fact, I think it's TOO amazing. Normal people wouldn't even have tried to do this good a job in the pathological situations you handle :-). But I think it's more complex than I really want to maintain (or test). I think we should focus on the cases where a) plugins don't overlap and b) plugin clip regions are at most a single rectangle. Both of these situations were handled very poorly or not at all pre-3.6 and they are very rare in the wild, as far as I know. It seems that given these two assumptions, the code can be vastly simpler; if the assumptions are violated, we'll get flicker but it won't be the end of the world. If we hit some important cases where these assumptions don't hold we can always revive this patch.
Comment on attachment 413152 [details] [diff] [review]
make public pixman symbols available in --disable-libxul and --enable-system-cairo builds v2
[Checkin: See comment 38]

check-sync-dirs.py wants js/src/config/system-headers to be the same as config/system-headers even though js won't use pixman, so that change will need to be made if this patch is still needed for v2.
This would save two sorts being done with MOZ_WIDGET_GTK2, but that would be the only platform to gain.
Blocks: 531142
This is just the algorithm of attachment 414135 [details] [diff] [review] merged with the fix for bug 528852, which required modification to deal with child window moves.
Attachment #414135 - Attachment is obsolete: true
Attachment #414135 - Flags: review?(roc)
This version only takes care about plugin widgets with rectangular clip
regions.  Overlapping rectangular clip regions will still end up going through
the sort, but the sort does not provide any particular order for those
widgets.

This uses essentially the same algorithm as used in SortBlitRectsForCopy(),
changing the code to use a linked list so that the same sort code is used on
all platforms.

(Pixman regions are used here mainly because pixman_region32_init_rects() is
useful, though the sort is also more efficient with banded rectangles.  Doing
an efficient version of pixman_region32_init_rects for nsIntRegion is
non-trivial when the rectangles can overlap, as they may do for child clip
regions.  I'm not sure whether it really makes sense to implement another
version for nsRegion or to use pixman_region32 to implement nsRegion rather
than having two different region implementations in the tree.  Right now it
seems easiest to use pixman regions here.)
Attachment #414195 - Attachment is obsolete: true
Attachment #416021 - Attachment is obsolete: true
Attachment #417177 - Flags: review?(roc)
Comment on attachment 417177 [details] [diff] [review]
order child widget moves (only) a little carefully when scrolling
[Checkin: See comment 39]

+    // rectangles is moving in the direction of decreasing x and y (left

Rectangles *are* moving

+  PRBool HaveMore() { return mHead != nsnull; }

We tend to use the name "IsDone" here.
Attachment #417177 - Flags: review?(roc) → review+
Comment on attachment 413152 [details] [diff] [review]
make public pixman symbols available in --disable-libxul and --enable-system-cairo builds v2
[Checkin: See comment 38]

Pushed with the $(warning) removed and the change to js/src/config/system-headers (comment 32).
http://hg.mozilla.org/mozilla-central/rev/d6cf9a15c50f
Comment on attachment 417177 [details] [diff] [review]
order child widget moves (only) a little carefully when scrolling
[Checkin: See comment 39]

Pushed with changes from review in comment 37.
http://hg.mozilla.org/mozilla-central/rev/274480cf21d7
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a1
Comment on attachment 417178 [details] [diff] [review]
mochitest
[Checkin: See comment 40]

I changed this from a chrome test to a simple mochitest.  I don't understand why, but the script in the chrome test doesn't seem to have access to the plugin object methods (with the plugin in a document loaded in the iframe).  There's no good reason for this to be a chrome test anyway.

http://hg.mozilla.org/mozilla-central/rev/f20c4c211ac8
http://hg.mozilla.org/mozilla-central/rev/808b422d1274

With MOZ_WIDGET_TOOLKIT == cocoa, the test failed with
OS X 10.5.2 try hg unit test
TEST-UNEXPECTED-FAIL |
 /tests/widget/tests/test_plugin_scroll_invalidation.html |
 no paint on scroll - got 17, expected 10
Perhaps that might be because nsChildView::Scroll is not fussy about the order in which child views are Resized, but then I didn't know that view resizing caused invalidations.

With MOZ_WIDGET_TOOLKIT == windows, the test failed with
TEST-UNEXPECTED-FAIL |
 /tests/widget/tests/test_plugin_scroll_invalidation.html |
 no paint on scroll - got 11, expected 5
I don't know why ScrollWindowEx would be causing invalidations there.

I don't know how to test for the widget implementation from javascript (testing for Linux is not right) in order to use a todo so I've only added the test for MOZ_WIDGET_TOOLKIT == gtk2.  Perhaps the consequences are not as bad on the other windowing systems as they are on X11 anyway.
Flags: in-testsuite+
This was an experimental change to the mochitest, to wait longer for painting to complete, in case resizing cause synchronous (or at least earlier) paints.
With this patch there was still the same number of unnecessary paints.

OS X 10.5.2 try hg unit test
TEST-UNEXPECTED-FAIL |
 /tests/widget/tests/test_plugin_scroll_invalidation.html |
 no paint on scroll - got 14, expected 7

WINNT 5.2 try hg unit test
TEST-UNEXPECTED-FAIL |
 /tests/widget/tests/test_plugin_scroll_invalidation.html |
 no paint on scroll - got 16, expected 10
Attachment #417438 - Attachment is obsolete: true
This breaks the Qt build with a compilation error caused by http://hg.mozilla.org/mozilla-central/diff/274480cf21d7/widget/src/qt/nsWindow.cpp since it includes an extra brace.
Depends on: 534650
Comment on attachment 417177 [details] [diff] [review]
order child widget moves (only) a little carefully when scrolling
[Checkin: See comment 39]

>+  for (ScrollRect** nextLink = aUnmovedLink; ScrollRect* otherRect = *nextLink; ) {
[Seeing a declaration in a conditional expression confused me, and VC7.1 too!]
Depends on: 536065
Depends on: 537870
Depends on: 547577
Depends on: 550211
Flags: in-litmus?
https://litmus.mozilla.org/show_test.cgi?id=11588 has been added to Litmus.
Flags: in-litmus? → in-litmus+
http://hg.mozilla.org/projects/firefox-lorentz/rev/b2e1a5533d2b

http://hg.mozilla.org/projects/firefox-lorentz/rev/b60caef6244c
Karl, please carefully check this one, especially the nsChildView.mm changes, since there were some branch changes that didn't apply directly.

http://hg.mozilla.org/projects/firefox-lorentz/rev/90524c2db1c1
http://hg.mozilla.org/projects/firefox-lorentz/rev/2630d5fc17f0
Whiteboard: [3.6.x] → [3.6.x][fixed-lorentz]
Attachment #417177 - Attachment description: order child widget moves (only) a little carefully when scrolling → order child widget moves (only) a little carefully when scrolling [Checkin: See comment 39]
Attachment #413152 - Attachment description: make public pixman symbols available in --disable-libxul and --enable-system-cairo builds v2 → make public pixman symbols available in --disable-libxul and --enable-system-cairo builds v2 [Checkin: See comment 38]
Attachment #417178 - Attachment description: mochitest → mochitest [Checkin: See comment 40]
(In reply to comment #45)
> http://hg.mozilla.org/projects/firefox-lorentz/rev/b60caef6244c
> Karl, please carefully check this one, especially the nsChildView.mm changes,
> since there were some branch changes that didn't apply directly.

Just one minor thing that only affects --enable-shared builds (not --enable-ipc):

http://hg.mozilla.org/projects/firefox-lorentz/rev/75e81b7bf3f3
Blanket approval for Lorentz merge to mozilla-1.9.2
a=beltzner for 1.9.2.4 - please make sure to mark status1.9.2:.4-fixed
Marking verified for 1.9.2 based on passing mochitests.
Keywords: verified1.9.2
Depends on: 622258
You need to log in before you can comment on or make changes to this bug.