Closed Bug 1056340 Opened 10 years ago Closed 10 years ago

e10s: Grooveshark crashes all tabs

Categories

(Core :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla35
Tracking Status
e10s m3+ ---

People

(Reporter: sstangl, Assigned: mchang)

References

()

Details

(Keywords: crash)

Attachments

(10 files, 4 obsolete files)

10.81 KB, text/plain
Details
15.38 KB, patch
Details | Diff | Splinter Review
76.55 KB, text/plain
Details
650.47 KB, text/plain
Details
324 bytes, text/html
Details
4.06 KB, text/plain
Details
734.44 KB, text/plain
Details
9.05 KB, text/plain
Details
15.83 KB, text/plain
Details
5.03 KB, patch
mchang
: review+
Details | Diff | Splinter Review
With e10s and Flash enabled, browsing to http://grooveshark.com causes the tab to display "Tab crashed". Additionally, all other open tabs crash with a similar page.

On the non-Grooveshark tabs, clicking the "Try Again" button reloads the tab as well as Grooveshark, causing the tab to again crash. Manually refreshing the tab allows it to load successfully.
Assignee: nobody → wmccloskey
Blocks: old-e10s-m2
Attached file stack
This seems to be a graphics issue. First we hit this assertion:

[Child 19290] ###!!! ASSERTION: creating Xlib surface failed!: 'Error', file /home/billm/moz/in1/gfx/layers/basic/TextureClientX11.cpp, line 113

Then we crash with the stack I've attached. Can any of you graphics guys see what's happening here?
Flags: needinfo?(matt.woodrow)
Flags: needinfo?(davidp99)
Flags: needinfo?(bjacob)
The top line of the stack suggests that it's an OOM crash. Do you see memory usage spike right before it crashes?
Flags: needinfo?(matt.woodrow)
I think this OOM error is bogus. The small size, 16k, means that if this were an OOM then the system would be totally overwhelmed already so it would likely OOM elsewhere before hitting this nice OOM error reporting.

The code in the parent stack frame is

void
gfxContext::PushNewDT(gfxContentType content)
{
  Rect clipBounds = GetAzureDeviceSpaceClipBounds();
  clipBounds.RoundOut();

  clipBounds.width = std::max(1.0f, clipBounds.width);
  clipBounds.height = std::max(1.0f, clipBounds.height);

  SurfaceFormat format = gfxPlatform::GetPlatform()->Optimal2DFormatForContent(content);

  RefPtr<DrawTarget> newDT =
    mDT->CreateSimilarDrawTarget(IntSize(int32_t(clipBounds.width), int32_t(clipBounds.height)),
                                 format);

  if (!newDT) {
    NS_WARNING("Failed to create DrawTarget of sufficient size.");
    newDT = mDT->CreateSimilarDrawTarget(IntSize(64, 64), format);

    if (!newDT) {
      // If even this fails.. we're most likely just out of memory!
      NS_ABORT_OOM(BytesPerPixel(format) * 64 * 64);
    }
  }

So the condition that determines that we hit the NS_ABORT_OOM, is that CreateSimilarDrawTarget failed. There seem to be many ways that that could fail.

To make more progress, I would need to be able to reproduce, but I can't.
Flags: needinfo?(bjacob)
So BenWa helped me dump display lists, and this looks like a layoutish bug where we are not clipping correctly to the viewport dimensions a web page that has some huge / faraway pieces, and end up passing those to gfx where things blow up.

Here's the stack to the first point of failure in gfx:

Breakpoint 1, gfxASurface::CheckSurfaceSize (sz=..., limit=32767) at /home/bjacob/hack/mozilla-central/gfx/thebes/gfxASurface.cpp:394
394	        NS_WARNING("Surface size too large (exceeds caller's limit)!");
(gdb) bt
#0  gfxASurface::CheckSurfaceSize (sz=..., limit=32767) at /home/bjacob/hack/mozilla-central/gfx/thebes/gfxASurface.cpp:394
#1  0x00007f438d4db147 in CreatePixmap (screen=0x7f4382866300, size=..., depth=32, relatedDrawable=0)
    at /home/bjacob/hack/mozilla-central/gfx/thebes/gfxXlibSurface.cpp:103
#2  0x00007f438d4db4e4 in gfxXlibSurface::Create (screen=0x7f4382866300, format=0x7f43828fc0f0, size=..., relatedDrawable=0)
    at /home/bjacob/hack/mozilla-central/gfx/thebes/gfxXlibSurface.cpp:219
#3  0x00007f438d4d7340 in gfxPlatformGtk::CreateOffscreenSurface (this=0x7f437ad65780, size=..., contentType=COLOR_ALPHA)
    at /home/bjacob/hack/mozilla-central/gfx/thebes/gfxPlatformGtk.cpp:105
#4  0x00007f438d38e0b8 in mozilla::layers::TextureClientX11::AllocateForSurface (this=0x7f4375b6ef00, aSize=..., 
    aTextureFlags=mozilla::layers::ALLOC_CLEAR_BUFFER) at /home/bjacob/hack/mozilla-central/gfx/layers/basic/TextureClientX11.cpp:111
#5  0x00007f438d409880 in mozilla::layers::TextureClient::CreateForDrawing (aAllocator=0x7f4379b7e0c0, aFormat=mozilla::gfx::B8G8R8A8, 
    aSize=..., aMoz2DBackend=mozilla::gfx::CAIRO, aTextureFlags=mozilla::layers::ALLOW_REPEAT, 
    aAllocFlags=mozilla::layers::ALLOC_CLEAR_BUFFER) at /home/bjacob/hack/mozilla-central/gfx/layers/client/TextureClient.cpp:335
#6  0x00007f438d4095fa in mozilla::layers::CompositableClient::CreateTextureClientForDrawing (this=0x7f4375b4a7b0, 
    aFormat=mozilla::gfx::B8G8R8A8, aSize=..., aMoz2DBackend=mozilla::gfx::NONE, aTextureFlags=mozilla::layers::ALLOW_REPEAT, 
    aAllocFlags=mozilla::layers::ALLOC_CLEAR_BUFFER) at /home/bjacob/hack/mozilla-central/gfx/layers/client/CompositableClient.cpp:206
#7  0x00007f438d40a757 in mozilla::layers::ContentClientRemoteBuffer::CreateBackBuffer (this=0x7f4375b4a7b0, aBufferRect=...)
    at /home/bjacob/hack/mozilla-central/gfx/layers/client/ContentClient.cpp:302
#8  0x00007f438d40a665 in mozilla::layers::ContentClientRemoteBuffer::BuildTextureClients (this=0x7f4375b4a7b0, 
    aFormat=mozilla::gfx::B8G8R8A8, aRect=..., aFlags=1) at /home/bjacob/hack/mozilla-central/gfx/layers/client/ContentClient.cpp:295
#9  0x00007f438d40a963 in mozilla::layers::ContentClientRemoteBuffer::CreateBuffer (this=0x7f4375b4a7b0, aType=COLOR_ALPHA, aRect=..., 
    aFlags=1, aBlackDT=0x7fff7b1461b8, aWhiteDT=0x7fff7b1461b0)
    at /home/bjacob/hack/mozilla-central/gfx/layers/client/ContentClient.cpp:331
#10 0x00007f438d40abdb in non-virtual thunk to mozilla::layers::ContentClientRemoteBuffer::CreateBuffer(gfxContentType, nsIntRect const&, unsigned int, mozilla::RefPtr<mozilla::gfx::DrawTarget>*, mozilla::RefPtr<mozilla::gfx::DrawTarget>*) () at Unified_cpp_gfx_layers2.cpp:348
#11 0x00007f438d3b8bef in mozilla::layers::RotatedContentBuffer::BeginPaint (this=0x7f4375b4a7d8, aLayer=0x7f4375b63000, aFlags=4)
    at /home/bjacob/hack/mozilla-central/gfx/layers/RotatedBuffer.cpp:646
#12 0x00007f438d41b89c in mozilla::layers::ContentClientRemoteBuffer::BeginPaintBuffer (this=0x7f4375b4a7b0, aLayer=0x7f4375b63000, 
    aFlags=4) at ../../dist/include/mozilla/layers/ContentClient.h:214
#13 0x00007f438d4066c8 in mozilla::layers::ClientThebesLayer::PaintThebes (this=0x7f4375b63000)
    at /home/bjacob/hack/mozilla-central/gfx/layers/client/ClientThebesLayer.cpp:55
#14 0x00007f438d406ddd in mozilla::layers::ClientThebesLayer::RenderLayerWithReadback (this=0x7f4375b63000, aReadback=0x7fff7b146600)
    at /home/bjacob/hack/mozilla-central/gfx/layers/client/ClientThebesLayer.cpp:132
#15 0x00007f438d406fbf in non-virtual thunk to mozilla::layers::ClientThebesLayer::RenderLayerWithReadback(mozilla::layers::ReadbackProcessor*) () at Unified_cpp_gfx_layers2.cpp:134
#16 0x00007f438d429613 in mozilla::layers::ClientContainerLayer::RenderLayer (this=0x7f4375b62000)
    at /home/bjacob/hack/mozilla-central/gfx/layers/client/ClientContainerLayer.h:69
#17 0x00007f438d42975c in non-virtual thunk to mozilla::layers::ClientContainerLayer::RenderLayer() () at Unified_cpp_gfx_layers2.cpp:76
#18 0x00007f438d41c985 in mozilla::layers::ClientLayer::RenderLayerWithReadback (this=0x7f4375b622d0, aReadback=0x7fff7b146710)
    at /home/bjacob/hack/mozilla-central/gfx/layers/client/ClientLayerManager.h:337
#19 0x00007f438d429613 in mozilla::layers::ClientContainerLayer::RenderLayer (this=0x7f4374056000)
    at /home/bjacob/hack/mozilla-central/gfx/layers/client/ClientContainerLayer.h:69
#20 0x00007f438d42975c in non-virtual thunk to mozilla::layers::ClientContainerLayer::RenderLayer() () at Unified_cpp_gfx_layers2.cpp:76
#21 0x00007f438d404caa in mozilla::layers::ClientLayerManager::EndTransactionInternal (this=0x7f437ade8500, aCallback=
    0x7f438f301740 <mozilla::FrameLayerBuilder::DrawThebesLayer(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*)>, aCallbackData=0x7fff7b147140)
    at /home/bjacob/hack/mozilla-central/gfx/layers/client/ClientLayerManager.cpp:220
#22 0x00007f438d404e5b in mozilla::layers::ClientLayerManager::EndTransaction (this=0x7f437ade8500, 
    aCallback=0x7f438f301740 <mozilla::FrameLayerBuilder::DrawThebesLayer(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*)>, aCallbackData=0x7fff7b147140, 
    aFlags=mozilla::layers::LayerManager::END_NO_REMOTE_COMPOSITE)
    at /home/bjacob/hack/mozilla-central/gfx/layers/client/ClientLayerManager.cpp:246
#23 0x00007f438f374e20 in nsDisplayList::PaintForFrame (this=0x7fff7b147098, aBuilder=0x7fff7b147140, aCtx=0x0, aForFrame=0x7f437414c450, 
    aFlags=13) at /home/bjacob/hack/mozilla-central/layout/base/nsDisplayList.cpp:1340
#24 0x00007f438f374418 in nsDisplayList::PaintRoot (this=0x7fff7b147098, aBuilder=0x7fff7b147140, aCtx=0x0, aFlags=13)
    at /home/bjacob/hack/mozilla-central/layout/base/nsDisplayList.cpp:1188
#25 0x00007f438f393487 in nsLayoutUtils::PaintFrame (aRenderingContext=0x0, aFrame=0x7f437414c450, aDirtyRegion=..., aBackstop=0, 
    aFlags=772) at /home/bjacob/hack/mozilla-central/layout/base/nsLayoutUtils.cpp:3059
#26 0x00007f438f2cb6ee in PresShell::Paint (this=0x7f4371a68c00, aViewToPaint=0x7f437161b270, aDirtyRegion=..., aFlags=1)
    at /home/bjacob/hack/mozilla-central/layout/base/nsPresShell.cpp:6212
#27 0x00007f438e8717ae in nsViewManager::ProcessPendingUpdatesPaint (this=0x7f4371a0c180, aWidget=0x7f437ade7f00)
    at /home/bjacob/hack/mozilla-central/view/nsViewManager.cpp:443
#28 0x00007f438e87145e in nsViewManager::ProcessPendingUpdatesForView (this=0x7f4371a0c180, aView=0x7f437161b270, aFlushDirtyRegion=true)
    at /home/bjacob/hack/mozilla-central/view/nsViewManager.cpp:384
#29 0x00007f438e8725a4 in nsViewManager::ProcessPendingUpdates (this=0x7f4371a0c180)
    at /home/bjacob/hack/mozilla-central/view/nsViewManager.cpp:1075
#30 0x00007f438f2e9fc4 in nsRefreshDriver::Tick (this=0x7f4371a65800, aNowEpoch=1409174282532664, aNowTime=...)
    at /home/bjacob/hack/mozilla-central/layout/base/nsRefreshDriver.cpp:1280
#31 0x00007f438f2efa9c in mozilla::RefreshDriverTimer::TickDriver (driver=0x7f4371a65800, jsnow=1409174282532664, now=...)
    at /home/bjacob/hack/mozilla-central/layout/base/nsRefreshDriver.cpp:171
#32 0x00007f438f2ef972 in mozilla::RefreshDriverTimer::Tick (this=0x7f4377d52300)
    at /home/bjacob/hack/mozilla-central/layout/base/nsRefreshDriver.cpp:162
#33 0x00007f438f2ef821 in mozilla::RefreshDriverTimer::TimerTick (aTimer=0x7f4379b02740, aClosure=0x7f4377d52300)
    at /home/bjacob/hack/mozilla-central/layout/base/nsRefreshDriver.cpp:188
#34 0x00007f438c2026f2 in nsTimerImpl::Fire (this=0x7f4379b02740) at /home/bjacob/hack/mozilla-central/xpcom/threads/nsTimerImpl.cpp:618
#35 0x00007f438c202b51 in nsTimerEvent::Run (this=0x7f4377d7e110) at /home/bjacob/hack/mozilla-central/xpcom/threads/nsTimerImpl.cpp:711
#36 0x00007f438c1fd64a in nsThread::ProcessNextEvent (this=0x7f43828c84f0, aMayWait=false, aResult=0x7fff7b14831e)
---Type <return> to continue, or q <return> to quit---q
 at /home/bjacob/hack/mozQuit
(gdb) p sz
$1 = (const nsIntSize &) @0x7fff7b1459c0: {<mozilla::gfx::BaseSize<int, nsIntSize>> = {width = 121278, height = 50}, <No data fields>}
(gdb) p 121278*60
$2 = 7276680


^ See the size above, 121278 px, times 60 == 7276680 layout units.

Now the display lists, showing the same size up to 0.5 px difference, 7276650:

Painting --- before optimization (dirty -7183110,0,7276650,98040):
CanvasBackgroundColor p=0x7f4371a6b020 f=0x7f437414cbd0(Canvas(html)(-1)) bounds(0,0,93540,98040) visible(0,0,93540,98040) componentAlpha(0,0,0,0) clip(0,0,93540,98040)  uniform (opaque 0,0,93540,98040) (rgba 255,255,255,255)
WrapList p=0x7f4375b61c20 f=0x7f43750a23b0(HTMLScroll(body)(2)) bounds(0,0,10860,1200) visible(0,0,93540,98040) componentAlpha(-60,0,10920,1200) clip() 
  Text p=0x7f4371a6b088 f=0x7f4376826020(Text(0)"Grooveshark Mobile Applications") bounds(-60,0,10920,1200) visible(0,0,93540,98040) componentAlpha(-60,0,10920,1200) clip(0,0,93540,98040) 
BackgroundColor p=0x7f4375b60308 f=0x7f43768e7e30(HTMLScroll(div)(7)) z=300 bounds(0,3000,93540,95040) visible(0,3000,93540,95040) componentAlpha(0,0,0,0) clip(0,3000,93540,95040)  uniform (opaque 0,3000,93540,95040) (rgba 0.972549,0.435294,0.019608,1.000000)
nsDisplayTransform p=0x7f4375b601b0 f=0x7f4376827a60(Block(div)(5)) z=400 bounds(-7183170,0,7276710,3000) visible(-7183110,0,7276650,3000) componentAlpha(-7183170,0,7276710,3000) clip(0,0,93540,98040) [ I ]
  BackgroundColor p=0x7f4371a6b0f8 f=0x7f4376827a60(Block(div)(5)) z=400 bounds(0,0,93540,3000) visible(-7183110,0,7276650,3000) componentAlpha(0,0,0,0) clip(0,0,93540,2940)  uniform (opaque 0,0,93540,3000) (rgba 0.200000,0.200000,0.200000,1.000000)
  Background p=0x7f4371a6b188 f=0x7f4376827a60(Block(div)(5)) z=400 bounds(0,0,93540,2940) visible(-7183110,0,7276650,3000) componentAlpha(0,0,0,0) clip(0,0,93540,2940)  (opaque 0,0,93540,2940)
  Border p=0x7f4371a6b238 f=0x7f4376827a60(Block(div)(5)) z=400 bounds(0,2940,93540,60) visible(-7183110,0,7276650,3000) componentAlpha(0,0,0,0) clip() 
  Border p=0x7f4371a6b280 f=0x7f43768db0a8(Block(div)(3)) bounds(27150,600,60,1800) visible(-7183110,600,7210320,1800) componentAlpha(0,0,0,0) clip() 
  WrapList p=0x7f4375b60120 f=0x7f43768db9d8(Block(a)(1)) bounds(-7183170,600,7210320,1800) visible(-7183110,600,7210320,1800) componentAlpha(-7183170,900,4320,1200) clip() 
    Background p=0x7f4371a6b2e8 f=0x7f43768db9d8(Block(a)(1)) bounds(17070,900,7920,1260) visible(-7183110,600,7210320,1800) componentAlpha(0,0,0,0) clip(16170,600,10920,1800) 
    Border p=0x7f4371a6b398 f=0x7f43768db9d8(Block(a)(1)) bounds(27090,600,60,1800) visible(-7183110,600,7210320,1800) componentAlpha(0,0,0,0) clip() 
    Text p=0x7f4375b60020 f=0x7f43768e7418(Text(0)"Grooveshark") bounds(-7183170,900,4320,1200) visible(-7183110,600,7210320,1800) componentAlpha(-7183170,900,4320,1200) clip() 
  Background p=0x7f4375b60070 f=0x7f43768e74f8(Block(i)(3)) bounds(0,0,0,0) visible(25290,1020,960,960) componentAlpha(0,0,0,0) clip(25290,1020,960,960) 
Painting --- after optimization:
CanvasBackgroundColor p=0x7f4371a6b020 f=0x7f437414cbd0(Canvas(html)(-1)) bounds(0,0,93540,98040) visible(0,0,93540,98040) componentAlpha(0,0,0,0) clip(0,0,93540,98040)  uniform (opaque 0,0,93540,98040) (rgba 255,255,255,255)
WrapList p=0x7f4375b61c20 f=0x7f43750a23b0(HTMLScroll(body)(2)) bounds(0,0,10860,1200) visible(0,0,93540,98040) componentAlpha(-60,0,10920,1200) clip() 
  Text p=0x7f4371a6b088 f=0x7f4376826020(Text(0)"Grooveshark Mobile Applications") bounds(-60,0,10920,1200) visible(0,0,93540,98040) componentAlpha(-60,0,10920,1200) clip(0,0,93540,98040) 
BackgroundColor p=0x7f4375b60308 f=0x7f43768e7e30(HTMLScroll(div)(7)) z=300 bounds(0,3000,93540,95040) visible(0,3000,93540,95040) componentAlpha(0,0,0,0) clip(0,3000,93540,95040)  uniform (opaque 0,3000,93540,95040) (rgba 0.972549,0.435294,0.019608,1.000000)
nsDisplayTransform p=0x7f4375b601b0 f=0x7f4376827a60(Block(div)(5)) z=400 bounds(-7183170,0,7276710,3000) visible(-7183110,0,7276650,3000) componentAlpha(-7183170,0,7276710,3000) clip(0,0,93540,98040) [ I ]
  BackgroundColor p=0x7f4371a6b0f8 f=0x7f4376827a60(Block(div)(5)) z=400 bounds(0,0,93540,3000) visible(-7183110,0,7276650,3000) componentAlpha(0,0,0,0) clip(0,0,93540,2940)  uniform (opaque 0,0,93540,3000) (rgba 0.200000,0.200000,0.200000,1.000000)
  Background p=0x7f4371a6b188 f=0x7f4376827a60(Block(div)(5)) z=400 bounds(0,0,93540,2940) visible(-7183110,0,7276650,3000) componentAlpha(0,0,0,0) clip(0,0,93540,2940)  (opaque 0,0,93540,2940)
  Border p=0x7f4371a6b238 f=0x7f4376827a60(Block(div)(5)) z=400 bounds(0,2940,93540,60) visible(-7183110,0,7276650,3000) componentAlpha(0,0,0,0) clip() 
  Border p=0x7f4371a6b280 f=0x7f43768db0a8(Block(div)(3)) bounds(27150,600,60,1800) visible(-7183110,600,7210320,1800) componentAlpha(0,0,0,0) clip() 
  WrapList p=0x7f4375b60120 f=0x7f43768db9d8(Block(a)(1)) bounds(-7183170,600,7210320,1800) visible(-7183110,600,7210320,1800) componentAlpha(-7183170,900,4320,1200) clip() 
    Background p=0x7f4371a6b2e8 f=0x7f43768db9d8(Block(a)(1)) bounds(17070,900,7920,1260) visible(-7183110,600,7210320,1800) componentAlpha(0,0,0,0) clip(16170,600,10920,1800) 
    Border p=0x7f4371a6b398 f=0x7f43768db9d8(Block(a)(1)) bounds(27090,600,60,1800) visible(-7183110,600,7210320,1800) componentAlpha(0,0,0,0) clip() 
    Text p=0x7f4375b60020 f=0x7f43768e7418(Text(0)"Grooveshark") bounds(-7183170,900,4320,1200) visible(-7183110,600,7210320,1800) componentAlpha(-7183170,900,4320,1200) clip() 
  Background p=0x7f4375b60070 f=0x7f43768e74f8(Block(i)(3)) bounds(0,0,0,0) visible(25290,1020,960,960) componentAlpha(0,0,0,0) clip(25290,1020,960,960) 
Painting --- retained layer tree:
ClientLayerManager (0x7f437ade8500)
  ClientContainerLayer (0x7f4374056000) [visible=< (x=0, y=0, w=1559, h=1634); >] [opaqueContent] [metrics={ cb=(x=0.000000, y=0.000000, w=1559.000000, h=1634.000000) sr=(x=0.000000, y=0.000000, w=1559.000000, h=1634.000000) s=(x=0.000000, y=0.000000) dp=(x=0.000000, y=0.000000, w=0.000000, h=0.000000) cdp=(x=0.000000, y=0.000000, w=0.000000, h=0.000000) scrollId=3 z=1.000 }]
    ClientThebesLayer (0x7f4374056400) [clip=(x=0, y=0, w=0, h=0)] [not visible]
    ClientColorLayer (0x7f4374056800) [visible=< (x=0, y=0, w=1559, h=1634); >] [opaqueContent] [color=rgba(255, 255, 255, 1)]
Flags: needinfo?(davidp99)
Flags: needinfo?(tnikkel)
Flags: needinfo?(matt.woodrow)
(In reply to Benoit Jacob [:bjacob] from comment #4)
>     Text p=0x7f4375b60020 f=0x7f43768e7418(Text(0)"Grooveshark")
> bounds(-7183170,900,4320,1200) visible(-7183110,600,7210320,1800)
> componentAlpha(-7183170,900,4320,1200) clip() 

Assuming the transform that contains this item is reasonable this display item should never have been created. We should have hit the early return in nsFrame::BuildDisplayListForChild at http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsFrame.cpp#2277
Flags: needinfo?(tnikkel)
Attached file log of rects (obsolete) —
Flags: needinfo?(tnikkel)
Attached patch log of rectsSplinter Review
Attachment #8480226 - Attachment is obsolete: true
Note that the transform here has an identity matrix [I]

It looks like the crazy values come from right at the start, the dirty rect has a width of 7276650. I'd expect that to be clipped to the size of the tab area.
Flags: needinfo?(matt.woodrow)
I just tested on nightly 35.0a1 (2014-09-03) and I was able to load grooveshark.com and listen to music. I'm not sure this is still a real issue. Tested on OS X.
(In reply to Mason Chang [:mchang] from comment #9)
> I just tested on nightly 35.0a1 (2014-09-03) and I was able to load
> grooveshark.com and listen to music. I'm not sure this is still a real
> issue. Tested on OS X.

Were you able to reproduce the crash on earlier versions? I just tested on nightly 35.0a1 and was able to reproduce the crash per the instructions in Comment 0. I assure you this is still an issue.
I tried a custom built nightly, mozilla-central 203417:acbdce59da2f on ubuntu and I can reproduce. Interesting that I can't reproduce on mac.
Attached file Display List dump
A log of the display list with some assertions failing then the sigsegv being handled.
I'm trying to dig through this but lots of things are happening that I don't understand yet. Here's what I have so far:

(gdb) f 3
#3  0x00007f1d22e09038 in nsLayoutUtils::PaintFrame (aRenderingContext=0x0, aFrame=0x7f1d08846450, aDirtyRegion=..., aBackstop=0, aFlags=772)
    at /home/masonchang/Projects/mozilla-central-desktop/layout/base/nsLayoutUtils.cpp:2905
2905	    aFrame->BuildDisplayListForStackingContext(&builder, dirtyRect, &list);
(gdb) print dirtyRect
$14 = {<mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>> = {x = -7198950, y = 0, width = 7260810, height = 33480}, <No data fields>}
(gdb) print aFrame->mRect
$15 = {<mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>> = {x = 0, y = 0, width = 61860, height = 33480}, <No data fields>}

We seem to be storing a very large overflow area here - http://dxr.mozilla.org/mozilla-central/source/layout/generic/nsHTMLReflowMetrics.h?from=nsOverflowAreas&case=true#37 - Which I'm not sure where it's set. I have to track that down.

From comment 5, that branch is never taken. According to gdb, I'm getting this:

$18 = (nsIFrame *) 0x7f1d08846ee0
(gdb) print child->GetStateBits()
$19 = 566248488304656
(gdb) print (child->GetStateBits() & NS_FRAME_FORCE_DISPLAY_LIST_DESCEND_INTO)
$21 = 1099511627776
(gdb) print !(child->GetStateBits() & NS_FRAME_FORCE_DISPLAY_LIST_DESCEND_INTO)
$22 = false

If anyone knows, is the error here that we have such a large dirty rect? Or is the problem that such large dirty rects should be ignored later in the process? Or that we should've never had or saved such a large rect in the first place?
I assume the dirtyRect comes from the call to GetVisualOverflowRectRelativeToSelf()?

For the non-e10s case, I'd expect the root frame to be treated as clipped such that the overflow area and the frame rect are the same thing.

I guess this isn't happening, so we're including the overflow area of descendents that is outside of the frame rect. We're not actually going to paint anything outside of the mozbrowser frame area, so we should make sure overflow computation reflects that.

roc or tn would probably know where this should happen.
(In reply to Matt Woodrow (:mattwoodrow) from comment #14)
> I assume the dirtyRect comes from the call to
> GetVisualOverflowRectRelativeToSelf()?
> 
> For the non-e10s case, I'd expect the root frame to be treated as clipped
> such that the overflow area and the frame rect are the same thing.

So I looked at the non-e10s case, and the visual overflow still exists. This seems to be occuring during reflow. Can you explain a bit more what you mean by this? So overflow areas that are much larger than the root frame are ok, just we should be clipping them to be the same? 

> 
> I guess this isn't happening, so we're including the overflow area of
> descendents that is outside of the frame rect. We're not actually going to
> paint anything outside of the mozbrowser frame area, so we should make sure
> overflow computation reflects that.

Are we supposed to be keeping the overflow area intact throughout the whole process then clip at the end and update the overflow computation? Or is the overflow computation always accurate and we clip when we actually need to paint?

> roc or tn would probably know where this should happen.

tn: Do you know where this should be happening? Thanks!
Flags: needinfo?(tnikkel)
Flags: needinfo?(tnikkel)
We should be able to modify the overflow area calculation in ViewportFrame::Reflow pretty easily to reflect this if that is what we do indeed want. Or maybe we just want to use frame's regular rect in nsLayoutUtils::PaintFrame?
Flags: needinfo?(tnikkel) → needinfo?(roc)
(In reply to Timothy Nikkel (:tn) from comment #16)
> We should be able to modify the overflow area calculation in
> ViewportFrame::Reflow pretty easily to reflect this if that is what we do
> indeed want. Or maybe we just want to use frame's regular rect in
> nsLayoutUtils::PaintFrame?

No.

The overflow area of the ViewportFrame should be limited to its border-box normally. The only case where that might not happen is if someone did frameLoader.clipSubdocument = false somewhere ... but that shouldn't happen here. The ViewportFrame should have all its descendants (other than scrollbars/scrollcorner) clipped to the scrollport.

A frame dump would show us which frame(s) are incorrectly extending ViewportFrame's overflow area.
Flags: needinfo?(roc)
There's a frame dump here already, the element that is very far off to the left is 7f0e1742cc38. It's in a fixed pos element. Looking at nsViewportFrame::Reflow it looks like we don't do anything special with the overflow areas we get from reflowing fixed pos frames. So it seems like we are getting what would be expected given the current code.
Attached file Sample Page
This small page causes almost the same behavior as grooveshark.com. On nightly on linux without e10s, Firefox doesn't actually crash, but it can't close until I kill a dead process. On e10s, it doesn't immediately crash, but once I load a new tab and load any new page, the tab will crash and the plugin-container will die. Thanks to tn for the idea of the small test case.
So should we just limit what fixed children can contribute to the overflow area in nsViewportFrame::Reflow then?
Flags: needinfo?(roc)
Interestingly, on the non-e10s case, we never hit the assertion from comment 1. Still digging to see where / why we diverge.
We don't call nsLayoutUtils::PaintFrame on content documents in non-e10s. In non-e10s mode the content document will be a subdocument, and so it will use nsSubDocumentFrame::BuildDisplayList, which gets passed in a dirty rect instead of using the visual overflow rect of its root frame. So it gets a sane dirty rect.
So even in the non-e10s case, we still call nsLayoutUtils::PaintFrame here - http://dxr.mozilla.org/mozilla-central/source/layout/base/nsPresShell.cpp?from=nsPresShell.cpp&case=true#6224. We also still find a dirty rect from using the visual overflow rect here - http://dxr.mozilla.org/mozilla-central/source/layout/base/nsLayoutUtils.cpp?from=nsLayoutUtils.cpp&case=true#2822. The difference is that when we call nsLayoutUtil::PaintFrame on the non-e10s case, we get paint a different frame from the nsViewportFrame::Reflow that has the non-clipped offset. For the e10s case, we paint the same nsViewportFrame. We also get different nsPresShells, which I'm not sure is by design or this is a bug.
nsLayoutUtils::PaintFrame is called on display root frames. Display root frames are generally the root frame in their process. In non-e10s the display root frame is the root frame of the chrome document (it contains the tabs and the address bar, etc). In the e10s case both the root chrome document and the root content document are display roots (because the root content document is now the root document in it's own process). So in the e10s case we will get nsLayoutUtils::PaintFrame called on the root frame of the root chrome document and the root content document.
(In reply to Timothy Nikkel (:tn) from comment #20)
> So should we just limit what fixed children can contribute to the overflow
> area in nsViewportFrame::Reflow then?

The fixed-pos frame's contribution to the viewport frame's overflow area should be clipped to the scrollport by the scrollframe.
Flags: needinfo?(roc)
Rendering of fixed-pos items is clipped to the scrollport since we paint through the placeholder which is a descendant of the scrollframe.

So the contributions of fixed-pos frames to the overflow area of the viewport should also be clipped to the scrollport, although it would be fine to just clip them to the viewport frame itself. But only if the scrollframe is actually doing clipping! Need to check IsIgnoringViewportClipping somewhere.
Assignee: wmccloskey → mchang
Status: NEW → ASSIGNED
This WIP works with the sample page that is listed in this bug. However, grooveshark.com still crashes as a dirty rect with the overflow text is still getting to nsLayoutUtils::PaintFrame.
We still have visual overflow areas of -7198800 at line 7f90a509de58, even though the top FixedList at 7f909ecb7660 seems to have been clipped correctly. This still becomes part of the DirtyRect at http://dxr.mozilla.org/mozilla-central/source/layout/base/nsLayoutUtils.cpp#2888. It gets passed through down to http://dxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxContext.cpp?from=gfxContext.cpp&case=true#1592, which causes the crash.
So actually it seems like it's a race condition. Sometimes, we still have the unclipped visual overflow area and we crash. Sometimes, the visual overflow area is clipped correctly and we segfault here - http://dxr.mozilla.org/mozilla-central/source/gfx/2d/DrawTargetCairo.cpp#171 - Which is called here - http://dxr.mozilla.org/mozilla-central/source/gfx/2d/DrawTargetCairo.cpp#321. The stride that's passed in is 2048 and the stride that's usually chosen by Cairo is 4096, which is interesting. On non-e10s, we never hit this code path. The rect being passed in is:

(gdb) print aRect
$4 = (const mozilla::gfx::IntRect &) @0x7fff87d1d0d0: {<mozilla::gfx::BaseRect<int, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits>, mozilla::gfx::IntPointTyped<mozilla::gfx::UnknownUnits>, mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits>, mozilla::gfx::IntMarginTyped<mozilla::gfx::UnknownUnits> >> = {x = 0, y = 0, width = 1024, 
    height = 621}, <mozilla::gfx::UnknownUnits> = {<No data fields>}, <No data fields>}

Which doesn't seem unreasonable.
I found that we're reading past the source buffer here - http://dxr.mozilla.org/mozilla-central/source/gfx/2d/DrawTargetCairo.cpp?from=DrawTargetCairo.cpp&case=true#171. aStride is 2048 and aRect is 1024 * 4 bytes per pixel, so we're copying 4096 bytes at a time from a buffer allocated with a stride in increments of 2048, which segfaults, which seems like a bug. Not sure if the real bug though is that we fail here - http://dxr.mozilla.org/mozilla-central/source/gfx/2d/DrawTargetCairo.cpp#315. Or both. If i insert an assertion MOZ_ASSERT(aRect.width * pixelWidth > aStride), it fails exactly at the right spot. Any thoughts Matt? Thanks!
Flags: needinfo?(matt.woodrow)
Although my normal nightly build seems to hit this bug only on grooveshark, I hit it constantly in my development builds. It's starting to make it hard to do work since I crash before I can test things. Based on what we know now, is there a workaround that would at least make the crashes go away? I just need something to use locally.
You should be able to get around it temporarily by changing this line http://mxr.mozilla.org/mozilla-central/source/layout/base/nsLayoutUtils.cpp#2853  Specifically change
  visibleRegion = aFrame->GetVisualOverflowRectRelativeToSelf();
to
  visibleRegion = aFrame->GetRectRelativeToSelf();
in nsLayoutUtils::PaintFrame.
Attached patch grooveshark.patch (obsolete) — Splinter Review
I have a patch that works with GrooveShark.com on Linux, Ubuntu 14.04 LTS. I have never been able to reproduce this on OS X. There were three bugs that occured with any determinism that are fixed in this patch.

1) We had a memcpy error where we were reading past the source buffer here - http://dxr.mozilla.org/mozilla-central/source/gfx/2d/DrawTargetCairo.cpp?from=DrawTargetCairo.cpp&case=true#171. This seems like a real memory bug and we may want to uplift this. I'm going to wait for Matt's response in comment 31.
2) We clip the Viewport's overflow bounds to the root scollable's scrollport in ViewportFrame::Reflow from comment 26.
3) We had intermittent crashes, which manifested itself a lot on release builds, but not debug builds after doing #2. This is because some JS would flush restyles, triggering us to flush overflow changes here - http://dxr.mozilla.org/mozilla-central/source/layout/base/RestyleTracker.h?from=RestyleTracker.h&case=true#131 overwriting the previously clipped ViewportFrame::Reflow overflow areas. I fixed this by not updating overflow areas during restyle if the current frame is a ViewportFrame and is the frame for a nsView.

I'm not very familiar with any of this code, so I'm fairly sure this patch probably isn't the right way to clip the viewport or the right place to prevent / edit the RestyleTracker. But I'm able to refresh GrooveShark multiple times in a row and play music whereas before it would crash reliably on first load. Any feedback on the right way to do this would be appreciated! Thanks!
Attachment #8490454 - Attachment is obsolete: true
Attachment #8492459 - Flags: feedback?(roc)
(In reply to Mason Chang [:mchang] from comment #31)
> I found that we're reading past the source buffer here -
> http://dxr.mozilla.org/mozilla-central/source/gfx/2d/DrawTargetCairo.
> cpp?from=DrawTargetCairo.cpp&case=true#171. aStride is 2048 and aRect is
> 1024 * 4 bytes per pixel,

You mean aRect.width is 1024? That makes no sense! We should never have aStride < aRect.width*pixelWidth! That means that rows of the image overlap in memory!
Comment on attachment 8492459 [details] [diff] [review]
grooveshark.patch

Review of attachment 8492459 [details] [diff] [review]:
-----------------------------------------------------------------

The ViewportFrame::Reflow change looks right but the others need more work. So let's split the patch up.

::: gfx/2d/DrawTargetCairo.cpp
@@ +169,5 @@
>  
>    for (int32_t y = 0; y < aRect.height; ++y) {
>      memcpy(surfData + y * surfStride,
>             source + y * aStride,
> +           aStride);

This change looks like a regression. The underlying bug must be somewhere else.

::: layout/generic/nsFrame.cpp
@@ +7268,5 @@
> +    nsIScrollableFrame* rootScrollFrame = PresContext()->PresShell()->GetRootScrollFrameAsScrollable();
> +    if (rootScrollFrame && !rootScrollFrame->IsIgnoringViewportClipping()) {
> +      return false;
> +    }
> +  }

I think the real fix for this issue is for ViewportFrame to override UpdateOverflow.

::: layout/generic/nsViewportFrame.cpp
@@ +263,5 @@
> +                    aPresContext->PresShell()->GetRootScrollFrameAsScrollable();
> +    if (rootScrollFrame && !rootScrollFrame->IsIgnoringViewportClipping()) {
> +      nsRect clipRect = rootScrollFrame->GetScrollPortRect();
> +      aDesiredSize.mOverflowAreas.SetAllTo(clipRect);
> +    }

This looks right :-)
Attachment #8492459 - Flags: feedback?(roc)
Clip the overflow areas to the viewport scroll rect during reflow and prevent ViewportFrame::UpdateOverflow if this viewport is the PresShell->RootScrollFrame isn't ignoring the viewport clipping. We no longer hit a situation where we DrawTargetCairo with aStride < aRect.width * pixelWidth, but I added an assert there just in case.

Try build: https://tbpl.mozilla.org/?tree=Try&rev=4b96913e4a1f
Attachment #8492459 - Attachment is obsolete: true
Attachment #8493480 - Flags: review?(roc)
Flags: needinfo?(matt.woodrow)
Comment on attachment 8493480 [details] [diff] [review]
Clip Viewport Fixed Position Overflow to Scroll Frame

Review of attachment 8493480 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/generic/nsViewportFrame.cpp
@@ +298,5 @@
>  
> +bool
> +ViewportFrame::UpdateOverflow()
> +{
> +  if (HasView() && (GetView()->GetFrame() == this)) {

This will always be true for ViewportFrames. Just take this condition out.
Attachment #8493480 - Flags: review?(roc) → review+
Carrying r+. Updated to fix a reftest failure on debug builds. New try build - https://tbpl.mozilla.org/?tree=Try&rev=3f32c4982248
Attachment #8493480 - Attachment is obsolete: true
Attachment #8493843 - Flags: review+
Try looks good except for Mulet builds, but Mulet builds seem to be failing on try in general. Adding checkin-needed.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/de388c070eaf
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: