Closed Bug 591358 Opened 15 years ago Closed 13 years ago

Crash [@ mozilla::layers::BasicLayerManager::PushGroupWithCachedSurface(gfxContext*, gfxASurface::gfxContentType, gfxPoint*) ]

Categories

(Core :: Graphics, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla18

People

(Reporter: chofmann, Assigned: ajones)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression, reproducible, Whiteboard: [STR in comment 18])

Crash Data

Attachments

(4 files, 12 obsolete files)

696 bytes, patch
roc
: review+
Details | Diff | Splinter Review
1.48 KB, patch
gal
: review+
Details | Diff | Splinter Review
34.35 KB, patch
roc
: review+
Details | Diff | Splinter Review
33.87 KB, patch
roc
: review+
Details | Diff | Splinter Review
currently #20 non-plugin related top crash in beta4 http://crash-stats.mozilla.com/report/index/cd0ff616-c9e2-49b6-81af-602d12100827 Frame Module Signature [Expand] Source 0 xul.dll mozilla::layers::BasicLayerManager::PushGroupWithCachedSurface gfx/layers/basic/BasicLayers.cpp:897 1 xul.dll mozilla::layers::BasicLayerManager::EndTransaction gfx/layers/basic/BasicLayers.cpp:963 2 xul.dll nsDisplayList::PaintForFrame layout/base/nsDisplayList.cpp:395 3 xul.dll nsLayoutUtils::PaintFrame layout/base/nsLayoutUtils.cpp:1406 4 xul.dll PresShell::Paint layout/base/nsPresShell.cpp:5934 5 xul.dll nsViewManager::RenderViews view/src/nsViewManager.cpp:459 6 xul.dll nsViewManager::Refresh view/src/nsViewManager.cpp:425 7 xul.dll nsViewManager::DispatchEvent view/src/nsViewManager.cpp:912 8 xul.dll AttachedHandleEvent view/src/nsView.cpp:192 9 xul.dll nsWindow::DispatchEvent widget/src/windows/nsWindow.cpp:3455 10 xul.dll nsWindow::DispatchWindowEvent widget/src/windows/nsWindow.cpp:3486 11 xul.dll nsWindow::OnPaint widget/src/windows/nsWindowGfx.cpp:563 12 xul.dll nsWindow::ProcessMessage widget/src/windows/nsWindow.cpp:4661 13 xul.dll nsWindow::WindowProcInternal widget/src/windows/nsWindow.cpp:4251 14 xul.dll nsWindow::WindowProc widget/src/windows/nsWindow.cpp:4206 15 user32.dll InternalCallWinProc 16 user32.dll UserCallWinProcCheckWow 17 user32.dll DispatchClientMessage 18 user32.dll __fnDWORD 19 ntdll.dll KiUserCallbackDispatcher 20 xul.dll nsWindow::DealWithPopups widget/src/windows/nsWindow.cpp:8084 21 user32.dll DispatchMessageW 22 xul.dll nsBaseAppShell::OnProcessNextEvent widget/src/xpwidgets/nsBaseAppShell.cpp:294 23 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:517 24 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:118 25 xul.dll xul.dll@0xb7a45b 26 xul.dll MessageLoop::RunInternal ipc/chromium/src/base/message_loop.cc:219 27 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:202 28 xul.dll _SEH_epilog4 29 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:176 30 xul.dll nsBaseAppShell::Run widget/src/xpwidgets/nsBaseAppShell.cpp:175 31 xul.dll nsAppShell::Run widget/src/windows/nsAppShell.cpp:243 http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=contains&query=mozilla%3A%3Alayers%3A%3ABasicLayerManager%3A%3APushGroup&date=08%2F27%2F2010%2008%3A22%3A30&range_value=1&range_unit=weeks&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=mozilla%3A%3Alayers%3A%3ABasicLayerManager%3A%3APushGroupWithCachedSurface%28gfxContext*%2C%20gfxASurface%3A%3AgfxContentType%2C%20gfxPoint*%29 checking --- mozilla::layers::BasicLayerManager::PushGroupWithCachedSurface.gfxContext 20100826-crashdata.csv found in: 4.0b4 4.0b3 4.0b2 4.0b5pre release total-crashes mozilla::layers::BasicLayerManager::PushGroupWithCachedSurface.gfxContext crashes pct. all 291046 89 0.000305794 4.0b4 17931 63 0.00351347 4.0b3 6642 23 0.00346281 4.0b2 1733 2 0.00115407 4.0b5pre 1819 1 0.000549753 os breakdown mozilla::layers::BasicLayerManager::PushGroupWithCachedSurface.gfxContextTotal 89 Win5.1 0.16 Win6.0 0.09 Win6.1 0.67 not too much interesting in the url list.... facebook, orkut, google, but this site showed up a couple of time http://knifetanks.com/main.html
and a few more game sites and some interesting animation http://www.choiceofgames.com/dragon/ https://www.leagueoflegends.com http://honsolgen.de/
increasing in rank as the beta popluation grows. #14 top crash for 4.0b4. nominating
blocking2.0: --- → ?
looks like the first instance of this crash could be http://crash-stats.mozilla.com/report/index/d01d3057-089a-4768-bb19-128702100718 Firefox 4.0b2pre build from 2010 07 16 040830 A lot of stuff landed in this area around then http://hg.mozilla.org/mozilla-central/log/900fdd7fb8b2/gfx/layers/basic/BasicLayers.cpp
maybe the part #20 pushgroup part of the retained layers landing is the place to start. Matt Woodrow - Bug 564991. Part 20: Performance win by avoiding PushGroup for single-layer opacity. r=vlad
Matt, does any of this look familiar/sadmaking?
Assignee: nobody → matt.woodrow+bugzilla
blocking2.0: ? → final+
Not particularly. Judging by the crash line it looks like an inlined version of InheritContextFlags is crashing with a NULL pointer. We already use aTarget earlier so mCachedSurface.Get() must be failing. The only way this can happen is if CreateSimilarSurface fails. Id guess at OOM, or clip.size having a 0 dimension. STR would be nice to catch this in a debugger. We could add a catch for an empty clip region and skip the whole drawing stage and hope that fixes it.
Removed accidental trailing whitespace
Attachment #471709 - Attachment is obsolete: true
Attachment #471710 - Flags: review?(roc)
Attachment #471709 - Flags: review?(roc)
Comment on attachment 471710 [details] [diff] [review] Don't attempt to draw when the clip is empty v2 This patch is safe enough and might conceivably help, I suppose. But I doubt this is the problem. I thought that creating a 0x0 surface in gfxASurface::CreateSimilarSurface would work fine and return a valid surface, is that not true?
Attachment #471710 - Flags: review?(roc) → review+
Something which hasn't been mentioned above is the fact that the crash happens with text zoom. See bug 604665.
Severity: normal → critical
Version: unspecified → Trunk
It's still there in the top #100 on beta7 and in the trunk.
blocking2.0: final+ → -
looks like its about #127 on beta8.
It is #72 top crasher in 4.0b10 over the last week (#3 top changer). Some comments in 4.0b10 say: "always happens when I leave facebook open and the computer goes to sleep." "Opened a second window to Google Image Search, and the screen filled with the image that was hovered over before crasching." "There is a problem with CanvasPattern, perhaps a leak. I create my CanvasPattern from a Canvas, and then I reuse it across maybe 1-2 minutes worth of frames drawn. At some point, on FF 3.6, all brushes in memory begin drawing as blanks, but no crash. On FF 4, painting ceases. If I move the window, I see the window rapidly run out of resources, and FF fails to paint the chrome. It stalls, and then crash. My workaround is to cache new brushes every frame and allow the old ones to gc." "Here's how to replicate this in your labs: Use a Win7 x64 pc with 12GB or more ram. Open up 50 or 100 browser windows and then let it sit there until it crashes. Monitor RAM usage of Firefox and you will see Firefox will eventually crash even thought there is plenty of free RAM available on the pc." Stack traces are different from the one in comment 0: 0 xul.dll mozilla::layers::BasicLayerManager::PushGroupWithCachedSurface gfx/layers/basic/BasicLayers.cpp:1116 1 xul.dll mozilla::layers::BasicLayerManager::EndTransactionInternal gfx/layers/basic/BasicLayers.cpp:1264 2 xul.dll mozilla::layers::BasicShadowLayerManager::EndTransaction gfx/layers/basic/BasicLayers.cpp:2658 3 xul.dll nsDisplayList::PaintForFrame layout/base/nsDisplayList.cpp:527 4 xul.dll nsLayoutUtils::PaintFrame layout/base/nsLayoutUtils.cpp:1560 5 xul.dll PresShell::Paint layout/base/nsPresShell.cpp:6144 More reports at: https://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&signature=mozilla%3A%3Alayers%3A%3ABasicLayerManager%3A%3APushGroupWithCachedSurface%28gfxContext*%2C%20gfxASurface%3A%3AgfxContentType%2C%20gfxPoint*%29
(In reply to comment #9) > Comment on attachment 471710 [details] [diff] [review] > Don't attempt to draw when the clip is empty v2 > > This patch is safe enough and might conceivably help, I suppose. But I doubt > this is the problem. I thought that creating a 0x0 surface in > gfxASurface::CreateSimilarSurface would work fine and return a valid surface, > is that not true? It should be mostly true. The semantics of 0x0 surfaces in cairo are sort of undefined, but they should function ok.
I've been able to crash with this stack by right clicking a couple times on the new Preschool IE demo (http://ie.microsoft.com/testdrive/Performance/Preschool/Default.html). After the demo finished I tried to right click to get a context menu and hit the crash. https://crash-stats.mozilla.com/report/index/bp-06610840-af7e-49a7-838d-faf132110210
It is now #22 top crasher in 4.0b11.
Still reproduceable with the RC nightly (Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b13pre) Gecko/20110303 Firefox/4.0b13pre) with the same steps as comment 16 1. go to the preschool demo page 2. click on start and wait for the demo to finish 3. at the final black page with results keep right clicking around various points of the page to open the context menu, after some click it will crash https://crash-stats.mozilla.com/report/index/a18844a0-fe25-45e7-9bcd-f40d42110304
blocking2.0: - → ?
Keywords: reproducible
Summary: Firefox 4.0b2,3,4 Crash Report [@ mozilla::layers::BasicLayerManager::PushGroupWithCachedSurface(gfxContext*, gfxASurface::gfxContentType, gfxPoint*) ] → Crash [@ mozilla::layers::BasicLayerManager::PushGroupWithCachedSurface(gfxContext*, gfxASurface::gfxContentType, gfxPoint*) ]
Scoobidiver: you nominated this without explanation or updating its position in crash-stats. Please don't do that again.
> Scoobidiver: you nominated this without explanation or updating its position in > crash-stats. Please don't do that again. It has been un-nomimated on December when it was #100 top crasher and unreproducible. Now it is #62 top crasher in 4.0b12 (I though it was #22 according to comment 17) and is reproducible. It is my mistake, it shouldn't have been renominate.
update on the stats. About 300 crashes per day on latest betas. checking --- mozilla::layers::BasicLayerManager::PushGroupWithCachedSurface.gfxContext...gfxASurface::gfxContentType,.gfxPoint.. 20110302-crashdata.csv found in: 4.0b12 4.0b11 4.0b4 4.0b7 4.0b13pre 4.0b10 4.0b9 4.0b3 4.0b2 release total-crashes mozilla::layers::BasicLayerManager::PushGroupWithCachedSurface.gfxContext...gfxASurface::gfxContentType,.gfxPoint.. crashes pct. all 358047 181 0.00050552 4.0b12 67246 121 0.00179936 4.0b11 15135 43 0.0028411 4.0b4 687 6 0.00873362 4.0b7 2419 3 0.00124018 4.0b13pre 1586 3 0.00189155 4.0b10 5515 2 0.000362647 4.0b9 2052 1 0.000487329 4.0b3 430 1 0.00232558 4.0b2 781 1 0.00128041 could go higher depending on how heavily the IE Preschool demo is promoted. we do see that in the daily urls 6 http://www.mtv3.fi/ 3 http://wonder-tonic.com/zombie/ 3 http://ie.microsoft.com/testdrive/Performance/Preschool/Default.html 3 http://cs-clubs.org.ru/load/23 as well as crashes on a few other high profile sites. 12 http://www.facebook.com 8 http://www.mtv3.fi 6 \N// 5 jar:file:// 3 http://www.google.fr 3 http://wonder-tonic.com 3 http://search.conduit.com 3 http://ie.microsoft.com 3 http://i1.scribdassets.com 3 http://cs-clubs.org.ru if this is low risk it should be considered for ride-along or .x
> could go higher depending on how heavily the IE Preschool demo is promoted. I don't think this is an interesting point, I just used it in my steps because it allows me to reliably reproduce. I suppose it's activated by something that happens there and that could happen in other websites as well.
Whiteboard: [STR in comment 18]
blocking2.0: ? → .x+
(In reply to comment #22) > > could go higher depending on how heavily the IE Preschool demo is promoted. > > I don't think this is an interesting point, I just used it in my steps because > it allows me to reliably reproduce. I suppose it's activated by something that > happens there and that could happen in other websites as well. I agree that it might happen on a variety of sites. The interesting part of this is that if *any* of those sites start to get a higher volume of traffic then the crash numbers increase significantly. http://ie.microsoft.com/testdrive/Performance/Preschool/Default.html is one of those sites where microsoft might though evangelism and press outreach drive a lot more traffic.
(In reply to comment #23) > Matt, can you reproduce this? I haven't been able to reproduce a crash, no. Memory usage is interesting though, it climbs steadily (2mb/s approx) while on the results page and drops with every GC. Right clicking as fast as I can makes this spike even faster. I managed to raise memory usage from 200mb -> 500mb by doing this, could possibly be an issue on a RAM limited 32bit machine?
In fact it kinda locks up temporarily for me, if I click enough. It does look like this is a memory consumption issue. After some right-clicking I get surface-VRAM up to 250MB in about:memory. Maybe the clicking around is allocating lots of canvases and we're not triggering GC because we don't know they use a lot of memory? I seem to remember a bug about that.
You can use this to report the allocation and cause a more timely GC: JS_PUBLIC_API(void) JS_updateMallocCounter(JSContext *cx, size_t nbytes)
(In reply to comment #25) > I managed to raise memory usage from 200mb -> 500mb by doing this, could > possibly be an issue on a RAM limited 32bit machine? Well I have 4GB memory on 64bit Windows 7, and I have to click about 10 times to crash, but my gpu has only 256MB in case this is related.
It took some more tries, but I've been able to crash in a debug build too, I had to run the test 4 times and each time click at a lazy-but-constant rate for some seconds. The memory gorws for me up to 500MB too, but it's gfx/win32/surface, I don't have d2d indeed. I must note that when I didn't crash, PushGroupWithCachedSurface doesn't seem to be called, indeed I was never able to breakpoint here but on the crash. I got these values as soon as entering the method: aContent = CONTENT_COLOR_ALPHA aSavedOffset = {x=3.1452189656166499e+124 y=3.395193265594e-313#DEN } aTarget has RefCnt = 4, has a mCairo (with ref_count = 1, status = CAIRO_STATUS_SUCCESS, user_data has 0 elements, gstate_freelist is the only thing that seems to point to garbage). this->mRefCnt = 3 this->mKeepAlive is empty this->mPhase = PHASE_DRAWING this->mXResolution = 1.0000000 this->mYResolution = 1.0000000 this->mCachedSurface seems to be uninitialized (mSurface is a null pointer, any other data is garbage) this->mDoubleBuffering= BUFFER_BUFFERED this->mTransactionIncomplete = false everything seems to proceed fine until we reach this point: nsRefPtr<gfxContext> ctx = mCachedSurface.Get(aContent, gfxIntSize(clip.size.width, clip.size.height), currentSurf); mCachedSurface here has still a null pointer to mSurface, I see that now mSize is setup, but other values seem garbage. /* Align our buffer for the original surface */ ctx->Translate(-clip.pos); then here I get the crash.
fwiw, we crash because ctx is a null pointer, mCachedSurface.Get failed.
something I forgot, I see above Matt said this could happen if clip has no size, so this is clip: { pos {x=0.00000000000000000 y=0.00000000000000000 } gfxPoint size {width=1696.0000000000000 height=1026.0000000000000 } gfxSize }
Sorry for bugspam, I try to post everything I can to help. I can easily enter a loop of warnings with the same steps: WARNING: Nv3DVStreaming CoCreateInstance failed (disabled).: file c:/mozilla/mozilla-central/gfx/layers/d3d9/Nv3DVUtils.cpp, line 85 _create_dc_and_bitmap: Operazione completata. WARNING: Invalid type to RecordMemoryUsedForSurfaceType!: file c:/mozilla/mozilla-central/gfx/thebes/gfxASurface.cpp, line 561 WARNING: Invalid type to RecordMemoryUsedForSurfaceType!: file c:/mozilla/mozilla-central/gfx/thebes/gfxASurface.cpp, line 561 JavaScript error: , line 0: uncaught exception: [Exception... "Component returned failure code: 0x8007000e (NS_ERROR_OUT_OF_MEMORY) [nsIDOMHTMLCanvasElement.getContext]" nsresult: "0x8007000e (NS_ERROR_OUT_OF_MEMORY)" location: "JS frame :: http://ie.microsoft.com/testdrive/Performance/Preschool/Demo.js :: <TOP_LEVEL> :: line 3909" data: no] _create_dc_and_bitmap: Operazione completata. WARNING: Invalid type to RecordMemoryUsedForSurfaceType!: file c:/mozilla/mozilla-central/gfx/thebes/gfxASurface.cpp, line 561 The last warning is then repeated a bunch of times, looks like infinite times, sometimes with other canvas oom errors. Looks like D3D9 layers use a lot of memory here?
and to confirm Matt's OOM theory, in CreateSimilarSurface if (GetContentType() == CONTENT_COLOR_ALPHA) { // [...] surface = cairo_win32_surface_create_with_dib(cairo_format_t(gfxASurface::FormatFromContent(aContent)), aSize.width, aSize.height); surface.status = CAIRO_STATUS_NO_MEMORY then surface gets destroyed and this returns nsnull. Browser memory doesn't seem excessive, I presume has something to do with D3D9 video memory?
It is #27 top crasher in 4.0 over the last 3 days.
This demo is creating temporary canvas contexts to measure text size and we are greedily allocating the default backing surface size (300x150) every time. This patch reports the allocated size to the GC and seems to reduce the problem for me. Memory usage still grows while right clicking fast, but it's collected much more often and I couldn't get the numbers anywhere near as high. Andreas: Do I need to also call JS_updateMallocCounter with negative values when the memory is freed? nsContentUtils::GetCurrentJSContext can be null during our Reset callback, and I don't know enough about context lifetimes to attempt caching the pointer. I'm also working on a second part to skip the surface allocation until it is actually used.
Attachment #522887 - Flags: review?(gal)
(In reply to comment #35) > Andreas: Do I need to also call JS_updateMallocCounter with negative values > when the memory is freed? nsContentUtils::GetCurrentJSContext can be null > during our Reset callback, and I don't know enough about context lifetimes to > attempt caching the pointer. I don't think we do. The JS engine just observes the memory churn and will decide to GC after we've allocated "lots" of data.
WIP patch. Appears to work on mac, need to test it properly.
Attachment #522887 - Flags: review?(gal) → review+
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.2
It is #25 top crasher in 4.0. What is the threshold to get if fixed in 2.0.x?
blocking2.0: .x+ → ?
Depends on: 656947
It still happens in 4.0.1 (#13 top crasher), 5.0b2 (#28 top crasher) and 6.0a1 (#58 top crasher), so it is not fixed. More reports at: https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3Alayers%3A%3ABasicLayerManager%3A%3APushGroupWithCachedSurface%28gfxContext*%2C%20gfxASurface%3A%3AgfxContentType%2C%20gfxPoint*%29
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Crash Signature: [@ mozilla::layers::BasicLayerManager::PushGroupWithCachedSurface(gfxContext*, gfxASurface::gfxContentType, gfxPoint*) ]
Ehsan, we are still seeing this bug in 6.0.
(In reply to comment #41) > Ehsan, we are still seeing this bug in 6.0. I only happened to land this patch on mozilla-central, and I don't know anything about this bug...
The central link shows only one of the three patches as fixed. Matt, do we need to get reviews and land additional pieces to clean up here?
Blocks: 672114
The one patch that landed should have been all that was needed to fix the crash. I'm really not sure how this is still crashing, as the GC should be freeing unused contexts. I can relook at the third patch to reduce the memory impact of canvas objects that aren't used for any drawing.
It is #12 top browser crasher in 5.0 and #15 in 6.0b2.
blocking2.0: ? → ---
Ok, so this is still happening on 6.0 and we get enough of these that I think we need someone to figure out what is going on. It looks like those patches did not fix the crash. Maybe it's a different problem but I can't find any data that indicate we ever got rid of it and it's back due to some other issue. JP, can we get someone on this?
Yes, we actually already discussed at the gfx meeting, Matt W. is going to look at it.
Rebased this patch against tip. Fixed any bugs I found running layout/reftests/canvas. Pushed to Try as: http://tbpl.allizom.org/?tree=Try&usebuildbot=1&rev=1dcb4e41531e I'm assuming that the underlying bug here is the same as we determined earlier. Hopefully this patch to only allocate a full sized surface when it's actually need will help significantly. We probably need a version of this for Azure too.
Attachment #522952 - Attachment is obsolete: true
Attachment #555320 - Flags: review?(roc)
Comment on attachment 555320 [details] [diff] [review] Part 2: Don't create backing surfaces until they are needed v2 Review of attachment 555320 [details] [diff] [review]: ----------------------------------------------------------------- This basically looks good. r+ with the comments below. I think we should land this, because it improves perf on the Preschool test and maybe other sites. However, I'll be surprised if it affects this crash bug. I can't imagine many sites would be helped by this fix, and the memory accounting patch we already landed here should save us if a lot of not-drawn-to canvases are being created. We should do something like this for Azure too. We can probably do a better job there because the Azure DrawTarget doesn't carry much state, so we can avoid creating a DrawTarget until an actual draw operation occurs. ::: content/canvas/src/nsCanvasRenderingContext2D.cpp @@ +977,5 @@ > // nothing to do, this is already the set style > return; > } > > + if (!mValid || !CreateSurface()) { Don't bother checking mValid everywhere. Just check !CreateSurface everywhere since it returns false if !mValid anyway. CreateSurface should probably check for errors in the surface and mThebes and return false in those cases; that would simplify the call sites some more. Also I think CreateSurface should be called EnsureSurface since if there is already a surface, we won't create another one. @@ +1160,5 @@ > + mThebes->SetMiterLimit(10.0); > + mThebes->SetLineCap(gfxContext::LINE_CAP_BUTT); > + mThebes->SetLineJoin(gfxContext::LINE_JOIN_MITER); > + > + mThebes->NewPath(); This code to setup mThebes should be shared with the same code in CreateSurface. @@ +3009,5 @@ > + mSurface->CairoStatus() || > + mThebes->HasError()) > + return NS_ERROR_FAILURE; > + > + processor.mThebes = mThebes; You're relying on gfxTextRun::MeasureText not using mThebes, which is a bad assumption. It turns out that currently we can use mThebes if processor.mDoMeasureBoundingBox is true, because then we pass TIGHT_INK_EXTENTS to gfxTextRun::MeasureText. For the case where we haven't set up an mThebes yet, I think we could instead pass in the GetReferenceRenderingContext from the presshell.
Attachment #555320 - Flags: review?(roc) → review+
Comment on attachment 555320 [details] [diff] [review] Part 2: Don't create backing surfaces until they are needed v2 Review of attachment 555320 [details] [diff] [review]: ----------------------------------------------------------------- Actually I'd like to rereview the text change before this lands.
Attachment #555320 - Flags: review+ → review-
> You're relying on gfxTextRun::MeasureText not using mThebes, which is a bad > assumption. It turns out that currently we can use mThebes if > processor.mDoMeasureBoundingBox is true, because then we pass > TIGHT_INK_EXTENTS to gfxTextRun::MeasureText. > > For the case where we haven't set up an mThebes yet, I think we could > instead pass in the GetReferenceRenderingContext from the presshell. What do you mean by this? We always have at least a valid 1x1 surface which is used for measuring, and when we actually try to draw a full sized surface is created.
When DrawOrMeasureText has aOp==TEXT_DRAW_OPERATION_MEASURE, it can use processor.mThebes, which might be null there. Right?
I had that problem originally, so Initialize() now creates a 1x1 surface and a valid mThebes. mThebes should never be null, and should always wrap a valid surface. Only when we want to actually do drawing is it recreated with a surface of the correct size. If we used GetReferenceRenderingContext, it wouldn't necessarily give the correct measurements since it might be a different surface type to what will be created with EnsureSurface (in particular if MOZ_CANVAS_IMAGE_SURFACE is set).
Comment on attachment 555320 [details] [diff] [review] Part 2: Don't create backing surfaces until they are needed v2 Review of attachment 555320 [details] [diff] [review]: ----------------------------------------------------------------- Actually I'd like to rereview the text change before this lands. ::: content/canvas/src/nsCanvasRenderingContext2D.cpp @@ +467,5 @@ > > // our drawing surfaces, contexts, and layers > nsRefPtr<gfxContext> mThebes; > nsRefPtr<gfxASurface> mSurface; > + PRPackedBool mSurfaceCreated; Better document this to make it clear that there's a surface even when this is false; this is simply whether we've created a surface of the correct size.
Comment on attachment 555320 [details] [diff] [review] Part 2: Don't create backing surfaces until they are needed v2 Review of attachment 555320 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/canvas/src/nsCanvasRenderingContext2D.cpp @@ +2170,5 @@ > { > + if (!mValid || !CreateSurface() || > + mSurface->CairoStatus() || > + mThebes->HasError()) > + return NS_ERROR_FAILURE; Actually here, if we haven't created a surface, we don't need to, right? Perhaps not worth optimizing though.
(In reply to Matt Woodrow (:mattwoodrow) from comment #53) > I had that problem originally, so Initialize() now creates a 1x1 surface and > a valid mThebes. mThebes should never be null, and should always wrap a > valid surface. Only when we want to actually do drawing is it recreated with > a surface of the correct size. It's kind of unfortunate that we have to create the 1x1 surface, though, since it increases the overhead of using a canvas for management. Seems to me we could use gfxPlatform::ScreenReferenceSurface for the 1x1 surface instead of creating a surface. Or else we could add a layer manager method to return a shared 1x1 surface.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #56) > Actually here, if we haven't created a surface, we don't need to, right? > Perhaps not worth optimizing though. That's in clearRect.
Removed the 1x1 surface creation, and just used GetReferenceRenderingContext for measuring text. Fixed other review comments too!
Attachment #555320 - Attachment is obsolete: true
Attachment #556466 - Flags: review?(roc)
Comment on attachment 556466 [details] [diff] [review] Part 2: Don't create backing surfaces until they are needed v3 Review of attachment 556466 [details] [diff] [review]: ----------------------------------------------------------------- Great! We need a patch for Azure too...
Attachment #556466 - Flags: review?(roc) → review+
Landed the canvas fix on inbound: http://hg.mozilla.org/integration/mozilla-inbound/rev/d08db920b986 Will work on an Azure patch this week.
Comment on attachment 556466 [details] [diff] [review] Part 2: Don't create backing surfaces until they are needed v3 >--- a/content/canvas/src/nsCanvasRenderingContext2D.cpp >+++ b/content/canvas/src/nsCanvasRenderingContext2D.cpp > // If mCanvasElement is not provided, then a docshell is > nsCOMPtr<nsIDocShell> mDocShell; > > // our drawing surfaces, contexts, and layers > nsRefPtr<gfxContext> mThebes; > nsRefPtr<gfxASurface> mSurface; >+ PRPackedBool mSurfaceCreated; > > PRUint32 mSaveCount; > > /** > * Flag to avoid duplicate calls to InvalidateFrame. Set to true whenever > * Redraw is called, reset to false when Render is called. > */ > PRPackedBool mIsEntireFrameInvalid; Not very useful to use a PRPackedBool if you're going to leave the saved memory unused
Hrm, this is not great performance wise, can you explain why we need this and make very sure no canvas demo's are regressed by this?
This would be a bit more efficient if mTarget was null when !mValid. Then EnsureSurface would just load mTarget and check for null (and we're going to use mTarget later anyway). Can we do that? ClearRect doesn't need to call EnsureSurface; if there isn't one, we can just return. The methods that call EnsureWriteablePath don't need to call EnsureSurface themselves since you already have an EnsureSurface there. Ditto for EnsureUserSpacePath and TransformWillUpdate. I think we can maintain the invariant that if mTarget is null then so are mPath, mPathBuilder and mDSPathBuilder. Then, in EnsureWriteablePath, we only need to call EnsureSurface in the !mPath case. And EnsureUserSpacePath only needs to call EnsureSurface in the !mPath && !mPathBuilder && !mDSPathBuilder case. Similar in TransformWillUpdate. That should eliminate a lot of the EnsureSurface checks.
Attachment #557744 - Attachment is obsolete: true
Attachment #558212 - Flags: review?
Attachment #557744 - Flags: review?(bas.schouten)
Attachment #558212 - Flags: review? → review?(bas.schouten)
It might be a good idea to create the error target early, when initializing the canvas context.
Was it not possible to avoid checking mTarget in path creation methods?
Nick, Bill, this bug added JS_updateMallocCounter for canvases a while ago...
Comment on attachment 558212 [details] [diff] [review] Part 3: Don't create backing surfaces until they are needed - Azure v2 Review of attachment 558212 [details] [diff] [review]: ----------------------------------------------------------------- Did you measure the performance impact of this patch? And if there is any, what is it? ::: content/canvas/src/nsCanvasRenderingContext2DAzure.cpp @@ +992,5 @@ > // Initialize our static variables. > PRUint32 nsCanvasRenderingContext2DAzure::sNumLivingContexts = 0; > PRUint8 (*nsCanvasRenderingContext2DAzure::sUnpremultiplyTable)[256] = nsnull; > PRUint8 (*nsCanvasRenderingContext2DAzure::sPremultiplyTable)[256] = nsnull; > +RefPtr<DrawTarget> nsCanvasRenderingContext2DAzure::sErrorTarget = nsnull; I'm not sure this destruction will work well, I'm not releasing COM objects from a global static is alright! In any case I think it'd be fine to have a per canvas error target so we can just store it on mTarget?
Attachment #558212 - Flags: review?(bas.schouten) → review+
Is bug 664764 a dupe of this?
This crash signature seems to have completely gone away. We still need the fix in part 3 though.
Assignee: matt.woodrow → ajones
I don't think we should bother with part 2. We are racing towards using only nsCanvasRenderingContext2DAzure, so only part 3 will be needed.
I've taken Matt's patch and applied it inbound. I've moved the EnsureTarget calls (formerly EnsureSurface) closer to using mTarget. This should make it lazier in a few more edge cases such as out of bounds co-ordinates.
Attachment #558212 - Attachment is obsolete: true
Attachment #628165 - Attachment is patch: true
Comment on attachment 628165 [details] [diff] [review] Part 3: Don't create backing surfaces until they are needed - Azure v3 Review of attachment 628165 [details] [diff] [review]: ----------------------------------------------------------------- Can you add some comments around mTarget and mIsValid to clarify the invariants here. Namely, 1) mTarget null means the buffer is entirely clear 2) mValid false means that there was an error allocating the buffer and mTarget is the error target. Actually could we get rid of mValid and replace checks of mValid with mTarget == sErrorTarget? In your .hgrc use [defaults] qdiff=-p -U 8 diff=-p -U 8 so we get function names in the diff and plenty of context. Please do that and regenerate this patch, it'll become much easier to review. Also, when you submit a patch, it's best to submit a full hg changeset patch with user-name and commit message filled out. ::: content/canvas/src/nsCanvasRenderingContext2DAzure.cpp @@ +439,4 @@ > */ > static PRUint8 (*sPremultiplyTable)[256]; > > + static RefPtr<DrawTarget> sErrorTarget; We shouldn't add static constructors and destructors. sErrorTarget should be a DrawTarget* and we should use explicit NS_ADDREF and NS_RELEASE calls to manage it. You can release it in the last nsCanvasRenderingContext2DAzure::~nsCanvasRenderingContext2DAzure destructor where we release the static data tables. @@ +1245,5 @@ > + > + if (layerManager) { > + mTarget = layerManager->CreateDrawTarget(size, format); > + } else { > + mTarget = Factory::CreateDrawTarget(BACKEND_DIRECT2D, size, format); The code that you're moving uses target = gfxPlatform::GetPlatform()->CreateOffscreenDrawTarget(size, format); which will actually work on non-Windows :-). So copy that instead. @@ +1251,5 @@ > + } > + > + if (!mTarget) { > + EnsureErrorTarget(); > + mTarget = sErrorTarget; Flip the if blocks around so that we can write "if (mTarget)" @@ +4196,5 @@ > + if (sErrorTarget) { > + return; > + } > + > + sErrorTarget = Factory::CreateDrawTarget(BACKEND_DIRECT2D, IntSize(1, 1), FORMAT_B8G8R8A8); This isn't right, it only works on Windows. Use the same CreateOffscreenDrawTarget as above.
Made the suggested changes.
Attachment #628165 - Attachment is obsolete: true
Comment on attachment 628244 [details] [diff] [review] Part 3: Don't create backing surfaces until they are needed - Azure v4 Review of attachment 628244 [details] [diff] [review]: ----------------------------------------------------------------- Don't forget to actually request review from someone when you want a patch to be reviewed :-). Matt Woodrow wrote most of this so it's probably fairest to put his name on it. That also means he's more likely to get blamed if it breaks something :-). ::: content/canvas/src/nsCanvasRenderingContext2DAzure.cpp @@ +511,5 @@ > > // If mCanvasElement is not provided, then a docshell is > nsCOMPtr<nsIDocShell> mDocShell; > > + // Our drawing surfaces, contexts, and layers. This is created lazily so it Just kill the "Our drawing surfaces, contexts, and layers" comment. @@ +1016,5 @@ > delete[] sUnpremultiplyTable; > delete[] sPremultiplyTable; > sUnpremultiplyTable = nsnull; > sPremultiplyTable = nsnull; > + NS_RELEASE(sErrorTarget); NS_IF_RELEASE since we don't want to crash if sErrorTarget is null. @@ +1302,5 @@ > state->colorStyles[STYLE_FILL] = NS_RGB(0,0,0); > state->colorStyles[STYLE_STROKE] = NS_RGB(0,0,0); > state->shadowColor = NS_RGBA(0,0,0,0); > > + return mTarget == sErrorTarget ? NS_OK : NS_ERROR_OUT_OF_MEMORY; I think you meant != @@ +1313,5 @@ > return NS_OK; > > mOpaque = isOpaque; > > + if (mTarget == sErrorTarget) { Here too. Actually I think we should just call SetDimensions unconditionally here now. Now that we're lazily creating the target, this won't cost us in the case where the target hasn't already been created. @@ +1331,5 @@ > return NS_OK; > > mIPC = isIPC; > > + if (mTarget != sErrorTarget) { Same here. @@ +1347,5 @@ > { > nsresult rv = NS_OK; > > + EnsureTarget(); > + if (mTarget == sErrorTarget) { Let's have an inline helper function "bool IsTargetValid()" that returns true when mTarget != sErrorTarget. That'll make this easier to read. However, in this case, if !mTarget then we should just return NS_OK since there is nothing to draw. @@ +1505,5 @@ > } > > mStyleStack.RemoveElementAt(mStyleStack.Length() - 1); > > TransformWillUpdate(); Move this up and replace EnsureTarget, since TransformWillUpdate already does it. @@ +1596,5 @@ > if (!JSValToMatrix(cx, matrix, &newCTM, &rv)) { > return rv; > } > > + EnsureTarget(); Actually this should call TransformWillUpdate... @@ +1623,5 @@ > } > > // XXX ERRMSG we need to report an error to developers here! (bug 329026) > if (newCTMInverse.Invert()) { > + EnsureTarget(); So should this! @@ +2052,5 @@ > return NS_OK; > } > + > + EnsureTarget(); > + if (mTarget == sErrorTarget) { I don't think we need to EnsureTarget here. If mTarget is null we can just return since it's already completely cleared. If it's sErrorTarget then we should just go ahead and not bother to check. @@ +2588,5 @@ > > void > nsCanvasRenderingContext2DAzure::TransformWillUpdate() > { > + EnsureTarget(); Document somewhere that TransformWillUpdate calls EnsureTarget so callers of TransformWillUpdate don't have to. @@ +4395,5 @@ > CanvasLayer *aOldLayer, > LayerManager *aManager) > { > + EnsureTarget(); > + if (mTarget == sErrorTarget) { Don't need to call EnsureTarget. If mTarget is null (or sErrorTarget) we can just return nsnull and nothing will be rendered. If we do that, plus all the changes above, then a blank canvas that hasn't been drawn to yet (but which has had getContext("2d") called) but is visible on the page won't have to allocate a buffer.
Added IsTargetValid() and fixed other review comments. Note that SetDimension will no longer cause an NS_ERROR_OUT_OF_MEMORY.
Attachment #628244 - Attachment is obsolete: true
Attachment #628572 - Flags: review+
Attachment #628572 - Flags: review+ → review?(roc)
Comment on attachment 628572 [details] [diff] [review] Part 3: Don't create backing surfaces until they are needed - Azure v5 Review of attachment 628572 [details] [diff] [review]: ----------------------------------------------------------------- You lost -p -U8 from your diffs again. Please regenerate with that, it really helps with reviewing. ::: content/canvas/src/nsCanvasRenderingContext2DAzure.cpp @@ +1253,5 @@ > + } > + > + if (layerManager) { > + mTarget = layerManager->CreateDrawTarget(size, format); > + } else { You should test what happens if we create a really large canvas, say 20000x20. This will probably cause this CreateDrawTarget to fail, when D2D is enabled, because D2D can only handle texture-sized canvases. In that case we currently (in nsHTMLCanvasElement::GetContext) fall back to forceThebes=true and end up creating a cairo-based canvas context that will use main-memory CPU-based rendering. But this patch is basically disabling that fallback, we'll take the error path instead. This is not good. Instead I think we need to fall back here. Unfortunately it's too late to change the type of 2d context back to an nsCanvasRenderingContext2D (cairo/Thebes-based context) like we currently do. Instead we need to choose a different Azure backend, ideally the cairo-Azure backend that Nick has been trying to get working!
Depends on: 773097
Rebased patch now that it is no longer blocked by lack of cairo fallback.
Attachment #654084 - Flags: review?(roc)
Attachment #628572 - Attachment is obsolete: true
Attachment #628572 - Flags: review?(roc)
Comment on attachment 654084 [details] [diff] [review] Part 3: Don't create backing surfaces until they are needed - Azure v6 v6 patch breaks fishbowl
Attachment #654084 - Flags: review?(roc)
Comment on attachment 654084 [details] [diff] [review] Part 3: Don't create backing surfaces until they are needed - Azure v6 Review of attachment 654084 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/canvas/src/nsCanvasRenderingContext2DAzure.cpp @@ +917,5 @@ > state->globalAlpha = 1.0; > > state->colorStyles[STYLE_FILL] = NS_RGB(0,0,0); > state->colorStyles[STYLE_STROKE] = NS_RGB(0,0,0); > state->shadowColor = NS_RGBA(0,0,0,0); Shouldn't ClearTarget be setting mTarget to null?
Try run for 8bb1eb3b5c35 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=8bb1eb3b5c35 Results (out of 32 total builds): success: 26 warnings: 6 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/ajones@mozilla.com-8bb1eb3b5c35
Try run for abd966a0c658 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=abd966a0c658 Results (out of 32 total builds): success: 28 warnings: 4 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/ajones@mozilla.com-abd966a0c658
Try run for f49849874800 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=f49849874800 Results (out of 143 total builds): exception: 29 success: 20 failure: 94 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/ajones@mozilla.com-f49849874800
Try run for a6b1c6ff6d54 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=a6b1c6ff6d54 Results (out of 250 total builds): success: 214 warnings: 36 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/ajones@mozilla.com-a6b1c6ff6d54
Try run for ad7afc76b6d4 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=ad7afc76b6d4 Results (out of 48 total builds): success: 46 warnings: 1 failure: 1 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/ajones@mozilla.com-ad7afc76b6d4
Try run for 6378f820ab32 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=6378f820ab32 Results (out of 250 total builds): success: 226 warnings: 22 failure: 2 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/ajones@mozilla.com-6378f820ab32
Try run for dc20a1747a53 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=dc20a1747a53 Results (out of 252 total builds): success: 232 warnings: 16 failure: 4 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/ajones@mozilla.com-dc20a1747a53
Need check-in for Part 3 v7 but needs to be checked in with or after bug 786913.
Depends on: 786913
Keywords: checkin-needed
Try run for dc20a1747a53 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=dc20a1747a53 Results (out of 254 total builds): success: 233 warnings: 17 failure: 4 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/ajones@mozilla.com-dc20a1747a53
(In reply to Ryan VanderMeulen from comment #95) > https://hg.mozilla.org/integration/mozilla-inbound/rev/013743bb609e > > Should this have a test? The patch is an optimisation rather than a change in functionality. Existing canvas tests should be sufficient to prevent regression in terms of functionality.
It doesn't appear that a crashtest was ever checked in for this bug.
Sorry, had to back this out for numerous Windows 7 mochitest failures that persisted even after a clobber. https://hg.mozilla.org/integration/mozilla-inbound/rev/72c85979e852
So the patch that passed try didn't get uploaded. Now it has.
Attachment #658696 - Flags: review?(roc)
Comment on attachment 658696 [details] [diff] [review] Part 3: Don't create backing surfaces until they are needed - Azure v8 Review of attachment 658696 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/canvas/src/nsCanvasRenderingContext2DAzure.cpp @@ +545,5 @@ > // Initialize our static variables. > uint32_t nsCanvasRenderingContext2DAzure::sNumLivingContexts = 0; > uint8_t (*nsCanvasRenderingContext2DAzure::sUnpremultiplyTable)[256] = nullptr; > uint8_t (*nsCanvasRenderingContext2DAzure::sPremultiplyTable)[256] = nullptr; > +RefPtr<DrawTarget> nsCanvasRenderingContext2DAzure::sErrorTarget = nullptr; You can't have a static RefPtr. Need to manually NS_ADDREF/NS_RELEASE. Which we do, so I don't know why this is a RefPtr.
Attachment #657156 - Attachment is obsolete: true
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #100) > Comment on attachment 658696 [details] [diff] [review] > Part 3: Don't create backing surfaces until they are needed - Azure v8 > > Review of attachment 658696 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: content/canvas/src/nsCanvasRenderingContext2DAzure.cpp > @@ +545,5 @@ > > // Initialize our static variables. > > uint32_t nsCanvasRenderingContext2DAzure::sNumLivingContexts = 0; > > uint8_t (*nsCanvasRenderingContext2DAzure::sUnpremultiplyTable)[256] = nullptr; > > uint8_t (*nsCanvasRenderingContext2DAzure::sPremultiplyTable)[256] = nullptr; > > +RefPtr<DrawTarget> nsCanvasRenderingContext2DAzure::sErrorTarget = nullptr; > > You can't have a static RefPtr. Need to manually NS_ADDREF/NS_RELEASE. Which > we do, so I don't know why this is a RefPtr. But you can have a StaticRefPtr, if that floats your boat.
Comment on attachment 658696 [details] [diff] [review] Part 3: Don't create backing surfaces until they are needed - Azure v8 Patch needs to be rebased.
Attachment #658696 - Flags: review?(roc)
*fingers crossed* https://hg.mozilla.org/integration/mozilla-inbound/rev/71d1333ca77f Still wondering about a crashtest for this bug, too.
Keywords: checkin-needed
Target Milestone: mozilla5 → mozilla18
Status: REOPENED → RESOLVED
Closed: 14 years ago13 years ago
Resolution: --- → FIXED
Depends on: 794463
Depends on: 1177726
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: