Last Comment Bug 703317 - non-threadsafe allocator for nsIntRegion causes TestDataStructures to fail
: non-threadsafe allocator for nsIntRegion causes TestDataStructures to fail
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Graphics (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla11
Assigned To: Nobody; OK to take it and work on it
:
Mentors:
Depends on: 774756 757933
Blocks: omtc 699319
  Show dependency treegraph
 
Reported: 2011-11-17 11:15 PST by Niko Matsakis [:nmatsakis]
Modified: 2012-07-17 13:13 PDT (History)
9 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
TLSAllocator v1 (2.45 KB, patch)
2011-11-29 10:37 PST, Benoit Girard (:BenWa)
roc: review+
Details | Diff | Review

Description Niko Matsakis [:nmatsakis] 2011-11-17 11:15:41 PST
The test TestDataStructures uses nsIntRegions which are allocated via a non-threadsafe allocator.  The result is that the Test18() subroutine is not usable in a cross-thread context.  Presumably nsIntRegions are also used in other contexts which will also not be usable in a cross-thread context.  Making the allocator thread-local rather than global might be a possible solution.
Comment 1 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-11-17 12:25:31 PST
This is one of the bugs blocking off-main-thread compositing.

I don't have a good feel for how much this optimized allocator helps us.  My first inclination would be to try an implementation using malloc/free and see if anything regresses.
Comment 2 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-11-17 14:43:08 PST
We create and destroy regions *a lot*. I expect removing this would regress something. Maybe I'm wrong though.

So, TLS recycler?
Comment 3 Benoit Girard (:BenWa) 2011-11-28 09:19:45 PST
I'm looking at the allocator and I find this very concerning:
http://mxr.mozilla.org/mozilla-central/source/gfx/src/nsRegion.cpp#99

This means to indicate we can have an adversarial case where a page causes a lot of region allocation. Closing the tab wouldn't release the memory.

I see 3 solutions:
1) TLSing the allocator
2) Use malloc
3) Add MUTEX for the allocator

Should we investigate TLSing first? Solution 2 should fix the memory poll release problem if it doesn't regress performance.
Comment 4 Mozilla RelEng Bot 2011-11-28 13:00:33 PST
Try run for 20ae1971dd5d is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=20ae1971dd5d
Results (out of 6 total builds):
    exception: 4
    success: 2
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/b56girard@gmail.com-20ae1971dd5d
Comment 5 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-11-28 15:01:16 PST
(In reply to Benoit Girard (:BenWa) from comment #3)
> I'm looking at the allocator and I find this very concerning:
> http://mxr.mozilla.org/mozilla-central/source/gfx/src/nsRegion.cpp#99
> 
> This means to indicate we can have an adversarial case where a page causes a
> lot of region allocation. Closing the tab wouldn't release the memory.
> 
> I see 3 solutions:
> 1) TLSing the allocator
> 2) Use malloc
> 3) Add MUTEX for the allocator

A good option might be TLS plus time-based expiration of the rect pool.

Maybe someone should just write a region micro-benchmark that creates and destroys a lot of regions with 1 and 2 rectangles in them, then see if there's a measurable speedup from the free rect pool. Maybe jemalloc optimizes these 16-byte allocations really well.
Comment 6 Mozilla RelEng Bot 2011-11-28 17:20:25 PST
Try run for d00b1a3dcccf is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=d00b1a3dcccf
Results (out of 49 total builds):
    success: 48
    failure: 1
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/b56girard@gmail.com-d00b1a3dcccf
Comment 7 Benoit Girard (:BenWa) 2011-11-28 17:34:41 PST
I got a 2% tp4 regression on OSX/XP/Win7 so I think option 2 is out of the question.
Comment 8 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-11-28 18:21:12 PST
That's good to know :-)
Comment 9 Benoit Girard (:BenWa) 2011-11-29 10:37:21 PST
Created attachment 577667 [details] [diff] [review]
TLSAllocator v1

Pushed to try
Comment 10 Mozilla RelEng Bot 2011-11-29 15:10:51 PST
Try run for 0b82febeb6ea is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=0b82febeb6ea
Results (out of 104 total builds):
    success: 89
    warnings: 14
    failure: 1
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/b56girard@gmail.com-0b82febeb6ea
Comment 11 Benoit Girard (:BenWa) 2011-11-29 15:23:09 PST
Comment on attachment 577667 [details] [diff] [review]
TLSAllocator v1

Good try run, worked well locally.
Comment 12 Marco Bonardo [::mak] 2011-11-30 03:58:17 PST
https://hg.mozilla.org/mozilla-central/rev/47500d6cbffd
Comment 13 Oleg Romashin (:romaxa) 2012-05-23 09:14:50 PDT
I've been playing recently with patches from bug 539356, and by default without last patches it introduces significant amount of region operations. and in couple with this fix I see significant performance hit caused by threadsafety checks on page which introduce lot of regions ops http://bubblemark.com/dhtml.htm
is there are anything we can do about it?
Comment 14 Benoit Girard (:BenWa) 2012-05-23 09:19:49 PDT
Do we have profiles to show what is slow?
Comment 15 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-05-23 11:50:07 PDT
Yeah, which threadsafety checks are you referring to?
Comment 16 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-23 12:22:51 PDT
Access to per-thread rectangle memory pool.
Comment 17 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-05-23 12:32:07 PDT
OK, TLS access is different than thread-safety checks.  I defer to comment 14 then.
Comment 18 Oleg Romashin (:romaxa) 2012-05-23 12:34:51 PDT
http://pastebin.mozilla.org/1648703 - here is opreport with fixed 757933, nspr calls are gone, bug pthread_getspecific still on top of profile
Comment 19 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-05-23 12:39:10 PDT
Yay ARM.  Definitely want to be __thread'ing this.
Comment 20 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-05-23 12:39:53 PDT
Thanks for the fix!
Comment 21 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-05-23 12:58:38 PDT
(In reply to Chris Jones [:cjones] [:warhammer] from comment #19)
> Yay ARM.  Definitely want to be __thread'ing this.

Filed bug 757969 on that.

Note You need to log in before you can comment on or make changes to this bug.