Closed
Bug 978631
Opened 11 years ago
Closed 11 years ago
Investigate a recycler bin for SharedBuffers
Categories
(Core :: Graphics, defect)
Core
Graphics
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: chiajung, Assigned: chiajung)
References
Details
Attachments
(1 file)
6.31 KB,
patch
|
Details | Diff | Splinter Review |
This is a follow up bug for bug 959089:
On some platform, in some case, we may need a lot shared buffers, current architecture does not use a buffer queue and tend to create->use->release->create rapidly with same configuration. However, sometimes such shared memory allocation may take long time. A buffer recycler bin based on bug 959089 should help such case much.
Tiling work will introduce a TextureClientPool, which can manage this at least for TextureClients of the same size/format. We have two approaches to recycling, one more generic and one specialized to tiles.
Assignee | ||
Comment 2•11 years ago
|
||
Some log about the allocation/deallocation w/o recycler:
03-06 12:24:17.709 624 694 I SBMParent: proc:684, alloc bufSize:(320, 147), format:0x1, usage:0x133, time: 0.67 ms
03-06 12:24:17.709 624 694 I SBMParent: proc:684, alloc bufSize:(320, 147), format:0x1, usage:0x133, time: 0.61 ms
03-06 12:24:17.729 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 1.25 ms
03-06 12:24:17.739 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 1.28 ms
03-06 12:24:17.769 624 694 I SBMParent: proc:684, alloc bufSize:(320, 75), format:0x1, usage:0x133, time: 0.52 ms
03-06 12:24:17.769 624 694 I SBMParent: proc:684, alloc bufSize:(320, 75), format:0x1, usage:0x133, time: 0.43 ms
03-06 12:24:18.090 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:18.090 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:18.180 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 2.90 ms
03-06 12:24:18.190 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 1.31 ms
03-06 12:24:18.640 624 694 I SBMParent: proc:684, drop from parent bufSize:(320, 147) format:0x1 usage:0x133
03-06 12:24:18.640 624 694 I SBMParent: proc:684, drop from parent bufSize:(320, 147) format:0x1 usage:0x133
03-06 12:24:18.830 624 694 I SBMParent: proc:684, alloc bufSize:(320, 147), format:0x1, usage:0x133, time: 1.65 ms
03-06 12:24:18.830 624 694 I SBMParent: proc:684, alloc bufSize:(320, 147), format:0x1, usage:0x133, time: 0.98 ms
03-06 12:24:19.271 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:19.271 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:19.371 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 1.25 ms
03-06 12:24:19.371 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 1.31 ms
03-06 12:24:20.242 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:20.252 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:20.362 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 7.32 ms
03-06 12:24:20.372 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 1.34 ms
03-06 12:24:20.852 624 694 I SBMParent: proc:684, drop from parent bufSize:(320, 147) format:0x1 usage:0x133
03-06 12:24:20.852 624 694 I SBMParent: proc:684, drop from parent bufSize:(320, 147) format:0x1 usage:0x133
03-06 12:24:21.043 624 694 I SBMParent: proc:684, alloc bufSize:(320, 147), format:0x1, usage:0x133, time: 1.10 ms
03-06 12:24:21.053 624 694 I SBMParent: proc:684, alloc bufSize:(320, 147), format:0x1, usage:0x133, time: 0.76 ms
03-06 12:24:21.543 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:21.543 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:21.613 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 1.25 ms
03-06 12:24:21.633 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 11.54 ms
03-06 12:24:22.114 624 694 I SBMParent: proc:684, drop from parent bufSize:(320, 147) format:0x1 usage:0x133
03-06 12:24:22.114 624 694 I SBMParent: proc:684, drop from parent bufSize:(320, 147) format:0x1 usage:0x133
03-06 12:24:22.464 624 694 I SBMParent: proc:684, alloc bufSize:(320, 147), format:0x1, usage:0x133, time: 0.67 ms
03-06 12:24:22.464 624 694 I SBMParent: proc:684, alloc bufSize:(320, 147), format:0x1, usage:0x133, time: 0.64 ms
03-06 12:24:22.924 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:22.934 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:23.165 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 1.46 ms
03-06 12:24:23.175 624 694 I SBMParent: proc:684, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 1.34 ms
03-06 12:24:23.655 624 694 I SBMParent: proc:684, drop from parent bufSize:(320, 147) format:0x1 usage:0x133
03-06 12:24:23.655 624 694 I SBMParent: proc:684, drop from parent bufSize:(320, 147) format:0x1 usage:0x133
03-06 12:24:24.136 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:24.136 624 694 I SBMParent: proc:684, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:24:24.136 624 694 I SBMParent: proc:684, drop from parent bufSize:(320, 75) format:0x1 usage:0x133
03-06 12:24:24.146 624 694 I SBMParent: proc:684, drop from parent bufSize:(320, 75) format:0x1 usage:0x133
It seems parent side always drop the buffer, and any scrooling operation from homescreen cause a lot of allocation/deallocation. It seems buffer allocation is very fast for small buffer only some peak(10ms).
Assignee | ||
Comment 3•11 years ago
|
||
Some log about the allocation/deallocation w/ a simple recycler:
03-06 12:37:53.744 786 813 I SBMParent: proc:786, alloc bufSize:(320, 32), format:0x1, usage:0x133, time: 0.37 ms
03-06 12:37:53.744 786 813 I SBMParent: proc:786, alloc bufSize:(320, 32), format:0x1, usage:0x133, time: 0.27 ms
03-06 12:37:53.984 786 813 I SBMParent: proc:786, alloc bufSize:(320, 480), format:0x1, usage:0x133, time: 5.37 ms
03-06 12:37:54.004 786 813 I SBMParent: proc:786, alloc bufSize:(320, 480), format:0x2, usage:0x133, time: 5.10 ms
03-06 12:37:54.004 786 813 I SBMParent: proc:786, alloc bufSize:(320, 480), format:0x2, usage:0x133, time: 5.31 ms
03-06 12:37:54.054 786 813 I SBMParent: proc:786, alloc bufSize:(247, 32), format:0x1, usage:0x133, time: 0.43 ms
03-06 12:37:54.054 786 813 I SBMParent: proc:786, alloc bufSize:(247, 32), format:0x1, usage:0x133, time: 0.24 ms
03-06 12:37:54.064 786 813 I SBMParent: proc:786, alloc bufSize:(320, 32), format:0x2, usage:0x133, time: 0.40 ms
03-06 12:37:54.064 786 813 I SBMParent: proc:786, alloc bufSize:(320, 32), format:0x2, usage:0x133, time: 0.27 ms
03-06 12:37:54.094 786 813 I SBMParent: proc:786, drop from parent bufSize:(247, 32) format:0x1 usage:0x133
03-06 12:37:54.094 786 813 I SBMParent: proc:786, drop from parent bufSize:(247, 32) format:0x1 usage:0x133
03-06 12:37:54.214 786 813 I SBMParent: proc:786, drop from parent bufSize:(320, 32) format:0x1 usage:0x133
03-06 12:37:54.214 786 813 I SBMParent: proc:786, drop from parent bufSize:(320, 32) format:0x1 usage:0x133
03-06 12:37:54.455 786 813 I SBMParent: proc:786, drop from parent bufSize:(320, 480) format:0x2 usage:0x133
03-06 12:37:54.455 786 813 I SBMParent: proc:786, drop from parent bufSize:(320, 480) format:0x2 usage:0x133
03-06 12:37:54.455 786 813 I SBMParent: proc:786, drop from parent bufSize:(247, 32) format:0x1 usage:0x133
03-06 12:37:54.455 786 813 I SBMParent: proc:786, drop from parent bufSize:(247, 32) format:0x1 usage:0x133
03-06 12:37:54.455 786 813 I SBMParent: proc:786, drop from parent bufSize:(320, 32) format:0x2 usage:0x133
03-06 12:37:54.455 786 813 I SBMParent: proc:786, drop from parent bufSize:(320, 32) format:0x2 usage:0x133
03-06 12:37:56.967 786 859 I SBMParent: proc:849, alloc bufSize:(320, 147), format:0x1, usage:0x133, time: 0.82 ms
03-06 12:37:56.977 786 859 I SBMParent: proc:849, alloc bufSize:(320, 147), format:0x1, usage:0x133, time: 0.76 ms
03-06 12:37:56.997 786 859 I SBMParent: proc:849, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 4.58 ms
03-06 12:37:57.007 786 859 I SBMParent: proc:849, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 6.13 ms
03-06 12:37:58.308 786 859 I SBMParent: proc:849, drop from parent bufSize:(320, 147) format:0x1 usage:0x133
03-06 12:37:58.308 786 859 I SBMParent: proc:849, drop from parent bufSize:(320, 147) format:0x1 usage:0x133
03-06 12:37:58.589 786 859 I SBMParent: proc:849, alloc bufSize:(320, 75), format:0x1, usage:0x133, time: 0.85 ms
03-06 12:37:58.599 786 859 I SBMParent: proc:849, alloc bufSize:(320, 75), format:0x1, usage:0x133, time: 0.55 ms
03-06 12:37:58.739 786 859 I SBMParent: proc:849, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 6.38 ms
03-06 12:37:58.739 786 859 I SBMParent: proc:849, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 6.07 ms
03-06 12:37:58.809 786 859 I SBMParent: proc:849, drop from parent bufSize:(320, 75) format:0x1 usage:0x133
03-06 12:37:58.809 786 859 I SBMParent: proc:849, drop from parent bufSize:(320, 75) format:0x1 usage:0x133
03-06 12:37:59.099 786 859 I SBMParent: proc:849, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:37:59.099 786 859 I SBMParent: proc:849, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:37:59.450 786 859 I SBMParent: proc:849, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 0.03 ms
03-06 12:37:59.460 786 859 I SBMParent: proc:849, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 0.00 ms
03-06 12:38:00.851 786 859 I SBMParent: proc:849, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:38:00.861 786 859 I SBMParent: proc:849, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:38:00.951 786 859 I SBMParent: proc:849, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 0.00 ms
03-06 12:38:00.961 786 859 I SBMParent: proc:849, alloc bufSize:(316, 376), format:0x1, usage:0x133, time: 0.03 ms
03-06 12:38:01.452 786 859 I SBMParent: proc:849, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:38:01.472 786 859 I SBMParent: proc:849, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:38:01.962 786 859 I SBMParent: proc:849, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
03-06 12:38:01.962 786 859 I SBMParent: proc:849, drop from parent bufSize:(316, 376) format:0x1 usage:0x133
it seems it help suppress the peak mentioned above. I will upload the patch later.
Assignee | ||
Comment 4•11 years ago
|
||
It seems this implementation causes some render error. The SOP to see the problem:
1. Boot up and unlock screen
2. launch another app
3. press home key to back to homescreen
4. scroll on homescreen
My initial guess is: because the buffers are always drop from parent side and we have no way knows that the buffer are in use else where in content side and dispatch the buffer to another user.
Assignee | ||
Comment 5•11 years ago
|
||
Another problem here is that "new GraphicBuffer" operation clears the buffer content before return, so we may need clear the buffer when recycle it from recycler bin.
Comment 6•11 years ago
|
||
This buffer has to be hooked up to the out of memory hook and must be flushed on low memory. I would like to review any patch before it lands.
Comment 7•11 years ago
|
||
It might be better to evaluate the performance with tiling enabled environment.
Comment 8•11 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #7)
> It might be better to evaluate the performance with tiling enabled
> environment.
Tiling also have a recycling mechanism.
Comment 9•11 years ago
|
||
Gralloc buffer recycling across process could become a security risk. Gralloc Buffers are kernel objects, even when client side release TextureClient. The client could continue to hold the gralloc buffers. And when the gralloc buffer is reused by another process, the client could read another process' rendering result.
Assignee | ||
Comment 10•11 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #9)
> Gralloc buffer recycling across process could become a security risk.
> Gralloc Buffers are kernel objects, even when client side release
> TextureClient. The client could continue to hold the gralloc buffers. And
> when the gralloc buffer is reused by another process, the client could read
> another process' rendering result.
Sure, next version will add a memset to clear buffer before acquire.(In my test, the memset should not take too much time).
And in fact there are some problem associate to the gralloc fd, which make the recycler bin not so easy like the WIP: In fact, the 2nd time serialize gralloc buffer to same process cause GraphicBufferMapper::registerBuffer fail unexpectedly, so I am trying to find a way to avoid that.(I can not find a real exist case that can recycle a buffer from other process, since the buffer request from different process usually have different buffer configuration)
A easier and safer implementation would be recycle the buffers from Child side which have many benefits:
1. No IPC incurred when a buffer is find in the recycler bin!!
2. memset on the child thread, which should have more time than parent thread
3. easier to be implemented
But the problem is:
1. Need another message from parent to child to ask all child to kill the buffer in recycler bin when pmem memory pressure.
2. Need a better way to expose the memory report for gralloc buffer with recycler bin.
3. Recycle gralloc buffer across process should maximize the possibility to recycle the resource.
Another problem here is about those PixelFormat has bpp value 0(private format, value returned from PixelFormat::bytesPerPixel), I think we should avoid recycling such kind of buffer(since we do not know how to clear it correctly anyway, and the only use case I can think out uses GonkBufferQueue, which never drop the buffers until cancelBuffer, however it should enhance the warm launch time.)
Comment 11•11 years ago
|
||
Lets not worry about recycling for now. I would like to see this landed asap and we can iterate from there.
Comment 12•11 years ago
|
||
Never mind. Comment 11 was for the wrong bug.
Comment 13•11 years ago
|
||
(In reply to Chiajung Hung [:chiajung] from comment #10)
> But the problem is:
> 1. Need another message from parent to child to ask all child to kill the
> buffer in recycler bin when pmem memory pressure.
The child side can already listen for memory pressure events without requiring an additional IPDL message. gfxPlatform.cpp contains an example of how to do that.
> 2. Need a better way to expose the memory report for gralloc buffer with
> recycler bin.
That should not be a big problem.
> 3. Recycle gralloc buffer across process should maximize the possibility to
> recycle the resource.
At the expense of synchronous IPC with the numerous context switches it implies (as you pointed out). There's a tradeoff.
With tiling we already recycle buffers for scrollable layers, and hardware-decoded videos reuse there gralloc buffers too, so we should measure the impact of not recycling the rest before we decide whether we need more recycling logic. It's harder to efficiently recycle buffers of arbitrary size.
Assignee | ||
Comment 14•11 years ago
|
||
(In reply to Nicolas Silva [:nical] from comment #13)
> (In reply to Chiajung Hung [:chiajung] from comment #10)
> > But the problem is:
> > 1. Need another message from parent to child to ask all child to kill the
> > buffer in recycler bin when pmem memory pressure.
>
> The child side can already listen for memory pressure events without
> requiring an additional IPDL message. gfxPlatform.cpp contains an example of
> how to do that.
I know memory pressure is listenable from content side, but the problem here is I am not sure there is code path that trigger memory pressure when it is out of PMEM, in fact, not normal heap or stack.
>
> > 2. Need a better way to expose the memory report for gralloc buffer with
> > recycler bin.
>
> That should not be a big problem.
Yes, it's quiet tiny problem, and I think there is some bug trying to enhance gralloc memory report on current architecture(not based on SharedBufferManager stuff).
>
> > 3. Recycle gralloc buffer across process should maximize the possibility to
> > recycle the resource.
>
> At the expense of synchronous IPC with the numerous context switches it
> implies (as you pointed out). There's a tradeoff.
>
> With tiling we already recycle buffers for scrollable layers, and
> hardware-decoded videos reuse there gralloc buffers too, so we should
> measure the impact of not recycling the rest before we decide whether we
> need more recycling logic. It's harder to efficiently recycle buffers of
> arbitrary size.
Yes, I think the buffer size/format/usage match is a quiet hard criteria and make recycleing not efficient as expected.
Hardware code/camera does not recycle the buffer once it is de-inited. That'w why I think such recycle operation helps on warm launch time for those component. (As a result, some manufactor usually reserve the buffers for them when Android boot up to reduce the launch time of Camera App)
Comment 15•11 years ago
|
||
From below comment, you will see lots of buffer creation during browser app without using tiles.
https://bugzilla.mozilla.org/show_bug.cgi?id=925616#c14
Recycler will receive the low-memory pressure soon because it keeps too many buffers with different size. And then it will try to free all buffers it has.
Now we have tiling, but I'm not sure recycler will still hit this problem or not.
Assignee | ||
Comment 16•11 years ago
|
||
After discussing with Peter, we think the cost of allocation is not that big after SharedBufferManager's implement (w/o recycler bin).
I measured the cost of buffer allocation and pure IPC, it seems buffer allocation takes over 90% time, and larger buffer takes longer time. We think it is because of memset. The IPC time seems constantly 0.5~1ms.
As a result, the recycler bin may not provide noticeable performance boost. Though it may still worth to have a very simple recycler bin in child side to avoid IPC. Since such recycler bin is safe and the gain is a constant time which benefits small buffer allocation quiet lot (since the new operation very fast, so IPC takes higher ratio).
I think I have to profile the content rendering before further try.
Comment 17•11 years ago
|
||
Agreed, lets close this. We will reopen if it shows up in profiles again.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•