Gralloc Allocation can take 25ms+

RESOLVED FIXED

Status

()

RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: BenWa, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

5 years ago
I though we already had a bug for this but no one can link me to it.

Here's a profile:
http://people.mozilla.org/~bgirard/cleopatra/#report=c0d4369e8bcde03937538e8e966127ac23797060&select=1223,1967

The interesting thing I'm seeing in this profile at least is the compositor is free to respond to the message in this case but takes most of the time memset-ing the gralloc buffer. Looks like the problem isn't entirely waiting for the other side to pick up the message.

CC'ing everyone I've heard discussing this issue.

Comment 1

5 years ago
where exactly is that memset coming from?
long time memset() duration was talked also in Bug 925616 Comment 19.
(Reporter)

Comment 4

5 years ago
Created attachment 8334051 [details] [diff] [review]
patch

dwilson are you the right person to review this?

I'm finding that the gralloc times are more inline with what I would expect by using MAP_POPULATE.
Assignee: nobody → bgirard
Status: NEW → ASSIGNED
Attachment #8334051 - Flags: review?(dwilson)

Comment 5

5 years ago
This has to be patched by the vendor upstream.

Comment 6

5 years ago
Diego, do you mind taking this?
Assignee: bgirard → dwilson
How did you come across this flag? I assume this somehow speeds up (or avoids?) the memset during allocation. How so?
Flags: needinfo?(bgirard)
(Reporter)

Comment 8

5 years ago
(In reply to Diego Wilson [:diego] from comment #7)
> How did you come across this flag? I assume this somehow speeds up (or
> avoids?) the memset during allocation. How so?

The problem is I'm seeing memset that is much slower then expected. We suspect it's because mmap sets up the page lazily when the allocation is large. In this case we always memset the entire area so the pages fault-in because they aren't setup. Since we will require each page it's better to force the pages to be populate and avoid the faulting.
Flags: needinfo?(bgirard)
OK, let me take it to my CAF graphics guys and I'll let you know if it sticks.
Attachment #8334051 - Flags: review?(dwilson)
This memset no longer exists in JB on CAF [1]. However it's still there on ICS.

Which target was this profiled and tested on?

[1] https://www.codeaurora.org/cgit/quic/la/platform/hardware/qcom/display/commit/libgralloc/ionalloc.cpp?id=74e4567e350ff76f2450a62e106ad3bf9aac568a
Flags: needinfo?(bgirard)
(Reporter)

Comment 11

5 years ago
Nexus 7
Flags: needinfo?(bgirard)
(Reporter)

Comment 12

5 years ago
(In reply to Diego Wilson [:diego] from comment #10)
> https://www.codeaurora.org/cgit/quic/la/platform/hardware/qcom/display/
> commit/libgralloc/ionalloc.cpp?id=74e4567e350ff76f2450a62e106ad3bf9aac568a

*cheers* Ok if the performance impact isn't moved somewhere else this is great.
(In reply to Benoit Girard (:BenWa) from comment #12)
> *cheers* Ok if the performance impact isn't moved somewhere else this is
> great.

According to Naseer the ion driver now guarantees giving us zero'd buffers and the assumption is that that is fast enough.
Looks like this patch is not needed in CAF at least.
Assignee: dwilson → nobody
Just to be clear, this issue should not affect any of the planned commercial devices that will ship with v1.3 since those devices use the CAF gonk.
(Reporter)

Comment 16

5 years ago
Resolving as fixed upstream (maybe should be WFM)
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED

Comment 17

5 years ago
Do you think we should print to logcat when allocating a gralloc buffer takes longer than X for a small X so we don't miss this with future hardware?
(Reporter)

Comment 18

5 years ago
(In reply to Andreas Gal :gal from comment #17)
> Do you think we should print to logcat when allocating a gralloc buffer
> takes longer than X for a small X so we don't miss this with future hardware?

Now that we have jld' stackwalk feature for the b2g profiler IMO it should be fairly obvious if we run into this type of issue in a profile. I mean it couldn't hurt to time and see if our allocation speed falls below say 100MB/sec. It would be noisy with context switching but still interesting to do.
The nice thing about logcat is that a lot more people know how to get it than people that can profile :)
(Reporter)

Comment 20

5 years ago
(In reply to Milan Sreckovic [:milan] from comment #19)
> The nice thing about logcat is that a lot more people know how to get it
> than people that can profile :)

Good point. Forked this over to bug 942351.
You need to log in before you can comment on or make changes to this bug.