977975 - composites running slower than 60fps on b2g 1.4

Reporter

Description

•

11 years ago

Lately on b2g v1.4 I have seen a number of profiles like this one from bug 975831 comment 54: http://people.mozilla.org/~bgirard/cleopatra/#report=91799a05276e72b06beb7d1f038dd55221c95c18 Look at time range 6316 to 6726. Here you can see a 70+ ms rasterize. If you look at what the child process is doing you can see that IPC MessageChannel::SendAndWait shows up a few times waiting for a response. While its unclear if this is due to multiple messages or a single call from the profile, it shows up enough that I started to wonder if IPC is getting delayed or stalling delivering messages to the parent process. To investigate further I hacked in some instrumentation in SendAndWait() to print to logcat when a synchronous IPC call takes greater than 10ms. I also then enabled IPC logging (without DEBUG). The IPC logging shows that most messages are delivered in a timely fashion: I/GeckoIPC(11713): [time:1393560810093386][11713->11550][PLayerTransactionChild] Sending Msg_PGrallocBufferConstructor([T ODO]) I/GeckoIPC(11550): [time:1393560810093983][11550<-11713][PLayerTransactionParent] Received Msg_PGrallocBufferConstructor( [TODO]) I/GeckoIPC(11550): [time:1393560810098459][11550->11713][PLayerTransactionParent] Sending reply Reply_PGrallocBufferConst ructor([TODO]) In this case the PGrallocBufferConstructor messages was delivered in less than 1ms. When my instrumentation fires, though, I see that significant delay is coming in the IPC delivery: I/GeckoIPC(11713): [time:1393560810100034][11713->11550][PLayerTransactionChild] Sending Msg_PTextureConstructor([TODO]) I/GeckoIPC(11713): [time:1393560810100983][11713->11550][PLayerTransactionChild] Sending Msg_PGrallocBufferConstructor([T ODO]) I/GeckoIPC(11550): [time:1393560810113251][11550<-11713][PLayerTransactionParent] Received Msg_PTextureConstructor([TODO] ) I/GeckoIPC(11550): [time:1393560810113503][11550<-11713][PLayerTransactionParent] Received Msg_PGrallocBufferConstructor( [TODO]) E/memalloc(11550): /dev/pmem: No more pmem available W/memalloc(11550): Falling back to ashmem I/GeckoIPC(11550): [time:1393560810132156][11550->11713][PLayerTransactionParent] Sending reply Reply_PGrallocBufferConst ructor([TODO]) I/Gecko (11713): ### ### SendAndWait() PLayerTransaction::Msg_PGrallocBufferConstructor took 31 ms Here we see a total delay of 31ms. Of that delay ~12.5ms is in IPC message delivery and ~18.5ms in parent process work. In this case it appears the parent process work was quite expensive due to running out of pmem. I also delays in PLayerTransaction::Msg_Update: I/GeckoIPC(11713): [time:1393560812133136][11713->11550][PLayerTransactionChild] Sending Msg_Update([TODO]) E/msm7627a.hwcomposer(11550): hwc_set: Unable to render by hwc due to non-pmem memory I/GeckoIPC(11550): [time:1393560812146536][11550<-11713][PLayerTransactionParent] Received Msg_Update([TODO]) I/GeckoIPC(11550): [time:1393560812147369][11550->11713][PLayerTransactionParent] Sending reply Reply_Update([TODO]) I/Gecko (11713): ### ### SendAndWait() PLayerTransaction::Msg_Update took 14 ms Here the entire Msg_Update() took 14ms and ~13.5ms of that was in IPC delivery. Note all my delayed Msg_Update() instances seem to have the hwc message about non-pmem memory interleaved between the send and receive. Its unclear to me quite yet what is going on, but it seems we're going to have a bad time if we cannot deliver the gfx IPC messages quickly and reliably. Its hard to prioritize work in the parent process if we are delayed finding out that we need to do it. Now that APZ has made composition asynchronous we may be running hwc more frequently than in the past. This is now keeping the compositor thread busy preventing it from picking up child messages in a timely fashion. Are the hwc operations synchronous? Can they be moved to a separate thread? Or are there other ways we could coordinate hwc better with the client layerizing? Are there other ways we could prioritize the IPC and gfx messages so that we have some limit on how long they will be delayed? Any other theories? Or is this behavior expected? I'll take this for now to investigate.

Sotaro Ikeda [:sotaro]

Comment 1

•

11 years ago

On ICS, hwc operations is sync. hwc operations becomes async from gonk JB by fence support. It is bug 957323.

Sotaro Ikeda [:sotaro]

Comment 2

•

11 years ago

gralloc buffer allocation is synchronous. It could be mitigated by Bug 959089. Bug 950112 is more aggressive approach. But it can not be used on ICS, because of gralloc buffer's drivers problem. On KK, it seems possible.

Sotaro Ikeda [:sotaro]

Comment 3

•

11 years ago

gralloc buffer allocation latency is intrinsic problem. we expect that tiling could mitigate some like Bug 963073.

framebuffer_debug.patch 11 years ago Ben Kelly [:bkelly, not reviewing] 3.41 KB, patch		Details \| Diff \| Splinter Review
fence_debug.patch 11 years ago Ben Kelly [:bkelly, not reviewing] 5.16 KB, patch		Details \| Diff \| Splinter Review
Improve composition thread warning debug. 11 years ago Ben Kelly [:bkelly, not reviewing] 3.59 KB, patch		Details \| Diff \| Splinter Review
Improve composition thread warning debug. (v2) 11 years ago Ben Kelly [:bkelly, not reviewing] 3.68 KB, patch		Details \| Diff \| Splinter Review
profile_gralloc_unlock.png 11 years ago Ben Kelly [:bkelly, not reviewing] 12.71 KB, image/png		Details
Improve composition thread warning debug. (v3) 11 years ago Ben Kelly [:bkelly, not reviewing] 3.37 KB, patch	BenWa : review+	Details \| Diff \| Splinter Review
Improve composition thread warning debug. (v4) 11 years ago Ben Kelly [:bkelly, not reviewing] 4.10 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
nexus-5 logs(manual logs added) 11 years ago Sotaro Ikeda [:sotaro] 1.23 MB, text/plain		Details
temporary patch - add log around ioctl() 11 years ago Sotaro Ikeda [:sotaro] 1.60 KB, patch		Details \| Diff \| Splinter Review