Closed
Bug 1163820
Opened 9 years ago
Closed 9 years ago
Video decode takes 30% more CPU compared to Android
Categories
(Firefox OS Graveyard :: Performance, defect)
Tracking
(tracking-b2g:-)
RESOLVED
WORKSFORME
tracking-b2g | - |
People
(Reporter: ntroast, Assigned: slee)
Details
(Keywords: perf, Whiteboard: [caf priority: p2][CR 830845])
Attachments
(2 files)
Gaia: 83a63e0e6fcc22c6a74b06ef77b88d5049719cad Gecko: 4f049c08e74aab39ee81e7c6e7fcee311a1868b0 Attached is a comparison between Android and Firefox OS during video playback. 5x 20second kernel traces were averaged while playing Big Buck Bunny trailer in 720p. http://mirror.cessen.com/blender.org/peach/trailer/trailer_720p.mov I disassembled the mov and reassembled in a mp4 container with the following avconv -i inputfile.mov -c copy outputfile.mp4
Reporter | ||
Updated•9 years ago
|
Summary: Video decode takes 44% more CPU compared to Android → Video decode takes 30% more CPU compared to Android
Reporter | ||
Comment 1•9 years ago
|
||
Sorry % Increase is probably not the right statistic here. This is better :D (1 - (1 / (664.732 / 461.537))) * 100 = ~30%
Comment 2•9 years ago
|
||
Is this a regression in 2.2 or the first time doing this kind of power consumption measurement?
Flags: needinfo?(ntroast)
Reporter | ||
Comment 3•9 years ago
|
||
2.2 is the first to support L, so I can not determine whether this is a regression or not.
Flags: needinfo?(ntroast)
Assignee | ||
Comment 5•9 years ago
|
||
Hi Nicholas, Can you tell me how you get the data? Thanks.
Flags: needinfo?(ntroast)
Reporter | ||
Comment 6•9 years ago
|
||
Hi Steven, Have a look at this tutorial about ftrace http://bamboopuppy.com/ftrace-android-profiling/ This one looks good too http://blog.fpmurphy.com/2014/05/kernel-tracing-using-ftrace.html Setting the "sched:*" event should be good enough to see how long each process spent on the CPU
Flags: needinfo?(ntroast)
Assignee | ||
Comment 7•9 years ago
|
||
Hi Nicolas, Thanks for providing the information. But it seems that we should change some kernel configs to enable the tracers? I cannot get any data from my nexus 5. Even I set the pid to set_ftrace_pid, set 1 to tracing_on. From comment 0, you mentioned that you get the result by testing 5 times and every time capturing 20 seconds of data. Then you average them, is it right? And the result shows that FxOS uses more 203.186 ms of CPU time than Android. So it means that when playing 20 seconds video clip, FxOS uses 203.186 ms more CPU time than Android. I checked attachment 8604335 [details]. Here are some reasons about why FxOS uses more CPU time than Android. 1. FxOS has more IPC transmitions. 2. FxOS has internal synchronization mechanism. For Android, the encoded data is sent to codec then render it. On FxOS, we get the decoded data from HW codec, keep it in a state machine, and decide when we can display it. The follow is a. mediaserver gets the frame from video app b. when the frame is decoded, mediaserver sends the data back to video app c. MediaDecoderStateMachine finds that the frame should be rendered, it sends to Compositor(b2g process) We have more IPCs than Android(b and c). That can explain why "Binder" and "ImageBridgeChild"(40.551+21.850-17.145=45.256ms) are higher in FxOS. For "Media Decode #1" and "Media S~hine #1"(74.514+26.194=100.708 ms), they are the state machine in FxOS for synchronization(c). These 2 items contribute about 3/4 of the time. For 2 things I still don't have any idea, 1. "gle.aac.decoder" uses 151.967 - 124.305 = 27.662ms on FxOS. 2. FxOS has "FastMixer" which costs 39.376 ms. I will try to find out why FxOS uses FastMixer first.
Flags: needinfo?(slee)
Reporter | ||
Comment 8•9 years ago
|
||
(In reply to StevenLee[:slee] from comment #7) > Hi Nicolas, > > Thanks for providing the information. But it seems that we should change > some kernel configs to enable the tracers? I cannot get any data from my > nexus 5. Even I set the pid to set_ftrace_pid, set 1 to tracing_on. This sets all sched events to be traced echo sched:* > /sys/kernel/debug/tracing/set_event This enables tracing echo 1 > /sys/kernel/debug/tracing/tracing_on This shows the trace output cat /sys/kernel/debug/tracing/trace Hopefully after those steps you will be able to get data. (In reply to StevenLee[:slee] from comment #7) > From comment 0, you mentioned that you get the result by testing 5 times and > every time capturing 20 seconds of data. Then you average them, is it right? > And the result shows that FxOS uses more 203.186 ms of CPU time than > Android. So it means that when playing 20 seconds video clip, FxOS uses > 203.186 ms more CPU time than Android. Sorry, I forgot to mention I also divided the results by 20, so in 1 second FxOS uses 203.186ms more CPU time than Android.
Updated•9 years ago
|
Whiteboard: [CR 836789]
Updated•9 years ago
|
Whiteboard: [CR 836789] → [caf priority: p2][CR 836789]
Reporter | ||
Updated•9 years ago
|
Whiteboard: [caf priority: p2][CR 836789] → [caf priority: p2][CR 830845]
Reporter | ||
Comment 9•9 years ago
|
||
Hi Steven, Thanks for pointing out fastmixer. I disabled fastmixer and saw a 93ms improvement. I see in Android that "At the time AudioFlinger creates a normal mixer thread, it decides whether or not to also create a fast mixer thread."[1] Does FxOS also do this decision making? [1] https://source.android.com/devices/audio/latency_design.html#mixerThreads
Flags: needinfo?(slee)
Assignee | ||
Comment 10•9 years ago
|
||
(In reply to Nicholas Troast [:ntroast] from comment #9) > I see in Android that "At the time AudioFlinger creates a normal mixer > thread, it decides whether or not to also create a fast mixer thread."[1] > Does FxOS also do this decision making? > [1] https://source.android.com/devices/audio/latency_design.html#mixerThreads Hi Nicholas, I found that FxOS does not explicitly request to use fast mixer thread. When we are creating an AudioTrack, the code flow just goes to the path of using the fast mixer thread. I will try to figure out whether we can or cannot prevent that. BTW, afterI flash the whole image, ftrace works. But I can enable only sched:* events or the kernel will hang then reboot. And the result only shows kernel functions. Should I need to add some flags when I compile the source code? Can you also upload an updated result? Thanks
Reporter | ||
Comment 11•9 years ago
|
||
Hi Steven, I tried on flame and I am able to collect the ftrace logs. You should see output like the following: Homescreen-1108 [000] ...2 144.521995: sched_switch: prev_comm=Homescreen prev_pid=1108 prev_prio=121 prev_state=S ==> next_comm=Timer next_pid=1169 next_prio=121 This raw output then needs to be run through a tool like pytimechart or kernelshark to give you a more human readable representation. I am not aware of any flags that need to be set. If you are still having trouble maybe I can ask around. If you have a patch I can certainly test it and provide feedback.
Assignee | ||
Comment 12•9 years ago
|
||
(In reply to Nicholas Troast [:ntroast] from comment #11) > Homescreen-1108 [000] ...2 144.521995: sched_switch: > prev_comm=Homescreen prev_pid=1108 prev_prio=121 prev_state=S ==> > next_comm=Timer next_pid=1169 next_prio=121 Hi Nicholas, Thanks for providing the information. I tried to install pytimechart on my Linux and pytimechart cannot parse my data correctly. I will try to profile Video app by our built-in profiler and see what functions consume much CPU.
Comment 13•9 years ago
|
||
It's not regression, remove 2.2?. Steven will continue the investigation.
Assignee: nobody → slee
blocking-b2g: 2.2? → ---
tracking-b2g:
--- → +
Keywords: perf
Whiteboard: [caf priority: p2][CR 830845] → [caf priority: p2][CR 830845][perf-wanted]
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(slee)
Comment 14•9 years ago
|
||
> 2. FxOS has "FastMixer" which costs 39.376 ms. > > I will try to find out why FxOS uses FastMixer first. There are 2 reasons why FxOS uses FastMixer in this test: 1. FastMixer is used when AudioTrack AUDIO_OUTPUT_FLAG_FAST is set in components OpenSL ES (frameworks/wilhelm), ToneGenerator, and SoundPool [1]. FxOS's libcubeb uses OpenSL ES. 2. To use FastMixer, the input sample rate needs to be equal to output sample rate. In this case, the sample rate in test video is 48000 which matches the output sample rate. If you use other video which the audio sample rate is not 48k, the normal mixer will be used instead of FastMixer. However, it just shifts the CPU loading to normal mixer, I don't think it has performance improvement unless there is bug in FastMixer. Because Android's video player uses AudioTrack directly without AUDIO_OUTPUT_FLAG_FAST, so FastMixer is not in your chart (Android part). [1] https://source.android.com/devices/audio/latency_design.html
Reporter | ||
Comment 15•9 years ago
|
||
(In reply to Alfredo Yang from comment #14) > If you use other video which the audio sample rate is not 48k, the normal > mixer will be used instead of FastMixer. However, it just shifts the CPU > loading to normal mixer, I don't think it has performance improvement unless > there is bug in FastMixer. From what I understand FastMixer improves audio latency at the cost of performance. Is this correct? Is audio latency a concern during video playback? > Because Android's video player uses AudioTrack directly without > AUDIO_OUTPUT_FLAG_FAST, so FastMixer is not in your chart (Android part). Why does Android use AudioTrack directly? Should FxOS be doing the same?
Flags: needinfo?(ayang)
Comment 16•9 years ago
|
||
(In reply to Nicholas Troast [:ntroast] from comment #15) > From what I understand FastMixer improves audio latency at the cost of > performance. Is this correct? > Is audio latency a concern during video playback? Audio latency is not concerned in video playback. On FxOS, we ask for high latency [1]; however, in Android's OpenSL implementation, it chooses to use FastMixer. It could be a bug in Android's OpenSL layer. > > > Because Android's video player uses AudioTrack directly without > > AUDIO_OUTPUT_FLAG_FAST, so FastMixer is not in your chart (Android part). > > Why does Android use AudioTrack directly? Should FxOS be doing the same? It is out of my scope to answer it. IMHO, it's better to support an open standard like OpenSL instead of a specific platform api. [1] https://dxr.mozilla.org/mozilla-central/source/dom/media/AudioStream.cpp?from=AudioStream.cpp#388 [2] http://androidxref.com/4.4.2_r2/xref/frameworks/wilhelm/src/android/AudioPlayer_to_android.cpp#1430
Flags: needinfo?(ayang)
Comment 17•9 years ago
|
||
Dylan, per comment 16, there is possible issue in Android. it should be listed as known issue.
Flags: needinfo?(doliver)
Whiteboard: [caf priority: p2][CR 830845][perf-wanted] → [caf priority: p2][CR 830845]
Comment 18•9 years ago
|
||
Removing dependency on CAF 2.2, per agreement with partner.
No longer blocks: CAF-v2.2-metabug
Flags: needinfo?(doliver)
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
Updated•9 years ago
|
Blocks: CAF-v2.2-metabug
No longer blocks: CAF-v2.2-metabug
You need to log in
before you can comment on or make changes to this bug.
Description
•