Closed Bug 882552 Opened 7 years ago Closed 7 years ago

Slow OMX codec behavior

Categories

(Core :: Audio/Video, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED FIXED
1.1 QE3 (26jun)
blocking-b2g leo+
Tracking Status
firefox22 --- wontfix
firefox23 --- wontfix
firefox24 --- fixed
b2g18 --- fixed
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- wontfix
b2g-v1.1hd --- fixed

People

(Reporter: roc, Assigned: sotaro)

References

Details

(Keywords: perf, Whiteboard: c= u=1.1 , [TD-42469])

Attachments

(3 files, 5 obsolete files)

See https://bugzilla.mozilla.org/show_bug.cgi?id=880601#c17.

Sometimes my dev phone gets into a state where calls into libstagefright codec 'read' calls are often slow (50-100ms or more, for audio or video). The data-source readAt calls are not blocking, they only read cached data, so it's probably not a network or caching issue. This makes video and audio playback intolerably jerky (although it does stay in sync now). Other times when I run the testcase, this doesn't happen at all and all the 'read' calls are fast.

See http://people.mozilla.com/~roc/troy.3gp.
One possibility is that omx codec driver task is blocked in kernel by other task. It might happen when data is written to flash/sd.
What would be the best way to try to debug this?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #2)
> What would be the best way to try to debug this?

If it is the case, write to MediaCache file's becomes slow. I checked write to the file. And it is not the problem in this case.
Attached patch patch - add logout (obsolete) — Splinter Review
(In reply to Sotaro Ikeda [:sotaro] from comment #4)
> Created attachment 763127 [details] [diff] [review]
> patch - add logout

By applying the patch, I confirmed the OMXCodec::read() delay. I feel that Binder IPC overhead from cpu usage point of view is not significant, but IPC turn around time is big.
Similar thing already happened in Bug 864180. It happened because of Bug 879529.  Decreasing the thread priority causes serious IPC latency. From this, it become clear that thread scheduling affected IPC latency significantly.
The patch move audio codec to the app process and use sw codec if there is a video track. I confirmed that this patch reduces OMXCodec::read()'s time.

Current usage of audio hw codec is for long time music playback. No necessity to use hw audio codec for movie.
On ics_chocolate(unagi, inari), The fix of Bug 864230 is not applied. It is necessary to apply the patch to check OMXCodec::read() latency.
log of "adb shell ps -p -t" during Youtube video playback by video app.
(In reply to Sotaro Ikeda [:sotaro] from comment #9)
> Created attachment 763140 [details]
> thread priority log during youtube video playback
> 
> log of "adb shell ps -p -t" during Youtube video playback by video app.

Binder thread's priority is very low.
By applying the patch. Binder thread's default priority(nice) from 18 to 0.
By applying attachment 763164 [details] [diff] [review], OMXCodec::read()'s duration becomes much better. Almost all audio track's read() becomes within 40ms. Video track's read() also becomes better but still sometime takes longer time. From the log, it seems related to media cache's related thing.
The log is a hard to understand because you can't see when a ReadVideo call started, so you can't tell which readAts belong to which ReadVideo call. If ReadVideo calls never overlapped, I can't explain

06-15 18:31:55.137   449   458 I GonkOmx : MediaStreamSource::readAt() duration 14
06-15 18:31:55.477   449   458 I GonkOmx : MediaStreamSource::readAt() duration 14
06-15 18:31:58.069   449   525 I GonkOmx : MediaStreamSource::readAt() duration 19
06-15 18:32:02.774   449   525 I GonkOmx : MediaStreamSource::readAt() duration 12
06-15 18:32:02.804   449   460 I GonkOmx : MediaStreamSource::readAt() duration 21
06-15 18:32:09.260   449   515 I GonkOmx : MediaStreamSource::readAt() duration 24
06-15 18:32:13.374   449   458 I GonkOmx : MediaStreamSource::readAt() duration 49
06-15 18:32:13.374   449   512 I GonkOmx : OmxDecoder::ReadVideo() 2 duration 44

where it looks like the sum of the readAt durations is much greater than the time under ReadVideo...
Anyway I think we should land the patch in comment #11 and then try to debug the remaining problems separately.
No longer blocks: 883814
Duplicate of this bug: 883814
Attachment #763129 - Attachment is obsolete: true
Attachment #763164 - Flags: review?(mwu)
Attachment #763127 - Attachment is obsolete: true
blocking-b2g: --- → leo?
Assignee: nobody → sotaro.ikeda.g
Priority: -- → P1
Target Milestone: --- → 1.1 QE3 (24jun)
Comment on attachment 763164 [details] [diff] [review]
patch - change Binder Thread's default priority to 0

Looks fine to me, though I think jlebar might have more of an opinion on these things than I do.
Attachment #763164 - Flags: review?(mwu)
Attachment #763164 - Flags: review?(justin.lebar+bug)
Attachment #763164 - Flags: review+
blocking-b2g: leo? → leo+
This is pretty complicated...

The binder (kernel/drivers/staging/android/binder.c) reads and saves the process's niceness when it starts up.  It renices itself all the time, and it occasionally restores itself to the niceness it originally saved.

Without knowing what the binder thread actually does, I agree that restoring itself to niceness 0 probably makes more sense than restoring itself to niceness 18, just because the process happened to have been started with niceness 18.

My main complaint here is that the code in this patch will work only if the process is launched with root permissions.  If we're not root and we try to lower our priority, we'll silently fail.  I'd prefer if we made a ruckus about it, so that we have some chance of discovering if we regress this bug.

Can you add a debug-build fatal assertion that the renice succeeds, and also print to logcat if the renice fails in a release build?
Comment on attachment 763164 [details] [diff] [review]
patch - change Binder Thread's default priority to 0

I'd like to take another quick look here.
Attachment #763164 - Flags: review?(justin.lebar+bug)
From comment #18, add "debug-build fatal assertion" and "print to logcat".
Attachment #763164 - Attachment is obsolete: true
Fix nits.
Attachment #764997 - Attachment is obsolete: true
Comment on attachment 765000 [details] [diff] [review]
patch v3 - change Binder Thread's default priority to 0

Can you review the updated patch?
Attachment #765000 - Flags: review?(justin.lebar+bug)
Comment on attachment 765000 [details] [diff] [review]
patch v3 - change Binder Thread's default priority to 0

> +    LOGE_IF(err, "setpriority failed");

Please make this a more descriptive error message.  Ideally we want something that someone can see and understand that it's a bug that should be reported.

r=me with that fixed.
Attachment #765000 - Flags: review?(justin.lebar+bug) → review+
Apply the comment. Carry "r+ mwu,jlebar".
Attachment #765000 - Attachment is obsolete: true
Keywords: checkin-needed
Status: NEW → ASSIGNED
Keywords: perf
Whiteboard: c= ,
Whiteboard: c= , → c= u=1.1 ,
https://hg.mozilla.org/mozilla-central/rev/880c8be7f35f
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Flags: in-moztrap-
Whiteboard: c= u=1.1 , → c= u=1.1 , [TD-42469]
Duplicate of this bug: 885434
You need to log in before you can comment on or make changes to this bug.