Fenix starts video timer on android that never terminates
Categories
(GeckoView :: Media, defect)
Tracking
(Performance Impact:high, relnote-firefox 127+, firefox127+ fixed, firefox128+ fixed, firefox129+ fixed)
People
(Reporter: bas.schouten, Assigned: jhlin)
References
(Blocks 1 open bug)
Details
Attachments
(5 files)
1.23 KB,
text/plain
|
Details | |
4.14 KB,
text/plain
|
Details | |
2.91 KB,
text/plain
|
Details | |
4.05 KB,
text/plain
|
Details | |
48 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
pascalc
:
approval-mozilla-release+
|
Details | Review |
This leads a flawed Android OS battery stats calculator to attribute a lot of fake battery usage to Firefox.
Steps to reproduce:
- Have Firefox installed.
- Force stop. (App Info -> Force Stop, i.e. a full force stop)
- Wait for an hour or two.
Reporter | ||
Comment 1•8 months ago
|
||
I've verified using adb shell dumpsys media.resource_manager
that after watching a couple of videos and terminating the Firefox process, the Firefox media process will continue things like:
Client:
Id: -5476158183275902000
Name: c2.exynos.h264.decoder
Resources:
non-secure-codec/hw-video-codec:[]:1
battery/hw-video-codec:[]:1
These will not go away until a device restart as far as I can tell.
Reporter | ||
Comment 2•8 months ago
•
|
||
More specifically, this works for me:
- Restart phone
adb shell dumpsys media.resource_manager
shows no activity- Start firefox on a blank page and go to https://www.vimeo.com/
- Once the video on the front page (not logged in) starts
adb shell dumpsys media.resource_manager
shows an h264 decoder - Press the app view button and swipe app up to close
- Long press firefox -> App Info -> Force Stop
adb shell dumpsys media.resource_manager
shows the codec still there- Video number in batterystats --usage goes up
Reporter | ||
Comment 3•8 months ago
|
||
Relevant bit from the resource manager, after killing Firefox.
Comment 4•8 months ago
•
|
||
I tried the steps above on a pixel 6 and s21 with android 13 and couldn't get it to reproduce.
However, I tried it on a s24 ultra with android 14 and got the same results as Bas.
I added the logs I got with the s24 ultra.
Comment 5•8 months ago
|
||
I have Firefox Beta installed on my Samsung Galaxy S23. I think for me it's not only an issue of attribution but my battery really does drain faster. I also noticed that using Samsung's battery saving mode seems to "fix" the battery drain. Maybe this information is somehow useful.
Comment 6•8 months ago
|
||
Comment 7•8 months ago
|
||
jhlin has been able to reproduce this in the emulator and has noticed the following in the logcat
06-14 10:58:00.504 437 3189 W ResourceManagerService: addResource called with untrusted pid 12284 or uid 10190, using calling pid 12705, uid 10190
Updated•8 months ago
|
Reporter | ||
Comment 8•8 months ago
|
||
(In reply to kirill.rakhman from comment #5)
I have Firefox Beta installed on my Samsung Galaxy S23. I think for me it's not only an issue of attribution but my battery really does drain faster. I also noticed that using Samsung's battery saving mode seems to "fix" the battery drain. Maybe this information is somehow useful.
There should be a way to verify if this is true or a placebo effect if you have access to adb. (I felt the same and concluded it was probably not true on my Pixel)
Use adb shell dumpsys batterystats --usage
, it will have an estimated usage and an actual usage. For me the difference between the two was roughly equal to the 'video' attribution of Firefox. Suggesting at least that part of the reporting was 'not real'.
Assignee | ||
Comment 9•8 months ago
|
||
3 processes involved:
- tab/content process(
:tab*
), where thevideo
element is run. It callsMediaCodec
API remoted in media process via binder/service. - media process(
:media
) runs the AndroidMediaCodec
API that calls media resource manager service (also via binder). mediaserver
process, an Android system process where the resource manager is run.
Now, some findings (on a Pixel 7/Android 14 emulator):
First, client/resource info in media.resource_manager
:
Pid: 10806
Priority: <unknown>
Client:
Id: -5476376626430055712
Name: c2.goldfish.h264.decoder
Resources:
non-secure-codec/video-codec:[]:1
battery/video-codec:[]:1
Adding/Removing the above info is done by Android framework code, as shown in the event log (captured in anther test session):
06-14 15:12:34 removeResource(pid 4796, uid 10192 clientId -5476376626430051536)
06-14 15:12:34 removeResource(pid 4796, uid 10192 clientId -5476376626430051536)
06-14 15:12:34 markClientForPendingRemoval(pid 4796, clientId -5476376626430051536)
06-14 15:12:33 addResource(pid 4796, uid 10192 clientId -5476376626430051536, resources battery/video-codec:[]:1 )
06-14 15:12:33 addResource(pid 4796, uid 10192 clientId -5476376626430051536, resources graphic-memory/unspecified:[]:0 )
06-14 15:12:33 addResource(pid 4796, uid 10192 clientId -5476376626430051536, resources non-secure-codec/video-codec:[]:1 )
The implementation gets caller pid
from binder IPC state, so the event log shows tab/content process ID, but there is a hidden service permission that the resource manager checks, and when fails, pid
will be replaced with media process ID, as the 'untrusted pid' message shown in comment 7, hence the different pid
s in client info and event log. Android system media services always show matching pid
s because they have the special permission and can use private API.
Second, the persistent client/resource info issue:
Even though pid
seems problematic as shown above, the resource manager behaves as expected when the codec lifecycle is managed properly (i.e., create/release via MediaCodec
API) because it translates pid
automatically.
However, if the codec is not released explicitly, things could go wrong. The resource manager uses the binder death technique to clean the client/resource info when client process is killed, but it doesn't check and replace the pid
in this case. (Again, system media services work as expected because of the special permission.)
We also use the same way to handle 'tab/content process is killed unexpectedly' situation. Unfortunately, when closing the app via Recents screen UI, Android actually kills the processes in a "undesired" order:
06-14 16:19:08.308 562 581 I ActivityManager: Killing 9595:org.mozilla.geckoview_example:gpu/u0a194 (adj 905): remove task
06-14 16:19:08.308 562 581 I ActivityManager: Killing 9862:org.mozilla.geckoview_example:media/u0a194 (adj 925): remove task
06-14 16:19:08.308 562 581 I ActivityManager: Killing 9561:org.mozilla.geckoview_example:tab16/u0a194 (adj 905): remove task
06-14 16:19:08.308 562 581 I ActivityManager: Killing 9515:org.mozilla.geckoview_example/u0a194 (adj 915): remove task
Since tab/content process is killed after media process, binderDied()
wasn't called. Some device/Android version combinations might behave differently. I guess that's probably why Marc couldn't reproduce the issue earlier.
Android document mentions Service.onTaskRemoved() is the way to handle this "remove task" situation, but so far I have not been able to make it work. Will keep trying.
Comment 10•8 months ago
|
||
(In reply to John Lin [:jhlin][:jolin] from comment #9)
The implementation gets caller
pid
from binder IPC state, so the event log shows tab/content process ID,
Why is the IPC caller the tab process? Doesn't that mean that we're calling some Android media function from the tab process?
Updated•8 months ago
|
Assignee | ||
Comment 11•8 months ago
|
||
(In reply to Markus Stange [:mstange] from comment #10)
(In reply to John Lin [:jhlin][:jolin] from comment #9)
The implementation gets caller
pid
from binder IPC state, so the event log shows tab/content process ID,Why is the IPC caller the tab process? Doesn't that mean that we're calling some Android media function from the tab process?
Not directly. All Android media functions are called in media process, and tab process calls into media process through binder.
I've tried other emulators with different Android versions, so far only Android 14 has this issue. Android 12 and 13, while also pass tab process pid to resource manager, will use media process pid in overridePid()
and remove the resouce info successfully.
Comment 12•8 months ago
|
||
I see... I think we have to look for a way to set things up in such a way that AIBinder_getCallingPid()
will return the media process pid and not the tab process pid.
I don't think cleaning things up ourselves is an option. Android can kill any process at any time, so our cleanup would never run reliably.
Assignee | ||
Comment 13•8 months ago
•
|
||
(In reply to Markus Stange [:mstange] from comment #12)
I see... I think we have to look for a way to set things up in such a way that
AIBinder_getCallingPid()
will return the media process pid and not the tab process pid.
I tried to access the hidden method using Java reflection but got NoSuchMethodException
.
I'll see if moving MediaCodec
calls out of the binder thread can fix this.
I don't think cleaning things up ourselves is an option. Android can kill any process at any time, so our cleanup would never run reliably.
Good point. I couldn't get OnTaskRemoved()
to work anyway.
Comment 14•8 months ago
|
||
Updated•8 months ago
|
Comment 15•8 months ago
|
||
John, can you try running the code which calls MediaCodec.createByCodecName
asynchronously, outside the Binder.execTransact
call? Then AIBinder_getCallingPid
might return the media process's pid.
I also looked for a way to avoid getting into the branch that calls AIBinder_getCallingPid
, by pre-initializing mPid
, but it seems the only way to pre-initialize mPid
is to call MediaCodec.createByCodecNameForClient
which is a system-only API that requires special permissions.
Comment 16•8 months ago
|
||
Comment 17•8 months ago
|
||
(In reply to Markus Stange [:mstange] from comment #15)
John, can you try running the code [...] asynchronously
Heh, this might not be as easy as I initially envisioned; the calling side expects a synchronous return value and it's not clear where the asynchronicity should be introduced.
But Jeff just had another idea: We can maybe just wrap our call to createByCodecName
with a call to Binder.clearCallingIdentity
. He's testing this now.
Comment 18•8 months ago
|
||
Here's what I'm trying:
diff --git a/mobile/android/geckoview/src/main/java/org/mozilla/gecko/media/LollipopAsyncCodec.java b/mobile/android/geckoview/src/main/java/org/mozilla/gecko/media/LollipopAsyncCodec.java
index aaf8810bbb05..e55d52698b52 100644
--- a/mobile/android/geckoview/src/main/java/org/mozilla/gecko/media/LollipopAsyncCodec.java
+++ b/mobile/android/geckoview/src/main/java/org/mozilla/gecko/media/LollipopAsyncCodec.java
@@ -8,6 +8,7 @@ import android.media.MediaCodec;
import android.media.MediaCodecInfo.CodecCapabilities;
import android.media.MediaCrypto;
import android.media.MediaFormat;
+import android.os.Binder;
import android.os.Build;
import android.os.Bundle;
import android.os.Handler;
@@ -134,7 +135,9 @@ import org.mozilla.gecko.util.HardwareCodecCapabilityUtils;
}
/* package */ LollipopAsyncCodec(final String name) throws IOException {
- mCodec = MediaCodec.createByCodecName(name);
+ long token = Binder.clearCallingIdentity();
+ mCodec = MediaCodec.createByCodecName(name);
+ Binder.restoreCallingIdentity(token);
}
@Override
Comment 19•8 months ago
|
||
Android is getting confused about which process is owning a media resource;
in different places it stores either our media process's pid or our tab process's pid.
Then, when our processes shut down (or rather when they killed by Android), this
confusion leads Android to do improper cleanup of its video resource accounting.
This causes it to keep increasing the "video" time for Firefox (in
adb shell dumpsys batterystats --usage
) even while no Firefox processes
are running. This in turn causes the "Battery" UI in Android's system preferences to
show large values for Firefox.
To work around this attribution error, we try to ensure that Android tracks the
same PID in all the places.
Specifically, the problem is that the constructor of
android::MediaCodec::ResourceManagerServiceProxy::ResourceManagerServiceProxy
is getting the owning PID from AIBinder_getCallingPid()
. The Firefox media process
calls MediaCodec.createByCodecName
under a binder transaction from the tab process,
specifically while the tab process is calling ICodec$Stub$Proxy.configure
inside
CodecProxy.init
.
This fix makes it so that AIBinder_getCallingPid()
returns the media process's PID,
by wrapping the call to MediaCodec.createByCodecName
in calls to
Binder.clearCallingIdentity
/ Binder.restoreCallingIdentity
. The Android
documentation says the following about clearCallingIdentity
:
This can be useful if, while handling an incoming call, you will be calling
on interfaces of other objects that may be local to your process and need to
do permission checks on the calls coming into them (so they will check the
permission of your own local process, and not whatever process originally
called you).
This matches what we're trying to do here quite well.
Firefox media process stack showing MediaCodec.createByCodecName getting into
AIBinder_getCallingPid()
:
AIBinder_getCallingPid
android::MediaCodec::ResourceManagerServiceProxy::ResourceManagerServiceProxy(int, unsigned int, std::__1::shared_ptr<aidl::android::media::IResourceManagerClient> const&, android::MediaCodec*)[libstagefright.so]
android::MediaCodec::MediaCodec(android::sp<android::ALooper> const&, int, unsigned int, std::__1::function<android::sp<android::CodecBase>(android::AString const&, char const*)>, std::__1::function<int (android::AString const&, android::sp<android::MediaCodecInfo>*)>)
android::MediaCodec::CreateByComponentName(android::sp<android::ALooper> const&, android::AString const&, int*, int, unsigned int)
android::JMediaCodec::JMediaCodec(_JNIEnv*, _jobject*, char const*, bool, bool, int, int)
android_media_MediaCodec_native_setup(_JNIEnv*, _jobject*, _jstring*, unsigned char, unsigned char, int, int)
android.media.MediaCodec.<init>
android.media.MediaCodec.<init>
android.media.MediaCodec.createByCodecName
org.mozilla.gecko.media.LollipopAsyncCodec.<init>
Firefox tab process stack showing ICodec$Stub$Proxy.configure
binder call:
_ioctl
ioctl
android::IPCThreadState::talkWithDriver(bool)
android::IPCThreadState::waitForResponse(android::Parcel*, int*)
android::IPCThreadState::transact(int, unsigned int, android::Parcel const&, android::Parcel*, unsigned int)
android::BpBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)
android_os_BinderProxy_transact(_JNIEnv*, _jobject*, int, _jobject*, _jobject*, int)
art_quick_generic_jni_trampoline
art_quick_invoke_stub
android.os.BinderProxy.transact
org.mozilla.gecko.media.ICodec$Stub$Proxy.configure
org.mozilla.gecko.media.CodecProxy.init
org.mozilla.gecko.media.RemoteManager.createCodec
org.mozilla.gecko.media.CodecProxy.create
mozilla::java::CodecProxy::Create
mozilla::RemoteVideoDecoder::Init()
Comment 20•8 months ago
|
||
Comment 21•8 months ago
|
||
This patch fixes the issue in local testing on Android 14, and video still works.
We haven't done any testing on Android 13 yet.
Comment 22•8 months ago
|
||
Backed out for causing lint failures in LollipopAsyncCodec.java
- Backout link
- Push with failures
- Failure Log
- Failure line: TEST-UNEXPECTED-ERROR | /builds/worker/checkouts/gecko/mobile/android/geckoview/src/main/java/org/mozilla/gecko/media/LollipopAsyncCodec.java:144:10 | Variable 'token' should be declared final. (com.puppycrawl.tools.checkstyle.checks.coding.FinalLocalVariableCheck)
Updated•8 months ago
|
Comment 23•8 months ago
•
|
||
I've updated the patch to add the final
.
Try push before that change: https://treeherder.mozilla.org/jobs?repo=try&revision=68c8e480baa6650bdffa261a06b3c884897f988d (thanks Kaya)
Comment 24•8 months ago
|
||
Comment 25•8 months ago
|
||
bugherder |
Updated•8 months ago
|
Comment 26•8 months ago
|
||
Markus, given the patch looks relatively simple and given the high impact on perceived performance, is this something you think we should get into the 127.0.2 dot release that is scheduled to happen on Monday?
If you want to do it, we'll need to request uplift today (both to mozilla-beta and mozilla-release) so that it's in tomorrow's beta.
Comment 27•8 months ago
•
|
||
Yes, the plan is to try to uplift to release. I talked with dmeehan yesterday and we can request uplift as late as Friday. I'll request beta uplift now.
Comment 28•8 months ago
|
||
Comment on attachment 9408514 [details]
Bug 1902077 - Work around a battery usage misattribution bug in Android 14. r=jolin
Beta/Release Uplift Approval Request
- User impact if declined: We get hit with a battery usage misattribution bug in Android 14 that can make it seem like Firefox is continually consuming battery even when it's not running.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): We're not certain what behaviour changes this could cause on all of the different Android versions and devices. It's possible that this could break video playback in some circumstances.
We've done some manual testing on Android 14 and 13 and everything seems fine. This code is triggered anytime there's video playback so at least it's well covered.
- String changes made/needed:
- Is Android affected?: Yes
Comment 29•8 months ago
•
|
||
Release Note Request (optional, but appreciated)
[Why is this notable]: It fixes a reporting error that many users are complaining about and that made us look worse than we are.
[Affects Firefox for Android]: yes, only
[Suggested wording]: Fix battery usage reporting in Firefox for Android: Firefox no longer triggers a bug in Android 14 which was causing Android to report massively inflated battery usage on certain devices when Firefox was not running. You may need to restart your phone for the fix to take effect.
[Links (documentation, blog post, etc)]: n/a
Comment 30•8 months ago
|
||
I've tested this change on Android 13. Video worked and there were no leaks of decoders when looking at adb shell dumpsys media.resource_manager
Comment 31•8 months ago
|
||
I also tested this with DRM video and that works too.
Comment 32•8 months ago
|
||
Comment on attachment 9408514 [details]
Bug 1902077 - Work around a battery usage misattribution bug in Android 14. r=jolin
Approved for 128.0b6.
Updated•8 months ago
|
Comment 33•8 months ago
|
||
uplift |
Comment 34•8 months ago
|
||
jnicol reports that video still works on Android 5.1
Comment 35•8 months ago
|
||
Comment on attachment 9408514 [details]
Bug 1902077 - Work around a battery usage misattribution bug in Android 14. r=jolin
Beta/Release Uplift Approval Request
- User impact if declined: We get hit with a battery usage misattribution bug in Android 14 that can make it seem like Firefox is continually consuming battery even when it's not running.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): We're not certain what behaviour changes this could cause on all of the different Android versions and devices. It's possible that this could break video playback in some circumstances, given the testing that we've done this doesn't seem super likely.
We've done some manual testing on Android 14, 13, 5.1 and everything seems fine. This code is triggered anytime there's video playback so at least it's well covered.
- String changes made/needed:
- Is Android affected?: Yes
Comment 36•8 months ago
|
||
Comment on attachment 9408514 [details]
Bug 1902077 - Work around a battery usage misattribution bug in Android 14. r=jolin
Approved for 127.0.2, thanks.
Comment 37•8 months ago
|
||
uplift |
Updated•8 months ago
|
Updated•8 months ago
|
Updated•8 months ago
|
Description
•