46 bytes, text/x-github-pull-request
|Details | Review | Splinter Review|
46 bytes, text/x-github-pull-request
|Details | Review | Splinter Review|
11.79 KB, text/plain
Gaia: 83a63e0e6fcc22c6a74b06ef77b88d5049719cad Gecko: 4f049c08e74aab39ee81e7c6e7fcee311a1868b0 STR: 1. Flash device 2. Have at least one video on the sdcard 3. Launch the Video app 4. Wait for the Video app to finish loading I can see from kernel trace logs that the CPU is 6x as busy only on the very first launch of the Video app. These are processes that are consuming CPU time after the Video app has finished loading for the very first time, and I do not see these processes consuming CPU on subsequent loads of the app: Video Gecko_IOThread Chrome_ChildThr IPDL Background hwcVsyncThread
Hi Steven, Would you be able to take a look at this bug? Thanks, Nick
Hi Nicholas, Can you provide more detail about your expected or acceptable results for this case? Or are you initially looking for investigation/information on why those processes are busy on first run?
Hi Dylan, I would expect very little activity on the CPU when the Video app is idle. I see that the CPU is busy running the processes I mentioned on the very first launch of the app. Even after minutes have gone by they are still spinning. Acceptable results would be for the Video app to behave the same as the second launch of the app, which means there would be no activity from the mentioned processes when idle.
Hi Nicholas, So it happens even there is only one video file in the SD card? In the first lunch, video app will search video files in the folder, seek to a time, decode and generate thumbnail. If there are bunch of video files, it should consume some time.
Hi Steven, Yes this happens with 1+ files. After the thumbnail is generated it is still spinning a minute later.
Hi Steven, I just checked and added another video to the sdcard then started the Video app and I am seeing the same behavior. So it seems that the processes mentioned keep spinning anytime the Video app detects a new video and needs to generate thumbnails.
Is this on QRD running L? Or also reproducible on Flame? No-Jun, can you please check this on Flame and if you have Nexus running L, on that as well. NI David as well.
Comment #4 is a good description of what the video app does when first launched and when it finds new videos to process. If that process is not ending, there is either a bug in the gaia code (which does have some somewhat complicated logic to serialize a bunch of async thumbnail creation operations) or gecko is doing something weird. I think it should be easy to rule out Gaia involvement. And if we can do that, the we'll need someone like Sotaro to see what is going on with the processes listed in comment #0. Leaving my needinfo set so I remember to come back and look at this.
I wonder if we can reproduce this bug on a flame just using: $ adb shell top -m 10 to see what processes are busy...
(In reply to David Flanagan [:djf] from comment #9) > I wonder if we can reproduce this bug on a flame just using: > > $ adb shell top -m 10 > > to see what processes are busy... I tried the top method you mentioned on our device, and I can see 1% CPU usage from the Video app process on the first launch; after one minute; after thumbnails have generated. I see 0% CPU usage on subsequent launches of the Video app. This is definitely more visible through kernel trace, but I guess with top it at least shows that it is doing something.
So I tested on both nexus-l and flame, with 50 videos preloaded, and ran adb shell top command since I'm not sure how to get the kernel process logs. When the video app is first loaded, the CPU load for the process reaches as high as 45%. Then while it generates video thumbnails, it maintains the high CPU usage of between 30% ~ 40%, until all thumbnails for the video clips are rendered. Once the thumbnail rendering is complete, then the CPU usage drop to 2% without any UI input. CPU usage will go up once the UI input is given. For example, fast scrolling causes CPU to use as much as 20%. When the Video app was killed and relaunched, the CPU process initially jumps up to 30% for the video process for a second, but it quickly drops down to 2% range again. When there was no thumbnail, CPU usage on initial launch was about 7~8% and quickly brought down to 2%. This seems in line with what :djf was suggesting that high CPU usage on initial launch is caused by thumbnail generation.
Hi No-Jun, Please see Comment #10. I am looking at CPU usage of the Video app process and not the total CPU usage. In addition I do not believe this is a good way to test this. Please try collecting kernel trace logs. I have provided information about kernel trace logs in bug 1163820 Please note even though the CPU usage is low (1%); a process that is scheduled on the CPU at a regular interval can affect power.
(In reply to Nicholas Troast [:ntroast] from comment #12) > Hi No-Jun, > > Please see Comment #10. I am looking at CPU usage of the Video app process > and not the total CPU usage. > In addition I do not believe this is a good way to test this. Please try > collecting kernel trace logs. > > I have provided information about kernel trace logs in bug 1163820 > > Please note even though the CPU usage is low (1%); a process that is > scheduled on the CPU at a regular interval can affect power. Oh, I was being unclear, the CPU % I got was strictly for the video process, not including any other b2g process. (top command will only give the uid and not the name of the video process, and uid of the video process can be seen with b2g-ps command) having said that, I'll also try out obtaining the kernel trace logs as well. keeping ni? flag
If I understood the use of ftrace correctly, what I see also matches what Nicholas is seeing. cpu_load value for Video process is much higher on initial launch than subsequent launches, which is same as what I mentioned in comment 11. The heavy CPU load for the video app was clearly seen in this case as well (in real-time). Similiar behavior was seen for the gallery app as well. But when using top command, it also showed that as soon as the thumbnail generation was completed, CPU load for the app process plummeted at the same time, suggesting that this heavy CPU load is due to the thumbnail processing (which does not happen if there is no new pictures loaded) The thumbnail processing takes place after the app is fully loaded, and depending on the number of images, it can take a while (approx 1~2 seconds for every 10 images, roughly)
I can reproduce what Nicholas is seeing. If the video app scans a new video, then once it is done, CPU usage never drops below 2% (using top). If I kill it and relaunch so that there are no new video files that are scanned, then usage spikes, then after a little while goes down to 0. Occassionally it pops back up to 2 or 3%, but it does drop down below 2%. The video app has some complicated thumbnail processing logic and there's a chance that this is a gaia bug. Taking this to investigate it from the Gaia end. If I can't find anything, I'll ask Sotaro or someone to take a look instead.
I should note that CPU usage does seem to drop to 0 when the video app goes to the background. So while this is worth tracking down, it shouldn't be a serious drain on battery life. Actually, it isn't clear if the extra 2% CPU usage continues while videos are playing or not.
The Gallery and Music apps also have this problem, but they have it even when they don't find any media. If there are no photos or songs on the sdcard, I can launch Gallery and Music and they'll both display their "no media" dialogs and sit there at 2% CPU utilization. This makes me think that this doesn't have anything to do with video and instead has to do with MediaDB. I'm not sure why it only affects Video when new files are found and not other times. I think that music and gallery use the autoscan feature of MediaDB and video does not, so perhaps that is a clue.
Most other apps do go to 0% CPU utilization (using top) when I launch them and do nothing else. Even the clock app only uses 1% CPU to move the second hand around the screen, so I think something is up here with the media apps. MediaDB keeps an indexeddb connection open... I wonder if that is expensive? The media apps are full-screen and aren't orientation locked. Could either of those cause more CPU usage? (That wouldn't explain why the video app doesn't always have high CPU, though)
Another difference between video and the other media apps is that video uses the old ready event while gallery and music have been modified to use enumerable instead. But switching gallery back to use ready doesn't seem to change anything: it still uses 2% CPU even if it does not find any new files.
If I modify the gallery app so it never calls MediaDB.scan() to look for new images, this bug does not occur. If I modify MediaDB.scan() so that it just calls endscan(), then this bug does not occur. If I modify it so it just does the quick scan or just does the full scan, then the bug does still occur. So now I'm beginning to wonder if it is the device storage enumeration that is causing the problem. (Though I still don't get why gallery is always affected and Video is only affected if there is a new file)
This does not appear to be related to shared/js/device_storage/enumerate_all.js If I remove the external SD card so that there is only one storage area on the phone, I still see the 2% CPU usage in gallery.
Changing the bug summary line to indicate that this is not just a video app bug.
Dave, Can you think of any reason why an app that calls storage.enumerate() would then end up using 2% CPU any time it is in the foreground, even if completely idle otherwise?
I cannot reproduce this just using the DS test app. It can enumerate and then go to zero. Maybe some interaction between IndexedDB and DeviceStorage? What if I modify mediadb to skip the db enumeration and just do the scan? Does that modify things? Maybe I should check with a marketplace file browser app to see if that has the same issue. And if not, then it is probably time to try to find a reduced test case that demonstrates the bug.
(In reply to David Flanagan [:djf] from comment #23) > Dave, > > Can you think of any reason why an app that calls storage.enumerate() would > then end up using 2% CPU any time it is in the foreground, even if > completely idle otherwise? Not off the top of my head. Has anybody done a profile to see what's consuming the 2%? Unless there is some type of "continuous" notification being delivered or something, but I would expect that to consume much more than 2%
Is the phone plugged into a machine with UMS or MTP enabled? Perhaps the host is doing some type of indexing?
I can reproduce on my nexus 5, too. As the STR on comment 0, I put only one video clip. In the first lunch of video app, it detects the video clip and generates a thumbnail. After that the main thread of video app still consumes about 1~2% of CPU for a while(using "adb shell top -m 10 -t" to get the top information with threads). BTW, I don't turn on MTP.
I check the CPU usage by built-in profiler, here is the result of Gallery app, http://people.mozilla.org/~bgirard/cleopatra/#report=965f6ee614a1e2ff2ea3265180e34623a28a16ef&selection=0,1,44. For Video app, I have similar result.
Just to be sure that this has nothing to do with the media apps being fullscreen and having unlocked orientation, I modified Gallery to have fixed orientation and not be fullscreen, and I see the same 2% usage when idle. (In reply to Dave Hylands [:dhylands] from comment #26) > Is the phone plugged into a machine with UMS or MTP enabled? Perhaps the > host is doing some type of indexing? If that was the case, we'd see the 2% when any app was running, and the CPU usage wouldn't be coming from the Gallery app process, right? (In reply to Dave Hylands [:dhylands] from comment #25) > Has anybody done a profile to see what's consuming the 2%? See comment #28. I don't know enough about Gecko to make any sense of it. > Unless there is some type of "continuous" notification being delivered or > something, but I would expect that to consume much more than 2% That was part of why I wanted to try the app with locked orientation; thinking that maybe the orientation sensors required to rotate the screen were taking up that time. For comparison, the process that runs the top program also takes 2% on my Flame.
If Steven can reproduce this on a nexus 5, it is probably not memory related. But to be sure, I converted my 319mb Flame to 1024mb, and the bug still reproduces.
I see a vsync tick happening in the profile. Perhaps this is being turned on as part of playing a video, and not being turned off at the end?
(In reply to Dave Hylands [:dhylands] from comment #31) > I see a vsync tick happening in the profile. > > Perhaps this is being turned on as part of playing a video, and not being > turned off at the end? One of the active threads that Nicholas pointed out in the origina comment is hwcVsyncThread, whatever that is. But note that Steven's profile is for the Gallery app. And I can reproduce the 2% CPU usage even when there are no media files on the phone at all. All I have to do is launch gallery and let it sit at the "no pictures found" screen. So playing videos is not required to reproduce this. The only clue I have so far is that if I comment out the code that scans device storage, then the elevated CPU usage does not happen in Gallery. But it isn't just a simple matter of enumerating device storage, because it doesn't happen for the DS test app. And it doesn't happen for the Video app unless the scan actually finds a new file. The apps that use MediaDB have this bug, and the apps that don't use it do not have the bug, so I'm assuming that it is some interaction between DeviceStorage and IndexedDB that is causing it. As far as I can tell, there isn't any JS code running to cause this. If I use the WebIDE debugger to pause the Gallery app, the eleveated CPU usage for that process continues. I don't know how WebIDE pauses an app, but I think it stops running the JS. Dave: is there some way to log IPC activity to show what device storage and indexed db are doing and what communication is occurring between the parent process and the child process? MediaDB does keep an indexeddb connection open for the life of the app... Or if I've somehow caused some sort of deadlock maybe there are threads that are staying busy when they shouldn't?
There are two file manager apps in the Marketplace "File Manager" and "Files". They appear to be clones of each other. Both use about 5% of CPU and keep the parent process pegged at 30% CPU usages or so. So I think they are doing something wrong, and I can't use those as reduced test cases.
I can create a patch that does some prints at a bunch of "likely" places in device storage where things might be happening.
(In reply to Dave Hylands [:dhylands] from comment #34) > I can create a patch that does some prints at a bunch of "likely" places in > device storage where things might be happening. I was hoping that there was already some kind of logging that could be switched on with a pref. But failing that, what you're proposing sounds like a good next step. Are you able to take this bug for further investigation? I've spent a day on it, and have not been able to create a reduced test case, and am not making any further progress investigating from the gaia side. I'm going to unassign myself and hope that you (or someone else) can take it.
Before I sign off this bug, I should add that all the investigation above has been on master. But I just tested on 2.2 and find basically the same thing as on master: - Gallery with 100 thumbnails takes 3% CPU all the time, even when idle. (This is higher than on master, but there are also more thumbnails than I tested on master so I'm not sure which variable is relevant.) - Music with one song takes 1% (lower than master) CPU, but never drops to 0 - Video takes 2% CPU on runs when it finds and processes a new file. But on runs where there are no new files, CPU drops to 0 after the initial spike.
Assigning to dhylands for adding the more logging and troubleshooting the problem. (Today is a holiday in CA)
Taking this back from Dave. I've isolated this to a CSS animation (the "crawling ants" display) that apparently keeps on running even when it is turned off. I suspect there is a Gecko bug, but I'll see if I can workaround it in gaia or at least produce a reduced test case for the gecko bug.
Gallery, Music and Video all use the same CSS animation code to display an indeterminate progress bar. It dates back to before we had shared CSS to do that. The only real difference is that the Gallery and Music apps start the throbber as soon as they start scanning the sdcard. The video app doesn't start the throbber until it first finds something during the scan. So that is why we were finding that the video app doesn't have this bug unless it finds something. All the apps try to turn the throbber off. Visually it disappears, but CPU usage continues. That is the bug here.
mhenretty points out that this sounds similar to bug 962594. That is marked fix, but maybe there is something to investigate there, still.
I can't reproduce this bug using the progress animations in shared/styles/progress_activity.css It looks like there is something about the media apps animation that is triggering this.
Here's what's going on, apparently: - The media apps have a crawling ants animation. There are two elements involved, an inner and an outer element. To stop the animation, we remove the CSS animation property on the inner element and set display:none on the outer element. It looks like there is a bug in gecko where if the container element is hidden, then removing the animation property on the inner element is ignored. I guess gecko is figuring that if the animation isn't visible anyway, it doesn't have to actually stop it or something. In any case, the gaia workaround is to first remove the CSS animation from the inner element, and then use a setTimeout to hide the outer element in a separate pass. Obviously it is a bug we'll want to fix in gecko, though. I'm hoping that I can look at bug 962594 and find someone from that bug to work on this gecko bug.
I've filed bug 1166500 for the underlying gecko bug, and will use this bug to prepare a workaround patch for Gallery, Music and Video
Created attachment 8607804 [details] [review] link to patch on github Jim, could you review this please? The gallery and video app changes are trivial. The music app changes are slightly more complicated. I've filed bug 1166500 for the underlying bug; this is just a workaround that I'm hoping to uplift to 2.2. The basic bug is that if we hide the container of an animated element before we stop the animation, then the animation doesn't really stop. So I stop the animation, then wait 100ms, then hide the containing element.
Comment on attachment 8607802 [details] [review] [gaia] davidflanagan:bug1163789 > mozilla-b2g:master Oops. I forget that I don't have to attach my patches manually anymore. Please review this one, Jim.
Will this be landing on master first? I ask because the patch I'm about to land (bug 1147173) will bitrot the music parts of this patch. If it does go on master, you'll need to move some of the code around. Sorry!
Comment on attachment 8607802 [details] [review] [gaia] davidflanagan:bug1163789 > mozilla-b2g:master Josh, This bug is on CAF's 2.2 list. This patch is only a workaround, but I'm assuming that we may not be able to get the underlying gecko bug (1166500) fixed safely enough or quickly enough for 2.2 uplift, so I'm nominating this patch instead. [Approval Request Comment] [Bug caused by] (feature/regressing bug #): I don't know if this is a regression or just a long-standing bug. [User impact] if declined: battery life will suffer because the media apps will never be fully idle. The impact is minimized because the bug only occurs when the app is in the foreground, but might still have significant battery life impact if the user is playing a long video. [Testing completed]: locally [Risk to taking this patch] (and alternatives if risky): not very risky. Mostly this patch just defers some DOM operations by 100ms using setTimeout() calls. [String changes made]: none
Hi David, Could you check the comment from Jim per comment 47? Since this is just workaround for 2.2, do we wan to land this on master as well? Thanks!
Josh: I was assuming that we would land this to master because I don't know how much effort it will be for Gecko engineers to fix 1166500, and the workaround is harmless once the underlying problem is fixed. Jim: Bitrot happens. No need to apologize. I'll prepare a new version of the PR that applies cleanly on master and will cherry-pick the current patch into a PR for 2.2. You may then need to review them both.
Both Gallery and Music have changed between 2.2 and 3.0 so I can't just cherry-pick the current version of the patch into 2.2. I'll have to prepare two separate versions of the patch for the two releases. Also: 2.2 does not have a fix for bug 962594. (It is nominated as a blocker, but uplift has not been requested on the patch). This means that the code I currently have in the patch for master is not enough to bring CPU usage down to zero, at least for Gallery, so the 2.2 patch is going to be a little more involved than I thought, and I'll have to check whether the camera app also needs a fix for 2.2.
Created attachment 8608321 [details] [review] [gaia] davidflanagan:bug1163789-v2.2 > mozilla-b2g:v2.2
Comment on attachment 8608321 [details] [review] [gaia] davidflanagan:bug1163789-v2.2 > mozilla-b2g:v2.2 Jim: this is the patch for 2.2. It turned out that I had to include workarounds for bug 962594 as well as bug 1166500 to make things work right with 2.2 gecko. So this patch is different enough that it needs a separate review. The major new thing here is that in 2.2 just hiding a <progress> element with display:none is not enough to stop the animation, so I'm setting value=0 on the element to turn off the CSS animation property (from shared/style/progress_activity.css).
Comment on attachment 8608321 [details] [review] [gaia] davidflanagan:bug1163789-v2.2 > mozilla-b2g:v2.2 Looks good overall. I have a few minor comments on GitHub.
Comment on attachment 8607802 [details] [review] [gaia] davidflanagan:bug1163789 > mozilla-b2g:master Clearing review for now, since this bitrotted.
Comment on attachment 8607802 [details] [review] [gaia] davidflanagan:bug1163789 > mozilla-b2g:master Clearing 2.2 approval request on the original (master) patch.
Comment on attachment 8608321 [details] [review] [gaia] davidflanagan:bug1163789-v2.2 > mozilla-b2g:v2.2 Requesting uplift on the 2.2-specific version of this patch. See comment #48 for the answers to the uplift request questions. I'll add that if we don't uplift this gaia patch, we need to uplift the gecko patch in bug 962594 and get the gecko team to fix and uplift bug 1166500. I had not previously realized that bug 962594 had not been fixed in 2.2.
Comment on attachment 8607802 [details] [review] [gaia] davidflanagan:bug1163789 > mozilla-b2g:master Jim, The master branch version of the patch is rebased and ready for your review as well.
Setting qawanted for help verifying that this bug is now fixed on 2.2. Needinfo for No-Jun, but I think he is OOO this week, so hopefully someone else from QA can take this. In order to verify this, you need to test Gallery, Music and Video, both in the case where there are no new files to scan and in the case where there are new files to scan. Before you start, use the Settings app to turn off the screen timeout, because verification requires the screen to stay on. To verify, use a terminal to run this command: $ adb shell top -m 20 This will show you the top 20 processes on the phone, sorted by how busy they are. The display will update about once a second. The 3rd column of the output shows how much of the CPU the process is using. Next, launch the app you want to verify. You'll see a new process appear in the output in your terminal. (It will just have a number and not a name, but it will probably be obvious which the new one is. You'll also see "fg" in the PCY column of output indicating that it is the foreground app) Note that the 'top' program that you're running to get this output is itself a process and will show up near the top of the list. While the app is starting up and scanning for new files, it will obviously have high CPU usage. But if this patch is good, and if you don't do anything with the app then you'll see that within 10 or 20 seconds the CPU usage should drop to 0 (or the app won't even appear on the list of the top 20 processes anymore). If the CPU usage for the app process stays at around 2% (on a flame), then the patch is not working.
Nicholas: this bug should now be fixed in 2.2, if you want to verify it on your end.
clearing the 2.2? blocking nomination since we just landed a fix for 2.2
Naoki, would you be able to take a look at this one if No-Jun is unavailable this week?
QAWanted to perform 2.2 verify given comment 60's instructions.
:doliver, I asked QAnalyst to take a look at this. If they have issues, then it will be reassigned/NI to me.
(In reply to David Flanagan [:djf] from comment #61) > Nicholas: this bug should now be fixed in 2.2, if you want to verify it on > your end. Yes this fixes the issue. Thanks!
Created attachment 8609396 [details] 2.2Verified Attached file is the result I've got running $ adb shell top -m 10 while following STR from Comment 0. Environmental Variables: Device: Flame 2.2 (KK, 319mb, full flash) Build ID: 20150522002508 Gaia: 9acbac7e6d4a2e3913af4aa202ea403501967fcd Gecko: a1e8f172523d Gonk: bd9cb3af2a0354577a6903917bc826489050b40d Version: 37.0 (2.2) Firmware Version: v18D-1 User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0
nhirata, yeojinC, Thanks for the looking into this!
No-Jun: I'm not sure whether anyone verified this fix for Gallery and Music as well as Video. Yeojin says they followed the STR in comment 0 rather than my expanded STR in comment 60. Jim: ping for a review of the master version of this patch. Since this patch blocks a 2.2 metabug, I'd like to be able to land on master and close it.
Comment on attachment 8607802 [details] [review] [gaia] davidflanagan:bug1163789 > mozilla-b2g:master r=me with nits addressed (see GitHub).
(In reply to David Flanagan [:djf] from comment #69) > No-Jun: I'm not sure whether anyone verified this fix for Gallery and Music > as well as Video. Yeojin says they followed the STR in comment 0 rather than > my expanded STR in comment 60. > > Jim: ping for a review of the master version of this patch. Since this patch > blocks a 2.2 metabug, I'd like to be able to land on master and close it. I just retested with comment 60 STR on Video, Gallery and Music app, and I saw the CPU % of the app processes consistently being pushed out of top 20 lists, and this was the same when all three apps were launched one after another in quick succession, so this looks good on 2.2.
review comments addressed.
Pull request has landed in master: https://github.com/mozilla-b2g/gaia/commit/18872da6d759f297250949901903890629bbab91