Unacceptable performance and memory consumption on Apart Posters VR experience.
Categories
(Core :: Graphics: CanvasWebGL, defect, P2)
Tracking
()
Performance Impact | low |
People
(Reporter: rbarker, Unassigned)
References
(Depends on 2 open bugs)
Details
(Keywords: perf:responsiveness, Whiteboard: [fxr:p1][geckoview:p1])
Attachments
(11 files, 1 obsolete file)
2.15 MB,
image/png
|
Details | |
2.24 MB,
image/png
|
Details | |
11.02 KB,
text/plain
|
Details | |
18.99 KB,
text/plain
|
Details | |
16.73 KB,
text/plain
|
Details | |
25.98 KB,
patch
|
Details | Diff | Splinter Review | |
1.53 KB,
patch
|
Details | Diff | Splinter Review | |
4.25 MB,
text/plain
|
Details | |
1.45 MB,
text/plain
|
Details | |
34.26 KB,
application/gzip
|
Details | |
48.68 KB,
patch
|
Details | Diff | Splinter Review |
STR:
- Visit https://show.apartposters.com/C4HMrDZ/showroom-3 in any GeckoView based browser (Fenix, GVE, FxR)
- Enter the room
Actual:
Browser will either crash or hang. If the room is entered, performance is very poor.
Expected:
Browser is able to enter the room and performance is good.
Notes:
This same page works with out issue in Chrome based Android browsers. Running on a Pixel 2 the frame rate is a fixed 60Hz in Cr while if you are even able to enter the room on a GV based browser, the frame rate is 30Hz. On a Pixel 1 the gap is even greater.
From examining the logs and viewing in a profiler, Gecko seems to be consume an excessive amount of memory which causes out-of-memory errors and webcontent, media process, and main process crashes. In the android studio profiler it shows the web content process consuming over 2GBs of graphics memory before a crash occurs.
Reporter | ||
Comment 1•4 years ago
|
||
FxR related issue: https://github.com/MozillaReality/FirefoxReality/issues/3395
Reporter | ||
Comment 2•4 years ago
|
||
Reporter | ||
Comment 3•4 years ago
|
||
Reporter | ||
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 4•4 years ago
|
||
I looked into this a little bit on macOS. I consistently get a 1GB memory increase when loading the page, and a corresponding 1GB decrease when navigating away from it.
I've been navigating between https://show.apartposters.com/#/ and https://show.apartposters.com/C4HMrDZ/showroom-3 by clicking the back/forward arrows, when reproducing this.
Profile: https://share.firefox.dev/3dtPWKj
It's not entirely clear to me what type of memory is being allocated here. It's mysterious to about:memory, too: The "explicit" allocations only increase by 360MB (16MB -> 383MB), but the "resident" number increases by 1.15GB (51MB -> 1227MB). The other number that increases is shmem-mapped, from 0 to 876.36 MB. But then when I navigate away again, shmem-mapped stays high, only decreasing by 60MB to 818.22 MB. But resident decreases to 164MB.
In any case, the profile gives some insight into what's happening: First we decode JPEG data, then WebGL converts it for texture upload, and then the GL driver converts it again during texture upload. And at least one of those three pieces seems to go into shared memory somehow, and what the driver does goes into mapped memory that is shared with the GPU but still accounted for in our resident size.
So this bug falls squarely within WebGL land, with maybe some contribution from image decoding.
Comment 5•4 years ago
|
||
Comment 6•4 years ago
|
||
Comment 7•4 years ago
|
||
Comment 8•4 years ago
|
||
Looking at the network requests, there's only one JPEG image in the list: https://hubs-5-assets.apart-internal.com/hubs/assets/waternormals-4418dde3f6abc21dc32506acf5f5b093.jpg
It's a 1024x1024 image. To allocate 1GB of image data, you'd need a 16k x 16k image.
I wonder if we're decoding the same image over and over.
Comment 9•4 years ago
|
||
Oh, image decoding is called from CreateImageBitmapFromBlob
, so the JPEG image data presumably comes from some JS buffer, possibly extracted from the 40MB binary file at https://hubs-5-assets.apart-internal.com/files/a5ba573c-ce81-43a1-9060-253a095fccbf.bin .
Reporter | ||
Comment 10•4 years ago
|
||
I believe the page is using a glTF which contains the texture data.
Comment 11•4 years ago
|
||
Hey jgilbert, could you put a priority on this one?
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 12•4 years ago
|
||
It would be great to narrow down where these seemingly-extra allocations are coming from. (untracked mallocs in gecko? driver shmems? gpu vram mirroring?)
Comment 13•4 years ago
|
||
Hey Sotaro - could you spend some time next week taking a look at this to help narrow down what might be happening?
Comment 14•4 years ago
|
||
(In reply to Jessie [:jbonisteel] pls NI from comment #13)
Hey Sotaro - could you spend some time next week taking a look at this to help narrow down what might be happening?
OK, I confirmed that GeckoView example app and Firefox preview was killed by lowmemorykiller during visiting https://show.apartposters.com/C4HMrDZ/showroom-3
Comment 15•4 years ago
|
||
Thanks Sotaro - is there any more information you are able to determine to understand why that is happening?
Comment 16•4 years ago
|
||
Comment 17•4 years ago
|
||
With attachment 9157610 [details] [diff] [review], I looked into resource usage around WebGLTExtureUpload.cpp on Linux PC. It loaded many ImageBitmaps and upload 2 4k videos(3840, 1920) . Size of videos were huge. They were continuously uploaded to GL texture even when they were not rendered. It seems to consume a lot of memory. I am going to look into more tomorrow.
Comment 18•4 years ago
|
||
GeckoView does not handle OS memory pressure yet. See Bug 1454752. Its support might reduce memory usage.
Comment 19•4 years ago
|
||
When I tested the page with Pixel3a, I normally saw a crash during uploading ImageBitmaps to GL textures. 37 ImageBitmaps were uploaded during loading the page.
Comment 20•4 years ago
|
||
One difference between chromium is AHardwareBuffer usage. Chromium uses it since Android Oreo. On android, gecko always use Shmem buffer for texture buffer, it needs additional gl texture image for GL rendering. When AHardwareBuffer is used, we do not need the additional gl texture image and could reduce memory usage. Bug 1562818 is for adding AHardwareBuffer support. But adding its support is high risk, since it adds totally new rendering and buffer allocation path.
Reporter | ||
Comment 21•4 years ago
|
||
This reproduces on Android 7 devices which is N so I would guess AHardwareBuffer
is not the issue?
Comment 22•4 years ago
•
|
||
(In reply to Randall Barker [:rbarker] from comment #21)
This reproduces on Android 7 devices which is N so I would guess
AHardwareBuffer
is not the issue?
From gecko's architecture, GeckoView uses more memory than chromium, AHardwareBuffer could be one way to reduce memory usage. Though, AHardwareBuffer is not supported on Android 7.
Comment 23•4 years ago
|
||
Another problem is a way of uploading video frames. Video frame uses SurfaceTexture. But it has a usage limitation. To bypass the limitation, its data access becomes very redundant(more memory and latency). It was added by Bug 1486659. Its architecture is like the following diagram.
Reporter | ||
Comment 24•4 years ago
•
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #22)
(In reply to Randall Barker [:rbarker] from comment #21)
This reproduces on Android 7 devices which is N so I would guess
AHardwareBuffer
is not the issue?From gecko's architecture, GeckoView uses more memory than chromium, AHardwareBuffer could be one way to reduce memory usage. Though, AHardwareBuffer is not supported on Android 7.
Right, and yet chrome is not only able to load this page on android 7 without issue, it then gets more than double the frame rate of gecko (when gecko is actually able to load the page).
Comment 26•4 years ago
•
|
||
(In reply to Randall Barker [:rbarker] from comment #24)
From gecko's architecture, GeckoView uses more memory than chromium, AHardwareBuffer could be one way to reduce memory usage. Though, AHardwareBuffer is not supported on Android 7.
Right, and yet chrome is not only able to load this page on android 7 without issue, it then gets more than double the frame rate of gecko (when gecko is actually able to load the page).
On Android 7, single mode SurfaceTexture might be used for reducing memory usage. It is Bug 1413142. But it is not enabled yet because of several problems.
Comment 27•4 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #25)
:jhlin, can you comment to Comment 23?
Yes, the output buffers from decoder are sent to SurfaceTexture allocated on parent side, and in order to use it in the WebGL context in child/content process, each drawImage()
call will make a copy of current video frame from parent [1].
The scene has two 4K videos and the log I added shows each takes 1-2ms to copy the texture on PIxel 2:
06-26 01:20:50.917 14440 14473 D GeckoSurfaceTexture: sync() took 2ms
06-26 01:20:50.920 14440 14473 D GeckoSurfaceTexture: sync() took 1ms
...
06-26 01:20:50.981 14440 14473 D GeckoSurfaceTexture: sync() took 1ms
06-26 01:20:50.983 14440 14473 D GeckoSurfaceTexture: sync() took 1ms
...
06-26 01:20:51.049 14440 14473 D GeckoSurfaceTexture: sync() took 1ms
06-26 01:20:51.052 14440 14473 D GeckoSurfaceTexture: sync() took 1ms
...
06-26 01:20:51.112 14440 14473 D GeckoSurfaceTexture: sync() took 1ms
06-26 01:20:51.115 14440 14473 D GeckoSurfaceTexture: sync() took 1ms
And as Sotaro said, the additional memory usage is huge because of 4K contents. Unfortunately, I don't have a good solution to eliminate that and cannot find any document about how to use AHardwareBuffer
in MediaCodec
API.
I'm not familiar with Chrome code. It's possible that they don't do extra copy for WebGL but it's hard to tell if that is the reason why it displays the scene smoothly.
[1] https://searchfox.org/mozilla-central/source/gfx/gl/AndroidSurfaceTexture.cpp#55
Comment 28•4 years ago
•
|
||
(In reply to John Lin [:jhlin][:jolin] from comment #27)
And as Sotaro said, the additional memory usage is huge because of 4K contents. Unfortunately, I don't have a good solution to eliminate that and cannot find any document about how to use
AHardwareBuffer
inMediaCodec
API.I'm not familiar with Chrome code. It's possible that they don't do extra copy for WebGL but it's hard to tell if that is the reason why it displays the scene smoothly.
We could get each video frame by using AImageReader. It is a wrapper of BufferItemConsumer.
Chromium uses it since Android P. Though, it seems possible to enable it since Android O.
ImageReaderGLOwner creates it.
Comment 29•4 years ago
•
|
||
https://phabricator.services.mozilla.com/D81479 roughly enabled AHardwareBuffer usage on Layer buffers and on WebGL SharedSurface. But oom crash happened during loading many ImageBitmaps on my Pixel 3a. From it, it sees necessary to reduce memory usage during loading the ImageBitmaps.
Comment 30•4 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #28)
We could get each video frame by using AImageReader. It is a wrapper of BufferItemConsumer.
Created Bug 1649110.
Comment 31•4 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #19)
When I tested the page with Pixel3a, I normally saw a crash during uploading ImageBitmaps to GL textures. 37 ImageBitmaps were uploaded during loading the page.
:jgilbert, do you have any ideas about how to reduce memory usage during loading many ImageBitmaps to GL textures for WebGL?
Comment 32•4 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #28)
(In reply to John Lin [:jhlin][:jolin] from comment #27)
And as Sotaro said, the additional memory usage is huge because of 4K contents. Unfortunately, I don't have a good solution to eliminate that and cannot find any document about how to use
AHardwareBuffer
inMediaCodec
API.I'm not familiar with Chrome code. It's possible that they don't do extra copy for WebGL but it's hard to tell if that is the reason why it displays the scene smoothly.
We could get each video frame by using AImageReader. It is a wrapper of BufferItemConsumer.
Chromium uses it since Android P. Though, it seems possible to enable it since Android O.
ImageReaderGLOwner creates it.
Thanks a lot for the info! Maybe Chromium uses ImageReader
only for Android P and later because Image::getHardwareBuffer()
is available since that version.
The document does mention some use cases are not supported and MediaCodec
is one of them. However, Android source code suggests that the HardwareBuffer
is created using GraphicBuffer
and should be compatible with MediaCodec
.
Comment 33•4 years ago
•
|
||
(In reply to John Lin [:jhlin][:jolin] from comment #32)
The document does mention some use cases are not supported and
MediaCodec
is one of them.
:jhlin, can you provide a link to the document?
Comment 34•4 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #33)
(In reply to John Lin [:jhlin][:jolin] from comment #32)
The document does mention some use cases are not supported and
MediaCodec
is one of them.:jhlin, can you provide a link to the document?
Sorry for not pointing it out clearly. It's in the paragraph explaining the return value from [1]: ... null if this Image doesn't support this feature. (Unsupported use cases include Image instances obtained through MediaCodec, and...
[1] https://developer.android.com/reference/android/media/Image#getHardwareBuffer()
Comment 35•4 years ago
•
|
||
Hmm, source code does not have a comment about MediaCodec and chromium media source seems to use AImageReader. A limitation might exist. Needs to look into more.
- http://androidxref.com/9.0.0_r3/xref/frameworks/av/media/ndk/include/media/NdkImage.h#763
- https://source.chromium.org/chromium/chromium/src/+/master:media/gpu/android/video_frame_factory_impl.cc;l=103
- https://source.chromium.org/chromium/chromium/src/+/master:gpu/command_buffer/service/texture_owner.cc;l=43
- https://source.chromium.org/chromium/chromium/src/+/master:media/gpu/android/video_frame_factory_impl.cc;l=36
Comment 36•4 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #31)
(In reply to Sotaro Ikeda [:sotaro] from comment #19)
When I tested the page with Pixel3a, I normally saw a crash during uploading ImageBitmaps to GL textures. 37 ImageBitmaps were uploaded during loading the page.
:jgilbert, do you have any ideas about how to reduce memory usage during loading many ImageBitmaps to GL textures for WebGL?
Maybe de-duplicating them would help? I bet we naively always make a copy.
We also don't seem to be freeing them aggressively enough. Maybe the GC doesn't realize how big they are, which is a common problem we've had in other areas: If the GC thinks the objects are small, it'll defer running a GC pass until there's likely to be more garbage. We should check that these large objects are known to be large by the GC/CC.
Comment 37•4 years ago
|
||
Comment 38•4 years ago
•
|
||
I tested about what happens during loading the page with several configrations like the followings on my Pixel 3a. Only [8] succeeded to load the page. All another configrations failed to load the pages. Application was killed by lowmemorykiller during loading ImageBitmaps. From it, "Use AHardwareBuffer for layer buffer" could reduce memory usage. And there were objects that were waiting to be cycle collected. And use AHardwareBuffer for layer buffer with CompositorOGL uses less memory than WebRender.
-[1] Use Webrendwe + No AHardwareBuffer use
-[2] Use Webrendwe + No AHardwareBuffer use + Add calling nsJSContext::CycleCollectNow() in FromImageBitmap()
-[3] Use Webrendwe + Use AHardwareBuffer for WebGL SharedSurface
-[4] Use Webrendwe + Use AHardwareBuffer for WebGL SharedSurface in FromImageBitmap()
-[5] Use CompositorOGL + No AHardwareBuffer use
-[6] Use CompositorOGL + No AHardwareBuffer + Add calling nsJSContext::CycleCollectNow() in FromImageBitmap()
-[7] Use CompositorOGL + Use AHardwareBuffer for layer buffer
-[8] Use CompositorOGL + Use AHardwareBuffer for layer buffer + Add calling nsJSContext::CycleCollectNow() in FromImageBitmap()
-[9] Use CompositorOGL + Use AHardwareBuffer for WebGL SharedSurface
-[10] Use CompositorOGL + Use AHardwareBuffer for WebGL SharedSurface + Add calling nsJSContext::CycleCollectNow() in FromImageBitmap()
-[11] Use CompositorOGL + Use AHardwareBuffer for layer buffer + Use AHardwareBuffer for WebGL SharedSurface
Comment 39•4 years ago
|
||
:gw, is there a way to reduce memory usage with WebRender on Android?
Comment 40•4 years ago
|
||
There's nothing specific I'm aware of, but it's hard to say for sure without doing some detailed profiling of texture and render target allocations.
It's possible that this case might be causing WR to incorrectly allocate a heap of render targets that are retained in the pool - it might be worth logging what texture allocations the renderer thread is making in this test case.
Reporter | ||
Comment 41•4 years ago
|
||
Firefox Reality has WebRender disabled since it does not work yet. Most of the devices we support are running Android 7. So neither fixing WebRender nor using AHardwareBuffer (requires Android 8, API Level 26) will reduce memory usage in Firefox Reality.
Comment 42•4 years ago
|
||
Just noting down some other things we discussed today that we will try to get more clarity into what could help:
- Try using DMD to see if that turns up anything
- Add logging to capture total amount of of Android buffers
Comment 43•4 years ago
|
||
Comment 44•4 years ago
|
||
Comment 45•4 years ago
|
||
Comment 46•4 years ago
|
||
Comment 47•4 years ago
|
||
Build was failed with "ac_add_options --enable-dmd" on Android. Created Bug 1651079.
Comment 48•4 years ago
|
||
The patch adds more logs around RasterImage to Attachment 9160926 [details] [diff].
Comment 49•4 years ago
•
|
||
With Attachment 9162060 [details] [diff], majority of cases, RasterImages of ImageBitmaps were still alive during uploading them to WebGL texture. In this case, SurfaceCache held 777139116 bytes during the uploading. But there was a case that the RasterImages were destroyed before the WebGL uploading. In this case, SurfaceCache held 22700824 bytes during the uploading and succeeded to upload ImageBitmaps to WebGL textures.
From it, we want to destroy RasterImages explicitly before WebGL texture uploading. I wonder if it might be related to ImageDecoderHelper::~ImageDecoderHelper(). It just posts Image to main thread.
Comment 50•4 years ago
•
|
||
:aosmond, is it possible to destroy RasterImage soon after its usage in CreateImageBitmapFromBlob::OnImageReady()? We want to ensure that the RasterImages are destroyed when the ImageBitmaps are used to reduce peak memory use.
Comment 51•4 years ago
•
|
||
The following is a call stack during inserting to SurfaceCache.
-> image::SurfaceCacheImpl::StartTracking()
-> image::SurfaceCacheImpl::Insert()
-> image::SurfaceCache::Insert()
-> image::DecoderFactory::CreateDecoder()
-> image::RasterImage::Decode()
-> image::RasterImage::LookupFrame()
-> image::RasterImage::GetFrameInternal()
-> image::RasterImage::GetFrameAtSize()
-> image::RasterImage::GetFrame(s)
-> dom::CreateImageBitmapFromBlob::OnImageReady()
-> ImageDecoderHelper::Run()
-> SchedulerGroup::Runnable::Run()
-> RunnableTask::Run()
Comment 52•4 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro PTO July 13th-17th] from comment #50)
:aosmond, is it possible to destroy RasterImage soon after its usage in CreateImageBitmapFromBlob::OnImageReady()? We want to ensure that the RasterImages are destroyed when the ImageBitmaps are used to reduce peak memory use.
Created Bug 1651587 for the above.
Comment 53•4 years ago
|
||
With a patch of Bug 1651587, GeckoView example app did not killed during loading ImageBitmaps, though the app was sometimes killed during loading videos.
Updated•4 years ago
|
Comment 54•4 years ago
•
|
||
I was surprised that SharedSurface_SurfaceTexture is not used by default in content process. See bug 1654459.
Updated•4 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Description
•