Closed Bug 1252929 Opened 8 years ago Closed 3 years ago

Reduce layers.max-active to 2 or 3

Categories

(Firefox for Android Graveyard :: Toolbar, defect, P5)

45 Branch
Unspecified
Android
defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: snorp, Unassigned)

Details

(Whiteboard: [gfx-noted])

Attachments

(2 files)

This is kind of a sledgehammer for trying to reduce memory usage. I think we should try setting layers.max-active to 2 or 3 and see if we can get any results from that. There will be a perf impact if a page has a bunch of animated elements, but I think most stuff should be fine. A better solution is needed, surely, but this might at least stop us from crashing everywhere.
Nical, Jamie, Milan, what do you think? I would like to try this in Nightly and Aurora now, and then we'll get some more feedback in the next Beta cycle too. If we decide it hurts perf too much or just ineffective, we can back it out.
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(milan)
Flags: needinfo?(jnicol)
What about just doing this when you're running on a "small memory" device?  Or do we see problems even on "large" phones?  But if we want to experiment, then let's do the telemetry experiment on the nightly population, as long as we have a way of getting the performance/crash information we need.
Flags: needinfo?(milan)
Whiteboard: [gfx-noted]
(In reply to Milan Sreckovic [:milan] from comment #2)
> But if we want to experiment,
> then let's do the telemetry experiment on the nightly population, as long as
> we have a way of getting the performance/crash information we need.

+1

See https://wiki.mozilla.org/Telemetry/Experiments
(In reply to Milan Sreckovic [:milan] from comment #2)
> What about just doing this when you're running on a "small memory" device? 
> Or do we see problems even on "large" phones?  But if we want to experiment,
> then let's do the telemetry experiment on the nightly population, as long as
> we have a way of getting the performance/crash information we need.

We see the OOM on all devices. What does the telemetry experiment get us? We don't need to collect any data and we don't want to restrict it to some subset of the population. All we want is to see if it reduces the crash rate, specifically OOM crashes.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #3)
> (In reply to Milan Sreckovic [:milan] from comment #2)
> > But if we want to experiment,
> > then let's do the telemetry experiment on the nightly population, as long as
> > we have a way of getting the performance/crash information we need.
> 
> +1
> 
> See https://wiki.mozilla.org/Telemetry/Experiments

Those don't work on Mobile, but we have our own A/B testing system. It would work for everything but associating crash data, which is not collected via telemetry on Mobile. Adding that support (a different bug) would be nice.
This preference is currently set to 20.  The proposal is to set it to 2 or 3.  I'd like to see what that does to the crash rate, whether it introduces other crashes, regresses other things, etc.  So, we can do it by making the change in nightly, waiting a week, and looking at the data, but if we can do it for half the people during that week, we take other variables out of the way (other bugs being introduced or fixed on nightly) and hopefully see the causality of any new problems or no problems at all.

Mark, is there a bug for the "adding support to associate crash data to experiments on mobile"?  Sounds like there may have been thoughts on the subject, and it sounds what I was hoping to get with the experiment I proposed is not currently available.
:snorp, will you look at the incoming data and see if it makes a difference?  Does this need to go beyond nightly to get enough data?
Attachment #8726374 - Flags: review?(mstange) → review+
One thing to consider, here, or in a separate bug, is if we can change the value of this preference on the fly, when we see the memory going down to low levels.  It probably isn't appropriate for the "memory pressure" type of event, but if we do have a way of detecting that things are getting bad, with the "does not require restart" patch above, at least we could try.
Comment on attachment 8726373 [details]
MozReview Request: Bug 1252929: Change default layers.max-active to 3 on nightly. r?snorp

https://reviewboard.mozilla.org/r/37937/#review34523
Attachment #8726373 - Flags: review?(snorp) → review+
(In reply to Milan Sreckovic [:milan] from comment #6)

> Mark, is there a bug for the "adding support to associate crash data to
> experiments on mobile"?  Sounds like there may have been thoughts on the
> subject, and it sounds what I was hoping to get with the experiment I
> proposed is not currently available.

I don't know of a bug. I only know we have been talking about it.

Also, I'm unsure if Telemetry Experiments work in Fennec. I don't know that we've ever tried one.
(In reply to Milan Sreckovic [:milan] from comment #8)
> :snorp, will you look at the incoming data and see if it makes a difference?
> Does this need to go beyond nightly to get enough data?

I'll keep an eye on things. We may want to put it into 46 beta for a cycle to see what happens there.
The patch in comment 14 is the non-moz-review one.  The other one will come through moz-review
Assignee: nobody → milan
OS: Unspecified → Android
Leaving needinfo on Jamie and Nicolas, to make sure they see what's been done here.
https://hg.mozilla.org/mozilla-central/rev/54e28eab4833
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 47
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(jnicol)
I'm guessing that since this bug is "Reduce layers.max-active to 2 or 3" rather than "Make layers.max-active preference not need a restart" that it is not in fact fixed or status-firefox47 fixed.
Target Milestone: Firefox 47 → ---
The failing test: dom/media/mediasource/test/test_MediaSource.html
Randomly, this passes on desktop with the setting of 3.
There were also a bunch of Android reftest failures which went away on the backout.
Based on https://treeherder.mozilla.org/#/jobs?repo=try&revision=3ecf35422887, the framebuffer status call in https://dxr.mozilla.org/mozilla-central/source/gfx/gl/ScopedGLHelpers.cpp#278 fails with LOCAL_GL_FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT status, causing the MOZ_CRASH mentioned in comment 25.

This status (https://www.khronos.org/opengles/sdk/docs/man/xhtml/glCheckFramebufferStatus.xml) means "No images are attached to the framebuffer". The size of the buffer for the TexImage2D call (https://dxr.mozilla.org/mozilla-central/source/gfx/layers/GLImages.cpp#66) when this happens is 320x240.

Jeff, can you interpret this data?
Flags: needinfo?(jgilbert)
(In reply to Milan Sreckovic [:milan] from comment #27)
> Based on
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=3ecf35422887, the
> framebuffer status call in
> https://dxr.mozilla.org/mozilla-central/source/gfx/gl/ScopedGLHelpers.
> cpp#278 fails with LOCAL_GL_FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT
> status, causing the MOZ_CRASH mentioned in comment 25.
> 
> This status
> (https://www.khronos.org/opengles/sdk/docs/man/xhtml/
> glCheckFramebufferStatus.xml) means "No images are attached to the
> framebuffer". The size of the buffer for the TexImage2D call
> (https://dxr.mozilla.org/mozilla-central/source/gfx/layers/GLImages.cpp#66)
> when this happens is 320x240.
> 
> Jeff, can you interpret this data?

Well, the driver thinks there's no succcessfully-allocated buffer attached to the framebuffer. Check that COLOR_ATTACHMENT0 has a texture or renderbuffer attached, and that the tex/rb was successfully allocated via texImage/renderbufferStorage. (Make sure these calls didn't fail)

Consider MOZ_GL_DEBUG_ABORT_ON_ERROR=1, which should catch in a debugger if one of these fails.
Flags: needinfo?(jgilbert)
MOZ_GL_DEBUG_ABORT_ON_ERROR=1 is good, but I think it caught an earlier error, with "context is not current":

01:16:08     INFO -   0  libxul.so!mozilla::gl::GLContext::BeforeGLCall [GLContext.h:7c79c672fe8d : 737 + 0x2]
01:16:08     INFO -   1  libxul.so!mozilla::gl::GLContext::fGetString [GLContext.h:7c79c672fe8d : 1314 + 0x9]
01:16:08     INFO -   2  libxul.so!mozilla::gl::GLContext::InitExtensions [GLContext.cpp:7c79c672fe8d : 1899 + 0x9]
01:16:08     INFO -   3  libxul.so!mozilla::gl::GLContext::InitWithPrefix [GLContext.cpp:7c79c672fe8d : 786 + 0x5]
01:16:08     INFO -   4  libxul.so!mozilla::gl::GLContextEGL::Init [GLContextProviderEGL.cpp:7c79c672fe8d : 256 + 0xb]
01:16:08     INFO -   5  libxul.so!mozilla::gl::GLContextEGL::CreateGLContext [GLContextProviderEGL.cpp:7c79c672fe8d : 529 + 0x5]
01:16:08     INFO -   6  libxul.so!mozilla::gl::GLContextEGL::CreateEGLPBufferOffscreenContext [GLContextProviderEGL.cpp:7c79c672fe8d : 952 + 0x5]
01:16:08     INFO -   7  libxul.so!mozilla::gl::GLContextProviderEGL::CreateHeadless [GLContextProviderEGL.cpp:7c79c672fe8d : 977 + 0x9]
01:16:08     INFO -   8  libxul.so!mozilla::widget::GfxInfo::GLStrings::EnsureInitialized [GfxInfo.cpp:7c79c672fe8d : 78 + 0x3]
01:16:08     INFO -   9  libxul.so!mozilla::widget::GfxInfo::EnsureInitialized [GfxInfo.cpp:7c79c672fe8d : 40 + 0x3]
01:16:08     INFO -  10  libxul.so!mozilla::widget::GfxInfo::GetAdapterDriver + 0x7
...

from https://treeherder.mozilla.org/#/jobs?repo=try&revision=7c79c672fe8d&selectedJob=17812248
Assignee: milaninbugzilla → nobody
Re-triaging per https://bugzilla.mozilla.org/show_bug.cgi?id=1473195

Needinfo :susheel if you think this bug should be re-triaged.
Priority: -- → P5
We have completed our launch of our new Firefox on Android. The development of the new versions use GitHub for issue tracking. If the bug report still reproduces in a current version of [Firefox on Android nightly](https://play.google.com/store/apps/details?id=org.mozilla.fenix) an issue can be reported at the [Fenix GitHub project](https://github.com/mozilla-mobile/fenix/). If you want to discuss your report please use [Mozilla's chat](https://wiki.mozilla.org/Matrix#Connect_to_Matrix) server https://chat.mozilla.org and join the [#fenix](https://chat.mozilla.org/#/room/#fenix:mozilla.org) channel.
Status: REOPENED → RESOLVED
Closed: 8 years ago3 years ago
Resolution: --- → INCOMPLETE
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.