Reduce layers.max-active to 2 or 3

REOPENED
Assigned to

Status

()

Firefox for Android
Toolbar
REOPENED
2 years ago
2 years ago

People

(Reporter: snorp, Assigned: milan)

Tracking

45 Branch
Unspecified
Android
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [gfx-noted])

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(2 attachments)

This is kind of a sledgehammer for trying to reduce memory usage. I think we should try setting layers.max-active to 2 or 3 and see if we can get any results from that. There will be a perf impact if a page has a bunch of animated elements, but I think most stuff should be fine. A better solution is needed, surely, but this might at least stop us from crashing everywhere.
Nical, Jamie, Milan, what do you think? I would like to try this in Nightly and Aurora now, and then we'll get some more feedback in the next Beta cycle too. If we decide it hurts perf too much or just ineffective, we can back it out.
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(milan)
Flags: needinfo?(jnicol)
(Assignee)

Comment 2

2 years ago
What about just doing this when you're running on a "small memory" device?  Or do we see problems even on "large" phones?  But if we want to experiment, then let's do the telemetry experiment on the nightly population, as long as we have a way of getting the performance/crash information we need.
Flags: needinfo?(milan)
Whiteboard: [gfx-noted]
(In reply to Milan Sreckovic [:milan] from comment #2)
> But if we want to experiment,
> then let's do the telemetry experiment on the nightly population, as long as
> we have a way of getting the performance/crash information we need.

+1

See https://wiki.mozilla.org/Telemetry/Experiments
(In reply to Milan Sreckovic [:milan] from comment #2)
> What about just doing this when you're running on a "small memory" device? 
> Or do we see problems even on "large" phones?  But if we want to experiment,
> then let's do the telemetry experiment on the nightly population, as long as
> we have a way of getting the performance/crash information we need.

We see the OOM on all devices. What does the telemetry experiment get us? We don't need to collect any data and we don't want to restrict it to some subset of the population. All we want is to see if it reduces the crash rate, specifically OOM crashes.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #3)
> (In reply to Milan Sreckovic [:milan] from comment #2)
> > But if we want to experiment,
> > then let's do the telemetry experiment on the nightly population, as long as
> > we have a way of getting the performance/crash information we need.
> 
> +1
> 
> See https://wiki.mozilla.org/Telemetry/Experiments

Those don't work on Mobile, but we have our own A/B testing system. It would work for everything but associating crash data, which is not collected via telemetry on Mobile. Adding that support (a different bug) would be nice.
(Assignee)

Comment 6

2 years ago
This preference is currently set to 20.  The proposal is to set it to 2 or 3.  I'd like to see what that does to the crash rate, whether it introduces other crashes, regresses other things, etc.  So, we can do it by making the change in nightly, waiting a week, and looking at the data, but if we can do it for half the people during that week, we take other variables out of the way (other bugs being introduced or fixed on nightly) and hopefully see the causality of any new problems or no problems at all.

Mark, is there a bug for the "adding support to associate crash data to experiments on mobile"?  Sounds like there may have been thoughts on the subject, and it sounds what I was hoping to get with the experiment I proposed is not currently available.
(Assignee)

Comment 7

2 years ago
Created attachment 8726373 [details]
MozReview Request: Bug 1252929: Change default layers.max-active to 3 on nightly. r?snorp

Review commit: https://reviewboard.mozilla.org/r/37937/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/37937/
Attachment #8726373 - Flags: review?(snorp)
(Assignee)

Comment 8

2 years ago
:snorp, will you look at the incoming data and see if it makes a difference?  Does this need to go beyond nightly to get enough data?
(Assignee)

Comment 9

2 years ago
Created attachment 8726374 [details] [diff] [review]
Bug 1252929: Make layers.max-active preference not require a restart. r=mstange
Attachment #8726374 - Flags: review?(mstange)
Attachment #8726374 - Flags: review?(mstange) → review+
(Assignee)

Comment 10

2 years ago
One thing to consider, here, or in a separate bug, is if we can change the value of this preference on the fly, when we see the memory going down to low levels.  It probably isn't appropriate for the "memory pressure" type of event, but if we do have a way of detecting that things are getting bad, with the "does not require restart" patch above, at least we could try.
Comment on attachment 8726373 [details]
MozReview Request: Bug 1252929: Change default layers.max-active to 3 on nightly. r?snorp

https://reviewboard.mozilla.org/r/37937/#review34523
Attachment #8726373 - Flags: review?(snorp) → review+
(In reply to Milan Sreckovic [:milan] from comment #6)

> Mark, is there a bug for the "adding support to associate crash data to
> experiments on mobile"?  Sounds like there may have been thoughts on the
> subject, and it sounds what I was hoping to get with the experiment I
> proposed is not currently available.

I don't know of a bug. I only know we have been talking about it.

Also, I'm unsure if Telemetry Experiments work in Fennec. I don't know that we've ever tried one.
(In reply to Milan Sreckovic [:milan] from comment #8)
> :snorp, will you look at the incoming data and see if it makes a difference?
> Does this need to go beyond nightly to get enough data?

I'll keep an eye on things. We may want to put it into 46 beta for a cycle to see what happens there.

Comment 14

2 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/54e28eab4833
(Assignee)

Comment 15

2 years ago
The patch in comment 14 is the non-moz-review one.  The other one will come through moz-review
(Assignee)

Updated

2 years ago
Assignee: nobody → milan
OS: Unspecified → Android
(Assignee)

Comment 16

2 years ago
Leaving needinfo on Jamie and Nicolas, to make sure they see what's been done here.

Comment 17

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/54e28eab4833
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox47: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 47

Updated

2 years ago
Flags: needinfo?(nical.bugzilla)

Updated

2 years ago
Flags: needinfo?(jnicol)

Comment 18

2 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/32c6f4faef5a
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/119f66e82125 on suspicion of being the cause of https://treeherder.mozilla.org/logviewer.html#?job_id=23186558&repo=mozilla-inbound
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I'm guessing that since this bug is "Reduce layers.max-active to 2 or 3" rather than "Make layers.max-active preference not need a restart" that it is not in fact fixed or status-firefox47 fixed.
status-firefox47: fixed → ---
Target Milestone: Firefox 47 → ---
(Assignee)

Comment 21

2 years ago
The failing test: dom/media/mediasource/test/test_MediaSource.html
(Assignee)

Comment 22

2 years ago
Randomly, this passes on desktop with the setting of 3.
(Assignee)

Comment 23

2 years ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=fcf4b3882f72
(Assignee)

Comment 24

2 years ago
Wrong try.  This one: https://treeherder.mozilla.org/#/jobs?repo=try&revision=91ec5bd7dd37
(Assignee)

Comment 25

2 years ago
We hit a MOZ_CRASH here: https://dxr.mozilla.org/mozilla-central/source/gfx/layers/GLImages.cpp#74
There were also a bunch of Android reftest failures which went away on the backout.
(Assignee)

Comment 27

2 years ago
Based on https://treeherder.mozilla.org/#/jobs?repo=try&revision=3ecf35422887, the framebuffer status call in https://dxr.mozilla.org/mozilla-central/source/gfx/gl/ScopedGLHelpers.cpp#278 fails with LOCAL_GL_FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT status, causing the MOZ_CRASH mentioned in comment 25.

This status (https://www.khronos.org/opengles/sdk/docs/man/xhtml/glCheckFramebufferStatus.xml) means "No images are attached to the framebuffer". The size of the buffer for the TexImage2D call (https://dxr.mozilla.org/mozilla-central/source/gfx/layers/GLImages.cpp#66) when this happens is 320x240.

Jeff, can you interpret this data?
Flags: needinfo?(jgilbert)
(In reply to Milan Sreckovic [:milan] from comment #27)
> Based on
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=3ecf35422887, the
> framebuffer status call in
> https://dxr.mozilla.org/mozilla-central/source/gfx/gl/ScopedGLHelpers.
> cpp#278 fails with LOCAL_GL_FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT
> status, causing the MOZ_CRASH mentioned in comment 25.
> 
> This status
> (https://www.khronos.org/opengles/sdk/docs/man/xhtml/
> glCheckFramebufferStatus.xml) means "No images are attached to the
> framebuffer". The size of the buffer for the TexImage2D call
> (https://dxr.mozilla.org/mozilla-central/source/gfx/layers/GLImages.cpp#66)
> when this happens is 320x240.
> 
> Jeff, can you interpret this data?

Well, the driver thinks there's no succcessfully-allocated buffer attached to the framebuffer. Check that COLOR_ATTACHMENT0 has a texture or renderbuffer attached, and that the tex/rb was successfully allocated via texImage/renderbufferStorage. (Make sure these calls didn't fail)

Consider MOZ_GL_DEBUG_ABORT_ON_ERROR=1, which should catch in a debugger if one of these fails.
Flags: needinfo?(jgilbert)
(Assignee)

Comment 29

2 years ago
MOZ_GL_DEBUG_ABORT_ON_ERROR=1 is good, but I think it caught an earlier error, with "context is not current":

01:16:08     INFO -   0  libxul.so!mozilla::gl::GLContext::BeforeGLCall [GLContext.h:7c79c672fe8d : 737 + 0x2]
01:16:08     INFO -   1  libxul.so!mozilla::gl::GLContext::fGetString [GLContext.h:7c79c672fe8d : 1314 + 0x9]
01:16:08     INFO -   2  libxul.so!mozilla::gl::GLContext::InitExtensions [GLContext.cpp:7c79c672fe8d : 1899 + 0x9]
01:16:08     INFO -   3  libxul.so!mozilla::gl::GLContext::InitWithPrefix [GLContext.cpp:7c79c672fe8d : 786 + 0x5]
01:16:08     INFO -   4  libxul.so!mozilla::gl::GLContextEGL::Init [GLContextProviderEGL.cpp:7c79c672fe8d : 256 + 0xb]
01:16:08     INFO -   5  libxul.so!mozilla::gl::GLContextEGL::CreateGLContext [GLContextProviderEGL.cpp:7c79c672fe8d : 529 + 0x5]
01:16:08     INFO -   6  libxul.so!mozilla::gl::GLContextEGL::CreateEGLPBufferOffscreenContext [GLContextProviderEGL.cpp:7c79c672fe8d : 952 + 0x5]
01:16:08     INFO -   7  libxul.so!mozilla::gl::GLContextProviderEGL::CreateHeadless [GLContextProviderEGL.cpp:7c79c672fe8d : 977 + 0x9]
01:16:08     INFO -   8  libxul.so!mozilla::widget::GfxInfo::GLStrings::EnsureInitialized [GfxInfo.cpp:7c79c672fe8d : 78 + 0x3]
01:16:08     INFO -   9  libxul.so!mozilla::widget::GfxInfo::EnsureInitialized [GfxInfo.cpp:7c79c672fe8d : 40 + 0x3]
01:16:08     INFO -  10  libxul.so!mozilla::widget::GfxInfo::GetAdapterDriver + 0x7
...

from https://treeherder.mozilla.org/#/jobs?repo=try&revision=7c79c672fe8d&selectedJob=17812248
You need to log in before you can comment on or make changes to this bug.