Open
Bug 838845
Opened 11 years ago
Updated 1 year ago
[meta] Full Review of GfxInfo and feature Blocklisting
Categories
(Core :: Graphics, defect)
Core
Graphics
Tracking
()
NEW
People
(Reporter: akeybl, Unassigned)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Keywords: meta)
Once we're done with the fallout from bug 838603, a full review of blocklisting on Android is a blocker for all future mobile blocklist pushes. We also need to figure out what to do about pushing blocklists in the future that only apply to {version of Firefox vetted for blocklisting} and up.
Comment 1•11 years ago
|
||
The badness is not limited to Android --- I suggest we call this a full review of GfxInfo in general. My bias being that it needs to be rewritten from scratch.
Comment 2•11 years ago
|
||
Should bug 823253 be on here for a block-list removal again dealing with Stage-fright?
Updated•11 years ago
|
Summary: Full Review of Android Graphics Blocklisting → Full Review of GfxInfo and feature Blocklisting
Comment 3•11 years ago
|
||
(In reply to Aaron Train [:aaronmt] from comment #2) > Should bug 823253 be on here for a block-list removal again dealing with > Stage-fright? Removing a blacklist entry should be much safer than adding one... though at this point, I don't want to display to much confidence.
Reporter | ||
Comment 4•11 years ago
|
||
We should complete this review as soon as possible, so we can make any necessary changes in FF20 late in Aurora or early in its beta cycle.
Comment 5•11 years ago
|
||
We're discussing this today in the gfx team. We're making an internal postmortem/proposal asap. I'm not entirely sure that we should aim for FF20 though. At least, we should not rush this and not constrain the new design to accomodate a very near deadline.
Reporter | ||
Comment 6•11 years ago
|
||
(In reply to Benoit Jacob [:bjacob] from comment #5) > We're discussing this today in the gfx team. We're making an internal > postmortem/proposal asap. > > I'm not entirely sure that we should aim for FF20 though. At least, we > should not rush this and not constrain the new design to accomodate a very > near deadline. If possible, please try to separate the identification of issues with our current gfx blocklisting mechanism from a longterm solution. We should leave the door open to short term fixes, in case a critical graphics stability issue is identified prior to a redesign that forces us to react.
Comment 7•11 years ago
|
||
Absolutely. We'll keep these two things separate.
Comment 8•11 years ago
|
||
For reference here (bug 737176) is the previous postmortem we never really did seriously on the previous time that roughly the same thing blew up on us (bug 711656)
Comment 9•11 years ago
|
||
Here's what we discussed so far. It has an analysis of what went wrong, but does not yet have the proposal for the new design. https://etherpad.mozilla.org/gfxinfo
Comment 10•11 years ago
|
||
https://etherpad.mozilla.org/gfxinfo now has a proposal for the new design and a discussion of how it would address issues we identified.
Comment 11•11 years ago
|
||
The etherpad seems to have stabilized so let's dump it here: Part 1. Mini post mortem on the worst bugs we had Bug 838603 - android startup crashes Not fully understood yet, however we know that this was largely caused by a mixture of: - over-complication / absence of design leading to surprising bugs - absence of min-gecko-version field in XML blocklist entries - difficulty of QA'ing XML blocklist entries esp. on android - persistent gfxinfo state in user profile Bug 711656 - D2D crashes Was a combination of: - persistent gfxinfo state in user profile - GfxInfo API exposed to Gecko is too error-prone to use *** Part 2. What's bad with current GfxInfo / what we should make better 1. Poor/complicated design leads to surprising bugs e.g. an unknown feature is interpreted as "all features" by accident 2. Persistent state in user profile Leads to unexpected bugs on real-world user profiles Very hard to predict or QA at all Moreover caching hardware-related info in user profile fails when a profile is used on multiple machines i.e. corp environment, portable firefox, etc. Moreover caching blacklisting decisions in user profile also confuses people who try to upgrade drivers 3. Absence of min-Gecko-version in XML blocklist Means that we can't ship blocklist updates that would break old Firefoxen because of e.g. 1. 4. Should not have a built-in blocklist --- should just fix all reasons why one might not want to use the XML blocklist 5. Should improve XML blocklisting capabilities Not experienced in XML format design. Need help there. E.g. do we want to allow matching hardware fields for substrings, regexes, etc. 6. Bad QAbility Persistent state makes some bugs (see part 1) virtually impossible to QA at all without knowing beforehand what the bug is going to be! No easy way to ping a blacklist update, especially on Android should have a ChromeOnly DOM API, see 9. No easy way to see what blacklist file was used, if it was correctly updated etc. No way to tell what blacklist rule determined the result Cannot test the blocklist entries of platform X while running on platform Y. See 8. 7. Bad testability Cannot test the blocklist entries of platform X while running on platform Y. See 8. XPCOM-only interface is not easy to test. ChromeOnly DOM API would be nicer. See 9. We could use compiled-code tests to test a large number of cases in reasonable time. Especially as some fuzz-like testing could be useful there to test resilience to unexpected XML blocklist entries. XPCshell test gave us the ability to test a browser restart to test XML blocklist updates. But we still can't test what happens when in the real world a build is run against a XML blocklist it wasn't designed for. So we need to change design approach here so that we're much more defensive towards unexpected XML blacklists entries. Then this design requirement would hopefully go away, allowing us to do without the XPCshell test. See below how the new design could allow that. 8. Platform-specific parts should be kept to the minimum i.e. system detection only. In particular that entails that system detection should be a separate self-contained part, separate from general blacklist processing. likely a separate class. 9. Should be a ChromeOnly WebIDL interface, not a XPCOM interface Will allow for mochitest-chrome test Will allow for ease of QA'ing by exposing stuff in a privileged web page Will avoid spreading bad design from XPCOM Will allow to have both a nice JS api and a nice C++ api 10. Bad API exposed to Gecko code Having a WebIDL interface instead of XPCOM interface (9) will go a long way toward making the C++ interface more sane. The interface should also avoid exposing gritty details to code that doesn't need them (precise blacklisting reason code when the user just wants a bool yes/no --- see bug 711656) 11. Impossible to know what blocklist rule determined the result This should be visible in about:support and in crash reports *** Part 3. Proposed new design Let's replace GfxInfo by 3 separate classes: SysInfo Blocklist BlocklistingService A SysInfo instance would contain any information about the machine we're currently running on. One could get a SysInfo by calling a function that does actual system detection; or one could make up one's own custom SysInfo for testing purposes. SysInfo is the only part that would be platform-specific in any way. The SysInfo interface would still be platform-agnostic (otherwise, the whole blocklist processing would need to have platform-specific backends, which we don't want). The system detection code generating a SysInfo representing the host would be the only platform-specific part. A Blocklist would be a set of blocklist rules. One could get one representing the actual blocklist in use (blocklist.xml) or one could make up one's own custom Blocklist for testing purposes. Blocklist would be completely platform-agnostic. BlocklistingService would be a singleton --- not quite sure yet about that part. The BlocklistingService instance would be a good place to cache anything we want to cache for the duration of the Firefox session. BlocklistingService would be completely platform-agnostic. To get a blocklisting decision, the user would call a method on BlocklistingService, passing it a Feature (some enum thing), a SysInfo and a Blocklist. There would be a helper overload only taking a Feature enum and just getting the real SysInfo and Blocklist; while tests would instead provide their own SysInfo and Blocklist. These 3 classes would have ChromeOnly WebIDL bindings so they could be used in the same way by both JS and C++. Blocklist is where all the XML processing would happen. From the outside --- even from BlocklistingService --- it should not matter that the blocklist came from XML BlocklistingService would also offer some more testing/QAing features to force as blocklist ping, obtain the current blocklist and sysinfo, etc. To be determined: what kind of caching would we do, if any, to make the XML blocklist processing not slow down startup? We will cross that bridge when we're there. The key design principle here is that we won't cache anything system-specific; we would just at most cache a more efficient representation of the blocklist entries that could be relevant on the host OS. *** Part 4: How the new design (part 3) would solve the current problems (part 2) 1. Poor/complicated design leads to surprising bugs just taking this occasion to think the design through and code it cleanly should give us a fix for that 2. Persistent state in user profile By not caching anything beyond perhaps a system-agnostic form of the blocklist, we will avoid these problems 3. Absence of min-Gecko-version in XML blocklist We can make such a field mandatory. We can also add an optional max-Gecko-version field. 4. Should not have a built-in blocklist --- should just fix all reasons why one might not want to use the XML blocklist We can take all the existing built-in blocklist and turn it into XML entries, expanding our XML format as needed 5. Should improve XML blocklisting capabilities Will do as needed 6. Bad QAbility No persistent state DOM API will give access to useful features (i.e. ping the blocklist) which we'll be able to expose in privileged pages new design will reduce platform-specific parts to the minimum; will allow to test all platforms' blocklists on all platforms will return the information of which blocklist rule determined the result 7. Bad testability See part 6, the same applies here By not requiring a browser restart for anything, we will be able to everything except testing actual browser restart in mochitests as opposed to xpcshell tests 8. Platform-specific parts should be kept to the minimum i.e. system detection only. Proposed design does give that 9. Should be a ChromeOnly WebIDL interface, not a XPCOM interface Proposed design does give that 10. Bad API exposed to Gecko code New design + WebIDL bindings instead of XPCOM interface should give us that 11. Impossible to know what blocklist rule determined the result No reason why we couldn't write that new code in a way that makes that easy.
Reporter | ||
Comment 12•11 years ago
|
||
Which of the above (outside of the redesign) do we think are blockers for mobile gfx blocklisting vs new desktop gfx blocklisting? Can we get bugs on file for those already?
Comment 13•11 years ago
|
||
Mobile is no different from desktop as far as blocklisting. The above plan only details steps to a redesign; I didn't think of other things that could be done aside from the redesign, as I believe that the right approach is to just do that redesign.
Comment 14•11 years ago
|
||
Am I understanding this correctly, and you want a separate blocklist.xml only for driver blocks? Or is this about extending the current blocklist? I'm just confused because I see you mention XML processing, but I would assume that the current blocklist is already being read and parsed elsewhere.
Comment 15•11 years ago
|
||
This is a plan to redesign the C++ code that handles the blocklist.xml entries, and to expand the xml format there (add new tags). We haven't made plans so far to add a separate XML file, as I was under the assumption that it made sense to avoid adding more network traffic. When we get to the point where we are expanding or changing the XML format, we will definitely work with you to ensure that existing entries are correctly ported.
Reporter | ||
Comment 16•11 years ago
|
||
(In reply to Benoit Jacob [:bjacob] from comment #13) > I didn't think of other things that could > be done aside from the redesign, as I believe that the right approach is to > just do that redesign. Does that mean that the gfx team's recommendation is that we push no further blocks to desktop or mobile until this redesign is complete? If so, have we come to an ETA for the blocking work?
Comment 17•11 years ago
|
||
Not quite that bad. We can push blocks in general, though there may be some types that we couldn't. They should be evaluated with the knowledge gained with bug 824118 until the redesign makes things safer.
Comment 18•11 years ago
|
||
(In reply to Benoit Jacob [:bjacob] from comment #15) > This is a plan to redesign the C++ code that handles the blocklist.xml > entries, and to expand the xml format there (add new tags). We haven't made > plans so far to add a separate XML file, as I was under the assumption that > it made sense to avoid adding more network traffic. When we get to the point > where we are expanding or changing the XML format, we will definitely work > with you to ensure that existing entries are correctly ported. OK, I'll just note that it would be useful to reuse the nodes and attributes we are already using for extension and plugin blocks, which possibly have greater filtering capabilities than what you need for graphics blocks. It'll also make it easier to implement both in Firefox and on AMO.
Comment 19•11 years ago
|
||
@Jorge: interesting, please a link to some documentation about that? @Alex: Is there anything more that you want to happen in Firefox 20 timeframe? Implementation will not be ready for Firefox 20. In the very near term I think all we can do is be extremely careful if we have to add any downloadable blacklist entry. We can still more relaxedly add compiled-in blacklist entries.
Comment 20•11 years ago
|
||
(In reply to Benoit Jacob [:bjacob] from comment #19) > @Jorge: interesting, please a link to some documentation about that? https://wiki.mozilla.org/Blocklisting/Admin
Reporter | ||
Comment 21•11 years ago
|
||
(In reply to Benoit Jacob [:bjacob] from comment #19) > @Alex: Is there anything more that you want to happen in Firefox 20 > timeframe? Implementation will not be ready for Firefox 20. In the very near > term I think all we can do is be extremely careful if we have to add any > downloadable blacklist entry. We can still more relaxedly add compiled-in > blacklist entries. Nope, sounds like there's nothing to do in FF20. Will ask your opinion on bug 843273.
Comment 22•9 years ago
|
||
It would be nice if we could address that in a future release. We have been bitten by this in 33 and that hurt. Milan, could you prioritize this?
Flags: needinfo?(milan)
There hasn't been a lot of push on this; we got in trouble in 21, and now in 33, and were lucky in between, but when bad things happen, usually they're really bad, as blocklisting, especially the downloadable part, is our last line of defense.
Flags: needinfo?(milan)
Updated•9 years ago
|
Assignee: jacob.benoit.1 → nobody
Updated•1 year ago
|
Severity: normal → S3
Updated•1 year ago
|
Summary: Full Review of GfxInfo and feature Blocklisting → [meta] Full Review of GfxInfo and feature Blocklisting
You need to log in
before you can comment on or make changes to this bug.
Description
•