Closed Bug 1316187 Opened 3 years ago Closed 3 years ago

verify that all the chrome files we ship are actually referenced

Categories

(Firefox Build System :: General, defect)

defect
Not set

Tracking

(firefox55 fixed)

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: florian, Assigned: florian)

References

(Depends on 6 open bugs, Blocks 1 open bug)

Details

Attachments

(3 files, 5 obsolete files)

I think we can do this with a mochitest. We'll need a relatively long list of exceptions to take into account files referenced from C++ code, but I think it'll still be worth it.
Attached patch Patch v1 (obsolete) — Splinter Review
I included lists of hardcoded files and and paths where there are files I can't find references for (searchplugins and light weight themes), but there are still lots of reported files (locally on Mac it reports 89 browser files and 42 devtools files as unreferenced). Some unreferenced files are intentionally there (eg. for add-on compat), but most are probably just bloat.
Attachment #8808854 - Flags: feedback?(gijskruitbosch+bugs)
Assignee: nobody → florian
Status: NEW → ASSIGNED
Attached patch Patch v2 (obsolete) — Splinter Review
With little additional effort, we can also check the other way that all chrome files that are referenced actually exist.
Attachment #8808979 - Flags: feedback?(gijskruitbosch+bugs)
Attachment #8808854 - Attachment is obsolete: true
Attachment #8808854 - Flags: feedback?(gijskruitbosch+bugs)
Comment on attachment 8808854 [details] [diff] [review]
Patch v1

Review of attachment 8808854 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/base/content/test/general/browser.ini
@@ +146,5 @@
>  [browser_aboutHome_wrapsCorrectly.js]
>  [browser_addKeywordSearch.js]
>  [browser_alltabslistener.js]
> +[browser_all_chrome_files_referenced.js]
> +skip-if = debug # no point in running on both opt and debug, and will likely intermittently timeout on debug

Can we move this and the other js/css parsable stuff out into a separate directory? They take a long time (even on opt) and not further bloating general/ would be a Good Thing.

::: browser/base/content/test/general/browser_all_chrome_files_referenced.js
@@ +1,2 @@
> +/* Any copyright is dedicated to the Public Domain.
> + * http://creativecommons.org/publicdomain/zero/1.0/ */

Don't need this disclaimer.

@@ +10,5 @@
> +                       "chrome://browser/locale/searchplugins/"];
> +
> +var gHardcodedReferences = new Set([
> +  // browser/components/about/AboutRedirector.cpp
> +  "chrome://browser/content/aboutNetError.xhtml",

I'm not convinced hardcoding all of these is a good idea.

Is there some way to enumerate about pages?

@@ +91,5 @@
> +  "resource://gre/res/contenteditable.css",
> +  "resource://gre/res/designmode.css",
> +  "resource://gre-resources/counterstyles.css",
> +  "resource://gre-resources/quirk.css",
> +  "resource://gre/res/svg.css",

It seems all of these shouldn't need to be in here, either.

@@ +127,5 @@
> +  "chrome://global/content/devicestorage.properties",
> +
> +  // widget/nsBaseFilePicker.cpp
> +  "chrome://global/locale/filepicker.properties",
> +  "chrome://global/content/filepicker.properties",

Feels like maybe we should ignore locale .properties files more generally?

@@ +161,5 @@
> +  "chrome://pdf.js/locale/chrome.properties",
> +  "chrome://pdf.js/locale/viewer.properties",
> +
> +  // extensions/cookie/nsCookiePromptService.cpp
> +  "chrome://cookie/content/cookieAcceptDialog.xul",

We got rid of "ask" for cookies, didn't we? Is this really still used?

@@ +204,5 @@
> +  // netwerk/streamconv/converters/nsIndexedToHTML.cpp
> +  "chrome://global/skin/dirListing/dirListing.css",
> +]);
> +
> +function fetchFile(uri) {

Isn't this duplicated with some of the other tests? Can we move it into the parsing helper?

@@ +270,5 @@
> +  return fetchFile(fileUri.spec).then(data => {
> +    for (let line of data.split('\n')) {
> +      let urls = line.match(/url\([^()]+\)/g);
> +      if (!urls) {
> +        // @import rules can take directly a string instead of a url.

Slight Francophonism going on here, I think. In English, I would omit "directly".

@@ +286,5 @@
> +        if (url.startsWith("data:"))
> +          continue;
> +
> +        url = Services.io.newURI(url, null, fileUri).specIgnoringRef;
> +        gReferencesFromCode.add(convertToChromeUri(url));

So does this test supersede the checks we do on .css files right now for images etc.? It feels like we're duplicating work...

@@ +299,5 @@
> +      let urls =
> +        line.match(/chrome:\/\/[a-zA-Z09 -]+\/(content|skin|locale)\/[^"')\\?]*/g);
> +      if (!urls) {
> +        // If there's no absolute chrome URL, look for relative ones in
> +        // src and href attributes.

This being either/or seems confusing. Presumably if a chrome .xul file used src="chrome://foo/content/foo.something" that would match the regex, but it could then still also used src="foo.png" ?

@@ +333,5 @@
> +    }
> +  });
> +}
> +
> +function convertToChromeUri(fileUri) {

This seems... tricky. Can you explain how this is supposed to cope with subdirs?

@@ +409,5 @@
> +
> +  let unreferencedFiles = chromeFiles.filter(isUnreferenced).sort();
> +  is(unreferencedFiles.length, 0, "there should be no unreferenced files");
> +  for (let file of unreferencedFiles)
> +    info("unreferenced chrome file: " + file);

ok(false, ...) is probably easier in terms of treeherder displays.
Attachment #8808854 - Attachment is obsolete: false
Comment on attachment 8808979 [details] [diff] [review]
Patch v2

I think most of my feedback still applies, so re-marking without looking at this further.
Attachment #8808979 - Flags: feedback?(gijskruitbosch+bugs) → feedback+
Attachment #8808854 - Attachment is obsolete: true
Comment on attachment 8808979 [details] [diff] [review]
Patch v2

Review of attachment 8808979 [details] [diff] [review]:
-----------------------------------------------------------------

I love the intent of this patch. I have questions whether this needs to be a mochitest.

Another issue I see is the giant list of files in browser_all_chrome_files_referenced.js. Having central lists of files that can be defined by components all over the tree is not scalable. Files like package-manifest.in are already a PITA to manage.

The preferred pattern for "global lists of files/path" is to define metadata somehow in the directories where the files themselves live. That way, if you delete or rename a directory, you don't have to go updating random references throughout the tree. It helps keep the barrier to future change low and therefore increases productivity.

Since the needed per-file metadata is related to files being packaged, the proper location for this annotation would be jar.mn files. We'd likely want to introduce a 1 character symbol to denote "referenced by native only" and have moz.build processing emit a file containing a list of said paths that this test can read. That's a bit of work, unfortunately. Code for jar.mn parsing lives in python/mozbuild/mozbuild/jar.mn. Getting a file written that can be read by the test in automation requires a bit of build system plumbing. A build peer can help out. We could possibly defer this work to a follow-up, as making the install small is more valuable than the build system being an ivory tower of architectural purity.

::: browser/base/content/test/general/browser_all_chrome_files_referenced.js
@@ +1,4 @@
> +/* Any copyright is dedicated to the Public Domain.
> + * http://creativecommons.org/publicdomain/zero/1.0/ */
> +
> +var moduleLocation = gTestPath.replace(/\/[^\/]*$/i, "/parsingTestHelpers.jsm");

Does this need to be a mochitest? Why not an xpcshell test?

I'd also like glandium's opinion at implementing this at the packaging level, which is the correct layer for this check if it can be done there (following the principle of "fail fast").
A packaging level test, while ideal, is also not a workable solution if it involves running a xpcshell or mochitest test: it won't run for android builds, and in the near future, won't run for OSX builds.
Attached patch Patch v3 (obsolete) — Splinter Review
Now without a big list of references from binary code (it turns out the libxul binary file can also be read to extract chrome urls from it).
Attachment #8808979 - Attachment is obsolete: true
Attached patch Patch v4Splinter Review
This is the patch that was pushed to try in comment 10, it produces lists of unreferenced chrome:// files, with few false positives.
Attachment #8809213 - Attachment is obsolete: true
Attached patch Patch for resource:// URLs (obsolete) — Splinter Review
Additional patch to apply after attachment 8813171 [details] [diff] [review] to also check resource:// urls.
This is what I pushed to try in comment 11.

The lists of files reported as unreferenced are significantly longer, and there are more false positives especially for devtools as I couldn't resolve correctly all the relative paths given to require() (eg. when there's a baseURL provided in a different .js file that's loaded in the same scope in normal use).
This is currently reporting 237 potentially unused files, plus 110 for devtools.

Some of these files are false positives for which the test can't find references in the code, due to the url being built using string concatenations in the code, or due to bugs in the test. These files will need to be whitelisted.

A few files may be unreferenced from Firefox, but intentionally kept for add-on backward compatibility. These files can be whitelisted too.

Most of these files are actually unused and we should file bugs to remove them. Ideally these bugs should be fixed, but if fixing them takes time, we may want to whitelist a few files with a comment pointing to the relevant bug.

We are not going to whitelist 300+ files though, so I'll need help to sort through the list of reported files, and file bugs. I created a google spreadsheet to coordinate this effort:
https://docs.google.com/spreadsheets/d/1lthM18DHQUy2Spn1IN8w-nXHjxU9ion91p996GoAH94/edit
Depends on: 1319428
Depends on: 1319432
Depends on: 1319435
Depends on: 1319457
Depends on: 1319539
Depends on: 1319541
Depends on: 1319551
This seems directly applicable to not shipping unused desktop resources in Fennec: Bug 1044079. \o/
Blocks: 1044079
Looks like at least

resource://app/modules/devtools
resource://gre/modules/devtools

should be added to the set of DevTools prefixes.

I'll keep looking over the DevTools report, thanks for investigating this!
Polished a bit the resource:// patch to address some feedback and whitelist some false positives.
Attachment #8813173 - Attachment is obsolete: true
Depends on: 1319762
Depends on: 1319832
Depends on: 1319844
Depends on: 1319849
Depends on: 1319854
Depends on: 1319861
Depends on: 1320058
Depends on: 1320063
Depends on: 1320066
Depends on: 1320084
Depends on: 1320126
Depends on: 1320154
Depends on: 1320156
Depends on: 1320157
Depends on: 1279533
Depends on: 1330821
Depends on: 1313045
Depends on: 1335409
Depends on: 1337345
Depends on: 1339400
Depends on: 1339401
Depends on: 1339420
Depends on: 1339424
Depends on: 1339432
Depends on: 1339444
Depends on: 1343584
Depends on: 1343824
Depends on: 1343827
Depends on: 1343833
Depends on: 1343837
Depends on: 1343839
Depends on: 1343843
Depends on: 1344252
Depends on: 1344257
Depends on: 1344267
Depends on: 1345513
Depends on: 1347928
Depends on: 1347947
Depends on: 1348358
Depends on: 1348359
Depends on: 1348362
Depends on: 1348369
Depends on: 1348521
Depends on: 1348522
Depends on: 1348524
Depends on: 1348525
Depends on: 1348526
Depends on: 1348529
Depends on: 1348532
Depends on: 1348533
Depends on: 1348555
Depends on: 1348556
Depends on: 1348558
Depends on: 1348559
Depends on: 1348561
Depends on: 1348562
Depends on: 1348570
Attached patch Patch v5 (obsolete) — Splinter Review
I cut the scope a little so that we can land and benefit sooner from the main part here.

What remains to do for follow-up:
- handle resource:// URLs. I already have a patch for this attached here, but there's some work left to identify what's false positives and what's bugs, and file the bugs. I'll keep that for a follow-up coming soon so that it doesn't block adding test coverage for chrome:// URLs.
- catch files that are in the chrome folder inside the omni.ja files, but not reachable from any registered chrome:// URLs. I have simple code to handle this, but unfortunately it has LOTS (1000+) of false positives because somehow some files used with resource URLs end up in the chrome tree (this mostly happens for devtools and pdfjs files). So this should be easier to handle after resource:// URLs are covered.
- reduce the number of false positives and add a whitelist for non-existant files that are referenced in our code, so that we can make new broken references fail the test. This should prevent bugs like bug 1340443 from happening again in the future.
- replace the simple reference check with a reference graph, so that we can catch files that reference each other but are otherwise unreferenced.
Attachment #8848770 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8848770 [details] [diff] [review]
Patch v5

Review of attachment 8848770 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/base/content/test/static/browser_all_files_referenced.js
@@ +1,2 @@
> +/* Any copyright is dedicated to the Public Domain.
> + * http://creativecommons.org/publicdomain/zero/1.0/ */

Don't need this to public domain a test, AIUI.

@@ +222,5 @@
> +      } else if (type == "category" &&
> +                 ["agent-style-sheets", "webextension-scripts",
> +                  "webextension-schemas", "webextension-scripts-addon",
> +                  "webextension-scripts-content",
> +                  "webextension-scripts-devtools"].includes(argv[0])) {

Factor this list of things out into a const Set, and use Set.has() ?

@@ +298,5 @@
> +          url = url.slice(0, -1);
> +
> +        // Make urls like chrome://browser/skin/ point to an actual file,
> +        // and remove the ref if any.
> +        url = Services.io.newURI(url).specIgnoringRef;

I wonder if we should make this not throw outright, but log a failure, if constructing the URL fails.

@@ +466,5 @@
> +        file.match("chrome://([^/]+)/content/([^/.]+)\.xul") ||
> +        file.match("chrome://([^/]+)/skin/([^/.]+)\.css");
> +      if (!pathParts || pathParts[1] != pathParts[2]) {
> +        // TODO: add a whitelist and make this reliable enough that we could
> +        // make the test fail when this catches something new.

Here and for the other todo, can you file bugs and include the bug reference here?
Attachment #8848770 - Flags: review?(gijskruitbosch+bugs) → review+
Depends on: 1349005
Depends on: 1349010
Attached patch Patch v6Splinter Review
I addressed comment 27, and also added support for ignorable whitelist entries, to make the test pass even when running on non-packaged builds. Hopefully it's OK to still carry forward the r+.

(In reply to :Gijs from comment #27)

> ::: browser/base/content/test/static/browser_all_files_referenced.js
> @@ +1,2 @@
> > +/* Any copyright is dedicated to the Public Domain.
> > + * http://creativecommons.org/publicdomain/zero/1.0/ */
> 
> Don't need this to public domain a test, AIUI.

I almost expected you to comment on this license header, as I remember you commenting on a test license header in the past. The reason why I decided to include it here is for consistency: all the other .js files in this folder have one.

Now if these headers are really useless, it would be nice to remove them all at once, and make a linter error when introducing new ones.
Attachment #8848770 - Attachment is obsolete: true
Pushed by florian@queze.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/009667806d2b
verify that all the chrome files we ship are actually referenced, r=Gijs.
Pushed by kwierso@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8483a6ffb5f2
Followup to fix incoming merge bustage before it happens a=bustage
Pushed by florian@queze.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/28893e8731b4
follow-up to fix failures on Linux x64 asan builds that have the crash reporter disabled, r=bustage-fix.
https://hg.mozilla.org/mozilla-central/rev/009667806d2b
https://hg.mozilla.org/mozilla-central/rev/8483a6ffb5f2
https://hg.mozilla.org/mozilla-central/rev/28893e8731b4
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 55
(In reply to Florian Quèze [:florian] [:flo] from comment #28)
> Now if [test public domain] headers are really useless, it would be nice to remove them all
> at once, and make a linter error when introducing new ones.

I think so. Gerv, can you confirm?
Flags: needinfo?(gerv)
> > Don't need this to public domain a test, AIUI.

Who thinks this, and why? I'm fairly sure you do need it.

I mean, there is a Mozilla policy that tests are PD, but that doesn't mean that explicitly marking them becomes a bad idea.

Gerv
Flags: needinfo?(gerv)
(In reply to Gervase Markham [:gerv] from comment #34)
> > > Don't need this to public domain a test, AIUI.
> 
> Who thinks this, and why? I'm fairly sure you do need it.

I do.

What does "need" mean in your sentence? Your blogpost (https://blog.gerv.net/2014/09/licensing-policy-change-tests-are-now-public-domain/ - from 2 years ago, upon which my understanding has been based all that time!) said:

> PD Test Code is Test Code which is Mozilla Code, which does not carry an explicit license header, and which was either committed to the Mozilla repository on or after 10th September 2014, ...

which seems to be pretty clear that test code in the repo without a license header ends up being public domain. Why do we then still "need" a license header, and how strong is that "need", really?

> I mean, there is a Mozilla policy that tests are PD, but that doesn't mean
> that explicitly marking them becomes a bad idea.

Adding boilerplate to every file in the tree is a bad idea, from my perspective. Recent case in point: we're switching Firefox's icons to SVG (bug 1347543). We have a lot of those icons, and they're so small that the MPL boilerplate is literally the majority of the file. Having that boilerplate in every file (so it becomes part of the DOM that loads in the image, if it's a comment, or so that we need to preprocess all those files, lengthening build times) affects the product we ship (and how we ship it).

Where boilerplate is legally necessary, I will grudgingly accept having it, but otherwise I would really, really like to just get rid of it. I don't open files to read the legalese at the top, but to look at the actual test. The closer to line 0 of the file that the actual test starts, the better.
Flags: needinfo?(gerv)
(In reply to :Gijs from comment #35)
> > PD Test Code is Test Code which is Mozilla Code, which does not carry an explicit license header, and which was either committed to the Mozilla repository on or after 10th September 2014, ...
> 
> which seems to be pretty clear that test code in the repo without a license
> header ends up being public domain. Why do we then still "need" a license
> header, and how strong is that "need", really?

Because per-file license documentation is a good idea:
http://lu.is/blog/2012/03/17/on-the-importance-of-per-file-license-information/

> Adding boilerplate to every file in the tree is a bad idea, from my
> perspective. 

I am not suggesting we take the time to go through the tree adding boilerplate to every file. But when creating new files, you should add the appropriate licensing information. It's intentionally designed to be only 2 lines. Files often end up somewhere other than where you first put them, and license-less files cause problems with reuse.

Gerv
Flags: needinfo?(gerv)
Depends on: 1349307
Depends on: 1349508
(In reply to Gervase Markham [:gerv] from comment #36)
> (In reply to :Gijs from comment #35)
> > > PD Test Code is Test Code which is Mozilla Code, which does not carry an explicit license header, and which was either committed to the Mozilla repository on or after 10th September 2014, ...
> > 
> > which seems to be pretty clear that test code in the repo without a license
> > header ends up being public domain. Why do we then still "need" a license
> > header, and how strong is that "need", really?
> 
> Because per-file license documentation is a good idea:
> http://lu.is/blog/2012/03/17/on-the-importance-of-per-file-license-
> information/
> 
> > Adding boilerplate to every file in the tree is a bad idea, from my
> > perspective. 
> 
> I am not suggesting we take the time to go through the tree adding
> boilerplate to every file. But when creating new files, you should add the
> appropriate licensing information. It's intentionally designed to be only 2
> lines. Files often end up somewhere other than where you first put them, and
> license-less files cause problems with reuse.

So then, do we really need linter errors for new files *without* license headers?
Flags: needinfo?(gerv)
I actually based my reviews on the same assumptions as comment 35, as well as many other reviewers did. We suggested to avoid PD public headers, since tests default to PD already.
Imo, it would be unfortunate to have again to introduce pointless boilerplate code in new tests, and have to ask people to add it when they make a new test starting from a copy of an existing one that has no licence header. Sounds like a waste of time.
I think a linter error for new files without license headers would be a fine idea. Mozilla's policy is per-file license information. We hope your editor is capable of either auto-collapsing it, or displaying enough lines that 2/3 lines at the top are not an issue.

Gerv
Flags: needinfo?(gerv)
Depends on: 1356029
See Also: → 1377705
Component: Build Config → General
Product: Firefox → Firefox Build System
Target Milestone: Firefox 55 → mozilla55
You need to log in before you can comment on or make changes to this bug.