Closed Bug 1386304 Opened 2 years ago Closed 2 years ago

Add asan-reporter to codebase

Categories

(Firefox Build System :: General, enhancement)

x86_64
Unspecified
enhancement
Not set

Tracking

(firefox57 fixed)

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: decoder, Assigned: decoder)

References

(Blocks 1 open bug)

Details

(Keywords: sec-want, Whiteboard: [adv-main57-])

Attachments

(1 file, 1 obsolete file)

Attached patch asan-reporter.patch (obsolete) — Splinter Review
For the ASan Nightly project, we need client-side code that is capable of picking up ASan traces that have been written to disk using ASan's "log_path" option and submit them to a special infrastructure run by us.

Attached is a first proof-of-concept patch that adds a system add-on for this purpose, as well as the build flag required to enable it (since we will need a separate build for this).

One problem I was not able to solve is the location of the traces. ASan reads its settings at earliest init, way before XPCOM is initialized. Therefore, I cannot access the temporary or profile directory provided by our codebase. Instead, I had to hardcode "/tmp" which should work fine for Linux and Mac (the main targets for this add-on version). 

The patch also doesn't add the prefs yet used by the add-on because the API endpoint is not determined yet.

Requesting a first feedback from Nathan on this before actually trying to land it.
Attachment #8892487 - Flags: feedback?(nfroyd)
Comment on attachment 8892487 [details] [diff] [review]
asan-reporter.patch

Review of attachment 8892487 [details] [diff] [review]:
-----------------------------------------------------------------

I assume you've tested this with some kind of local server or something?  Not sure if you can write tests that can be executed in-tree for this, but if you could, that would be most excellent.

::: browser/extensions/asan-reporter/README.md
@@ +1,2 @@
> +# firefox-asan-reporter
> +Internal ASan reporter addon for Firefox

I'm sure you realize this too, but it would be nice if this were a little more fleshed out.

::: browser/extensions/asan-reporter/bootstrap.js
@@ +25,5 @@
> +const PREF_API_URL = "asanreporter.apiurl";
> +const PREF_AUTH_TOKEN = "asanreporter.authtoken";
> +
> +function log(aMsg) {
> +  console.log("@ ASan Reporter: " + aMsg);

Logging should be hidden behind some sort of debugging flag; I think we even have a Log.jsm module for such purposes.

@@ +42,5 @@
> +  //
> +  // Instead, we hardcode the /tmp directory here, which should be fine in
> +  // most cases, as long as we are on Linux and Mac (the main targets for
> +  // this addon at the time of writing).
> +  processDirectory("/tmp");

We probably shouldn't do this at addon startup, it should be some sort of "wait until the browser isn't in use" sort of thing.

@@ +57,5 @@
> +  // Scan the directory for any ASan logs that we haven't
> +  // submitted yet. Store the filenames in an array so we
> +  // can close the iterator early.
> +  let promise = iterator.forEach(
> +    function onEntry(entry) {

I think the newish syntax is something like `(entry) => { ... }`, which is a little bit shorter.  Likewise for lots of other closures used below.

@@ +59,5 @@
> +  // can close the iterator early.
> +  let promise = iterator.forEach(
> +    function onEntry(entry) {
> +      if (entry.name.indexOf("ff_asan_log.") == 0
> +       && entry.name.indexOf("submitted") < 0) {

Nit: this is kind of weird indentation.

@@ +79,5 @@
> +        // ... then chain all other report submit calls
> +        for (let i = 1; i < results.length; ++i) {
> +          let f = function() { return submitReport(results[i].path) }
> +          promise = promise.then(f,f);
> +        }

Maybe just Promise.all(Array.map(...)) ?  Not quite the same behavior if one of the promises rejects, but that's probably OK?

@@ +98,5 @@
> +  }
> +  return OS.File.read(reportFile).then(submitToServer).then(markSubmitted);
> +}
> +
> +function submitToServer(data) {

I am not qualified to try and review procedures around submitting things to a server and not submitting PII and such.  You might look at what Telemetry/crash reporting requires for data review and poke some of those folks to see if they have ideas.

@@ +105,5 @@
> +      let cid = Preferences.get(PREF_CLIENT_ID);
> +      let api_url = Preferences.get(PREF_API_URL);
> +      let auth_token = Preferences.get(PREF_AUTH_TOKEN);
> +
> +      let decoder = gTextDecoder;

Why have a global variable for this, instead of creating what you need every time?  More efficient, maybe, but you're also setting up report objects and querying preferences on every submission, too...so it doesn't seem like avoiding a TextDecoder creation every time is saving you much.

@@ +108,5 @@
> +
> +      let decoder = gTextDecoder;
> +
> +      if (!cid) {
> +        cid = "unknown";

What is the cid for, anyway?  This sounds like a unique identifier, which I think we have some strict rules about?

::: build/moz.configure/old.configure
@@ +157,5 @@
>      '--cache-file',
>      '--datadir',
>      '--enable-accessibility',
>      '--enable-address-sanitizer',
> +    '--enable-address-sanitizer-reporter',

New options should be added in moz.configure where possible, and adding this one to moz.configure is definitely possible.

Checks should be added that --enable-address-sanitizer-reporter requires --enable-address-sanitizer.
Attachment #8892487 - Flags: feedback?(nfroyd) → feedback+
(In reply to Nathan Froyd [:froydnj] from comment #1)
> Comment on attachment 8892487 [details] [diff] [review]
> asan-reporter.patch
> 
> Review of attachment 8892487 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I assume you've tested this with some kind of local server or something? 
> Not sure if you can write tests that can be executed in-tree for this, but
> if you could, that would be most excellent.

This is tested with a remote system that we already have (and it is quite huge). Writing local tests for this would require writing a local server as well, so that is a lot of work that we are probably not willing to take at this stage of the project. Also, doing so would still not confirm that this works with our remote systems, so I see little use there.

> 
> ::: browser/extensions/asan-reporter/README.md
> @@ +1,2 @@
> > +# firefox-asan-reporter
> > +Internal ASan reporter addon for Firefox
> 
> I'm sure you realize this too, but it would be nice if this were a little
> more fleshed out.


I didn't write a more detailed README so far because this isn't meant to be used by anyone at the moment. But I will write a more comprehensive README before this actually goes into the tree and we have more details.

> 
> ::: browser/extensions/asan-reporter/bootstrap.js
> @@ +25,5 @@
> > +const PREF_API_URL = "asanreporter.apiurl";
> > +const PREF_AUTH_TOKEN = "asanreporter.authtoken";
> > +
> > +function log(aMsg) {
> > +  console.log("@ ASan Reporter: " + aMsg);
> 
> Logging should be hidden behind some sort of debugging flag; I think we even
> have a Log.jsm module for such purposes.

Ok, I will look that up and see if I can find it. I copied this from other addons though, so this is nothing that just I am doing. Most of the stuff around here I learned from other system addons we have.

> 
> @@ +42,5 @@
> > +  //
> > +  // Instead, we hardcode the /tmp directory here, which should be fine in
> > +  // most cases, as long as we are on Linux and Mac (the main targets for
> > +  // this addon at the time of writing).
> > +  processDirectory("/tmp");
> 
> We probably shouldn't do this at addon startup, it should be some sort of
> "wait until the browser isn't in use" sort of thing.

I disagree. Getting the crash trace to us has highest priority in this kind of build (the whole build is super slow anyway, so you shouldn't even notice these async operations). Waiting until the browser is further in use risks that we crash again (e.g. on tab restore) without having the previous crash submitted, so I would like to keep this in at startup.

> 
> @@ +57,5 @@
> > +  // Scan the directory for any ASan logs that we haven't
> > +  // submitted yet. Store the filenames in an array so we
> > +  // can close the iterator early.
> > +  let promise = iterator.forEach(
> > +    function onEntry(entry) {
> 
> I think the newish syntax is something like `(entry) => { ... }`, which is a
> little bit shorter.  Likewise for lots of other closures used below.

I copied this from our MDN documentation, but I can try to adjust it.

> 
> @@ +59,5 @@
> > +  // can close the iterator early.
> > +  let promise = iterator.forEach(
> > +    function onEntry(entry) {
> > +      if (entry.name.indexOf("ff_asan_log.") == 0
> > +       && entry.name.indexOf("submitted") < 0) {
> 
> Nit: this is kind of weird indentation.

Will fix.

> 
> @@ +79,5 @@
> > +        // ... then chain all other report submit calls
> > +        for (let i = 1; i < results.length; ++i) {
> > +          let f = function() { return submitReport(results[i].path) }
> > +          promise = promise.then(f,f);
> > +        }
> 
> Maybe just Promise.all(Array.map(...)) ?  Not quite the same behavior if one
> of the promises rejects, but that's probably OK?

I can try to use that instead, I have never used it before, so yay for learning new things \o/

> 
> @@ +98,5 @@
> > +  }
> > +  return OS.File.read(reportFile).then(submitToServer).then(markSubmitted);
> > +}
> > +
> > +function submitToServer(data) {
> 
> I am not qualified to try and review procedures around submitting things to
> a server and not submitting PII and such.  You might look at what
> Telemetry/crash reporting requires for data review and poke some of those
> folks to see if they have ideas.

Will do :)

> 
> @@ +105,5 @@
> > +      let cid = Preferences.get(PREF_CLIENT_ID);
> > +      let api_url = Preferences.get(PREF_API_URL);
> > +      let auth_token = Preferences.get(PREF_AUTH_TOKEN);
> > +
> > +      let decoder = gTextDecoder;
> 
> Why have a global variable for this, instead of creating what you need every
> time?  More efficient, maybe, but you're also setting up report objects and
> querying preferences on every submission, too...so it doesn't seem like
> avoiding a TextDecoder creation every time is saving you much.

I copied this from our own internals. It must be saving something if we do it in core code right?

> 
> @@ +108,5 @@
> > +
> > +      let decoder = gTextDecoder;
> > +
> > +      if (!cid) {
> > +        cid = "unknown";
> 
> What is the cid for, anyway?  This sounds like a unique identifier, which I
> think we have some strict rules about?

No, cid is just the "clientid" which is an arbitrary identifier. In our setup, this is either "unknown" or a value the users set through the pref (in this case, their Bugzilla email address so we can Cc them).

> 
> ::: build/moz.configure/old.configure
> @@ +157,5 @@
> >      '--cache-file',
> >      '--datadir',
> >      '--enable-accessibility',
> >      '--enable-address-sanitizer',
> > +    '--enable-address-sanitizer-reporter',
> 
> New options should be added in moz.configure where possible, and adding this
> one to moz.configure is definitely possible.
> 
> Checks should be added that --enable-address-sanitizer-reporter requires
> --enable-address-sanitizer.


I will try to move this. Thanks for the quick feedback and I'd be happy to get another round of responses to my comments here before I start working on the patch that can go into reviews. Thanks!
Flags: needinfo?(nfroyd)
(In reply to Christian Holler (:decoder) from comment #2)
> (In reply to Nathan Froyd [:froydnj] from comment #1)
> > I assume you've tested this with some kind of local server or something? 
> > Not sure if you can write tests that can be executed in-tree for this, but
> > if you could, that would be most excellent.
> 
> This is tested with a remote system that we already have (and it is quite
> huge). Writing local tests for this would require writing a local server as
> well, so that is a lot of work that we are probably not willing to take at
> this stage of the project. Also, doing so would still not confirm that this
> works with our remote systems, so I see little use there.

*shrug*  We have (or had) tests that telemetry pings got submitted and were approximately the right format in-tree, even if the server wasn't a complete mockup of our telemetry server.  I imagine Sync might have similar sorts of tests?  Might help catch silly bugs.

> > ::: browser/extensions/asan-reporter/README.md
> > @@ +1,2 @@
> > > +# firefox-asan-reporter
> > > +Internal ASan reporter addon for Firefox
> > 
> > I'm sure you realize this too, but it would be nice if this were a little
> > more fleshed out.
> 
> I didn't write a more detailed README so far because this isn't meant to be
> used by anyone at the moment. But I will write a more comprehensive README
> before this actually goes into the tree and we have more details.

Thank you.

> > 
> > @@ +42,5 @@
> > > +  //
> > > +  // Instead, we hardcode the /tmp directory here, which should be fine in
> > > +  // most cases, as long as we are on Linux and Mac (the main targets for
> > > +  // this addon at the time of writing).
> > > +  processDirectory("/tmp");
> > 
> > We probably shouldn't do this at addon startup, it should be some sort of
> > "wait until the browser isn't in use" sort of thing.
> 
> I disagree. Getting the crash trace to us has highest priority in this kind
> of build (the whole build is super slow anyway, so you shouldn't even notice
> these async operations). Waiting until the browser is further in use risks
> that we crash again (e.g. on tab restore) without having the previous crash
> submitted, so I would like to keep this in at startup.

I think if we are worried about the browser crashing so early, we probably have bigger issues.

You'll want to make sure that these builds can be filtered out of telemetry somehow (either on the client by turning telemetry off or some sort of identifier in the information they send to telemetry, so we can do filtering on the server or on queries); given that these builds are likely to be significantly slower, their existence is going to throw off some of the responsiveness metrics, etc. that we collect.

> > @@ +105,5 @@
> > > +      let cid = Preferences.get(PREF_CLIENT_ID);
> > > +      let api_url = Preferences.get(PREF_API_URL);
> > > +      let auth_token = Preferences.get(PREF_AUTH_TOKEN);
> > > +
> > > +      let decoder = gTextDecoder;
> > 
> > Why have a global variable for this, instead of creating what you need every
> > time?  More efficient, maybe, but you're also setting up report objects and
> > querying preferences on every submission, too...so it doesn't seem like
> > avoiding a TextDecoder creation every time is saving you much.
> 
> I copied this from our own internals. It must be saving something if we do
> it in core code right?

That would be assuming that our core code is everywhere and always ideal. ;)
Flags: needinfo?(nfroyd)
Attachment #8892487 - Attachment is obsolete: true
Here's the updated version of my ASan reporter patch. Changes I made:

* Replaced custom logging with Log.jsm and added a logging pref
* Added a comprehensive README documenting the purpose and prefs of the addon (see https://github.com/choller/firefox-asan-reporter)
* Used arrow functions for closures where applicable
* Restructured the request promise chaining with a forEach. Using Promise.all will cause promises to run in parallel rather than sequential chaining, so that is not a good option.
* Moved --enable-address-sanitizer-reporter to toolkit/moz.configure. Checking for --enable-address-sanitizer is currently not possible there due to sanitizer options still being in old.configure. I don't think this is a problem because this is an internal option right now, that we are only going to use in one TC build.
* Replaced gTextDecoder with local TextDecoder instances
* Various comment and indent fixes

The detailed list of commits affecting the addon itself is here: https://github.com/choller/firefox-asan-reporter/commits/master

After revisiting the code and potential ways to test it, I decided that writing a test for this would require additional test code that is substantially larger than the addon itself. Since we are going to do internal testing with this addon and it is unlikely to change later on, I don't think spending that amount of resources is justified at this point.

As for submitting crashes at startup: I consider it essential to submit crashes as early as possible on startup, in particular before the user hits session restore. Otherwise, the browser could crash again on session restore without existing traces being submitted. The behavior of this addon is even better than our regular crash reporter, because all the heavy parts of the submit process are async and you can actually browse normally while the submit process is still running (whereas regular crash reporting blocks the browser startup).

Regarding data review, I am already in contact with the necessary people to figure out if shipping this to anyone requires additional reviews. Since we are currently not shipping any of this as part of the official builds or even separate builds, I don't think the data situation needs to be evaluated at this point. We do however need to clarify this before starting any kind of program based on such builds. Also, since we are not submitting any more data than our regular crash reporting, this is probably more a question of server-side storage/handling, rather than client-side code.

In order to not mess with telemetry, the builds based on this will be built with MOZ_TELEMETRY_REPORTING=0 since we don't need any telemetry on this. Fwiw, I don't know the state of the regular ASan builds wrt to this matter. I couldn't find any particular switches turning telemetry off in ASan Nightly builds (which we have already). We should probably investigate if this is on or off by default in ASan nightly.

I would like to land this so we can start making a TC build based on the --enable-address-sanitizer-reporter build flag. Having that build is a requirement to unblock Release Engineering for providing updates for such a build, which could take some time.
Comment on attachment 8903669 [details]
Bug 1386304 - Add and integrate asan-reporter system add-on.

https://reviewboard.mozilla.org/r/175438/#review181978
Attachment #8903669 - Flags: review?(nfroyd) → review+
Pushed by choller@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0c2b838c91e4
Add and integrate asan-reporter system add-on. r=froydnj
Pushed by choller@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9c581d290dc3
Add and integrate asan-reporter system add-on. r=froydnj
I just relanded this with eslint failures fixed. Locally ran eslint to confirm that all errors are gone. Keeping r+ since the fixes are only style/syntax changes.
Flags: needinfo?(choller)
https://hg.mozilla.org/mozilla-central/rev/9c581d290dc3
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Whiteboard: [adv-main57-]
Blocks: 1422269
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.