Closed Bug 916677 Opened 11 years ago Closed 11 years ago

Perform rooting analysis on full browser build

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: sfink, Assigned: sfink)

References

Details

(Whiteboard: [leave open)

Attachments

(13 files, 3 obsolete files)

idiot developer fixes 11 years ago Steve Fink [:sfink] [:s:] 13.69 KB, patch		Details \| Diff \| Splinter Review
Put spidermonkey_build options in the right place 11 years ago Steve Fink [:sfink] [:s:] 2.57 KB, patch	mozilla : review+ sfink : checked-in+	Details \| Diff \| Splinter Review
Run analysis on full browser build 11 years ago Steve Fink [:sfink] [:s:] 13.73 KB, patch	mozilla : review+ sfink : checked-in+	Details \| Diff \| Splinter Review
Split up the linux64-haz build into variants the analyze just the JS shell vs the full browser 11 years ago Steve Fink [:sfink] [:s:] 6.78 KB, patch	mozilla : review+	Details \| Diff \| Splinter Review
Downgrade to gcc 4.7.2 due to plugin error-reporting related bug. 11 years ago Steve Fink [:sfink] [:s:] 1.81 KB, patch	mozilla : review+	Details \| Diff \| Splinter Review
Fix --enable-threadsafe (the new default) JS shell builds 11 years ago Steve Fink [:sfink] [:s:] 821 bytes, patch	mozilla : review+	Details \| Diff \| Splinter Review
The analysis now produces some additional files. Upload them too. 11 years ago Steve Fink [:sfink] [:s:] 1.81 KB, patch	mozilla : review+	Details \| Diff \| Splinter Review
Split up the output files and append GC function stack to hazards. Also check against an expected hazard count. 11 years ago Steve Fink [:sfink] [:s:] 7.64 KB, patch	terrence : review+	Details \| Diff \| Splinter Review
Pass expected number of hazards into the analysis 11 years ago Steve Fink [:sfink] [:s:] 2.53 KB, patch	mozilla : review+	Details \| Diff \| Splinter Review
Upload failed analysis results 11 years ago Steve Fink [:sfink] [:s:] 1.47 KB, patch		Details \| Diff \| Splinter Review
Update expected number of hazards to accommodate some regressions 11 years ago Steve Fink [:sfink] [:s:] 830 bytes, patch	terrence : review+	Details \| Diff \| Splinter Review
Do not check expectations in explain.py (move to mozharness script) 11 years ago Steve Fink [:sfink] [:s:] 7.23 KB, patch	terrence : review+	Details \| Diff \| Splinter Review
Upload analysis results before checking expectations 11 years ago Steve Fink [:sfink] [:s:] 6.44 KB, patch	mozilla : review+	Details \| Diff \| Splinter Review
Upload analysis results before checking expectations 11 years ago Steve Fink [:sfink] [:s:] 9.88 KB, patch	mozilla : review+ sfink : checked-in+	Details \| Diff \| Splinter Review
unittests for read_from_file() 11 years ago Aki Sasaki (not active) 2.03 KB, patch	sfink : review+ mozilla : checked-in+	Details \| Diff \| Splinter Review
Turn the build orange on hazard expectation failure 11 years ago Steve Fink [:sfink] [:s:] 2.91 KB, patch	mozilla : review+ sfink : checked-in+	Details \| Diff \| Splinter Review

Steve Fink [:sfink] [:s:]

Assignee

Description

•

11 years ago

The stuff I previously landed only did the analysis on the code required to build the JS tree. We like having that number separately because (1) it's a different set of people who will fix the problems reported, and (2) it's way faster to generate and therefore easier to deal with.

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

11 years ago

Depends on: 910919

Steve Fink [:sfink] [:s:]

Assignee

Comment 1

•

11 years ago

Attached patch idiot developer fixes (obsolete) — Details — Splinter Review

This patch is fixes to stuff that was kinda sorta horribly broken in what is currently landed, to the point that I can't figure out what the heck I tested before landing it.

Attachment #805160 - Flags: review?(aki)

Steve Fink [:sfink] [:s:]

Assignee

Comment 2

•

11 years ago

Comment on attachment 805160 [details] [diff] [review]
idiot developer fixes

Doh! Mixed up the patches.

Attachment #805160 - Attachment is obsolete: true

Attachment #805160 - Flags: review?(aki)

Steve Fink [:sfink] [:s:]

Assignee

Comment 3

•

11 years ago

Attached patch Put spidermonkey_build options in the right place — Details — Splinter Review

*This* patch fixes stuff that was kinda sorta horribly broken in what is currently landed, to the point that I can't figure out what the heck I tested before landing it.

Attachment #805162 - Flags: review?(aki)

Steve Fink [:sfink] [:s:]

Assignee

Comment 4

•

11 years ago

Attached patch Run analysis on full browser build — Details — Splinter Review

And this one splits things up into separate shell and browser builds, selected from buildbot by loading the appropriate config file.

Attachment #805163 - Flags: review?(aki)

Steve Fink [:sfink] [:s:]

Assignee

Comment 5

•

11 years ago

Attached patch Split up the linux64-haz build into variants the analyze just the JS shell vs the full browser — Details — Splinter Review

Split up the linux64-haz build into variants the analyze just the JS shell vs the full browser

Attachment #805164 - Flags: review?(aki)

Aki Sasaki (not active)

Updated

•

11 years ago

Attachment #805162 - Flags: review?(aki) → review+

Aki Sasaki (not active)

Comment 6

•

11 years ago

Comment on attachment 805163 [details] [diff] [review]
Run analysis on full browser build

>     def query_env(self, partial_env=None, replace_dict=None,
>+                  purge_env=[],

I know catlee once pointed out that setting a variable to a list in a method parameter can result in weirdness... e.g. calling query_env() with a purge_env set will leave purge_env set on subsequent calls.  His workaround was

    def query_env(..., purge_env=None, ...):
        if purge_env is None:
            purge_env = []

However, I haven't been able to trigger this behavior at all; maybe it's fixed in newer pythons?

Attachment #805163 - Flags: review?(aki) → review+

Aki Sasaki (not active)

Comment 7

•

11 years ago

Comment on attachment 805164 [details] [diff] [review]
Split up the linux64-haz build into variants the analyze just the JS shell vs the full browser

>diff --git a/mozilla/universal_master_sqlite.cfg b/mozilla/universal_master_sqlite.cfg
>--- a/mozilla/universal_master_sqlite.cfg
>+++ b/mozilla/universal_master_sqlite.cfg
>@@ -72,20 +72,21 @@ buildObjects = {'builders': [], 'status'
> 
> import passwords
> reload(passwords)
> 
> #ACTIVE_BRANCHES = ['mozilla-inbound']
> ACTIVE_THUNDERBIRD_BRANCHES = []
> ACTIVE_B2G_BRANCHES = []
> ENABLE_RELEASES = False
> ACTIVE_PROJECTS = []
> #ACTIVE_PROJECTS = ['spidermonkey_try__try']
>+BRANCHES['try']['platforms'] = dict([(name,p) for name,p in BRANCHES['try']['platforms'].items() if 'haz' in name])
> 
> for branch in ACTIVE_BRANCHES:
>     branchObjects = generateBranchObjects(BRANCHES[branch], branch,
>             getattr(passwords, 'secrets', None))
>     buildObjects = mergeBuildObjects(buildObjects, branchObjects)
> 
> for branch in ACTIVE_THUNDERBIRD_BRANCHES:
>     branchObjects = generateBranchObjects(THUNDERBIRD_BRANCHES[branch], branch)
>     # Strip away any duplicate change sources
>     usefulChangeSources = []

I'm going to guess this is staging/testing changes?
r=me without this file, assuming that's correct.

Attachment #805164 - Flags: review?(aki) → review+

Steve Fink [:sfink] [:s:]

Assignee

Comment 8

•

11 years ago

(In reply to Aki Sasaki [:aki] from comment #6)
> Comment on attachment 805163 [details] [diff] [review]
> Run analysis on full browser build
> 
> >     def query_env(self, partial_env=None, replace_dict=None,
> >+                  purge_env=[],
> 
> I know catlee once pointed out that setting a variable to a list in a method
> parameter can result in weirdness... e.g. calling query_env() with a
> purge_env set will leave purge_env set on subsequent calls.  His workaround
> was
> 
>     def query_env(..., purge_env=None, ...):
>         if purge_env is None:
>             purge_env = []
> 
> However, I haven't been able to trigger this behavior at all; maybe it's
> fixed in newer pythons?

I can guess what he might have been referring to:

  def foo(x=[]):
    print("Incoming: " + str(x))
    x.append(3)
  foo() # Prints []
  foo() # Prints [3]

so the default value is the same object from call to call, which means mutating it is a bad idea. Hm... maybe I should make it purge_env=().

Steve Fink [:sfink] [:s:]

Assignee

Comment 9

•

11 years ago

(In reply to Aki Sasaki [:aki] from comment #7)
> > #ACTIVE_BRANCHES = ['mozilla-inbound']
> > ACTIVE_THUNDERBIRD_BRANCHES = []
> > ACTIVE_B2G_BRANCHES = []
> > ENABLE_RELEASES = False
> > ACTIVE_PROJECTS = []
> > #ACTIVE_PROJECTS = ['spidermonkey_try__try']
> >+BRANCHES['try']['platforms'] = dict([(name,p) for name,p in BRANCHES['try']['platforms'].items() if 'haz' in name])
> > 
> > for branch in ACTIVE_BRANCHES:
> >     branchObjects = generateBranchObjects(BRANCHES[branch], branch,
> >             getattr(passwords, 'secrets', None))
> >     buildObjects = mergeBuildObjects(buildObjects, branchObjects)
> > 
> > for branch in ACTIVE_THUNDERBIRD_BRANCHES:
> >     branchObjects = generateBranchObjects(THUNDERBIRD_BRANCHES[branch], branch)
> >     # Strip away any duplicate change sources
> >     usefulChangeSources = []
> 
> I'm going to guess this is staging/testing changes?
> r=me without this file, assuming that's correct.

Oh. Yes, sorry. And it's extra confusing because the context shows my testing changes is applied before this patch. I will remove this chunk before landing; thanks!

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

11 years ago

Attachment #805162 - Flags: checked-in+

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

11 years ago

Attachment #805163 - Flags: checked-in+

Steve Fink [:sfink] [:s:]

Assignee

Comment 10

•

11 years ago

https://hg.mozilla.org/build/buildbot-configs/rev/3c944d79c028
http://hg.mozilla.org/build/mozharness/rev/507ad3d948fd
http://hg.mozilla.org/build/mozharness/rev/1b9fe2818b84

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

11 years ago

Depends on: 923374

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

11 years ago

Depends on: 930802

Steve Fink [:sfink] [:s:]

Assignee

Comment 11

•

11 years ago

Attached patch Downgrade to gcc 4.7.2 due to plugin error-reporting related bug. — Details — Splinter Review

Bug is fixed in 4.8.0, and mysteriously does not trigger in 4.7.2.

Attachment #822579 - Flags: review?(aki)

Steve Fink [:sfink] [:s:]

Assignee

Comment 12

•

11 years ago

Attached patch Fix --enable-threadsafe (the new default) JS shell builds — Details — Splinter Review

Attachment #822581 - Flags: review?(aki)

Aki Sasaki (not active)

Updated

•

11 years ago

Attachment #822579 - Flags: review?(aki) → review+

Steve Fink [:sfink] [:s:]

Assignee

Comment 13

•

11 years ago

Attached patch The analysis now produces some additional files. Upload them too. — Details — Splinter Review

The analysis now produces some additional files. Upload them too.

Attachment #822582 - Flags: review?(aki)

Steve Fink [:sfink] [:s:]

Assignee

Comment 14

•

11 years ago

Attached patch Split up the output files and append GC function stack to hazards. Also check against an expected hazard count. — Details — Splinter Review

Split up the output files and append GC function stack to hazards. Also check against an expected hazard count. (Count is in the tree so it can be updated with the tree.)

Attachment #822587 - Flags: review?(terrence)

Aki Sasaki (not active)

Updated

•

11 years ago

Attachment #822581 - Flags: review?(aki) → review+

Aki Sasaki (not active)

Updated

•

11 years ago

Attachment #822582 - Flags: review?(aki) → review+

Terrence Cole [:terrence]

Comment 15

•

11 years ago

Comment on attachment 822587 [details] [diff] [review]
Split up the output files and append GC function stack to hazards. Also check against an expected hazard count.

Review of attachment 822587 [details] [diff] [review]:
-----------------------------------------------------------------

r=me

::: js/src/devtools/rootAnalysis/explain.py
@@ +27,5 @@
> +    with open(args.rootingHazards) as rootingHazards, \
> +        open(args.hazards, 'w') as hazards, \
> +        open(args.extra, 'w') as extra, \
> +        open(args.refs, 'w') as refs:
> +        in_hazard = None

Could you rename this to current_hazard, as in_hazard sounds like it should be a bool, not a key.

@@ +57,5 @@
> +                num_hazards += 1
> +                continue
> +
> +            if in_hazard:
> +                if re.match(r'^$', line):

|if not line.strip():| please. No need to spin up the regex machinery when a simple "is all whitespace" will do.

@@ +61,5 @@
> +                if re.match(r'^$', line):
> +                    # Blank line => end of this hazard
> +                    in_hazard = None
> +                else:
> +                    hazardousGCFunctions[in_hazard][-1] += line

I always found the "assignment to list[-1] appends" behavior to be in the set of things that are only obvious if you are dutch. Could you write this as |hazardousGCFunctions[in_hazard].append(line)|.

@@ +81,5 @@
> +                    if m.group(1) in hazardousGCFunctions:
> +                        current = (m.group(1), hazardousGCFunctions[m.group(1)], [line])
> +                elif current:
> +                    current[2][0] += line
> +            process(current)

This with block took a /long/ time to understand. It's pretty elegant though, even if it would make Cthulhu grin. Embedding a mutable list in the middle of a tuple and then modifying only the first line of it is particularly devious. Fhtagn!

@@ +105,5 @@
> +        sys.exit(1)
> +    else:
> +        print("%d fewer unsafe refs than expected! (expected %d, saw %d)" % (args.expect_refs - num_refs, args.expect_refs, num_refs))
> +
> +sys.exit(0)

Eh? Just fall off the end and let python return 0 itself. If you sys.exit, python won't clean up any of its internal OS proxying bits. For example, you won't block on background threads. Not really a problem for this program since you're closing your files properly, but a good thing to be aware of.

Attachment #822587 - Flags: review?(terrence) → review+

Steve Fink [:sfink] [:s:]

Assignee

Comment 16

•

11 years ago

(In reply to Terrence Cole [:terrence] from comment #15)
> Comment on attachment 822587 [details] [diff] [review]
> Split up the output files and append GC function stack to hazards. Also
> check against an expected hazard count.
> 
> Review of attachment 822587 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me
> 
> ::: js/src/devtools/rootAnalysis/explain.py
> @@ +27,5 @@
> > +    with open(args.rootingHazards) as rootingHazards, \
> > +        open(args.hazards, 'w') as hazards, \
> > +        open(args.extra, 'w') as extra, \
> > +        open(args.refs, 'w') as refs:
> > +        in_hazard = None
> 
> Could you rename this to current_hazard, as in_hazard sounds like it should
> be a bool, not a key.

You are astute. It originally *was* a bool, until I realized I didn't need a separate variable but neglected to rename it.

> @@ +57,5 @@
> > +                num_hazards += 1
> > +                continue
> > +
> > +            if in_hazard:
> > +                if re.match(r'^$', line):
> 
> |if not line.strip():| please. No need to spin up the regex machinery when a
> simple "is all whitespace" will do.

Heh. I'm transliterating from Perl, and I would've done better with a more direct transliteration here...

> @@ +61,5 @@
> > +                if re.match(r'^$', line):
> > +                    # Blank line => end of this hazard
> > +                    in_hazard = None
> > +                else:
> > +                    hazardousGCFunctions[in_hazard][-1] += line
> 
> I always found the "assignment to list[-1] appends" behavior to be in the
> set of things that are only obvious if you are dutch. Could you write this
> as |hazardousGCFunctions[in_hazard].append(line)|.

That's not what's happening here. I'm appending to the string stored in the last element of the list.

I've added a comment to the initialization of hazardousGCFunctions:

        # Map from a GC function name to the list of hazards resulting from
        # that GC function
        hazardousGCFunctions = {}

Though this does make me realize that I'm totally screwing up the ordering of the hazards. I'm now outputting them ordered by GC function, instead of the function that the hazard is in. That's actually kind of awful. Fixing...

> @@ +81,5 @@
> > +                    if m.group(1) in hazardousGCFunctions:
> > +                        current = (m.group(1), hazardousGCFunctions[m.group(1)], [line])
> > +                elif current:
> > +                    current[2][0] += line
> > +            process(current)
> 
> This with block took a /long/ time to understand. It's pretty elegant
> though, even if it would make Cthulhu grin. Embedding a mutable list in the
> middle of a tuple and then modifying only the first line of it is
> particularly devious. Fhtagn!

That has more to do with my unfamiliarity with Python. I ran into the general trick before, when I needed to export a mutable variable from a module. The simplest workaround seemed to be do export a length-1 list. In Perl, I'd use a reference to a value, which is more what I want (even though the syntax for it is truly awful.)

Would it be better to just make 'current' be a list instead of a tuple? The [0] in current[2][0] is pretty horrible.

Or really, I suppose I should do 
  current = { 'function': m.group(1), 'hazards': ..., 'gcExplanation': line }
  ...
  current['gcExplanation'] += line

and stop packing my data structures so much. I am too disturbed by my suspicions about Python's ability to optimize such code. Bad sfink. Premature optimization. Fixing.

> @@ +105,5 @@
> > +        sys.exit(1)
> > +    else:
> > +        print("%d fewer unsafe refs than expected! (expected %d, saw %d)" % (args.expect_refs - num_refs, args.expect_refs, num_refs))
> > +
> > +sys.exit(0)
> 
> Eh? Just fall off the end and let python return 0 itself. If you sys.exit,
> python won't clean up any of its internal OS proxying bits. For example, you
> won't block on background threads. Not really a problem for this program
> since you're closing your files properly, but a good thing to be aware of.

Oh, sorry. Is there any better equivalent to sys.exit(1)? It feels wrong. I suppose I could throw an uncaught exception, but that'll make the output messy.

Steve Fink [:sfink] [:s:]

Assignee

Comment 17

•

11 years ago

remote:   https://hg.mozilla.org/build/mozharness/rev/c341a28bf5fc
remote:   https://hg.mozilla.org/build/mozharness/rev/0e572af957f6
remote:   https://hg.mozilla.org/build/mozharness/rev/153d9d8718cb
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/98cb3022ae10

Terrence Cole [:terrence]

Comment 18

•

11 years ago

(In reply to Steve Fink [:sfink] from comment #16)
> (In reply to Terrence Cole [:terrence] from comment #15)
> > @@ +81,5 @@
> > > +                    if m.group(1) in hazardousGCFunctions:
> > > +                        current = (m.group(1), hazardousGCFunctions[m.group(1)], [line])
> > > +                elif current:
> > > +                    current[2][0] += line
> > > +            process(current)
> > 
> > This with block took a /long/ time to understand. It's pretty elegant
> > though, even if it would make Cthulhu grin. Embedding a mutable list in the
> > middle of a tuple and then modifying only the first line of it is
> > particularly devious. Fhtagn!
> 
> That has more to do with my unfamiliarity with Python. I ran into the
> general trick before, when I needed to export a mutable variable from a
> module. The simplest workaround seemed to be do export a length-1 list. In
> Perl, I'd use a reference to a value, which is more what I want (even though
> the syntax for it is truly awful.)
> 
> Would it be better to just make 'current' be a list instead of a tuple? The
> [0] in current[2][0] is pretty horrible.
> 
> Or really, I suppose I should do 
>   current = { 'function': m.group(1), 'hazards': ..., 'gcExplanation': line }
>   ...
>   current['gcExplanation'] += line
> 
> and stop packing my data structures so much. I am too disturbed by my
> suspicions about Python's ability to optimize such code. Bad sfink.
> Premature optimization. Fixing.

Yes, that's way better. If you are worried about performance being an issue you should make it a class. In general though, I really wouldn't worry about python speed: they don't have a jit so swapping hashing for indexing is going to be totally unnoticeable below the interpreter overhead. Of course, if you ever run it on pypy, that will be less true. In fact, if you make it a class and don't use setattr, pypy will lay it out like a C struct and get you similar access speeds, which it can't do for a dict.

Carsten Book [:Tomcat]

Comment 19

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/98cb3022ae10

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

John Hopkins (:jhopkins)

Comment 20

•

11 years ago

in production

Steve Fink [:sfink] [:s:]

Assignee

Comment 21

•

11 years ago

Attached patch Pass expected number of hazards into the analysis — Details — Splinter Review

Attachment #824823 - Flags: review?(aki)

Aki Sasaki (not active)

Updated

•

11 years ago

Attachment #824823 - Flags: review?(aki) → review+

Steve Fink [:sfink] [:s:]

Assignee

Comment 22

•

11 years ago

http://hg.mozilla.org/build/mozharness/rev/a197409f2ed9

Armen [:armenzg]

Comment 23

•

11 years ago

Live in production.

Steve Fink [:sfink] [:s:]

Assignee

Comment 24

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/594ddc2d9331

Steve Fink [:sfink] [:s:]

Assignee

Comment 25

•

11 years ago

Attached patch Upload failed analysis results (obsolete) — Details — Splinter Review

When we find more hazards than expected, upload the results.

Attachment #826396 - Flags: review?(aki)

Phil Ringnalda (:philor)

Comment 26

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/594ddc2d9331

Steve Fink [:sfink] [:s:]

Assignee

Comment 27

•

11 years ago

Attached patch Update expected number of hazards to accommodate some regressions — Details — Splinter Review

It looks like the only thing currently breaking the hazard builds is that we've regressed by a few more hazards. Just add them to the expectation counts for now. I guess we'll want to change the expect counts when fixing hazards or something?

Attachment #826614 - Flags: review?(terrence)

Steve Fink [:sfink] [:s:]

Assignee

Comment 28

•

11 years ago

Comment on attachment 826396 [details] [diff] [review]
Upload failed analysis results

Cancelling review request. The mozharness script should just do the upload before checking the expectations.

Attachment #826396 - Flags: review?(aki)

Terrence Cole [:terrence]

Comment 29

•

11 years ago

Comment on attachment 826614 [details] [diff] [review]
Update expected number of hazards to accommodate some regressions

Review of attachment 826614 [details] [diff] [review]:
-----------------------------------------------------------------

r=me

Attachment #826614 - Flags: review?(terrence) → review+

Steve Fink [:sfink] [:s:]

Assignee

Comment 30

•

11 years ago

Attached patch Do not check expectations in explain.py (move to mozharness script) — Details — Splinter Review

This probably doesn't really warrant a review, but whatever.

Attachment #827226 - Flags: review?(terrence)

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

11 years ago

Depends on: 934698

Steve Fink [:sfink] [:s:]

Assignee

Comment 31

•

11 years ago

Attached patch Upload analysis results before checking expectations (obsolete) — Details — Splinter Review

Ok, better: this does the analysis and uploads the results *before* checking against the expected number of hazards.

Attachment #827230 - Flags: review?(aki)

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

11 years ago

Attachment #826396 - Attachment is obsolete: true

Terrence Cole [:terrence]

Comment 32

•

11 years ago

Comment on attachment 827226 [details] [diff] [review]
Do not check expectations in explain.py (move to mozharness script)

Review of attachment 827226 [details] [diff] [review]:
-----------------------------------------------------------------

r=me

Attachment #827226 - Flags: review?(terrence) → review+

Aki Sasaki (not active)

Comment 33

•

11 years ago

Comment on attachment 827230 [details] [diff] [review]
Upload analysis results before checking expectations

I'm overall ok with this, just some logging comments.

>+    def check_expectations(self):
>+        if 'expect_file' not in self.config:
>+            return

I'd prefer a self.info("No 'expect_file' in self.config; skipping!") before the return, so we know why nothing's running.


>+        data = json.load(file(expect_file, 'r'))
>+
>+        num_hazards = 0
>+        num_refs = 0
>+        with open(os.path.join(analysis_dir, "rootingHazards.txt")) as hazards_fh:

For both data and hazards_fh, self.read_from_file() would provide more logging.
http://hg.mozilla.org/build/mozharness/file/e8f03d2f9277/mozharness/base/script.py#l373

>+        expect_hazards = data.get('expect-hazards')
>+        if expect_hazards is not None and expect_hazards != num_hazards:
>+            if expect_hazards < num_hazards:
>+                self.fatal("%d more hazards than expected (expected %d, saw %d)" %
>+                           (num_hazards - expect_hazards, expect_hazards, num_hazards))
>+        else:
>+            self.info("%d fewer hazards than expected! (expected %d, saw %d)" %
>+                      (expect_hazards - num_hazards, expect_hazards, num_hazards))
>+
>+        expect_refs = data.get('expect-refs')
>+        if expect_refs is not None and expect_refs != num_refs:
>+            if expect_refs < num_refs:
>+                self.fatal("%d more unsafe refs than expected (expected %d, saw %d)" %
>+                           (num_refs - expect_refs, expect_refs, num_refs))
>+            else:
>+                self.info("%d fewer unsafe refs than expected! (expected %d, saw %d)" %
>+                          (expect_refs - num_refs, expect_refs, num_refs))

Do you want to self.fatal() or set your self.return_code?
Either's fine, though you may want to turn orange rather than red to differentiate between types of errors.
BuildbotMixin has some helper methods/constants if you want to turn orange.

Attachment #827230 - Flags: review?(aki) → review+

Steve Fink [:sfink] [:s:]

Assignee

Comment 34

•

11 years ago

(In reply to Aki Sasaki [:aki] from comment #33)
> Comment on attachment 827230 [details] [diff] [review]
> Upload analysis results before checking expectations
> 
> I'm overall ok with this, just some logging comments.
> 
> >+    def check_expectations(self):
> >+        if 'expect_file' not in self.config:
> >+            return
> 
> I'd prefer a self.info("No 'expect_file' in self.config; skipping!") before
> the return, so we know why nothing's running.

Good call.

> >+        data = json.load(file(expect_file, 'r'))
> >+
> >+        num_hazards = 0
> >+        num_refs = 0
> >+        with open(os.path.join(analysis_dir, "rootingHazards.txt")) as hazards_fh:
> 
> For both data and hazards_fh, self.read_from_file() would provide more
> logging.
> http://hg.mozilla.org/build/mozharness/file/e8f03d2f9277/mozharness/base/
> script.py#l373

Hm. rootingHazards.txt used to be huge, though it's gotten far better. But it's totally possible that a small code change could trigger thousands of hazards, which would bloat it up to a very large file (tens, maybe hundreds, of megabytes.) Given that these are the sorts of things this builder is here to detect, that makes me a little nervous of sucking the whole file into memory.

The other stuff in self.read_from_file is good, though. Maybe I'll take a shot at doing a line-based version.

Btw, wouldn't self.read_from_file leak the filehandle if fh.read() throws?

> Do you want to self.fatal() or set your self.return_code?
> Either's fine, though you may want to turn orange rather than red to
> differentiate between types of errors.
> BuildbotMixin has some helper methods/constants if you want to turn orange.

I very very much want these to turn orange rather than red. I'll change this.

Aki Sasaki (not active)

Comment 35

•

11 years ago

(In reply to Steve Fink [:sfink] from comment #34)
> > >+        data = json.load(file(expect_file, 'r'))
> > >+
> > >+        num_hazards = 0
> > >+        num_refs = 0
> > >+        with open(os.path.join(analysis_dir, "rootingHazards.txt")) as hazards_fh:
> > 
> > For both data and hazards_fh, self.read_from_file() would provide more
> > logging.
> > http://hg.mozilla.org/build/mozharness/file/e8f03d2f9277/mozharness/base/
> > script.py#l373
> 
> Hm. rootingHazards.txt used to be huge, though it's gotten far better. But
> it's totally possible that a small code change could trigger thousands of
> hazards, which would bloat it up to a very large file (tens, maybe hundreds,
> of megabytes.) Given that these are the sorts of things this builder is here
> to detect, that makes me a little nervous of sucking the whole file into
> memory.
> 
> The other stuff in self.read_from_file is good, though. Maybe I'll take a
> shot at doing a line-based version.

That makes sense.

> Btw, wouldn't self.read_from_file leak the filehandle if fh.read() throws?

I think once you leave the method, there are no references to it and it gets gc'ed.
http://patshaughnessy.net/2013/10/24/visualizing-garbage-collection-in-ruby-and-python
If that's what you're asking.

> > Do you want to self.fatal() or set your self.return_code?
> > Either's fine, though you may want to turn orange rather than red to
> > differentiate between types of errors.
> > BuildbotMixin has some helper methods/constants if you want to turn orange.
> 
> I very very much want these to turn orange rather than red. I'll change this.

Ok.  There are a lot of examples from other test scripts on this; let me know if you need more pointers on this.

Terrence Cole [:terrence]

Comment 36

•

11 years ago

(In reply to Aki Sasaki [:aki] from comment #35)
> (In reply to Steve Fink [:sfink] from comment #34)
> 
> > Btw, wouldn't self.read_from_file leak the filehandle if fh.read() throws?
> 
> I think once you leave the method, there are no references to it and it gets
> gc'ed.
> http://patshaughnessy.net/2013/10/24/visualizing-garbage-collection-in-ruby-
> and-python
> If that's what you're asking.

The only reason that happens to work is that python has a terrible garbage collector. If we ever move to pypy, relying on GC to clean-up file handles is going to lead to all sorts of intermittent out-of-file-handles related failures.

Steve Fink [:sfink] [:s:]

Assignee

Comment 37

•

11 years ago

Attached patch Upload analysis results before checking expectations — Details — Splinter Review

Getting opened() to work with read_from_file() was a bit challenging. This seems to do everything except it doesn't split off a different log message if the file doesn't exist. Instead, it'll say

 ERROR - unable to open <filename>: File not found

which seems close enough. (If I do the explicit check and return None, the @contextmanager has a fit and says "generator didn't stop". I tried various things and couldn't get it to work in all cases.)

Attachment #827641 - Flags: review?(aki)

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

11 years ago

Attachment #827230 - Attachment is obsolete: true

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 38

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/ae25b8cc8ff8
https://hg.mozilla.org/mozilla-central/rev/e69975bf9002

Aki Sasaki (not active)

Comment 39

•

11 years ago

Attached patch unittests for read_from_file() — Details — Splinter Review

Attachment #827759 - Flags: review?(sphink)

Aki Sasaki (not active)

Comment 40

•

11 years ago

Comment on attachment 827641 [details] [diff] [review]
Upload analysis results before checking expectations

Passes my unittests; r=me.
Guessing the orange-ness is coming later.

Attachment #827641 - Flags: review?(aki) → review+

Steve Fink [:sfink] [:s:]

Assignee

Comment 41

•

11 years ago

Comment on attachment 827759 [details] [diff] [review]
unittests for read_from_file()

Review of attachment 827759 [details] [diff] [review]:
-----------------------------------------------------------------

::: test/test_base_script.py
@@ +315,5 @@
> +        contents = self.s.read_from_file(self.temp_file)
> +        self.assertEqual(contents, test_string)
> +
> +    def test_read_from_nonexistent_file(self):
> +        self.s = script.BaseScript(initial_config_file='test/test.json')

Just wondering -- why don't all of these use get_debug_script_obj()?

@@ +631,5 @@
>          self.assertEqual(self.s.post_action_1_args[0][1], dict(success=True))
>          self.assertEqual(self.s.post_action_1_args[1][0], ('build',))
>          self.assertEqual(self.s.post_action_1_args[1][1], dict(success=True))
>  
>          # post_action_3 should only get called for the action is is registered

While you're nearby, s/is is/it is/?

Attachment #827759 - Flags: review?(sphink) → review+

Steve Fink [:sfink] [:s:]

Assignee

Comment 42

•

11 years ago

(In reply to Aki Sasaki [:aki] from comment #40)
> Guessing the orange-ness is coming later.

Doh! Totally forgot about that.

Steve Fink [:sfink] [:s:]

Assignee

Comment 43

•

11 years ago

Attached patch Turn the build orange on hazard expectation failure — Details — Splinter Review

Seems pretty straightforward.

Attachment #827864 - Flags: review?(aki)

Aki Sasaki (not active)

Comment 44

•

11 years ago

Comment on attachment 827759 [details] [diff] [review]
unittests for read_from_file()

https://hg.mozilla.org/build/mozharness/rev/0dc9e4d95038

(In reply to Steve Fink [:sfink] from comment #41)
> Comment on attachment 827759 [details] [diff] [review]
> unittests for read_from_file()
> 
> Review of attachment 827759 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: test/test_base_script.py
> @@ +315,5 @@
> > +        contents = self.s.read_from_file(self.temp_file)
> > +        self.assertEqual(contents, test_string)
> > +
> > +    def test_read_from_nonexistent_file(self):
> > +        self.s = script.BaseScript(initial_config_file='test/test.json')
> 
> Just wondering -- why don't all of these use get_debug_script_obj()?

Dunno, get_debug_script_obj() was specifically for the log_level=debug setting, which is hardly ever used (never?).

Probably no good reason.

> @@ +631,5 @@
> >          self.assertEqual(self.s.post_action_1_args[0][1], dict(success=True))
> >          self.assertEqual(self.s.post_action_1_args[1][0], ('build',))
> >          self.assertEqual(self.s.post_action_1_args[1][1], dict(success=True))
> >  
> >          # post_action_3 should only get called for the action is is registered
> 
> While you're nearby, s/is is/it is/?

Sure.

Attachment #827759 - Flags: checked-in+

Aki Sasaki (not active)

Comment 45

•

11 years ago

Comment on attachment 827864 [details] [diff] [review]
Turn the build orange on hazard expectation failure

>-                self.fatal("%d more hazards than expected (expected %d, saw %d)" %
>-                           (num_hazards - expect_hazards, expect_hazards, num_hazards))
>+                self.warning("%d more hazards than expected (expected %d, saw %d)" %
>+                             (num_hazards - expect_hazards, expect_hazards, num_hazards))

You may want to TinderboxPrint this... it's a special string that Tinderbox (and now TBPL) respect and show up in the summary when you click on a build/test but don't yet go to the log file.  Same for the next edited block.

That would look like

self.warning("TinderboxPrint: %d more hazards...

>+                self.buildbot_status(TBPL_WARNING)

Ok, so I dug into this.
We're getting the right color for the other test scripts because this map
http://hg.mozilla.org/build/mozharness/file/0dc9e4d95038/mozharness/mozilla/buildbot.py#l40
matches up to this rc_eval_func in buildbotcustom.misc.generateTestBuilder():
http://hg.mozilla.org/build/buildbotcustom/file/d54d708fba06/misc.py#l506

HOWEVER, the hazards builders aren't in buildbot-configs/mozilla-tests/config.py.  They're in buildbot-configs/mozilla/config.py, e.g.
http://hg.mozilla.org/build/buildbot-configs/file/20bafe3167a0/mozilla/config.py#l530
I think you're calling buildbotcustom.misc.makeMHFactory():
http://hg.mozilla.org/build/buildbotcustom/file/d54d708fba06/misc.py#l392
There is no rc_eval_func for makeMHFactory(), and I'm not sure what adding one will do to the other mozharness-based builds (largely b2g atm I think).
Probably harmless...

What I'm getting at is this looks like a good mozharness patch for orangeness, but we probably need a buildbotcustom.misc patch for buildbot to do the right thing.
If you want to limit the scope of the rc_eval_func to only hazards builders, then we may also need a buildbot-configs component.

Sorry about that, I'm not intentionally moving the goalposts...

Attachment #827864 - Flags: review?(aki) → review+

John Hopkins (:jhopkins)

Comment 46

•

11 years ago

in production

Steve Fink [:sfink] [:s:]

Assignee

Comment 47

•

11 years ago

(In reply to Aki Sasaki [:aki] from comment #45)
> You may want to TinderboxPrint this... it's a special string that Tinderbox
> (and now TBPL) respect and show up in the summary when you click on a
> build/test but don't yet go to the log file.  Same for the next edited block.

Yes, I'd definitely like it to show up. I think I'll make it show up whenever it doesn't match the expectation, even if it is an unexpectedly low hazard count.

> >+                self.buildbot_status(TBPL_WARNING)
> 
> Ok, so I dug into this.

Oh! Thanks for doing that. I just kind of threw it in there and looked at the mozharness output. I didn't actually test it all the way through to a slave.

And ugh.

> I think you're calling buildbotcustom.misc.makeMHFactory():
> http://hg.mozilla.org/build/buildbotcustom/file/d54d708fba06/misc.py#l392
> There is no rc_eval_func for makeMHFactory(), and I'm not sure what adding
> one will do to the other mozharness-based builds (largely b2g atm I think).
> Probably harmless...

Hm. It seems to me like this is generally desirable. For example, the current (non-mozharness) builds can turn orange because they run post-build tests. Ignoring the fact that we kind of want to move most of those tests into a separate builder, we can't replicate current behavior via mozharness. So I'd argue for adding it into build jobs anyway.

> 
> What I'm getting at is this looks like a good mozharness patch for
> orangeness, but we probably need a buildbotcustom.misc patch for buildbot to
> do the right thing.
> If you want to limit the scope of the rc_eval_func to only hazards builders,
> then we may also need a buildbot-configs component.

I think it makes the most sense to have it globally available.

Though the whole status-via-log-parsing might not be the best way to do this in the first place. How would you feel about mozharness steps producing a "builder result JSON blob" that could be processed to produce the red/orange/retry/whatever status? I don't know how hard it would be, but it would also be a nice place to stick crazy stuff like "here's the set of tests that failed, if you retrigger you can request to only rerun those tests" or whatever. TinderboxPrint stuff could go there too, and eventually maybe you could use it to get the master completely out of the log processing business.

> Sorry about that, I'm not intentionally moving the goalposts...

But I can land as-is anyway, since I'm setting the return code. It'll just produce a red build for now.

(And the above "builder result json blob" stuff is me picking up a goalpost and wandering off with it...)

Let me test my understanding. For a test mozharness job:

- the mozharness script calls self.buildbot_status(TBPL_WARNING)
- aka self.buildbot_status('WARNING')
- which prints "# TBPL WARNING #" to the log file
  - which is entirely ignored
- and also sets the result code to 1 (aka EXIT_STATUS_DICT[TBPL_WARNING])
- this result code is transmitted back to the master
- the master is running this script as a ScriptFactory with log_eval_func set to rc_eval_func({1: WARNINGS})
- the latter is a closure that maps the command result code (1) to a step outcome (WARNINGS)

And so the risk is that other mozharness build jobs are returning nonzero status codes that currently map to ERRORS, and might switch to WARNINGS. (Or worse, to RETRY or something.) Do I have all that right?

Steve Fink [:sfink] [:s:]

Assignee

Comment 48

•

11 years ago

Oops, should have asked for this to be left open.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Whiteboard: [leave open

Aki Sasaki (not active)

Comment 49

•

11 years ago

(In reply to Steve Fink [:sfink] from comment #47)
> > I think you're calling buildbotcustom.misc.makeMHFactory():
> > http://hg.mozilla.org/build/buildbotcustom/file/d54d708fba06/misc.py#l392
> > There is no rc_eval_func for makeMHFactory(), and I'm not sure what adding
> > one will do to the other mozharness-based builds (largely b2g atm I think).
> > Probably harmless...
> 
> Hm. It seems to me like this is generally desirable. For example, the
> current (non-mozharness) builds can turn orange because they run post-build
> tests. Ignoring the fact that we kind of want to move most of those tests
> into a separate builder, we can't replicate current behavior via mozharness.
> So I'd argue for adding it into build jobs anyway.

I think it's generally desirable, but may have unforeseen side effects in the short term.

> Though the whole status-via-log-parsing might not be the best way to do this
> in the first place.

I think the rc_eval_func only looks at the exit status so it's not really parsing logs on the master.  We are parsing logs in mozharness itself when an error_list is set for a run_command(), but I think that's a necessary evil when we can't control all the commands we run.

> How would you feel about mozharness steps producing a
> "builder result JSON blob" that could be processed to produce the
> red/orange/retry/whatever status? I don't know how hard it would be, but it
> would also be a nice place to stick crazy stuff like "here's the set of
> tests that failed, if you retrigger you can request to only rerun those
> tests" or whatever. TinderboxPrint stuff could go there too, and eventually
> maybe you could use it to get the master completely out of the log
> processing business.

As above, the master's out of the log processing for rc_eval_func, but does upload logs to ftp.m.o.  Ideally we could do slave-side log uploading, but then we lose the last part of the log (if something fails after/while we upload the log)

The status json may have other good properties, so I'm not shooting this down.  If we pursue this, I would want this to be the global standard for things, though, not another way of doing things.  http://xkcd.com/927/

> > Sorry about that, I'm not intentionally moving the goalposts...
> 
> But I can land as-is anyway, since I'm setting the return code. It'll just
> produce a red build for now.

Yes.

> Let me test my understanding. For a test mozharness job:
> 
> - the mozharness script calls self.buildbot_status(TBPL_WARNING)
> - aka self.buildbot_status('WARNING')
> - which prints "# TBPL WARNING #" to the log file
>   - which is entirely ignored

Yes, because sometimes logs are truncated, plus master-side log parsing is less desirable than looking at an exit status.

> - and also sets the result code to 1 (aka EXIT_STATUS_DICT[TBPL_WARNING])
> - this result code is transmitted back to the master
> - the master is running this script as a ScriptFactory with log_eval_func
> set to rc_eval_func({1: WARNINGS})
> - the latter is a closure that maps the command result code (1) to a step
> outcome (WARNINGS)

Yes, you have a very good understanding here :)

> And so the risk is that other mozharness build jobs are returning nonzero
> status codes that currently map to ERRORS, and might switch to WARNINGS. (Or
> worse, to RETRY or something.) Do I have all that right?

I think infinite RETRY is probably the worst potential unintentional side effect here.  And yes, I think that's all correct :)

Steve Fink [:sfink] [:s:]

Assignee

Comment 50

•

11 years ago

https://hg.mozilla.org/build/mozharness/rev/9b794101516c
https://hg.mozilla.org/build/mozharness/rev/6a43c6a7c271

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

11 years ago

Attachment #827864 - Flags: checked-in+

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

11 years ago

Attachment #827641 - Flags: checked-in+

John Hopkins (:jhopkins)

Comment 51

•

11 years ago

in production

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

11 years ago

Blocks: 937771

Steve Fink [:sfink] [:s:]

Assignee

Comment 52

•

11 years ago

I don't think I need this bug anymore. It's working. Will use new bugs for enhancements.

Status: REOPENED → RESOLVED

Closed: 11 years ago → 11 years ago

Resolution: --- → FIXED

Gregory Szorc [:gps]

Updated

•

10 years ago

Blocks: 961314

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: General Automation → General

You need to log in before you can comment on or make changes to this bug.