Closed Bug 1262970 Opened 3 years ago Closed 3 years ago

34.32 - 93.02% compiler_metrics num_constructors (linux32, linux64) regression on push da8d6c4eab61 (Thu Apr 7 2016)

Categories

(Core :: General, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 --- fixed

People

(Reporter: jmaher, Assigned: pehrsons)

References

Details

(Keywords: perf, regression)

Attachments

(2 files)

we have some big changes in num contstructors:
https://treeherder.mozilla.org/perf.html#/alerts?id=760

as we are recording different build types, it is hard to determine exactly what the root cause is, but it falls somewhere here:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&group_state=expanded&filter-searchStr=artifact&tochange=4d0f975a23119a61a6c8e5856125de2db5713c49&fromchange=efaed594f0fdad3825e4bbd9bc1efacc17fccf50

probably in bug 1208371.
(In reply to Joel Maher (:jmaher) from comment #0)
> we have some big changes in num contstructors:
> https://treeherder.mozilla.org/perf.html#/alerts?id=760

How does the 64-bit linux opt record a *fractional* number of constructors?  That makes no sense.

> as we are recording different build types, it is hard to determine exactly
> what the root cause is, but it falls somewhere here:
> https://treeherder.mozilla.org/#/jobs?repo=mozilla-
> inbound&group_state=expanded&filter-
> searchStr=artifact&tochange=4d0f975a23119a61a6c8e5856125de2db5713c49&fromchan
> ge=efaed594f0fdad3825e4bbd9bc1efacc17fccf50
> 
> probably in bug 1208371.

Every build in that page lists compiler_metrics_num_constructors as 86, so it seems kind of unlikely that the problem is in that range...unless that page isn't showing what you thought it was going to show?
Flags: needinfo?(jmaher)
looking on the graph, I see linux64 opt:
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,45b561e2236d7631af037d960a78410227317660,1%5D

and this yields no fractions- the problem is we have a lot of from jobs like AB, S, B- all reporting to the same graph.

:wlach, I cannot figure out how linux64 opt shows:
83.71 	< 	149.67

those are odd, maybe it is averaging the bi-modal data?  how can we get the raw replicates that make up the numbers we see?
Flags: needinfo?(jmaher) → needinfo?(wlachance)
Product: Firefox → Core
num_constructors are static constructors that are invoked on process start I take it?

I can't recall adding many of those, though there are a lot of changes in bug 1208371. Hard to remember them all. Is it possible to narrow the range to commit level?
as a note, pgo results are in showing the identical regression for both linux32 and linux64.
I have done a lot of try pushes to help bisect this:
https://treeherder.mozilla.org/#/jobs?repo=try&author=jmaher@mozilla.com&fromchange=d557a1ef56e9


results in a couple of hours
a lot of compiler errors, but I narrowed it down to between:
2f5a37a6a06d01cd1d8c8eea6b48fb54408197e4 - 86
... - compiler errors
7e8bffba9688fd968efaf7329f8b2e34fafa1c5d - 166

:pehrsons, can you redo the bisection on the narrowed range and ensure the builds work?
Flags: needinfo?(pehrsons)
Thanks. I'll try to bisect with fixes.
Hmm. Here's one constructor I introduced: https://dxr.mozilla.org/mozilla-central/rev/d62963756d9a9d19cbbb5d8f3dd3c7cfa8fdef88/dom/media/MediaSegment.h#74

To see how big difference it makes I changed it to a #define and am building at https://treeherder.mozilla.org/#/jobs?repo=try&revision=e0322e05efdd (disregard the other commit, please).
(In reply to Andreas Pehrson [:pehrsons] (Telenor) from comment #9)
> Hmm. Here's one constructor I introduced:
> https://dxr.mozilla.org/mozilla-central/rev/
> d62963756d9a9d19cbbb5d8f3dd3c7cfa8fdef88/dom/media/MediaSegment.h#74
> 
> To see how big difference it makes I changed it to a #define and am building
> at https://treeherder.mozilla.org/#/jobs?repo=try&revision=e0322e05efdd
> (disregard the other commit, please).

So.

> {"name": "num_constructors", "value": 86}

Any clues how that const introduced 80 constructors?
Flags: needinfo?(pehrsons) → needinfo?(jmaher)
thanks for looking into this.  I know next to nothing about the number of constructors- froydnj is the main guy for that measurement.
Flags: needinfo?(jmaher) → needinfo?(nfroyd)
(In reply to Andreas Pehrson [:pehrsons] (Telenor) from comment #10)
> (In reply to Andreas Pehrson [:pehrsons] (Telenor) from comment #9)
> > Hmm. Here's one constructor I introduced:
> > https://dxr.mozilla.org/mozilla-central/rev/
> > d62963756d9a9d19cbbb5d8f3dd3c7cfa8fdef88/dom/media/MediaSegment.h#74
> > 
> > To see how big difference it makes I changed it to a #define and am building
> > at https://treeherder.mozilla.org/#/jobs?repo=try&revision=e0322e05efdd
> > (disregard the other commit, please).
> 
> So.
> 
> > {"name": "num_constructors", "value": 86}
> 
> Any clues how that const introduced 80 constructors?

Thanks for tracking this down.

PrincipalHandle is just an alias for nsMainThreadPtrHolder.  nsMainThreadPtrHolder has a non-trivial constructor and a non-trivial destructor: that's standardese for saying that they're user-defined.  And even though the compiler could theoretically optimize away the constructor (as it's simply storing a constant value), you still have to register the destructor to be run at shutdown time, so you need a static constructor to do that registration.  Which is actually useless work...we should do something about that.

Anyway, that explains one static constructor.  Where do 80 static constructors come from?

We have |const PrincipalHandle PRINCIPAL_HANDLE_NONE| in a header.  That means that every file that includes MediaSegment.h gets a separate copy of PRINCIPAL_HANDLE_NONE, since |const| variables are defined to only have visibility to the translation unit (i.e. file) in which they're defined.  I imagine that the desired effect was to only have this thing defined once, but that's not what happened.

Indeed, on a reasonably current Firefox for x86-64 Linux, if I run:

@nightcrawler:~$ readelf -sW firefox/libxul.so |grep -c PRINCIPAL_HANDLE_NONE
94

showing that we have even more instances now, which is not so great.

The easy way out here is to change the header to declare:

extern const PrincipalHandle PRINCIPAL_HANDLE_NONE;

and then define PRINCIPAL_HANDLE_NONE someplace convenient.  We'll still have a single static constructor, but that's better than having a bunch plus new ones introduced whenever MediaSegment.h gets included someplace else.

The better way, I think, is to add:

  bool operator==(decltype(nullptr)) const { return mPtr == nullptr; }

to nsMainThreadPtrHandle and then replace all instances of PRINCIPAL_HANDLE_NONE with nullptr.  I guess you might need:

  MOZ_IMPLICIT nsMainThreadPtrHandle(decltype(nullptr)) : mPtr(nullptr) {}

to make things work nicely for when the code expects an actual object.
Flags: needinfo?(nfroyd)
(In reply to Joel Maher (:jmaher) from comment #2)
> looking on the graph, I see linux64 opt:
> https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,
> 45b561e2236d7631af037d960a78410227317660,1%5D
> 
> and this yields no fractions- the problem is we have a lot of from jobs like
> AB, S, B- all reporting to the same graph.
> 
> :wlach, I cannot figure out how linux64 opt shows:
> 83.71 	< 	149.67
> 
> those are odd, maybe it is averaging the bi-modal data?  how can we get the
> raw replicates that make up the numbers we see?

Yes, it's averaging the data, which is kind of weird. Perfherder kind of makes the assumption that performance measures things that are non-deterministic in things like this calculation. We could file a bug to handle things like this better, but I honestly don't know when I'd have time to look into it. The thing to do in this case is look at the graph as the source of truth. It's quite obvious what's going on:

https://treeherder.mozilla.org/perf.html#/graphs?timerange=1209600&series=%5Bmozilla-inbound,45b561e2236d7631af037d960a78410227317660,1%5D&selected=%5Bmozilla-inbound,45b561e2236d7631af037d960a78410227317660%5D
Flags: needinfo?(wlachance)
Assignee: nobody → pehrsons
Status: NEW → ASSIGNED
Comment on attachment 8741336 [details]
MozReview Request: Bug 1262970 - Allow nsMainThreadPtrHandle to be constructed by and compared to nullptr. r?bholley

Bouncing this to Nathan.
Attachment #8741336 - Flags: review?(bobbyholley) → review?(nfroyd)
Attachment #8741337 - Flags: review?(nfroyd) → review+
Comment on attachment 8741337 [details]
MozReview Request: Bug 1262970 - Change PRINCIPAL_HANDLE_NONE to a nullptr #define. r?nfroyd

https://reviewboard.mozilla.org/r/46361/#review43341

Apologies for the delay in reviewing this.
Attachment #8741336 - Flags: review?(nfroyd) → review+
Comment on attachment 8741336 [details]
MozReview Request: Bug 1262970 - Allow nsMainThreadPtrHandle to be constructed by and compared to nullptr. r?bholley

https://reviewboard.mozilla.org/r/46359/#review43343
num_contructors sees a win here:
https://treeherder.mozilla.org/perf.html#/alerts?id=885

good stuff!
https://hg.mozilla.org/mozilla-central/rev/6bbf05ba56d8
https://hg.mozilla.org/mozilla-central/rev/24eb2f310602
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Version: unspecified → Trunk
Moving from Core::Untriaged to Core::General https://bugzilla.mozilla.org/show_bug.cgi?id=1407598
Component: Untriaged → General
You need to log in before you can comment on or make changes to this bug.