1451591 - Optionally flush CacheIR spew more frequently

Reporter

Description

•

6 years ago

One of the downsides to the elegance of emitting JSON CacheIR logs the way it is done right now is that a crashed process means no log is emitted. This eliminates the log as a quick way to check what IC could perhaps be the crasher while developing (when multiple might be in play) 

I do wonder if there's a way to make this more useful, but I'm not sure.

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 1

•

6 years ago

I see truncated logs in browser builds pretty frequently. Tom's CacheIR analyzer seems to attempt to fix that up, but it would be nice to get some more fidelity when desired. 

Thinking on it, I think the right way to do this would be to add a counter to CacheIRSpewer, and then flush the output on a regular interval of log entries. The flush could be disabled by default, or set to a high value. 

When accurate logs are desired, for example in a crashing test case, the frequency could be set to 1, and the logs flushed for every entry. 

An ideal place to put the flush logic would be in the destructor of the CacheIRSpewer::Guard class [1]

[1]: https://searchfox.org/mozilla-central/rev/e0c879c86b95bdc752b1dbff6088169735674e4a/js/src/jit/CacheIRSpewer.h#48

Mentor: mgaudet

Summary: Make CACHEIR_LOGS more useful in crashing test cases → Optionally flush CacheIR spew more frequently

Whiteboard: [lang=c++]

Adam Holm

Assignee

Comment 2

•

6 years ago

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #1)

I'm curious about this bug and see that it's mentored (sorry if I'm incorrect there), but I'm having a hard time gauging how difficult a fix for this bug will be. What would you guess the difficulty of this bug to be? I think I could handle a step up from good-first-bug.

Thanks

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 3

•

6 years ago

(In reply to Adam Holm from comment #2)
> (In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #1)
> 
> I'm curious about this bug and see that it's mentored (sorry if I'm
> incorrect there), but I'm having a hard time gauging how difficult a fix for
> this bug will be. What would you guess the difficulty of this bug to be? I
> think I could handle a step up from good-first-bug.
> 
> Thanks

And I would say this is pretty much that difficulty level :) I'd be glad to help you get this done. 

So, to have an idea of what you're working on, I'd read these two blog posts [1][2]. I don't know about you, but I find context helps. If you don't understand it all, no worries, you can still work on this bug.

Once you're ready to start working on it, here's what I'd recommend you do: 

1. Make sure you can build a SpiderMonkey shell stand alone [3]. You'll want to build with debug. 
2. Try generating a CacheIR log; run a test case from js/src/jit-tests/cacheir/ with the environment variable CACHEIR_LOGS set to an absolute path to a file. I usually use CACHEIR_LOGS=$PWD/cacheir_log
3. Take a look at the generated file, and take note of one of the later CacheIR stubs attached. 
3. Now, in js/src/jit/CacheIR.cpp, find where that stub was attached. The logs are json, and you'll look for the 'attached' property, then find it in the file; it'll be inside a call to |trackAttached|. 
4. Just before that, insert |MOZ_CRASH("hit");|, and recompile. If you re run the test, it should now crash. If you set CACHEIR_LOGS, and look in the log file, you'll note it is empty. 

The reason I have you do steps 2-4 is because I want you to be able to tell when you've got the bug fixed.

Ok, so fixing the bug, you'll flush the output file used by the CacheIRSpewer (which currently is only flushed in its destructor) on a regular cadence. You'll need to add a counter and a limit. 

Searchfox [4] will be your friend for finding things! 

When you have something working, or are close and need feedback, feel free to attach a patch here.


[1]: https://www.mgaudet.ca/technical/2018/6/5/an-inline-cache-isnt-just-a-cache
[2]: https://jandemooij.nl/blog/2017/01/25/cacheir/
[3]: https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Build_Documentation
[4]: https://searchfox.org/

Adam Holm

Assignee

Comment 4

•

6 years ago

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #3)
> (In reply to Adam Holm from comment #2)
> > (In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #1)
> > 
> > I'm curious about this bug and see that it's mentored (sorry if I'm
> > incorrect there), but I'm having a hard time gauging how difficult a fix for
> > this bug will be. What would you guess the difficulty of this bug to be? I
> > think I could handle a step up from good-first-bug.
> > 
> > Thanks
> 
> And I would say this is pretty much that difficulty level :) I'd be glad to
> help you get this done. 
> 
> So, to have an idea of what you're working on, I'd read these two blog posts
> [1][2]. I don't know about you, but I find context helps. If you don't
> understand it all, no worries, you can still work on this bug.
> 
> Once you're ready to start working on it, here's what I'd recommend you do: 
> 
> 1. Make sure you can build a SpiderMonkey shell stand alone [3]. You'll want
> to build with debug. 
> 2. Try generating a CacheIR log; run a test case from
> js/src/jit-tests/cacheir/ with the environment variable CACHEIR_LOGS set to
> an absolute path to a file. I usually use CACHEIR_LOGS=$PWD/cacheir_log
> 3. Take a look at the generated file, and take note of one of the later
> CacheIR stubs attached. 
> 3. Now, in js/src/jit/CacheIR.cpp, find where that stub was attached. The
> logs are json, and you'll look for the 'attached' property, then find it in
> the file; it'll be inside a call to |trackAttached|. 
> 4. Just before that, insert |MOZ_CRASH("hit");|, and recompile. If you re
> run the test, it should now crash. If you set CACHEIR_LOGS, and look in the
> log file, you'll note it is empty. 
> 
> The reason I have you do steps 2-4 is because I want you to be able to tell
> when you've got the bug fixed.
> 
> Ok, so fixing the bug, you'll flush the output file used by the
> CacheIRSpewer (which currently is only flushed in its destructor) on a
> regular cadence. You'll need to add a counter and a limit. 
> 
> Searchfox [4] will be your friend for finding things! 
> 
> When you have something working, or are close and need feedback, feel free
> to attach a patch here.
> 
> 
> [1]:
> https://www.mgaudet.ca/technical/2018/6/5/an-inline-cache-isnt-just-a-cache
> [2]: https://jandemooij.nl/blog/2017/01/25/cacheir/
> [3]:
> https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/
> Build_Documentation
> [4]: https://searchfox.org/

This is great and very helpful, thank you! I'll get started on the steps that you've outlined.

Adam Holm

Assignee

Comment 5

•

6 years ago

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #3)
> 4. Just before that, insert |MOZ_CRASH("hit");|, and recompile. If you re
> run the test, it should now crash. If you set CACHEIR_LOGS, and look in the
> log file, you'll note it is empty. 

I'm having some difficulty with this step. I run the test compare.js [1] with the method you describe above. I get a json file of the cache as expected. In the cacheIR log, at line 6203 (out of 6456), there is an |"attached":"Number"|. Then I insert |MOZ_CRASH("hit");| above |trackAttached("Number");| in CacheIR.cpp [2]. I'm not exactly sure what to recompile, I've tried |./mach build js/src| and |./mach build| (I run both from ~/src/mozilla-central/). After doing both, I run compare.js[1] again, but Spidermonkey doesn't crash and the cacheIR log isn't empty (the json file has the same stubs and the same number of them). Any suggestions?

[1]: https://searchfox.org/mozilla-central/source/js/src/jit-test/tests/cacheir/compare.js
[2]: https://searchfox.org/mozilla-central/source/js/src/jit/CacheIR.cpp#5426

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 6

•

6 years ago

Hrm. I'm not sure what mach uses for its default build settings, but it seems like the |MOZ_CRASH| has been compiled out. 

Here's what I do to build (these instructions should work on Mac and Linux, I'm less sure about windows, but the ~ suggests they should work for you)

    cd ~src/mozilla-central/js/src/
    mkdir build_DBG.OBJ # This name is used just because it matches one of the default .hgignore rules. 
    cd build_DBG.OBJ
    ../configure --enable-warnings-as-errors  --enable-tests --disable-optimize --enable-debug # Good for debugging and development
    make -j9

Now you can run the created shell in ./dist/bin/js 

If you wanted to continue using mach, you'd want to read about MOZCONFIGs [1]

[1]: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Configuring_Build_Options

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 7

•

6 years ago

Oh, and to run compare.js in the configurations we typically care about, you could use

    ../jit-test/jit_test.py --jitflags=all ./dist/bin/js cacheir/compare.js 

The tool is moderately intelligent about test locations.

Adam Holm

Assignee

Comment 8

•

6 years ago

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #6)
> Hrm. I'm not sure what mach uses for its default build settings, but it
> seems like the |MOZ_CRASH| has been compiled out. 
> 
> Here's what I do to build (these instructions should work on Mac and Linux,
> I'm less sure about windows, but the ~ suggests they should work for you)

You're correct, I'm running Ubuntu 16.04.

> 
>     cd ~src/mozilla-central/js/src/
>     mkdir build_DBG.OBJ # This name is used just because it matches one of
> the default .hgignore rules. 
>     cd build_DBG.OBJ
>     ../configure --enable-warnings-as-errors  --enable-tests
> --disable-optimize --enable-debug # Good for debugging and development
>     make -j9
> 
> Now you can run the created shell in ./dist/bin/js 
> 
> If you wanted to continue using mach, you'd want to read about MOZCONFIGs [1]
> 
> [1]:
> https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/
> Build_Instructions/Configuring_Build_Options

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #7)
> Oh, and to run compare.js in the configurations we typically care about, you
> could use
> 
>     ../jit-test/jit_test.py --jitflags=all ./dist/bin/js cacheir/compare.js 
> 
> The tool is moderately intelligent about test locations.

This is very helpful, thank you! I am seeing changes, but the log file for running the test with |MOZ_CRASH("hit");| (same line number as I described earlier) is just truncated, not empty. Is there a step I'm neglecting to include?

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 9

•

6 years ago

(In reply to Adam Holm from comment #8)
> This is very helpful, thank you! I am seeing changes, but the log file for
> running the test with |MOZ_CRASH("hit");| (same line number as I described
> earlier) is just truncated, not empty. Is there a step I'm neglecting to
> include?

I think that empty logs would happen for shorter programs, and truncated ones for longer (buffered output); so if you're seeing truncation, that's sufficient (I see more empty files, but that might be because I'm on a Mac).

Next up: Try to get it so that the only truncation would be the missing final ']' of the array by flushing on every output, and I think we'll be good.

Adam Holm

Assignee

Comment 10

•

6 years ago

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #9)
> (In reply to Adam Holm from comment #8)
> Next up: Try to get it so that the only truncation would be the missing
> final ']' of the array by flushing on every output, and I think we'll be
> good.

Do you mean create this level of truncation by flushing the output file that CacheIRSpewer uses, or by inserting |MOZ_CRASH| statements into CacheIR.cpp?

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 11

•

6 years ago

(In reply to Adam Holm from comment #10)
> (In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #9)
> > (In reply to Adam Holm from comment #8)
> > Next up: Try to get it so that the only truncation would be the missing
> > final ']' of the array by flushing on every output, and I think we'll be
> > good.
> 
> Do you mean create this level of truncation by flushing the output file that
> CacheIRSpewer uses, or by inserting |MOZ_CRASH| statements into CacheIR.cpp?

Time to start adding flushes :) As I suggested, the destructor of the CacheIRSpewer::Guard class would be the place (See Comment 2).

Adam Holm

Assignee

Comment 12

•

6 years ago

Attached patch bug1451591.patch (obsolete) — Details — Splinter Review

Here's what I have so far. Let me know if it's what you had in mind or if I'm not on the right track, and if there's anything I'm doing incorrectly/improperly.

Thanks!

Flags: needinfo?(mgaudet)

Attachment #9025902 - Flags: feedback?(mgaudet)

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 13

•

6 years ago

Comment on attachment 9025902 [details] [diff] [review]
bug1451591.patch

Review of attachment 9025902 [details] [diff] [review]:
-----------------------------------------------------------------

Very much on the right path, but still some work to be done :)

::: js/src/jit/CacheIRSpewer.cpp
@@ +66,5 @@
> +// Value of spew_interval_ must be set to >= 1, values
> +// of 0 will create errors.
> +// Will be set high by default (a value of 6 currently), meaning caches will be
> +// spewed at less frequent intervals.
> +int CacheIRSpewer::Guard::spew_interval_ = 6;

If we make this non-static, as I think we should, these will both go away.

- A default interval should likely be substantially higher; think 1000 or 10000. 
- We need to be able to vary this without recompiling the source code. I'd suggest an environment variable like CACHEIR_LOG_FLUSH; In the spewer constructor you can do something like 

   spewInterval_ = getenv("CACHEIR_LOG_FLUSH") ? atoi(getenv("CACHEIR_LOG_FLUSH")) : 10000;

It is here you can also apply some sanitization to the value. i.e. if set to zero, perhaps you rewrite that to 1, along with a comment indicating they both spew on every output.

@@ +227,5 @@
> +void
> +CacheIRSpewer::flushCache()
> +{
> +    MOZ_ASSERT(enabled());
> +    output.flush();

I don't think we need an extra member functin here, just access output flush directly.

::: js/src/jit/CacheIRSpewer.h
@@ +51,4 @@
>          const IRGenerator& gen_;
>          const char* name_;
>  
> +        // Counters for flushing caches.

Note: We're not flushing the caches themselves, but we are instead flushing the outputs which are recording said caches. 

So we'll want to rename |flushCache| et. al.

@@ +53,5 @@
>  
> +        // Counters for flushing caches.
> +
> +        // Counts number of times Guard class is called.
> +        static int count_iterations_;

- I'd prefer this not be a static; given the guard class has access to the spewer, and IIRC nested classes have access to private members, this should be easily made.

- The name isn't great, because what we are counting aren't 'iterations' per se, but instead are spewed attachment attempts. Perhaps guardCount_? 

- SpiderMonkey style (as you'll note from the rest of the file) has members as camelCase.

@@ +60,5 @@
> +        // by developer. For e.g., a value of 1 means
> +        // print every time Guard destructor is called,
> +        // a value of 2 means print every other time
> +        // ~Guard() is called, etc.
> +        static int spew_interval_;

While this can be static, given we have a singleton, there's no real need. 

nit: You can wrap the comment wider than you are (use 80 columns as the width). When the comments are so narrow they get hard to read.

@@ +64,5 @@
> +        static int spew_interval_;
> +
> +        // Member to increment iteration counter,
> +        // count_iterations_.
> +        const int count_iterations_adder_;

I don't understand what this particular member is providing?

@@ +89,5 @@
>              sp_.endCache();
>              sp_.lock().unlock();
>            }
> +          MOZ_ASSERT(spew_interval_>0, "spew_interval_ needs to be set >= 1");
> +          if ((count_iterations_ % spew_interval_) == (spew_interval_ - 1)) {

I would combine the increment and simplify the check here: 

    if (sp_.guardCount_++ % spewInterval_ == 0) {...} 

and then we don't have to touch the initializer list. 

I'd also move this code inside the locked section above, to neatly bypass any threading issues.

Attachment #9025902 - Flags: feedback?(mgaudet)

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 14

•

6 years ago

(PS: No need to ni? and f?; one flag suffices :P )

Flags: needinfo?(mgaudet)

Adam Holm

Assignee

Comment 15

•

6 years ago

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #13)

Thanks for the awesome feedback! 

> @@ +64,5 @@
> > +        static int spew_interval_;
> > +
> > +        // Member to increment iteration counter,
> > +        // count_iterations_.
> > +        const int count_iterations_adder_;
> 
> I don't understand what this particular member is providing?
> 

|count_iterations_adder_| increments |count_iterations_| in the initializer list of Guard to help count how many times the Guard class is called - if I understand what I'm doing there correctly. Admittedly it may not be the best solution - it's just the way I ended up getting a counter working (for determining the cadence). Any suggestions for improvements/better approaches?

Adam Holm

Assignee

Comment 16

•

6 years ago

Attached patch bug1451591.patch (obsolete) — Details — Splinter Review

Here is what I have so far based on your feedback.


* You can disregard my above comment, I figured out how to implement the counter without extraneous member variables.

* For setting |spewInterval_|, I wasn't sure if you wanted me to put the definition in the constructor in CacheIRSpewer.h or CacheIRSpewer.cpp. I put it in CacheIRSpewer.cpp as the destructor in that file had some logic, so I tried to follow that convention. 

* Let me know if the message for values of |spewInterval_| less than 1 isn't what you had in mind, or if |MOZ_ReportAssertionFailure| isn't a good way to display a warning.

* I just realized that I've completely forgotten to implement the limit you mentioned. By a limit, do you mean a value where when |spewInterval_| equals that limit, Guard will stop flushing output?

Attachment #9025902 - Attachment is obsolete: true

Attachment #9028150 - Flags: feedback?(mgaudet)

Adam Holm

Assignee

Comment 17

•

6 years ago

Attached patch bug1451591.patch (obsolete) — Details — Splinter Review

The previous patch was weird because of some mistakes I made with updating. This one should be good, sorry about that!

Attachment #9028150 - Attachment is obsolete: true

Attachment #9028150 - Flags: feedback?(mgaudet)

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 18

•

6 years ago

Comment on attachment 9028154 [details] [diff] [review]
bug1451591.patch

Review of attachment 9028154 [details] [diff] [review]:
-----------------------------------------------------------------

Looks pretty good! One last couple of nits to address, and then we're ready for review!

- Notice how you have sp_.guardCount++ in both the constructor and destructor? This means you're double-incrementing for each guard. I'd just get rid of the increment in the constructor and use the one in the destructor.

::: js/src/jit/CacheIRSpewer.cpp
@@ +40,5 @@
> +                        10000;
> +
> +    if (spewInterval_ < 1) {
> +        spewInterval_ = 1;
> +        MOZ_ReportAssertionFailure("spewInterval_ has been set to a value less"

I've actually never seen MOZ_ReportAssertionFailure before. Searchfox suggests [1] it's not used much at all (which likely means you're using the wrong thing). 

I'd actually get rid of the diagnostic message entirely. This kind of thing can be corrected silently with no trouble. 

[1] https://searchfox.org/mozilla-central/search?q=MOZ_ReportAssertionFailure&path=

::: js/src/jit/CacheIRSpewer.h
@@ +30,5 @@
> +    // Counter to record how many times Guard class is called. This is used to
> +    // determine when to flush outputs based on the given interval value.
> +    // For example, if |spewInterval_ = 2|, outputs will be flushed on
> +    // guardCount_ values 0,2,4,6,...
> +    int guardCount_;

Both guardCount_ and spewInterval_ should be uint32_t. 

We prefer types with a specified width. As well, in this case, it's nonsensical for either value to be negative, so unsigned types are the correct choice.

Attachment #9028154 - Flags: feedback+

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Updated

•

6 years ago

Assignee: nobody → asorholm

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Updated

•

6 years ago

Blocks: 1510603

Adam Holm

Assignee

Comment 19

•

6 years ago

Attached patch bug1451591.patch (obsolete) — Details — Splinter Review

Here's a patch implementing those changes. Let me know if there are any problems and I'll get them fixed.

Thanks for all your help!

Attachment #9028154 - Attachment is obsolete: true

Attachment #9028438 - Flags: review?(mgaudet)

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 20

•

6 years ago

Comment on attachment 9028438 [details] [diff] [review]
bug1451591.patch

Review of attachment 9028438 [details] [diff] [review]:
-----------------------------------------------------------------

Looks great, and I appreciate the commenting!

Attachment #9028438 - Flags: review?(mgaudet) → review+

Adam Holm

Assignee

Comment 21

•

6 years ago

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #20)
> Comment on attachment 9028438 [details] [diff] [review]
> bug1451591.patch
> 
> Review of attachment 9028438 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks great, and I appreciate the commenting!

Do I just add "checkin-needed" to the keywords field of this bug now? I know the usual process can involve pushing to the Try Server before that, but I don't have commit access at this point.

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 22

•

6 years ago

Hey Adam, 

Here's a try build: 

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e4a57b03cd89f0dc99c82d9fba417c459e95c72d

I did notice though that the commit message isn't great. Can I get a new patch uploaded with this as the commit message: 

"Bug 1451591 - Flush CacheIR Spewer on regular intervals"

Adam Holm

Assignee

Comment 23

•

6 years ago

Attached patch bug1451591.patch — Details — Splinter Review

Here you go!

Attachment #9028438 - Attachment is obsolete: true

Flags: needinfo?(mgaudet)

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Comment 24

•

6 years ago

Comment on attachment 9028736 [details] [diff] [review]
bug1451591.patch

Thanks! You can set checkin-needed now, or I can if you'd like.

Flags: needinfo?(mgaudet)

Attachment #9028736 - Flags: review+

Adam Holm

Assignee

Updated

•

6 years ago

Keywords: checkin-needed

Adam Holm

Assignee

Comment 25

•

6 years ago

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #24)
> Comment on attachment 9028736 [details] [diff] [review]
> bug1451591.patch
> 
> Thanks! You can set checkin-needed now, or I can if you'd like.


Done.

Pulsebot

Comment 26

•

6 years ago

Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3a2c2eb8f481
Flush CacheIR Spewer on regular intervals. r=mgaudet

Keywords: checkin-needed

Razvan Maries

Comment 27

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/3a2c2eb8f481

Status: NEW → RESOLVED

Closed: 6 years ago

status-firefox65: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla65

Ryan VanderMeulen [:RyanVM]

Updated

•

5 years ago

status-firefox61: affected → ---

bug1451591.patch 6 years ago Adam Holm 3.05 KB, patch		Details \| Diff \| Splinter Review
bug1451591.patch 6 years ago Adam Holm 3.87 KB, patch		Details \| Diff \| Splinter Review
bug1451591.patch 6 years ago Adam Holm 2.43 KB, patch	mgaudet : feedback+	Details \| Diff \| Splinter Review
bug1451591.patch 6 years ago Adam Holm 1.98 KB, patch	mgaudet : review+	Details \| Diff \| Splinter Review
bug1451591.patch 6 years ago Adam Holm 1.93 KB, patch	mgaudet : review+	Details \| Diff \| Splinter Review