Closed Bug 1084651 Opened 5 years ago Closed 5 years ago

Make SliceBudget's interface clearer and harder to misuse

Categories

(Core :: JavaScript: GC, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla36

People

(Reporter: ehoogeveen, Assigned: ehoogeveen)

Details

Attachments

(7 files, 14 obsolete files)

5.06 KB, patch
billm
: review+
Details | Diff | Splinter Review
11.22 KB, patch
billm
: review+
mccr8
: review+
Details | Diff | Splinter Review
2.81 KB, patch
billm
: review+
Details | Diff | Splinter Review
5.15 KB, patch
ehoogeveen
: review+
Details | Diff | Splinter Review
7.26 KB, patch
billm
: review+
Details | Diff | Splinter Review
6.29 KB, patch
ehoogeveen
: review+
Details | Diff | Splinter Review
22.42 KB, patch
ehoogeveen
: review+
Details | Diff | Splinter Review
I was looking at SliceBudget and I noticed that its work mode seems to be broken: SliceBudget::checkOverBudget() returns true if |PRMJ_Now() > deadline|, and since |deadline| is initialized to 0 in the work mode we'll always be 'over budget' immediately.

In addition, you're supposed to initialize a SliceBudget by passing the result of SliceBudget::TimeBudget or SliceBudget::WorkBudget to it, but nothing enforces this - SliceBudget's single argument constructor just takes an int64_t.
This patch fixes the work mode and additionally changes the interface a little. It adds two structs, TimeBudget and WorkBudget, that must be passed to SliceBudget to construct a limited slice, and removes the old constructor.

One thing this might change is how 'isOverBudget' should be interpreted for work mode (it's hard to tell because it was broken): '0 work' now matches '0 time' in that it creates an unbounded budget, and 'isOverBudget' will return true if the allotted budget has been reached, rather than if it has been exceeded. That way if you budget '1 work', isOverBudget will return true if after you call step(1) once. This made more sense to me, and the equivalent situation in 'time mode' seems like it matters less (since it's measured in microseconds, the difference between >= and > is small).

Because of the interface changes, this temporarily changes a couple of places that used the WorkBudget function to simply explicitly pass a negative integer. This is fixed in part 2.
Attachment #8507257 - Flags: review?(wmccloskey)
This changes GCRuntime::collect, GCRuntime::gcCycle, GCRuntime::budgetIncrementalGC and GCRuntime::incrementalCollectSlice to take a SliceBudget& instead of an int64_t so that the right type of budget can be chosen at the source (and not inferred from the POD budget's sign).

I haven't checked if these patches conflict at all with bhackett's work in bug 958492, but they can be easily rebased if so (unless he's changing SliceBudget itself).
Attachment #8507262 - Flags: review?(wmccloskey)
Actually, I guess I'm not sure what [1] is intended to do. Is it intended to do the minimal amount of work, or trigger an unbounded slice?

I'm considering making a part 3 that changes SliceBudget::Unlimited to a large number like UINT32_MAX (i.e. the largest number that GCRuntime::getParameter can return) so we can be more explicit about what '0' actually means. Personally I think it's meaningless for work-based slices, so those could assert that budget > 0. For time-based slices it could be meaningful, since we only check the time after decrementing the counter by 1000 - but maybe it shouldn't be.

Finally, I'm wondering if we should even allow ridiculously long slices - for instance, we could assert that we don't budget slices longer than 1 second (TimeBudget(1000)), in case something ends up passes a very wrong value. But maybe that would make the interface too specific to the time scales involved in GC and CC.

[1] http://hg.mozilla.org/mozilla-central/annotate/ae1dfa192faf/js/src/jsgc.cpp#l6188
Flags: needinfo?(jcoppeard)
Oh wait, I'm stupid - SliceBudget's work mode isn't broken at all, I just got thrown off by the sign swapping going on in the constructor. I still think it's needlessly confusing though; let me redo the patches (I'll probably swap parts 1 and 2 since they make more sense the other way around).
Attachment #8507257 - Attachment is obsolete: true
Attachment #8507257 - Flags: review?(wmccloskey)
Attachment #8507262 - Attachment is obsolete: true
Attachment #8507262 - Flags: review?(wmccloskey)
Summary: Fix SliceBudget's 'work' mode and make SliceBudget harder to misuse → Make SliceBudget's interface clearer and harder to misuse
Oops, wrong file.
Attachment #8507833 - Attachment is obsolete: true
Attachment #8507833 - Flags: review?(wmccloskey)
Attachment #8507834 - Flags: review?(wmccloskey)
Compared to the first version, this changes the work mode back to how it used to work (which was more efficient). It does change isOverBudget to check for |counter > 0|, so that it will fall through after CounterReset ticks and not |CounterReset + 1| ticks, and likewise changes checkOverBudget so that it returns true if the current time happens to exactly match the deadline. That way we'll stop when the allotted budget has been exhausted, not when it has been explicitly exceeded.

In addition, it changes SliceBudget::Unlimited to INT32_MAX and makes all larger values imply an unlimited slice. The cycle collector exposes a work-based slice to JS for testing purposes, so I figured clamping in the constructor wasn't unreasonable. It does assert that |budget > 0| so we don't end up trying to do 0 work. I don't think any places currently did this - the cycle collector allowed passing in 0, but I think it was assuming this to mean 'unlimited'.
Attachment #8507837 - Flags: review?(wmccloskey)
Calling GCSlice(0) from tests seems to be fairly common, and it doesn't seem reasonable for JS code which doesn't know about SliceBudget::Unlimited to need to pass in a random high value (but I still think we can adhere to using SliceBudget::Unlimited internally), so this adds a check for |objCount > 0| in GCRuntime::gcDebugSlice. Sorry about the churn.
Attachment #8507837 - Attachment is obsolete: true
Attachment #8507837 - Flags: review?(wmccloskey)
Attachment #8507865 - Flags: review?(wmccloskey)
Attachment #8507834 - Flags: review?(wmccloskey) → review+
Comment on attachment 8507865 [details] [diff] [review]
Part 2 v3: Clean up SliceBudget and require explicitly choosing between TimeBudget and WorkBudget.

Review of attachment 8507865 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/public/SliceBudget.h
@@ +37,5 @@
>      intptr_t counter;
>  
>      static const intptr_t CounterReset = 1000;
>  
> +    static const int64_t Unlimited = INT32_MAX;

Why is this INT32_MAX when the type is an int64_t?

Also, with your changes, can we just remove Unlimited and rely on the SliceBudget() constructor for that purpose?

::: xpcom/base/nsCycleCollector.cpp
@@ +3612,5 @@
>          continueSlice = false;
>          break;
>      }
>      if (continueSlice) {
> +      continueSlice = !aBudget.isOverBudget();

Please get mccr8 to review this file. It seems like you're changing something here, and I'm not sure it's what he intended.
Attachment #8507865 - Flags: review?(wmccloskey)
Asking for re-review because I decided to SliceBudgetify gcDebugSlice as well to get rid of the silly boolean argument, which required adjusting a few more functions. Straightforward, but I felt weird just carrying forward r+ with that.
Attachment #8507834 - Attachment is obsolete: true
Attachment #8508874 - Flags: review?(wmccloskey)
I decided to split the cleanup out into its own part. This shouldn't change the behavior anywhere except if JS calls GCSlice with a negative value to get a time-based budget (which would be a bad idea for a test and isn't done anywhere in the tree).

Andrew, asking for your review on the changes to nsCycleCollector.cpp. The only change of note should be the change from SliceBudget::checkOverBudget to SliceBudget::isOverBudget.

I made the former private since it isn't really meant to be used by external consumers: it does the wrong thing for work-based budgets (always returns true), and SliceBudget::isOverBudget will always fall through to it if the budget has actually been exhausted.

If you're worried that a significant amount of time has passed since the last call to SliceBudget::isOverBudget, and it might only return false because its internal counter hasn't reached 0 yet, you should just represent that time by calling SliceBudget::step instead. But I don't know if this was a conscious decision or a typo ;)
Attachment #8508874 - Attachment is obsolete: true
Attachment #8508874 - Flags: review?(wmccloskey)
Attachment #8508890 - Flags: review?(wmccloskey)
Attachment #8508890 - Flags: review?(continuation)
Comment on attachment 8508874 [details] [diff] [review]
Part 1 v4: Thread SliceBudget through several functions to choose the budget type at the source.

Ugh, misclick.
Attachment #8508874 - Attachment is obsolete: false
Attachment #8508874 - Flags: review?(wmccloskey)
Attachment #8507865 - Attachment is obsolete: true
This part actually changes how SliceBudget works. Before this patch, the following assertion would fail:

SliceBudget budget(WorkBudget(1));
budget.step(1);
MOZ_ASSERT(budget.isOverBudget());

This is because the counter in isOverBudget() used >= instead of >. You could argue that this is correct in the sense that you aren't *over* budget yet - but you are about to be, the allotted budget has been exhausted. I personally think this is unhelpful, and existing uses in tests seem to agree with me - some use 1 and some use 0, but they probably all expect to do the minimal amount of work.

In this patch I made it so that WorkBudget(0) and WorkBudget(1) are equivalent - they both budget one unit of work. WorkBudget(2) budgets 2 units of work, and so on. Alternatives are to (1) stick with the old behavior, (2) simply allow 0 as a way to 'do nothing', or (3) explicitly disallow 0. Given the number of tests that use 0 or 1, I figured (2) and (3) would unintuitive.

In addition, SliceBudget::Unlimited is now -1, and any value < 0 is interpreted as an unlimited slice. This matches the cycle collector's logic, and enables a simplification in that area (part 5). I kept SliceBudget::Unlimited around for GCRuntime::sliceBudget, letting GCRuntime::setParameter special case 0 to mean unlimited (as before). GCRuntime::setParameter takes an uint32_t, so passing -1 directly would be a bit questionable.

I decided to allow SliceBudget(TimeBudget(0)) as an alias for SliceBudget(WorkBudget(CounterReset)), even though it's not very useful, as a simple way to 'do a tiny bit of work'. Nothing currently uses this, and I could go either way on it.

In any case, let me know if the new behavior agrees with your expectations for this class.
Flags: needinfo?(jcoppeard)
Attachment #8508909 - Flags: review?(wmccloskey)
Attachment #8508909 - Flags: review?(continuation)
This just adjusts some tests that called GCSlice with unique-looking numbers, incrementing them by 1 to match the new behavior. I left the tests that call it with 0 or 1 alone, as well as tests that use some huge round number.
Attachment #8508912 - Flags: review?(wmccloskey)
This moves SliceBudget construction to nsJSContext::RunCycleCollectorSlice and nsJSContext::RunCycleCollectorWorkSlice and gets rid of nsCycleCollector_collectSliceWork, since it ended up looking exactly the same as nsCycleCollector_collectSlice. The logic no longer needs to check for negative values since those will get an unlimited slice by default now.
Attachment #8508915 - Flags: review?(continuation)
Try run for parts 1 and 2:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c2a87fec6db0

Try run for all parts combined:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=9f956aa30baf

I pushed those in the middle of some infra failure, but the only actual failures look like pre-existing orange.
(I cancelled the needinfo? because it's pretty obvious to me now that the GC_ZEAL work budget just wants to do the minimum amount of work; I was confused before because of the sign swapping going on for work budgets).
Comment on attachment 8508890 [details] [diff] [review]
Part 2 v4: Clean up SliceBudget and require explicitly choosing between TimeBudget and WorkBudget.

Review of attachment 8508890 [details] [diff] [review]:
-----------------------------------------------------------------

The idea of this code is that we should always check immediately if we have exceeded the budget, without regard for how expensive it is to check.

I guess this is an abuse of the interface, because only one of the phases of the CC actually is good about bumping the counter, because it is the only one that can be incrementalized in a fine-grained way.
Attachment #8508890 - Flags: review?(continuation) → review-
(In reply to Andrew McCreight [:mccr8] from comment #19)
> I guess this is an abuse of the interface, because only one of the phases of
> the CC actually is good about bumping the counter, because it is the only
> one that can be incrementalized in a fine-grained way.

Ah, so it *was* intentional. Would you mind if I changed it to just do the following?

> // Force SliceBudget::isOverBudget to check the time.
> aBudget.step(SliceBudget::CounterReset);
> continueSlice = !aBudget.isOverBudget();

SliceBudget::checkOverBudget really shouldn't be called directly as it always returns true for work-based budgets. Of course, stepping by SliceBudget::CounterReset isn't really correct for work-based budgets either - it would be better if the CC had an idea of how much work it actually did (even if it isn't incremental), so we could just bump it by the full amount. I suspect we would need a better idea of how work units correspond to time to do that, though (this is something I intend to explore, but not in this bug).
Flags: needinfo?(continuation)
> Would you mind if I changed it to just do the following?
Yeah, that sounds good to me.  It is kind of crummy, but I'm just not really sure how to properly deal with a work-based workload in ICC without coming up with some kind of cost accounting in phases we don't really care about anyways.
Flags: needinfo?(continuation)
With that change.

> Yeah, that sounds good to me.  It is kind of crummy, but I'm just not really
> sure how to properly deal with a work-based workload in ICC without coming
> up with some kind of cost accounting in phases we don't really care about
> anyways.

Yeah, SliceBudget kind of expects everything that it's in scope for to do fine-grained accounting, and so tries to avoid doing anything expensive itself. That's kind of at odds with what you want here, though. Although this workaround is kind of awful, on the plus side it might give more power to the present and future fuzzers to check slices of varying lengths (assuming they know to try ccSlice()).
Attachment #8508890 - Attachment is obsolete: true
Attachment #8508890 - Flags: review?(wmccloskey)
Attachment #8509802 - Flags: review?(wmccloskey)
Attachment #8509802 - Flags: review?(continuation)
Comment on attachment 8509802 [details] [diff] [review]
Part 2 v5: Clean up SliceBudget and require explicitly choosing between TimeBudget and WorkBudget.

Review of attachment 8509802 [details] [diff] [review]:
-----------------------------------------------------------------

r=me for the CC changes
Attachment #8509802 - Flags: review?(continuation) → review+
Comment on attachment 8508909 [details] [diff] [review]
Part 3 v1: Tighten up the bounds of SliceBudget to work as a budget should (but make an exception for WorkBudget(0)).

Review of attachment 8508909 [details] [diff] [review]:
-----------------------------------------------------------------

This should be okay from the CC's perspective.
Attachment #8508909 - Flags: review?(continuation) → review+
Attachment #8508915 - Flags: review?(continuation) → review+
Thanks for the reviews, Andrew!

It occurred to me that SliceBudget::reset() is a total misnomer considering what it actually does. A SliceBudget couldn't reset itself even if it wanted to, because it doesn't store the original budget anywhere! This patch renames reset to the not very creative, but hopefully more descriptive 'makeUnlimited'.
Attachment #8510050 - Flags: review?(wmccloskey)
Bill: Gentle review ping :)
Attachment #8508874 - Flags: review?(wmccloskey) → review+
Attachment #8509802 - Flags: review?(wmccloskey) → review+
Attachment #8510050 - Flags: review?(wmccloskey) → review+
Comment on attachment 8508909 [details] [diff] [review]
Part 3 v1: Tighten up the bounds of SliceBudget to work as a budget should (but make an exception for WorkBudget(0)).

Review of attachment 8508909 [details] [diff] [review]:
-----------------------------------------------------------------

The purpose of gcslice(0) was to scan roots, but not to mark any heap objects at all. It sounds like that's broken. If so, I'd rather fix it than have this weird behavior. And gcslice(1) should do 1 unit of marking work. Let's do that.
Attachment #8508909 - Flags: review?(wmccloskey)
Comment on attachment 8508912 [details] [diff] [review]
Part 4a: Adjust some tests to preserve previous behavior.

Review of attachment 8508912 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay here.
Attachment #8508912 - Flags: review?(wmccloskey) → review+
Rebased on tip; carrying forward r=billm.
Attachment #8508874 - Attachment is obsolete: true
Attachment #8516727 - Flags: review+
(In reply to Bill McCloskey (:billm) from comment #27)
> The purpose of gcslice(0) was to scan roots, but not to mark any heap
> objects at all. It sounds like that's broken. If so, I'd rather fix it than
> have this weird behavior. And gcslice(1) should do 1 unit of marking work.
> Let's do that.

Aha! If it's useful, by all means :) This version just removes the special casing from the WorkBudget constructor.

I'll post another patch that changes the existing tests that call gcslice(0) to do gcslice(1) instead. Some of these may want the existing behavior, but I think most do not (comments are sparse, and I think they pass either way). I think there's about 10 cases, so hopefully it shouldn't take too long to review them.
Attachment #8508909 - Attachment is obsolete: true
Attachment #8516730 - Flags: review?(wmccloskey)
Rebased across the change to part 3; carrying forward r=billm.
Attachment #8510050 - Attachment is obsolete: true
Attachment #8516732 - Flags: review+
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #30)
> I'll post another patch that changes the existing tests that call gcslice(0)
> to do gcslice(1) instead. Some of these may want the existing behavior, but
> I think most do not (comments are sparse, and I think they pass either way).
> I think there's about 10 cases, so hopefully it shouldn't take too long to
> review them.

No, I'd like to keep the current gcslice(0) usage.
(In reply to Bill McCloskey (:billm) from comment #32)
> No, I'd like to keep the current gcslice(0) usage.

Hmm, I was about to post this when I saw this comment. Could you have a look anyway? The tests I changed are basically fuzz tests that I'm worried won't test anything anymore otherwise. Of course, maybe there have been enough changes since they were made that they're useless anyway, or maybe the exact budget doesn't need to be so specific.
Attachment #8516753 - Flags: feedback?(wmccloskey)
Attachment #8516730 - Flags: review?(wmccloskey) → review+
Comment on attachment 8516753 [details] [diff] [review]
Part 4b: Adjust and clarify some tests using gcslice(0).

Review of attachment 8516753 [details] [diff] [review]:
-----------------------------------------------------------------

Actually, this looks fine. I was only concerned about the tests in gc/.
Attachment #8516753 - Flags: feedback?(wmccloskey) → review+
Attachment #8508912 - Attachment description: Part 4 v1: Adjust some tests to preserve previous behavior. → Part 4a: Adjust some tests to preserve previous behavior.
this failed to apply cleanly:

renamed 1084651 -> slicebudget-thread.patch
applying slicebudget-thread.patch
patching file js/src/gc/GCRuntime.h
Hunk #2 FAILED at 499
1 out of 2 hunks FAILED -- saving rejects to file js/src/gc/GCRuntime.h.rej
patch failed, unable to continue (try -v)
patch failed, rejects left in working dir
errors during apply, please fix and refresh slicebudget-thread.patch

that was when i started with part 1.

Could you take a look, thanks!
Flags: needinfo?(emanuel.hoogeveen)
Rebased on m-i tip.
Attachment #8516727 - Attachment is obsolete: true
Flags: needinfo?(emanuel.hoogeveen)
Attachment #8517288 - Flags: review+
Keywords: checkin-needed
I can reproduce the error locally using the version of clang that the builders use, but the error itself makes no sense. I think this is a bug in clang :\

I'm trying to create a minimal patch that demonstrates the problem; we can decide what we want to do from there.
Correction: The error message is bogus, but the problem is caused by the |namespace js { struct SliceBudget; }| I added to xpcom/base/nsCycleCollector.h in part 5. That gets hidden visibility, which conflicts with later includes of js/SliceBudget.h.

I think the right fix here is to just include js/SliceBudget.h in xpcom/base/nsCycleCollector.h instead, since that will ensure that anyone who uses the header gets the intended visibility. js/SliceBudget.h is tiny anyway, so it shouldn't hurt anything. Patch coming up.
Another small rebase, carrying forward r=billm.
Attachment #8517288 - Attachment is obsolete: true
Attachment #8518289 - Flags: review+
Carrying forward r=mccr8. This version removes the wrong forward declaration of SliceBudget and simply includes js/SliceBudget.h instead (and removes the redundant include from the module). This builds with clang + ASAN locally, including non-unified.

I'll send this through try once the infra bustage clears up.
Attachment #8508915 - Attachment is obsolete: true
Attachment #8518293 - Flags: review+
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=ae94761d496b (hazard analysis build looks pretty busted, but I really don't think this could affect that)
Keywords: checkin-needed
sorry had to back this out again, seems the hazard failures were related to errors like 

https://treeherder.mozilla.org/ui/logviewer.html#?job_id=3686142&repo=mozilla-inbound

backed out from m-c and integration trees
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Ack, I'm really sorry about this, I should have looked at the try results a bit more closely. Part 1 changed the signature of GCRuntime::collect (replacing the int64 parameter with a SliceBudget& parameter), and it seems the hazard analysis was checking for the exact signature.

Steve, how can I adjust the signature the analysis checks for? Is it in-tree somewhere?
Flags: needinfo?(sphink)
Resolved via IRC.
Flags: needinfo?(sphink)
This version changes the static analysis to expect the new signature of GCRuntime::collect (from int64_t to js::SliceBudget*). Looks good on try, I'll hold off asking for checkin again until more results come back:

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=88315dd7542a
Attachment #8518289 - Attachment is obsolete: true
Attachment #8519155 - Flags: review+
Let's try this again.
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.