Closed Bug 1652126 Opened 4 years ago Closed 4 years ago

Cancelling off thread compilation requests are not successful if the parse task has not finished

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

RESOLVED FIXED
82 Branch
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- unaffected
firefox78 --- unaffected
firefox79 --- disabled
firefox80 --- disabled
firefox81 --- disabled
firefox82 --- fixed

People

(Reporter: denispal, Assigned: denispal)

References

(Regression)

Details

(Keywords: regression)

Crash Data

Attachments

(3 files, 2 obsolete files)

In scriptloader, there is a race condition present between the cancellation of a request and the off thread compilation of a script. If the cancellation occurs before the parse task is finished then there is no token available to provide to JS::CancelOffThreadScript and the compiled script will leak.

Off thread compilations from the scriptloader can run into a potential race condition if the request is canceled before the parse is finished. This can lead to a leak since the off-thread compiled script is never properly cleaned up.

The attached patch fixes the leaks, but I'm seeing if there's another possible way by cancelling tasks asynchronously instead so we don't block the main thread waiting.

Set release status flags based on info from the regressing bug 1606652

Summary: Wait before off thread parse is finished before cancelling requests → Cancelling off thread compilation requests are not successful if the parse task has not finished
Attachment #9162904 - Attachment is obsolete: true

Bug 1606652 introduced a regression where speculatively compiled JS scripts can leak if they are cancelled while the compilation is not finished. This occurs because the off thread token is not available until the callback occurs so when the cancel request is made, it simply returns because the token is null. The patch proposes a few changes:

  1. Add an optional parameter to the JS::Compile* and JS::Decode* functions that will return the parse task token right away.
  2. Wait for the off thread compilation to finish before cancelling any request. Most of the time, these will already be finished.
  3. Cancel off thread script compilations for any preload scripts when the load event fires to avoid leaks as they are likely not going to be executed.

Hi Denis, can you add a reviewer to your patch to see if it fixes bug 1265637?

Flags: needinfo?(dpalmeiro)

(In reply to Andreea Pavel [:apavel] from comment #5)

Hi Denis, can you add a reviewer to your patch to see if it fixes bug 1265637?

Hi Andreea. Yes, it should fix the bug but after talking to smaug about it I am looking to try another approach first that does this slightly differently. Unfortunately it's not a trivial fix and only reproducible on try so it's been taking a bit of time. However, if the failures are becoming a problem I can turn the original feature off that caused this until this bug is resolved.

Flags: needinfo?(dpalmeiro)

(In reply to Denis Palmeiro [:denispal] from comment #6)

(In reply to Andreea Pavel [:apavel] from comment #5)

Hi Denis, can you add a reviewer to your patch to see if it fixes bug 1265637?

Hi Andreea. Yes, it should fix the bug but after talking to smaug about it I am looking to try another approach first that does this slightly differently. Unfortunately it's not a trivial fix and only reproducible on try so it's been taking a bit of time. However, if the failures are becoming a problem I can turn the original feature off that caused this until this bug is resolved.

There are 325 total failures in the last 7 days on all trees, this a very high failure rate. If the fix takes longer than expected, then I would say turn the feature off if that's a safe approach.

Joel any insight on how to proceed here?

Flags: needinfo?(jmaher)

this is territory to backout, I assume getting the fix right and passing tests is not quick, so lets backout tomorrow if this is not up for review.

:denispal, do you need help?

Flags: needinfo?(jmaher)

(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #8)

this is territory to backout, I assume getting the fix right and passing tests is not quick, so lets backout tomorrow if this is not up for review.

:denispal, do you need help?

I think I should have a patch ready by EOD for review. Just doing some wider testing now. If the tests don't look good then I'll post a patch to flip dom.script_loader.external_scripts.speculative_omt_parse.enabled which should address these failures.

Denis, are there any updates here?

Flags: needinfo?(dpalmeiro)

(In reply to Cosmin Sabou [:CosminS] from comment #10)

Denis, are there any updates here?

Yes, I did disable the offending feature in bug 1654357.

Flags: needinfo?(dpalmeiro)

Thank you, then we should see a drop in failure rate in Bug 1265637 in the next few days.

No longer blocks: 1265637

It appears that the feature disabled in bug 1654357 was re-enabled by bug 1659483. Denis, do you have plans to work on this in the near future?

Blocks: 1265637
Flags: needinfo?(dpalmeiro)

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #13)

It appears that the feature disabled in bug 1654357 was re-enabled by bug 1659483. Denis, do you have plans to work on this in the near future?

Yes, I am actively working on this now. The patch I have so far dramatically reduces the frequency in which the leaks appear, but it still shows up occasionally. Just trying to understand what I'm missing atm.

Flags: needinfo?(dpalmeiro)
Attachment #9164378 - Attachment is obsolete: true
Pushed by dpalmeiro@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c50c7898c277 Provide OffThreadToken's immediately and allow cancellation of parse tasks that are not completely finished. r=jonco https://hg.mozilla.org/integration/autoland/rev/294c4c7bb3e8 Obtain an OffThreadToken immediately so parse tasks can be canceled anytime, and clean up dangling Runnables during cancellation. r=smaug https://hg.mozilla.org/integration/autoland/rev/59cc6a4d3a55 Re-enable browser_pdfjs_preview.js for asan builds. r=jmaher
Crash Signature: [@ mozilla::LinkedList<js::ParseTask>::~LinkedList()]
See Also: → 1666724
Regressions: 1665724
Regressions: 1673567
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: