A pain point that the debugger faces is that the debugger may open long after the page was opened. This currently means that ScriptSourceObjects can get GCed, meaning that when you open the debugger, it cannot easily find all sources that might have existed in the past.
We currently have an approach to avoid this where we accumulate an array of URLs for sources, and then refetch and reparse them manually. Personally I don't love this approach because:
- We have no way to guarantee that what we re-fetch is what ran on the page
- If that source had logged a console message, we lose the link between the console message and the source
- ScriptSourceObjects have a lot properties other than URLs, like line number, introductionType, script vs module parse goal, among others, and these can only be inferred from the URL and the content, at best.
These issues mean that we can't be 100% confident that the sources we are working with accurately reflect what is running in the page, and consistency and stability of source file handling is something we're trying to make very stable.
I'm filing this issue because I think we should revisit this approach and talk with the SpiderMonkey team to get thoughts for alternatives. For instance:
How do people view the current GC behavior? Could we instead strongly hold any source that has an explicit URL? Could we have a CompileOption that controlled how aggressive GC behavior of sources is? GCing evalled sources is fine because we don't show those in the debugger's list of files anyway. We'd only want to hold onto things that have a useful URL. Maybe we only clear them on memory pressure. I have no sense for how often sources even are GCed on the average page. I'd imagine if a source is simple enough to be GCed, it probably was pretty small anyway. Even pages using hot-reloading likely end up keeping scripts alive pretty often because only some of the files inside a bundle will end up being hot-reloaded over time, the ones that haven't changed will still be keeping the old bundle file alive in memory.
Is this even something that SpiderMonkey views as its responsibility? Should we instead leave it up to the DOM ScriptLoader to hold scripts live manually and maintain that list and evict things if we pass a threshold or get a memory pressure event? If the loader did try to do that, how would it do that? ScriptSourceObjects aren't part of the public API, so currently we'd have to hold the root JSScript live. Would that keep around a lot more memory? I know that scripts can relazify to reduce memory usage, but I don't have a sense of how much a full source file of lazy JSScript scripts takes up.
Related to that, the debugger currently handles cases where some JSScripts have been GCed by re-parsing the source text because it has to get breakpoint position metadata for the scripts that were GCed, so holding the root JSScript live would actually avoid that too if it's technically an option.