Closed Bug 1266389 Opened 8 years ago Closed 22 days ago

32-bit Firefox fragments its address space and OOMs in emunittest suite.

Categories

(Core :: JavaScript Engine, defect)

43 Branch
x86
Windows 10
defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: jujjyl, Unassigned)

References

Details

The emunittest suite contains a set of real world asm.js and handwritten games repurposed as web tests/benchmarks.

Running any of the tests immediately after a fresh browser start succeeds on 32-bit Firefox. However, running the whole suite sequentially, 32-bit Firefox always OOMs in the middle of the suite run, in a JS exception about not being able to allocate a large typed array for the asm.js heap. However, allocating the same amount of memory in smaller slices does succeed. This suggests address space fragmentation is happening here.

64-bit Firefox can run the whole suite several times in a sequential torture mode without OOMing, and looking at about:memory and DMD after running, it doesn't look like there would exist any memory leaks that would be causing an issue here.
Bug 1266393 highlights a potential solution proposed by Johnny Stenback. Another one would be if we were able to aggressively compact natively allocated memory, or pool more aggressively, but it is unknown of what memory exactly would need to be subject to compacting. Bug 1266378 was written to discuss tooling to figure this out.
If DMD had a way to visualize the address space (bug 1266378), it will likely hit bug 1241166 if used on the emunittest suite.
Component: General → Untriaged
Product: Firefox → Core
Address space fragmentation is a big issue in 32-bit Firefox. For users running a 32-bit Firefox on 32-bit Windows (I don't know what percentage of our userbase that is, bsmedberg might have that information), there's only 2GB of usable address space available. For users running 32-bit Firefox on 64-bit Windows there's 4GB of usable address space, but fragmentation is still an issue. We use jemalloc for most memory allocation, which is pretty good about not fragmenting, but I believe the asm.js code calls VirtualAlloc directly.
asm.js only calls VirtualAlloc on 64-bit where fragmentation isn't an issue (and we do the protection tricks that require VirtualAlloc).  On 32-bit, we simply use the memory that has already been calloc()d for the ArrayBuffer passed in.
OOM crash reports do contain information about address space fragmentation, at least on Windows. I'm not really sure how to get that, though.

It would probably be useful to get about:memory reports after test runs. They should give some insight into whether it is JS or jemalloc that is getting more fragmented.
See Also: → defrag
(In reply to Andrew McCreight [:mccr8] from comment #5)
> OOM crash reports do contain information about address space fragmentation,
> at least on Windows. I'm not really sure how to get that, though.

I discussed this in bug 1266378 comment 2, it's in the minidumps and we have C++ code in Socorro's stackwalker that uses it to calculate some things.
Are you certain that the problem is address space fragmentation instead of heap fragmentation? Before we get too far down the path of diagnosing virtual space fragmentation or leakage, let's prove that's the cause.
Flags: needinfo?(jujjyl)
Good question - can you be more specific by how these are distinguished?

The data point I have is that generally one of the asm.js pages will fail to load on creating the large 128MB-512MB asm.js ArrayBuffer by throwing a JS out of memory exception. However allocating the same amount of memory in JS side as smaller typed arrays, e.g. in 16MB or 32MB chunks will succeed. I am not sure which kind of fragmentation issue to label this as. (When testing this, I typically write a u32 every few hundred kilobytes into the array after allocation to make sure that the array must be committed immediately, though not sure if that is needed or relevant)

There is some noise when testing this in emunittest suite, since sometimes the end result is a direct crash instead (bug 1248788, bug 1267571).

We do have a games partner who reported that overall about 8% of their game launches run into the same problem, and when I asked them to do the same kind of "allocate the memory as smaller chunks" to differentiate between if they really are out of memory, or just out of big linear space, they came back saying that 100% of the failures they are seeing are the out of big linear space kind, but that the small chunk allocations always succeeded.
Flags: needinfo?(jujjyl)
Component: Untriaged → JavaScript Engine
Jukka: if you have specific crash reports that they've encountered, we could look at the memory info and diagnose more properly.
The data from the games partner comes from their metrics board of analytics data they are gathering from users running in the wild. This means they don't receive crash reports either :/ We don't know how many users from their userbase run into real browser crashes, and they only see the ones that gracefully fail in JS OOM exceptions.
If you have specific URLs of games that they're seeing these crashes on we could try querying for crash reports by URL.
Has this been relieved at all by jonco's recent work to compact all the different types of gc things, instead of only JSObjects? I think it would have only just reached dev edition in the merge a couple days ago. I suppose probably not since emscripten-generated code isn't using very many GC things...
Depends on: 1277066, 1294017
Testing current Nightly 32-bit on the emunittest suite, which has now a public version that is accessible at

https://s3.amazonaws.com/mozilla-games/emunittest-public/index.html

the issues in 32-bit Firefox do persist and it's not able to get through the whole suite without OOMing. With Michael's work in bug 1277066, things are looking much better, however I haven't been able to run the full suite yet, due to a limitation in bug 1294017.
Severity: normal → S3
Status: NEW → RESOLVED
Closed: 22 days ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.