Closed Bug 1423616 Opened 7 years ago Closed 1 year ago

Determine source of XDR corruption

Categories

(Core :: JavaScript Engine, enhancement, P2)

enhancement

Tracking

()

RESOLVED INCOMPLETE
Tracking Status
firefox-esr52 - wontfix
firefox59 - wontfix
firefox63 --- wontfix
firefox64 --- wontfix
firefox65 --- wontfix

People

(Reporter: tcampbell, Assigned: tcampbell)

References

(Depends on 1 open bug)

Details

(Keywords: sec-audit)

Bug 1418894 seems to reducing a number of wild-ptr crash signatures, and crashes that do get through (the mitigation can't verify 100%) seem to suggest that our XDR data is corrupted. We need to continue investigating to find a root cause.

Marking S-S due to the large number of wild-ptr crashes that result when mitigation isn't sufficient. As we narrow in on the root causes we risk discovering an attack strategy.
Group: core-security
Assignee: nobody → tcampbell
Priority: -- → P1
Blocks: 1373934
Group: core-security → javascript-core-security
Keywords: sec-audit
The mitigation for XDR-related crashes landed in Bug 1418894 has made a significant impact on a number of wild-ptr crash signatures. These bugs seem to originate from the XDR transcode buffer being corrupt (reason unknown, but I suspect StartupCache has threading problems). The result is that XDR loading builds invalid JavaScript state leading to objects being silently casted to the wrong type and crashing. The mitigation makes it unlikely for garbage data to transcode successfully and instead reporting the cache entry is empty (leading to normal startup instead of a crash).

The remaining crashes in Bug 1373934 occur while reading from transcode buffer byte-by-byte and walking off the end of a page. Despite the scary markAtom GC signature, these are read violations in sequential order of chrome code script. In theory, this is a potential read gadget for pointers, but risk is probably on the lower side.

I will put together an ESR52 patch for it and we should consider uplifting for stability reasons. Patch is reasonably low risk and helps remove a lot of confusing crash reports. It made quite a dent in crashes from 58.0b8 to 58.0b9.
Ted, do you mean that you've already landed what you think you can for 59 beta, and want to make a corresponding patch for esr?   Can you link to something that shows me the crash reports you're referring to? (This sounds great by the way)
Flags: needinfo?(tcampbell)
To summarize, there are two mitigation bugs landed: Bug 1418894 (FF58), Bug 1373934 (FF60). The early bug reduced crashes in 58/59 and I believe would be safe enough to uplift to ESR *if* that made sense. On the other-hand, I'm not seeing any of the relevant crash signatures on current ESR crash-stats.

Bug 1418894 has been be in beta for a while and rebases on ESR-52 with minor changes. I'd be comfortable landing it in ESR-52.
Daniel, do you think we should uplift?

The security impact if we don't is unclear (My feeling is we'll be on ESR60 soon enough, but I don't have experience here). If so, should I just uplift request on Bug 1418894? All these crash signatures and mitigations got tangled up in a mix of sec-severity. In general the mitigations are additional general checks and aren't pointing to a specific security issue (since we don't know the exact why).
Flags: needinfo?(tcampbell) → needinfo?(dveditz)
(In reply to Ted Campbell [:tcampbell] from comment #3)
> To summarize, there are two mitigation bugs landed: Bug 1418894 (FF58), Bug
> 1373934 (FF60). 

That second one should be: Bug 1438645 (FF60)
Doesn't sound like there's anything left to do for 59 here.

I'm feeling a bit ambivalent about ESR52 given the comment that we aren't really seeing these crashes there with the kind of volume we were getting on 58/59. And the crashes happen around startup, so they'd be difficult to exploit. Based on that, I'm going to call this wontfix but would listen to any arguments for why this is actually high-enough severity on ESR52 to warrant further consideration.
Without seeing any crashes on ESR-52 we don't know that we need this nor that the combined set of patches needed would be appropriate for that branch without reworking them. I agree with the wontfix for 52 and 59.
Flags: needinfo?(dveditz)
Is there still more investigation to do here? What's the remaining volume of crashes? (nbp says about 23,000 over 6 months, which would be medium-low; perhaps we mark this "stalled" and stop tracking it for now.)

Do we think the remaining crashes are still wild-ptr crashes caused by threading, or something else? (failed mmap I/O, for example.)
Flags: needinfo?(tcampbell)
Priority: P1 → P2
Updating tracking flags as we get closer to the 64 release.
Flags: needinfo?(tcampbell)
Depends on: 1548356
Severity: normal → S3
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → INCOMPLETE
Group: javascript-core-security
You need to log in before you can comment on or make changes to this bug.