Closed Bug 1041148 Opened 5 years ago Closed 5 years ago
GC hazard with worker XMLHttp
With the patch for bug 1039551, I'm getting Windows-only, debug-only Mochitest-4 GC crashes. See the stack trace below. We're crashing here: NS_IMPL_CYCLE_COLLECTION_TRACE_BEGIN_INHERITED(XMLHttpRequest, nsXHREventTarget) NS_IMPL_CYCLE_COLLECTION_TRACE_JSVAL_MEMBER_CALLBACK(mStateData.mResponse) NS_IMPL_CYCLE_COLLECTION_TRACE_END Because mStateData.mResponse points to 0x4b4b4b4b memory (freed GC thing). When we assign this value in XMLHttpRequest::GetResponse: mStateData.mResponse = STRING_TO_JSVAL(str); It's still valid AFAICS. Somehow we're not tracing it and the GC frees it. Not sure how to debug this... CC experts, does anything about this code look wrong? 08:26:29 INFO - 0 mozjs.dll!js::gc::ArenaHeader::getAllocKind() [Heap.h:a4db87a48b24 : 509 + 0x42] 08:26:29 INFO - 1 mozjs.dll!js::gc::GetGCThingTraceKind(void const *) [jsgcinlines.h:a4db87a48b24 : 122 + 0xd] 08:26:29 INFO - 2 mozjs.dll!js::GCThingTraceKind(void *) [jsfriendapi.cpp:a4db87a48b24 : 649 + 0x5] 08:26:29 INFO - 3 xul.dll!xpc_GCThingIsGrayCCThing(void *) [nsXPConnect.cpp:a4db87a48b24 : 261 + 0x8] 08:26:29 INFO - 4 xul.dll!SnowWhiteKiller::Trace(JS::Heap<JS::Value> *,char const *,void *) [nsCycleCollector.cpp:a4db87a48b24 : 2649 + 0x9] 08:26:29 INFO - 5 xul.dll!mozilla::dom::workers::XMLHttpRequest::cycleCollection::Trace(void *,TraceCallbacks const &,void *) [XMLHttpRequest.cpp:a4db87a48b24 : 1615 + 0x12] 08:26:29 INFO - 6 xul.dll!SnowWhiteKiller::~SnowWhiteKiller() [nsCycleCollector.cpp:a4db87a48b24 : 2618 + 0x13]
(In reply to Jan de Mooij [:jandem] from comment #0) > We're crashing here: This is in dom/workers/XMLHttpRequest.cpp btw. I can reproduce this reliably on Windows so I'd be happy to test a patch or something. It's just hard to debug because I'm not very familiar with the CC and the XMLHttpRequest code. STR (on Windows): (1) Update to inbound revision a4db87a48b24 (includes the patch for bug 1039551 before it was backed out) (2) Build with --enable-debug --enable-optimize (3) Run: ./mach mochitest-plain dom/workers/test/ --start-at test_xhr.html It should hit an assertion failure.
We should be calling mozilla::HoldJSObjects(this) in that branch, no? Looks like the only caller of it for this class is UpdateState, which is presumably not being called here...
(In reply to Boris Zbarsky [:bz] from comment #2) > We should be calling mozilla::HoldJSObjects(this) in that branch, no? Looks > like the only caller of it for this class is UpdateState, which is > presumably not being called here... Wow thanks, I didn't know about that function. Glad I didn't spend a day or two debugging this :) But with that call it no longer crashes. I'll post a patch.
This seems to go back a while...
Assignee: nobody → jdemooij
Status: NEW → ASSIGNED
Attachment #8459167 - Flags: review?(bzbarsky)
Comment on attachment 8459167 [details] [diff] [review] Patch r=me, and bonus points for mStateData.mResponse.setString(str) too.
Attachment #8459167 - Flags: review?(bzbarsky) → review+
Taking the bonus points.
Comment on attachment 8459187 [details] [diff] [review] Patch v2 [Security approval request comment] > How easily could an exploit be constructed based on the patch? It's not trivial but it's pretty easy to see what's going on. > Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No. > Which older supported branches are affected by this flaw? All. > Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Should apply or be very simple to backport. > How likely is this patch to cause regressions; how much testing does it need? Very unlikely; doesn't need a lot of testing.
Attachment #8459187 - Flags: sec-approval?
sec-approval+ for checkin on 8/5 or later.
Attachment #8459187 - Flags: sec-approval? → sec-approval+
(In reply to Al Billings [:abillings] from comment #9) > sec-approval+ for checkin on 8/5 or later. Hm, this blocks bug 1039551. I guess we can wait a few weeks though; glad it isn't blocking other bugs.
I'm trying to limit our exposure window. Once we check in, "bad people" may notice the issue and it is a sec-critical. We're shipping 31 tomorrow and then the six week clock starts.
Is there some reason not to just call HoldJSObjects in the ctor? That would also make it harder to reverse engineer an exploit from the patch. The main thread XHR also tries to be clever about it, but I think it is a bad idea unless there's some compelling reason, like they rarely hold JS objects and the overhead of a hash table look up on object creation is too much.
Summary: Possible GC hazard with workers XMLHttpRequest? → GC hazard with worker XMLHttpRequest
bz, do you agree we should move the HoldJSObjects call to the constructor for the main thread and worker thread XMLHttpRequest? I can post a patch to do that.
Yes, I think we should do that.
(In reply to Jan de Mooij [:jandem] from comment #13) > I can post a patch to do that. Thanks! I should have done that before, as we've had various trouble with HoldJSObjects in XHR.
We should fix this on ESR24 and ESR31 as well.
Comment on attachment 8465288 [details] [diff] [review] Alternative fix r=me
Attachment #8465288 - Flags: review?(bzbarsky) → review+
This is okay to check in now.
Whiteboard: [checkin on 8/5]
Comment on attachment 8465288 [details] [diff] [review] Alternative fix Fix Landed on Version: 34 Approval Request Comment [Feature/regressing bug #]: old [User impact if declined]: sec bug [Describe test coverage new/current, TBPL]: this code is run on TBPL [Risks and why]: low, it just makes some behavior more consistent [String/UUID change made/needed]: none
Comment on attachment 8465288 [details] [diff] [review] Alternative fix Approving across the board so that this can land close to concurrently on all branches.
Attachment #8465288 - Flags: approval-mozilla-esr31?
Attachment #8465288 - Flags: approval-mozilla-esr31+
Attachment #8465288 - Flags: approval-mozilla-esr24?
Attachment #8465288 - Flags: approval-mozilla-beta?
Attachment #8465288 - Flags: approval-mozilla-beta+
Attachment #8465288 - Flags: approval-mozilla-aurora?
Attachment #8465288 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/0a77721ad4ad https://hg.mozilla.org/releases/mozilla-beta/rev/7b29fabbf26a https://hg.mozilla.org/releases/mozilla-esr31/rev/7d4e5ae524c7 https://hg.mozilla.org/releases/mozilla-b2g30_v1_4/rev/e435d92972d7 https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/720aa2dd84c8 I confirmed with mccr8 over IRC that this isn't actually needed on esr24 as it's a regression from bug 928312 which landed on 28+.
(In reply to Andrew McCreight [:mccr8] from comment #20) > https://hg.mozilla.org/integration/mozilla-inbound/rev/bb6a8372586f Thanks for taking care of this while I was away!
Marking this as qe-verify- as there's no test case(s) that can be used to verify this code change. If this was still an issue, it would have been spotted under tbpl. I also double checked with :jandem just to see if there's anything that can be tested/verified manually. If there's anything else that QE can do to test/verify this change, please let us know and needinfo!
You need to log in before you can comment on or make changes to this bug.