Closed Bug 640265 Opened 12 years ago Closed 12 years ago

GC during cx->calloc can cause invalid shape in a new JSObject


(Core :: JavaScript Engine, defect)

Not set



Tracking Status
firefox6 --- fixed
firefox7 --- fixed
firefox8 --- fixed
status1.9.2 --- unaffected


(Reporter: billm, Assigned: billm)



(Whiteboard: [sg:critical?][fixed-in-tracemonkey][qa-])


(1 file, 1 obsolete file)

If I understand bug 635137 correctly, cx->malloc is allowed to GC. I assume this means that cx->calloc can also GC. If that is true, then we have a problem. In JSObject::getEmptyShape, we use cx->calloc to allocate the array of empty shapes for an object. This happens before we set the object's |map| field to anything valid. Consequently, a GC during the cx->calloc could cause us to try to mark the invalid map as a shape.

To prove this, sticking a JS_GC call right before the cx->calloc call in JSObject::getEmptyShape causes an immediate crash on shell startup. It only works on opt builds, because in debug builds we initialize |map| to NULL right after allocating the object.

I'm mostly trying to get a sense of what our invariants are here. Is cx->calloc really allowed to GC? If it is, then we should add a GCZEAL option to GC every time a cx->malloc-related function executes.
Here's another, related, problem. The last lines of InitScopeForObject say the following:

    /* The GC nulls map initially. It should still be null on error. */
    return false;

However, the GC only NULLs map in debug builds, as far as I can tell. So I think we might be missing the map initialization here, too.
Yes, cx->*alloc() can GC (I am not a fan, but thats how it was decided to do it a while back).
Attached patch patch (obsolete) — Splinter Review
I talked this over with Andreas and Brendan. Eventually we decided that cx->malloc should not be running a GC. If some OOM error reporter tries to GC, that's a bug. This patch asserts if that ever happens.

Andreas, I didn't really understand how you want to fix the error reporters. Could you file a bug on that?
Assignee: general → wmccloskey
Attachment #518219 - Flags: review?(gal)
This seems sg:critical?, whether we fix it here or in a different bug.
Whiteboard: [sg:critical?]
The patch only asserts, doesn't fix anything. We need DOM-side changes to really fix this. Not sure this is sg:crit. Bill, do we have proof for that?
All I know is that if we GC during this particular cx->calloc, then we will write to memory at undefined locations. I'm not sure how easy it would be to control the location, or what gets written, but it seems possible.

Could you file a bug to fix it on the DOM side, Andreas? I don't understand your plan to fix it.

Alternatively, I could change the assert to simply return if we're reporting an error. That would cause one of these bad GCs to fail silently.
This bug's basically bug 635137 comment 0, but it was claimed there that any of the cx->malloc/cx->calloc/etc. functions could GC, and that was how it should be.  Has something changed?
What about simply disabling the GC when it is invoked from cx->malloc error reporter? It would also help to address the bug on branches.
(In reply to comment #7)
> This bug's basically bug 635137 comment 0, but it was claimed there that any of
> the cx->malloc/cx->calloc/etc. functions could GC, and that was how it should
> be.  Has something changed?

I think Igor first, then several of us in Bill's cube, realized that nesting GC in cx->*alloc is Not Good.

Andreas, I just wanted to remind you about this. I think the choice is that you can r+ this patch and file a bug to fix the error reporter, or else I can change the patch to bypass GC during error reporting.
Andreas, seems you think we need more work here, can you please elaborate and r- the patch if we don't want the patch that's already attached?
Brendan should review this. I think he liked this approach, and I think its wrong. He has seniority, so I am happy to yield on it :)
Attachment #518219 - Flags: review?(gal) → review?(brendan)
(In reply to comment #12)
> Brendan should review this. I think he liked this approach, and I think its
> wrong. He has seniority, so I am happy to yield on it :)

Do you disagree with the idea that cx->calloc shouldn't be allowed to GC? Or with the decision about whether we should just skip over the GC during error reporting versus ensuring we never GC during error reporting?

The problem is that if we stick with the current approach, someone will need to write some code in the error reporter, and you're the only candidate right now.
Whiteboard: [sg:critical?] → [sg:critical?][needs review]
Attached patch patchSplinter Review
I changed the patch so that it ignores GC requests during OOM reporting. This is the least invasive way to fix the problem.

Also, just to know for sure, I added GCs in malloc, calloc, and realloc. Just running jit-tests, I got over 20 failures. Some of them were dupes, but I saw at least several distinct bugs. I think that taking this patch is far easier than fixing all these.
Attachment #518219 - Attachment is obsolete: true
Attachment #518219 - Flags: review?(brendan)
Attachment #529145 - Flags: review?(dmandelin)
Comment on attachment 529145 [details] [diff] [review]

Review of attachment 529145 [details] [diff] [review]:

A couple of nits, fix if you agree they are valid.

::: js/src/jscntxt.cpp
@@ +826,2 @@

A C++ helper class here would be good. Obviously, it's not that important in this particular case, but if had a tiny class for temporarily setting variables, we could just use it in every case like this.

::: js/src/jscntxt.h
@@ +711,5 @@
+     * OOM reporting (in js_ReportOutOfMemory). If a GC is requested while
+     * reporting the OOM, we assert.
+     */
+    bool                 inOOMReport;

Comment left over from the first version? I think it wants to say we don't GC while reporting OOM.

@@ +723,5 @@
     void reduceGCTriggerBytes(uint32 amount);
      * Call the system malloc while checking for GC memory pressure and
+     * reporting OOM error when cx is not null. We cannot GC from here.

I can imagine someone reading "We cannot GC from here" as "We must not GC from here", although clearly "cannot GC" is correct by precise meaning of "cannot". Maybe, "GC from here will be prevented" or something?
Attachment #529145 - Flags: review?(dmandelin) → review+
Whiteboard: [sg:critical?][needs review] → [sg:critical?][fixed-in-tracemonkey]
Depends on: 658864
Closed: 12 years ago
Resolution: --- → FIXED
Can't tell if 1.9.2 is affected or not. Parts of the patch would apply but I don't know if the older engine could potentially GC during error reporting.
status1.9.2: --- → ?
Target Milestone: --- → mozilla6
blocking1.9.2: --- → ?
blocking1.9.2: ? → .21+
Sorry, I should have replied sooner. The immediate problem that was reported here definitely does not exist on 1.9.2. The way shapes work has totally changed, and I don't think we have anything like the emptyShapes array (at least I couldn't find one in the code).

So I don't think there's anything to fix here.

 Can we remove this for 1.9.2 based on what Bill says above?
Yep, I'll mark it unaffected.
blocking1.9.2: .23+ → ---
qa- as QA cannot verify this fix. Please reset to qa+ if a testcase can be provided to verify the fix.
Whiteboard: [sg:critical?][fixed-in-tracemonkey] → [sg:critical?][fixed-in-tracemonkey][qa-]
Group: core-security
You need to log in before you can comment on or make changes to this bug.