Closed Bug 1726639 Opened 3 years ago Closed 3 years ago

Bogus hazard reported fillInAfterSwap on desktop

Categories

(Core :: JavaScript Engine, defect)

defect

Tracking

()

RESOLVED INVALID

People

(Reporter: sfink, Unassigned)

Details

(Keywords: sec-other)

I just ran the hazard analysis locally, expecting it to be clean, and I got the following hazard:

Function '_ZN2js12NativeObject15fillInAfterSwapEP9JSContextN2JS6HandleIPS0_EES5_NS4_INS3_13StackGCVectorINS3_5ValueENS_15TempAllocPolicyEEEEEPv$uint8 js::NativeObject::fillInAfterSwap(JSContext*, JS::Handle<js:
:NativeObject*>, js::NativeObject*, JS::Handle<JS::StackGCVector<JS::Value> >, void*)' has unrooted 'old' of type 'js::NativeObject*' live across GC call '_ZN2js12NativeObject28changeNumFixedSlotsAfterSwapEP9JS
ContextN2JS6HandleIPS0_EEj$uint8 js::NativeObject::changeNumFixedSlotsAfterSwap(JSContext*, JS::Handle<js::NativeObject*>, uint32)' at js/src/vm/JSObject.cpp:1310
    ...
GC Function: _ZN2js12NativeObject28changeNumFixedSlotsAfterSwapEP9JSContextN2JS6HandleIPS0_EEj$uint8 js::NativeObject::changeNumFixedSlotsAfterSwap(JSContext*, JS::Handle<js::NativeObject*>, uint32)
    uint8 js::Shape::replaceShape(JSContext*, JS::Handle<JSObject*>, js::EnumFlags<js::ObjectFlag>, js::TaggedProto, uint32)
    js::BaseShape* js::BaseShape::get(JSContext*, JSClass*, JS::Realm*, JS::Handle<js::TaggedProto>)
    js::BaseShape* js::Allocate(JSContext*) [with T = js::BaseShape; js::AllowGC allowGC = js::CanGC]
    js::gc::Cell* js::AllocateTenuredImpl(JSContext*, uint8, size_t) [with js::AllowGC allowGC = js::CanGC; size_t = long unsigned int]
    uint8 js::gc::GCRuntime::checkAllocatorState(JSContext*, uint8) [with js::AllowGC allowGC = js::CanGC]
    uint8 js::gc::GCRuntime::gcIfNeededAtAllocation(JSContext*)
    uint8 js::gc::GCRuntime::gcIfRequested()
    void js::gc::GCRuntime::minorGC(int32, uint8)
    GC

That hazard looks valid to me. I don't see why it isn't being reported in treeherder.

The relevant code is

/* static */
bool NativeObject::fillInAfterSwap(JSContext* cx, HandleNativeObject obj,
                                   NativeObject* old,
                                   HandleValueVector values) {
  // This object has just been swapped with some other object, and its shape
  // no longer reflects its allocated size. Correct this information and
  // fill the slots in with the specified values.
  MOZ_ASSERT(obj->slotSpan() == values.length());
  MOZ_ASSERT(!IsInsideNursery(obj));

  // Make sure the shape's numFixedSlots() is correct.
  size_t nfixed =
      gc::GetGCKindSlots(obj->asTenured().getAllocKind(), obj->getClass());
  if (nfixed != obj->shape()->numFixedSlots()) {
    if (!NativeObject::changeNumFixedSlotsAfterSwap(cx, obj, nfixed)) {
      return false;
    }
    MOZ_ASSERT(obj->shape()->numFixedSlots() == nfixed);
  }

  uint32_t oldDictionarySlotSpan =
      obj->inDictionaryMode() ? obj->dictionaryModeSlotSpan() : 0;

  Zone* zone = obj->zone();
  if (obj->hasDynamicSlots()) {
    ObjectSlots* slotsHeader = obj->getSlotsHeader();
    size_t size = ObjectSlots::allocSize(slotsHeader->capacity());
    zone->removeCellMemory(old, size, MemoryUse::ObjectSlots);
    js_free(slotsHeader);
    obj->setEmptyDynamicSlots(0);
  }
  ...

so we have obj passed in as a Handle, but old as a bare object, and old is used after NativeObject::changeNumFixedSlotsAfterSwap is called. It seems like a clear hazard. I don't know why it wouldn't be reported, unless somehow the analysis thinks it's only being called from within the GC or something. (Which is definitely not the case.)

This was touched by bug 1719457, but in an irrelevant way. The relevant code seems to have been introduced in bug 1395509.

Regressed by: 1395509
Group: core-security → javascript-core-security

There is an AutoSuppressGC in the JSObject::swap caller. This could definitely be cleaned up or made more obvious though...

Oh, thank you! I missed that. Now I need to diagnose why that isn't getting handled properly in my local run.

Current guess: in my local run, I have to keep rerunning the build until it stops OOMing. I bet it's losing some information, specifically a call edge here that it needs to see that this is always within an AutoSuppressGC scope. I guess I really need to figure out why it's OOMing on my threadripper.

Per comment 2 it sounds like this is not a real hazard, so I'm going to mark this sec-other. Please adjust as appropriate if I am misunderstanding something.

Keywords: sec-other

(In reply to Andrew McCreight [:mccr8] from comment #4)

Per comment 2 it sounds like this is not a real hazard, so I'm going to mark this sec-other. Please adjust as appropriate if I am misunderstanding something.

That was correct, but I have now tracked this down and this was a result of stale state on my desktop and therefore not something that will happen in automation. I will open this bug up and mark it INVALID.

_ZN2js12NativeObject15fillInAfterSwapEP9JSContextN2JS6HandleIPS0_EES5_NS4_INS3_13StackGCVectorINS3_5ValueENS_15TempAllocPolicyEEEEE is correctly known to the analysis to be in a no-GC region. But the hazard is being reported in _ZN2js12NativeObject15fillInAfterSwapEP9JSContextN2JS6HandleIPS0_EES5_NS4_INS3_13StackGCVectorINS3_5ValueENS_15TempAllocPolicyEEEEEPv.

The only difference there is the "Pv" at the end, which is a final void* parameter. That was removed in bug 1719457.

So what happened here is that I did a build before rebasing on top of that bug and recorded a function definition with the final void* parameter. I then rebased and did an incremental analysis, where it correctly recognized that the new, now void*-less function is in a GC suppression region. But the analysis loops over all known functions, which includes stale cruft lingering in the database. The old version of the function is not called in a GC suppression region, since it is no longer called at all, and therefore hazards in its body are reported as hazards.

To make this closer to correct, I could sweep through the DB and expire everything in a recompiled compilation unit or something. Which still wouldn't be perfect, because particularly unified compilation units can shift around. I guess the better fix would be to not do incremental analyses, or at least be aware of this risk when doing them.

Group: javascript-core-security
Status: NEW → RESOLVED
Closed: 3 years ago
No longer regressed by: 1395509
Resolution: --- → INVALID
Summary: Hazard in fillInAfterSwap → Bogus hazard reported fillInAfterSwap on desktop
You need to log in before you can comment on or make changes to this bug.