Closed Bug 1600895 Opened 2 years ago Closed 2 years ago

ThreadSanitizer: data race [@ updateEdge<js::BaseShape>] vs. [@ isForwarded]

Categories

(Core :: JavaScript: GC, defect, P1)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla76
Tracking Status
firefox-esr68 --- unaffected
firefox72 --- wontfix
firefox74 --- wontfix
firefox75 --- wontfix
firefox76 --- fixed

People

(Reporter: decoder, Assigned: jonco)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [retriggered][stockwell fixed:patch])

Attachments

(2 files, 3 obsolete files)

The attached crash information was detected while running CI tests with ThreadSanitizer on mozilla-central revision 66531295716a.

Filing as requested by :jonco because this race is not expected by JS devs.

General information about TSan reports

Why fix races?

Data races are undefined behavior and can cause crashes as well as correctness issues. Compiler optimizations can cause racy code to have unpredictable and hard-to-reproduce behavior.

Rating

If you think this race can cause crashes or correctness issues, it would be great to rate the bug appropriately as P1/P2 and/or indicating this in the bug. This makes it a lot easier for us to assess the actual impact that these reports make and if they are helpful to you.

False Positives / Benign Races

Typically, races reported by TSan are not false positives [1], but it is possible that the race is benign. Even in this case it would be nice to come up with a fix if it is easily doable and does not regress performance. Every race that we cannot fix will have to remain on the suppression list and slows down the overall TSan performance. Also note that seemingly benign races can possibly be harmful (also depending on the compiler, optimizations and the architecture) [2][3].

[1] One major exception is the involvement of uninstrumented code from third-party libraries.
[2] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
[3] How to miscompile programs with "benign" data races: https://www.usenix.org/legacy/events/hotpar11/tech/final_files/Boehm.pdf

Suppressing unfixable races

If the bug cannot be fixed, then a runtime suppression needs to be added in mozglue/build/TsanOptions.cpp. The suppressions match on the full stack, so it should be picked such that it is unique to this particular race. The bug number of this bug should also be included so we have some documentation on why this suppression was added.

Assignee: nobody → jcoppeard
Priority: -- → P1
See Also: → 1617287

This is a race during compacting GC between updating a shape's base shape and testing whether the shape is forwarded when updating an object group. This is possible because a shape's base shape pointer is stored in the first word of the shape, which is also used for the forwarded flag. Updating the shape's base shape pointer will clear the forwarded flag. However we only update the pointers in ummoved GC things or the new copy of a GC moved thing so this flag will never be set.

So I think this is a 'benign' race. It should possible to reorder the shape layout to stop this happening though.

This makes sure we will detect use of the contents of moved GC things during the update phase of the GC, not just when we return to the mutator as previously. Annoyingly we need to preserve contents for native objects with fixed elements because the elements flags are stored there and these may be accessed from other objects if they are COW elements.

This caught a use of unforwarded scripts during invalidation caused by OOM during sweeping type information.

Depends on D63871

I think the race is benign: we're checking isForwarded() and this will be false both before and after we update the base shape pointer. We can avoid the race altogether though by rearranging the shape fields. There's already some unused space in Shape due to alignment so we can move this to the start of the object where the GC use it for flags without distrubing anything else.

Depends on D63872

Attachment #9128519 - Attachment is obsolete: true

The problem here is concerns the fact that during compacting GC we parallelise updates to cells. This is important to reduce pause times as we can't return to the mutator until this work is complete. The race occurs as described in comment 3 when two updates are occurring in parallel: an unmoved GC thing has the GC thing pointer in its first word updated, and another cell tests a pointer to the first GC thing to see if it is forwarded. Whichever order these operations occur in the result is the same, however this is still a data race.

I looked at other GC things this could apply to and this can affect JSObjects and Scopes as well as Shapes. The JSObject case is interesting since we have JSObject to JSObject pointers which means we can't solve this by updating cells in separate phases.

Depends on: 1618131

I'll move the patches to bug 1618131 as they don't actually fix this race.

Comment on attachment 9128517 [details]
Bug 1600895 - Assert that compacting GC only ever updates unmoved GC things or new copies of moved things and never the old copy r?jandem

Revision D63871 was moved to bug 1618131. Setting attachment 9128517 [details] to obsolete.

Attachment #9128517 - Attachment is obsolete: true

Comment on attachment 9128518 [details]
Bug 1600895 - Poision moved GC thing contents sooner r?jandem

Revision D63872 was moved to bug 1618131. Setting attachment 9128518 [details] to obsolete.

Attachment #9128518 - Attachment is obsolete: true
Duplicate of this bug: 1622969
Duplicate of this bug: 1624027

There are 82 total failures since March 19th, on linux64-tsan opt.

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=294174804&repo=autoland&lineNumber=1796

[task 2020-03-21T07:09:49.056Z] 07:09:49 INFO - TEST-OK | dom/abort/tests/test_abort_controller_fetch.html | took 1292ms
[task 2020-03-21T07:09:49.240Z] 07:09:49 INFO - TEST-START | dom/abort/tests/test_event_listener_leaks.html
[task 2020-03-21T07:09:51.601Z] 07:09:51 INFO - GECKO(2347) | MEMORY STAT | vsize 130550465MB | residentFast 564MB
[task 2020-03-21T07:09:51.703Z] 07:09:51 INFO - TEST-OK | dom/abort/tests/test_event_listener_leaks.html | took 2462ms
[task 2020-03-21T07:09:51.893Z] 07:09:51 INFO - TEST-START | Shutdown
[task 2020-03-21T07:09:51.893Z] 07:09:51 INFO - Passed: 20
[task 2020-03-21T07:09:51.895Z] 07:09:51 INFO - Failed: 0
[task 2020-03-21T07:09:51.895Z] 07:09:51 INFO - Todo: 0
[task 2020-03-21T07:09:51.897Z] 07:09:51 INFO - Mode: e10s
[task 2020-03-21T07:09:51.898Z] 07:09:51 INFO - Slowest: 2461ms - /tests/dom/abort/tests/test_event_listener_leaks.html
[task 2020-03-21T07:09:51.900Z] 07:09:51 INFO - SimpleTest FINISHED
[task 2020-03-21T07:09:51.902Z] 07:09:51 INFO - TEST-INFO | Ran 1 Loops
[task 2020-03-21T07:09:51.918Z] 07:09:51 INFO - SimpleTest FINISHED
[task 2020-03-21T07:10:02.222Z] 07:10:02 INFO - GECKO(2347) | 1584774602211 Marionette TRACE Received observer notification xpcom-will-shutdown
[task 2020-03-21T07:10:02.223Z] 07:10:02 INFO - GECKO(2347) | 1584774602213 Marionette INFO Stopped listening on port 2828
[task 2020-03-21T07:10:02.224Z] 07:10:02 INFO - GECKO(2347) | 1584774602213 Marionette DEBUG Marionette stopped listening
[task 2020-03-21T07:10:03.471Z] 07:10:03 INFO - TEST-INFO | Main app process: exit 0
[task 2020-03-21T07:10:03.472Z] 07:10:03 INFO - runtests.py | Application ran for: 0:01:04.739370
[task 2020-03-21T07:10:03.473Z] 07:10:03 INFO - zombiecheck | Reading PID log: /tmp/tmpzF_pFIpidlog
[task 2020-03-21T07:10:03.473Z] 07:10:03 INFO - ==> process 2347 launched child process 2362
[task 2020-03-21T07:10:03.474Z] 07:10:03 INFO - ==> process 2347 launched child process 2405
[task 2020-03-21T07:10:03.474Z] 07:10:03 INFO - ==> process 2347 launched child process 2459
[task 2020-03-21T07:10:03.475Z] 07:10:03 INFO - ==> process 2347 launched child process 2513
[task 2020-03-21T07:10:03.475Z] 07:10:03 INFO - ==> process 2347 launched child process 2537
[task 2020-03-21T07:10:03.476Z] 07:10:03 INFO - zombiecheck | Checking for orphan process with PID: 2513
[task 2020-03-21T07:10:03.476Z] 07:10:03 INFO - zombiecheck | Checking for orphan process with PID: 2362
[task 2020-03-21T07:10:03.476Z] 07:10:03 INFO - zombiecheck | Checking for orphan process with PID: 2459
[task 2020-03-21T07:10:03.477Z] 07:10:03 INFO - zombiecheck | Checking for orphan process with PID: 2405
[task 2020-03-21T07:10:03.477Z] 07:10:03 INFO - zombiecheck | Checking for orphan process with PID: 2537
[task 2020-03-21T07:10:03.478Z] 07:10:03 INFO - Stopping web server
[task 2020-03-21T07:10:03.498Z] 07:10:03 INFO - Stopping web socket server
[task 2020-03-21T07:10:03.519Z] 07:10:03 INFO - Stopping ssltunnel
[task 2020-03-21T07:10:03.543Z] 07:10:03 WARNING - leakcheck | refcount logging is off, so leaks can't be detected!
[task 2020-03-21T07:10:03.543Z] 07:10:03 INFO - runtests.py | Running tests: end.

[task 2020-03-21T07:10:43.365Z] 07:10:43 INFO - GECKO(2638) | WARNING: ThreadSanitizer: data race (pid=2638)
[task 2020-03-21T07:10:43.370Z] 07:10:43 INFO - GECKO(2638) | Read of size 8 at 0x7fba8d391bb0 by main thread:
[task 2020-03-21T07:10:43.373Z] 07:10:43 INFO - GECKO(2638) | #0 baseUnowned /builds/worker/checkouts/gecko/js/src/vm/Shape.h:863:10 (libxul.so+0x6871d73)
[task 2020-03-21T07:10:43.374Z] 07:10:43 INFO - GECKO(2638) | #1 unowned /builds/worker/checkouts/gecko/js/src/vm/Shape.h:853:22 (libxul.so+0x6871d73)
[task 2020-03-21T07:10:43.376Z] 07:10:43 INFO - GECKO(2638) | #2 js::Shape::fixupShapeTreeAfterMovingGC() /builds/worker/checkouts/gecko/js/src/vm/Shape.cpp:1968:39 (libxul.so+0x6871d73)
[task 2020-03-21T07:10:43.377Z] 07:10:43 INFO - GECKO(2638) | #3 js::Shape::fixupAfterMovingGC() /builds/worker/checkouts/gecko/js/src/vm/Shape.cpp:1994:5 (libxul.so+0x68721a7)
[task 2020-03-21T07:10:43.378Z] 07:10:43 INFO - GECKO(2638) | #4 UpdateCellPointers<js::Shape> /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:2175:9 (libxul.so+0x6c5f5ca)
[task 2020-03-21T07:10:43.380Z] 07:10:43 INFO - GECKO(2638) | #5 UpdateArenaPointersTyped<js::Shape> /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:2182:5 (libxul.so+0x6c5f5ca)
[task 2020-03-21T07:10:43.381Z] 07:10:43 INFO - GECKO(2638) | #6 UpdateArenaPointers /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:2198:5 (libxul.so+0x6c5f5ca)
[task 2020-03-21T07:10:43.382Z] 07:10:43 INFO - GECKO(2638) | #7 UpdateArenaListSegmentPointers(js::gc::GCRuntime*, ArenaListSegment const&) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:2222:5 (libxul.so+0x6c5f5ca)
[task 2020-03-21T07:10:43.384Z] 07:10:43 INFO - GECKO(2638) | #8 js::gc::GCRuntime::updateCellPointers(JS::Zone*, mozilla::EnumSet<js::gc::AllocKind, unsigned long>) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:2374:5 (libxul.so+0x6c5de47)
[task 2020-03-21T07:10:43.385Z] 07:10:43 INFO - GECKO(2638) | #9 updateAllCellPointers /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:2435:3 (libxul.so+0x6c5ff63)
[task 2020-03-21T07:10:43.386Z] 07:10:43 INFO - GECKO(2638) | #10 js::gc::GCRuntime::updateZonePointersToRelocatedCells(JS::Zone*) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:2474:3 (libxul.so+0x6c5ff63)
[task 2020-03-21T07:10:43.388Z] 07:10:43 INFO - GECKO(2638) | #11 js::gc::GCRuntime::compactPhase(JS::GCReason, js::SliceBudget&, js::gc::AutoGCSession&) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:6130:7 (libxul.so+0x6c78802)
[task 2020-03-21T07:10:43.389Z] 07:10:43 INFO - GECKO(2638) | #12 js::gc::GCRuntime::incrementalSlice(js::SliceBudget&, mozilla::Maybe<JSGCInvocationKind> const&, JS::GCReason, js::gc::AutoGCSession&) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:6577:13 (libxul.so+0x6c7a71a)
[task 2020-03-21T07:10:43.390Z] 07:10:43 INFO - GECKO(2638) | #13 js::gc::GCRuntime::gcCycle(bool, js::SliceBudget, mozilla::Maybe<JSGCInvocationKind> const&, JS::GCReason) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:6940:3 (libxul.so+0x6c7c049)
[task 2020-03-21T07:10:43.392Z] 07:10:43 INFO - GECKO(2638) | #14 js::gc::GCRuntime::collect(bool, js::SliceBudget, mozilla::Maybe<JSGCInvocationKind> const&, JS::GCReason) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:7123:9 (libxul.so+0x6c7d5e5)
[task 2020-03-21T07:10:43.393Z] 07:10:43 INFO - GECKO(2638) | #15 js::gc::GCRuntime::gcSlice(JS::GCReason, long) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:7215:3 (libxul.so+0x6c7de6a)
[task 2020-03-21T07:10:43.394Z] 07:10:43 INFO - GECKO(2638) | #16 JS::IncrementalGCSlice(JSContext*, JS::GCReason, long) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:8045:21 (libxul.so+0x6c81627)
[task 2020-03-21T07:10:43.396Z] 07:10:43 INFO - GECKO(2638) | #17 nsJSContext::GarbageCollectNow(JS::GCReason, nsJSContext::IsIncremental, nsJSContext::IsShrinking, long) /builds/worker/checkouts/gecko/dom/base/nsJSEnvironment.cpp:1170:5 (libxul.so+0x263e99a)
[task 2020-03-21T07:10:43.397Z] 07:10:43 INFO - GECKO(2638) | #18 InterSliceGCRunnerFired(mozilla::TimeStamp, void*) /builds/worker/checkouts/gecko/dom/base/nsJSEnvironment.cpp:1741:3 (libxul.so+0x2644ce2)
[task 2020-03-21T07:10:43.398Z] 07:10:43 INFO - GECKO(2638) | #19 operator() /builds/worker/checkouts/gecko/dom/base/nsJSEnvironment.cpp:2330:22 (libxul.so+0x265412f)
[task 2020-03-21T07:10:43.400Z] 07:10:43 INFO - GECKO(2638) | #20 std::_Function_handler<bool (mozilla::TimeStamp), DOMGCSliceCallback(JSContext*, JS::GCProgress, JS::GCDescription const&)::$_18>::_M_invoke(std::_Any_data const&, mozilla::TimeStamp&&) /builds/worker/fetches/clang/bin/../lib/gcc/x86_64-unknown-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/std_function.h:301:9 (libxul.so+0x265412f)
[task 2020-03-21T07:10:43.401Z] 07:10:43 INFO - GECKO(2638) | #21 operator() /builds/worker/fetches/clang/bin/../lib/gcc/x86_64-unknown-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/std_function.h:706:14 (libxul.so+0xa88b5e)
[task 2020-03-21T07:10:43.403Z] 07:10:43 INFO - GECKO(2638) | #22 mozilla::IdleTaskRunner::Run() /builds/worker/checkouts/gecko/xpcom/threads/IdleTaskRunner.cpp:58:14 (libxul.so+0xa88b5e)
[task 2020-03-21T07:10:43.404Z] 07:10:43 INFO - GECKO(2638) | #23 nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1220:14 (libxul.so+0xaa9fc1)
[task 2020-03-21T07:10:43.405Z] 07:10:43 INFO - GECKO(2638) | #24 NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:481:10 (libxul.so+0xaaea62)
[task 2020-03-21T07:10:43.407Z] 07:10:43 INFO - GECKO(2638) | #25 mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) /builds/worker/checkouts/gecko/ipc/glue/MessagePump.cpp:87:21 (libxul.so+0x132b289)
[task 2020-03-21T07:10:43.408Z] 07:10:43 INFO - GECKO(2638) | #26 RunInternal /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:315:10 (libxul.so+0x12bc3ac)
[task 2020-03-21T07:10:43.409Z] 07:10:43 INFO - GECKO(2638) | #27 RunHandler /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:308:3 (libxul.so+0x12bc3ac)
[task 2020-03-21T07:10:43.411Z] 07:10:43 INFO - GECKO(2638) | #28 MessageLoop::Run() /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:290:3 (libxul.so+0x12bc3ac)
[task 2020-03-21T07:10:43.412Z] 07:10:43 INFO - GECKO(2638) | #29 nsBaseAppShell::Run() /builds/worker/checkouts/gecko/widget/nsBaseAppShell.cpp:137:27 (libxul.so+0x47830d3)
[task 2020-03-21T07:10:43.413Z] 07:10:43 INFO - GECKO(2638) | #30 nsAppStartup::Run() /builds/worker/checkouts/gecko/toolkit/components/startup/nsAppStartup.cpp:271:30 (libxul.so+0x62c99c5)
[task 2020-03-21T07:10:43.415Z] 07:10:43 INFO - GECKO(2638) | #31 XREMain::XRE_mainRun() /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:4614:22 (libxul.so+0x63dc5fa)
[task 2020-03-21T07:10:43.416Z] 07:10:43 INFO - GECKO(2638) | #32 XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:4749:8 (libxul.so+0x63dd1e3)
[task 2020-03-21T07:10:43.417Z] 07:10:43 INFO - GECKO(2638) | #33 XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:4800:21 (libxul.so+0x63dd534)
[task 2020-03-21T07:10:43.419Z] 07:10:43 INFO - GECKO(2638) | #34 mozilla::BootstrapImpl::XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/Bootstrap.cpp:45:12 (libxul.so+0x63e8022)
[task 2020-03-21T07:10:43.420Z] 07:10:43 INFO - GECKO(2638) | #35 do_main /builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp:217:22 (firefox+0xc8a33)
[task 2020-03-21T07:10:43.421Z] 07:10:43 INFO - GECKO(2638) | #36 main /builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp:331:16 (firefox+0xc8a33)
[task 2020-03-21T07:10:43.423Z] 07:10:43 INFO - GECKO(2638) | Previous write of size 8 at 0x7fba8d391bb0 by thread T11:
[task 2020-03-21T07:10:43.425Z] 07:10:43 INFO - GECKO(2638) | [failed to restore the stack]
[task 2020-03-21T07:10:43.425Z] 07:10:43 INFO - GECKO(2638) | Thread T11 'JS Helper' (tid=2656, running) created by main thread at:
[task 2020-03-21T07:10:43.426Z] 07:10:43 INFO - GECKO(2638) | #0 pthread_create /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:967:3 (firefox+0x571ab)
[task 2020-03-21T07:10:43.426Z] 07:10:43 INFO - GECKO(2638) | #1 js::Thread::create(void* ()(void), void*) /builds/worker/checkouts/gecko/js/src/threading/posix/PosixThread.cpp:52:7 (libxul.so+0x6629d93)
[task 2020-03-21T07:10:43.426Z] 07:10:43 INFO - GECKO(2638) | #2 bool js::Thread::init<void (&)(void*), js::HelperThread*>(void (&)(void*), js::HelperThread*&&) /builds/worker/checkouts/gecko/js/src/threading/Thread.h:91:12 (libxul.so+0x6727f02)
[task 2020-03-21T07:10:43.427Z] 07:10:43 INFO - GECKO(2638) | #3 js::GlobalHelperThreadState::ensureInitialized() /builds/worker/checkouts/gecko/js/src/vm/HelperThreads.cpp:1158:27 (libxul.so+0x6722b31)
[task 2020-03-21T07:10:43.427Z] 07:10:43 INFO - GECKO(2638) | #4 js::EnsureHelperThreadsInitialized() /builds/worker/checkouts/gecko/js/src/vm/HelperThreads.cpp:94:30 (libxul.so+0x6722812)
[task 2020-03-21T07:10:43.428Z] 07:10:43 INFO - GECKO(2638) | #5 JSRuntime::init(JSContext*, unsigned int) /builds/worker/checkouts/gecko/js/src/vm/Runtime.cpp:200:32 (libxul.so+0x68346a2)
[task 2020-03-21T07:10:43.428Z] 07:10:43 INFO - GECKO(2638) | #6 js::NewContext(unsigned int, JSRuntime*) /builds/worker/checkouts/gecko/js/src/vm/JSContext.cpp:170:17 (libxul.so+0x6771d2d)
[task 2020-03-21T07:10:43.428Z] 07:10:43 INFO - GECKO(2638) | #7 JS_NewContext(unsigned int, JSRuntime*) /builds/worker/checkouts/gecko/js/src/jsapi.cpp:391:10 (libxul.so+0x65b6cd4)
[task 2020-03-21T07:10:43.429Z] 07:10:43 INFO - GECKO(2638) | #8 mozilla::CycleCollectedJSContext::Initialize(JSRuntime*, unsigned int) /builds/worker/checkouts/gecko/xpcom/base/CycleCollectedJSContext.cpp:128:16 (libxul.so+0x9b306b)
[task 2020-03-21T07:10:43.429Z] 07:10:43 INFO - GECKO(2638) | #9 XPCJSContext::Initialize() /builds/worker/checkouts/gecko/js/xpconnect/src/XPCJSContext.cpp:1104:32 (libxul.so+0x18bda2a)
[task 2020-03-21T07:10:43.429Z] 07:10:43 INFO - GECKO(2638) | #10 XPCJSContext::NewXPCJSContext() /builds/worker/checkouts/gecko/js/xpconnect/src/XPCJSContext.cpp:1300:23 (libxul.so+0x18be587)
[task 2020-03-21T07:10:43.434Z] 07:10:43 INFO - runtests.py | Waiting for browser...
[task 2020-03-21T07:10:43.435Z] 07:10:43 INFO - GECKO(2638) | #11 nsXPConnect::InitJSContext() /builds/worker/checkouts/gecko/js/xpconnect/src/nsXPConnect.cpp:80:25 (libxul.so+0x18f8903)
[task 2020-03-21T07:10:43.436Z] 07:10:43 INFO - GECKO(2638) | #12 xpc::InitializeJSContext() /builds/worker/checkouts/gecko/js/xpconnect/src/nsXPConnect.cpp:98:35 (libxul.so+0x18f89bf)
[task 2020-03-21T07:10:43.438Z] 07:10:43 INFO - GECKO(2638) | #13 XREMain::XRE_mainRun() /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:4387:5 (libxul.so+0x63dbf4f)
[task 2020-03-21T07:10:43.438Z] 07:10:43 INFO - GECKO(2638) | #14 XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:4749:8 (libxul.so+0x63dd1e3)
[task 2020-03-21T07:10:43.438Z] 07:10:43 INFO - GECKO(2638) | #15 XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp:4800:21 (libxul.so+0x63dd534)
[task 2020-03-21T07:10:43.441Z] 07:10:43 INFO - GECKO(2638) | #16 mozilla::BootstrapImpl::XRE_main(int, char**, mozilla::BootstrapConfig const&) /builds/worker/checkouts/gecko/toolkit/xre/Bootstrap.cpp:45:12 (libxul.so+0x63e8022)
[task 2020-03-21T07:10:43.441Z] 07:10:43 INFO - GECKO(2638) | #17 do_main /builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp:217:22 (firefox+0xc8a33)
[task 2020-03-21T07:10:43.441Z] 07:10:43 INFO - GECKO(2638) | #18 main /builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp:331:16 (firefox+0xc8a33)
[task 2020-03-21T07:10:43.442Z] 07:10:43 INFO - GECKO(2638) | SUMMARY: ThreadSanitizer: data race /builds/worker/checkouts/gecko/js/src/vm/Shape.h:863:10 in baseUnowned
[task 2020-03-21T07:10:43.442Z] 07:10:43 INFO - GECKO(2638) | ==================
[task 2020-03-21T07:10:43.443Z] 07:10:43 INFO - GECKO(2638) | Exiting due to channel error.
[task 2020-03-21T07:10:43.444Z] 07:10:43 INFO - GECKO(2638) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=19.1251) [GFX1-]: Receive IPC close with reason=AbnormalShutdown
[task 2020-03-21T07:10:43.444Z] 07:10:43 INFO - GECKO(2638) | Exiting due to channel error.
[task 2020-03-21T07:10:43.444Z] 07:10:43 INFO - GECKO(2638) | Exiting due to channel error.
[task 2020-03-21T07:10:44.493Z] 07:10:44 INFO - GECKO(2638) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=20.9684)
[task 2020-03-21T07:10:44.500Z] 07:10:44 INFO - TEST-INFO | Main app process: killed by SIGIOT
[task 2020-03-21T07:10:44.500Z] 07:10:44 INFO - Buffered messages finished
[task 2020-03-21T07:10:44.501Z] 07:10:44 ERROR - TEST-UNEXPECTED-FAIL | automation.py | application terminated with exit code -6
[task 2020-03-21T07:10:44.501Z] 07:10:44 INFO - runtests.py | Application ran for: 0:00:39.098104
[task 2020-03-21T07:10:44.501Z] 07:10:44 INFO - zombiecheck | Reading PID log: /tmp/tmpHUrlTIpidlog
[task 2020-03-21T07:10:44.501Z] 07:10:44 INFO - ==> process 2638 launched child process 2653
[task 2020-03-21T07:10:44.502Z] 07:10:44 INFO - ==> process 2638 launched child process 2691
[task 2020-03-21T07:10:44.502Z] 07:10:44 INFO - ==> process 2638 launched child process 2743
[task 2020-03-21T07:10:44.502Z] 07:10:44 INFO - ==> process 2638 launched child process 2797
[task 2020-03-21T07:10:44.502Z] 07:10:44 INFO - zombiecheck | Checking for orphan process with PID: 2797
[task 2020-03-21T07:10:44.502Z] 07:10:44 INFO - zombiecheck | Checking for orphan process with PID: 2691
[task 2020-03-21T07:10:44.503Z] 07:10:44 INFO - zombiecheck | Checking for orphan process with PID: 2653
[task 2020-03-21T07:10:44.503Z] 07:10:44 INFO - zombiecheck | Checking for orphan process with PID: 2743
[task 2020-03-21T07:10:44.503Z] 07:10:44 INFO - Traceback (most recent call last):
[task 2020-03-21T07:10:44.503Z] 07:10:44 INFO - File "/builds/worker/workspace/build/tests/mochitest/runtests.py", line 2871, in doTests
[task 2020-03-21T07:10:44.503Z] 07:10:44 INFO - e10s=options.e10s
[task 2020-03-21T07:10:44.504Z] 07:10:44 INFO - File "/builds/worker/workspace/build/tests/mochitest/runtests.py", line 2388, in runApp
[task 2020-03-21T07:10:44.504Z] 07:10:44 INFO - raise exc(value).with_traceback(tb)
[task 2020-03-21T07:10:44.504Z] 07:10:44 INFO - AttributeError: 'error' object has no attribute 'with_traceback'
[task 2020-03-21T07:10:44.504Z] 07:10:44 ERROR - Automation Error: Received unexpected exception while running application

Jonco can you please take a look? There are over 300 failures with the above error, most were classified with bug 1622969

Flags: needinfo?(jcoppeard)
Whiteboard: [stockwell needswork:owner]
Pushed by choller@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ea747f9c106f
Add additional suppression for race variant. r=jonco
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla76
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell fixed:patch]

(In reply to Jon Coppeard (:jonco) from comment #7)
I've looked into this some more and I don't think the problem is exactly as described in comment 7, because we don't update shapes in parallel on separate threads.

The issue is a race between updating an ObjectGroup and a Shape. The shape is not moved and has its base shape update; the object group has a pointer to the shape and tests whether it is forwarded. As before, I don't think this causes problems because whichever order these events happen in the outcome is the same.

I also now think that bug 1622969 is s separate issue.

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

We're hoping to remove ObjectGroup in the future. Hopefully that will also remove this race.

Flags: needinfo?(jcoppeard)
You need to log in before you can comment on or make changes to this bug.