The answer is that I don't really know. We might want to just copy WebKit/blink behaviour, but there's really no spec describing exactly what that is.
The issue is that under the old spec, the default value for transform-style is flat, so we'd expect <div id="cubeFields"> to break the preserve-3d here. Unfortunately WebKit didn't actually do this, and skips over 'no-op' elements (if they're not a containing block?), which is how this still works for WebKit/blink. This is described in .
The new ED version of the spec attempts to resolve this by adding a new value as the default (auto), which effectively inherits the parents value. It also changes the wording around preserve-3d to say that elements participate in a 3d rendering context if their containing block has transform-style:preserve-3d, rather than their parent.
I don't think anyone actually implements the new value for transform-style, and there are a whole bunch of spec issues filed for parts that just aren't implementable. See , , .
I've also filed  to suggest that WebKit/blink should just do what we do (and what the original spec said), since it's simpler, but I don't think it's getting much traction.
I suspect we'll end up having to copy WebKit behaviour, and use the containing block to determine preserve-3d, so that's probably the work required here. It'll require someone to spend some time figuring out the existing behaviour around stacking context ancestors that aren't containing blocks etc, since the specs are of no help.