Bug 1796126 Comment 22 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

My interest in this is purely in identifying if we have any coding patterns that are particularly vulnerable to a data race on Family 23 model 1 stepping 1 CPUs, as we could see this affect a future release.

To cite an example of a hot code path that would be very vulnerable to errata that cause stale loads, there is at least one place in the code where we use std::push_heap to insert items in a list, while also reading it from another thread, this rapidly hits the addresses with store and load operations in several different orderings (which makes it more likely to hit cpu errata), presumably there are x86 lock prefix instructions occurring around this time, and not necessarily on the writing thread - if errata 1021 caused a stale load on the writing thread, it would corrupt the list being inserted into (not to be confused with the expected behavior of stale values seen by the reading thread - x86 lockless programming is fun like that).
My interest in this is purely in identifying if we have any coding patterns that are particularly vulnerable to a data race on Family 23 model 1 stepping 1 CPUs, as we could see this affect a future release.

To cite an example of a hot code path that would be very vulnerable to errata that cause stale loads, there is at least one place in the code where we use std::push_heap to insert items in a list, while also reading it from another thread, this rapidly hits the addresses with store and load operations in several different orderings (which makes it more likely to hit cpu errata), presumably there are x86 lock prefix instructions occurring around this time, and not necessarily on the writing thread - if errata 1021 caused a stale load on the writing thread, it would corrupt the list being inserted into (not to be confused with the expected behavior of stale values seen by the reading thread - x86 lockless programming is fun like that).

Whereas for errata 1091 the most likely similar data structure I can imagine is just a map implementation using unaligned structs - if alignment is enabled then struct {int32 key;void *value;} takes 16 bytes on x86_64, but if not enabled it would be 12 bytes and half the elements of a vector of these structs would be unaligned pointers, making it very possible to get stale data for loads of pointers crossing a 4K boundary in a data race condition.

Back to Bug 1796126 Comment 22