Bug 1830193 Comment 5 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

I wonder if we are running into variance due to worker history and burst-vs-base rates.
For example, for Azure "premium ssd": https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types#premium-ssd-size
Its "P10" 128GB disk *can* do 3,500iops in burst, but only 500iops at base rate. Suspiciously, this does roughly match the bimodal character of the run lengths if we look at items/s, which is bimodal here at ~1800/s and ~360/s.

Regardless, many of these runs are ~2m faster now with the patch, but unfortunately they are also sometimes ~2m slower. It's possible that tuning the various parallelism parameters might help out the slow case, but I want to make sure there's no easier solution. (ramdisks? It looks like these with-gpu vm configs have a ton of ram)

It looks like (according to the internet) generally it can be tricky to get good perf for disks on Azure, because of latency overhead when writing to a disk that's "secretly" auto-serialized (managed?) to blobs over the network.

Does any of this track? I'm outside my expertise once we get into Azure stuff. :)
I wonder if we are running into variance due to worker history and burst-vs-base rates.
For example, for Azure "premium ssd": https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types#premium-ssd-size
Its "P10" 128GB disk *can* do 3,500iops in burst, but only 500iops at base rate. Suspiciously, this does roughly match the bimodal character of the run lengths if we look at items/s, which is bimodal here at ~1800/s and ~360/s.

Regardless, many of these runs are ~2m faster now with the patch, but unfortunately they are also sometimes ~2m slower. It's possible that tuning the various parallelism parameters might help out the slow case, but I want to make sure there's no easier solution. (ramdisks? It looks like these with-gpu vm configs have a ton of ram)

It looks like (according to the internet) generally it can be tricky to get good perf for disks on Azure, because of latency overhead when writing to a disk that's "secretly" auto-serialized (managed?) to blobs over the network. (and that we therefore want to ideally stick to "temporary disks" for ephemeral stuff like this)

Does any of this track? I'm outside my expertise once we get into Azure stuff. :)

Back to Bug 1830193 Comment 5