Embargo has lifted! So here's that teardown I've promised! I won't get into how this happened, but AMD graciously sent me two test setups this year. A retail Zen3 setup back in January, and an engineering Zen4 setup in August. The motivation here is to let me optimize y-cruncher for these...
www.mersenneforum.org
“But anyway, we live in a very strange world now. AMD has AVX512, but Intel does not for mainstream... If you told me this a few years ago, I'd have looked at you funny. But here we are... Despite the "double-pumping", Zen4's AVX512 is not only competitive with Intel's, it outright beats it in many ways. Intel does remain ahead in a few areas though.”
from another place:
“Zen4 AVX512 is mostly double-pumped: a 256-bit native hardware that processes two halves of the 512-bit register.”
So they saved area and complexity (they have four total 256-bit units but for different purposes each) along with power and just double pumped but still found a way to beat Intel’s original, hotter implementation of AVX-512. It’s a pretty great implementation.