Regarding hitting a target/specific reference contrast ratio… My concern would be about consistency across a wide range of user contrast goals and what could happen within a zone.
For example, depending on subject matter, I force contrast over extremely wide extremes. Some photos I aim for a very low contrast, almost etherial look. Others I punch up contrast considerably - especially B&W, and often color. Sometimes it’s a little of both, and sometimes within small areas that could be within a zone. What I’m concerned about is how local dimming handles that, specifically its predictability. While in post I would not want the display’s local dimming/brightening algorithm making judgments on its own and counteract what I’m trying to achieve in post, often in multiple local areas in the frame, and resulting in unfaithful prints. As an aside, and if my math is correct the 576 zones in Apple’s XDR display come out to an area of around 3/4 square-inch (don’t know if they’re square or rectangular).
I think I’d need to use an XDR display for awhile in order become a believer in that tech for processing photos, making a significant/noticeable difference. OTOH, at a $5K price (without stand) that’s not going to be happening. I have zero complaints with my ASD meeting my needs. I still would love 6K res in a 32” display, though.
I think I understand what you're worried about and, if so, I can answer this conceptually (though not practically...more on that at the end).
The purpose of local dimming is not to enhance contrast beyond that of your source material. Rather, it's to adjust the monitor so it comes closer to your source material. [With just 12 zones, the Dell is going to have a hard time doing that.] Consider an LCD monitor with a peak brightness of 500 nits and a minimum brightness (aka black level), due to 0.1% bleed-through, of 0.5 nits. This gives it a contrast ratio of 1000:1 (IIUC this is typical for IPS LCD panels).
Now suppose your source material is designed to have a minimum black level of 0.005 nits when the peak brightness is 500 nits (100,000:1 contrast ratio). When you view it on your LCD panel, you're not going to see 0.005 nits, you're going to see black areas that are 100x brighter. Thus your LCD isn't going to show you faithfully what your final post-production scenes would look like on a reference display. In particular, you're not going to be able to adjust details in the dark areas because you're not going to see the the details caused by variations in light level below 0.5 nits. However, with finely granulated local dimming, the monitor will selectively reduce the backlight behind those darkest pixels by 100-fold so that, with the 0.1% bleed, they have a local brightness of 0.005 nits instead of 0.5 nits.
In sum, with an IPS panel whose static contrast ratio is 1000:1, and with source material whose contrast ratio is significiantly higher, your starting point is unfaithful to your source material. Local dimming, properly implemented with sufficient granularity (i.e., not the 12 zones on the Dell), makes your monitor less unfaithful.
More broadly, with darker blacks, photos have a different, richer look. You're not going to see that look when you do your post-production on a standard LCD.
Thus, in sum it's not a question of having a standard LCD which, while not optimum, will at least be faithful to your intentions—vs. one with local dimming that, while fancier, might not be faithful—since the standard LCD isn't faithful either (if your material is high contrast). I guess the advantage of the LCD is that its unfaithfulness may be more predictable. Though I suspect once you get to know a monitor that has local dimming, you will be able to understand and predict its unfaithfulness as well.
But here's the practical issues that I can't address:
(1) Your goal is not to have your artistic intentions accurately reflected when your work is displayed on a reference monitor, it's to have them accurately reflected when they're physically printed. And I don't know what the contrast ratio is for photographic prints. [I'm reminded here of sound engineers that used to mix on inexpensive speakers because they knew their stuff would most likely be played back on inexpensive speakers (car radios, boom boxes, etc.). This typically resulted in music with bumped-up lows and highs to compensate for the limited frequency range of said speakers, resulting in a painfully bright and boomy sound when played back on a high fidelity system. So if the contrast ratio of photographic prints isn't great, maybe doing post-production on a standard LCD wouldn't be a bad choice.]
(2) How well is local dimming implemented, even when the granularity is high enough for it to have the potential to work well?