Question 1

What is MAI-Voice-2 and what does it support?

Accepted Answer

MAI-Voice-2 is Microsoft's most expressive text-to-speech model to date, announced on June 2, 2026. It supports fifteen languages, per-turn emotion tags, code-switching for Hindi-English and Spanish-English, and zero-shot voice cloning from a five-second clip, and it is available in Azure AI Foundry.

Question 2

Why does Microsoft say MAI-Voice-2 is not recommended for production?

Accepted Answer

The model is in public preview, which means no service-level agreement, no guaranteed uptime, and no committed latency. Features listed as coming soon are unavailable and certain capabilities remain constrained, so teams building production pipelines treat the preview label as an operational failure mode rather than a disclaimer.

Question 3

Why isn't a strong benchmark score enough for production TTS?

Accepted Answer

TTS evaluation measures single-sample perceptual quality. It does not measure correctness at scale, consistency across reruns, or graceful failure with automatic recovery, so a model can win a preference test and still fail on proper nouns, pacing across long scripts, or emotion tags that behave differently across languages.

Question 4

What does a production voice pipeline require beyond the model?

Accepted Answer

It needs transcript verification to confirm audio matches the input, retry logic when a model returns degraded output, model-switching when a provider is down or in preview, and a validation layer that catches errors before audio ships. No single TTS model provides these.

Question 5

How does Onepin handle MAI-Voice-2?

Accepted Answer

Onepin sits above the model layer, routing jobs across 100+ TTS models, running quality validation on every output, and handling retries and model-switching automatically. When MAI-Voice-2 moves out of preview Onepin routes to it, and while it carries no SLA Onepin routes around it.

Microsoft Just Launched MAI-Voice-2. They Also Said Don't Use It in Production.

What MAI-Voice-2 Ships

The Gap Microsoft Put on Paper

Why This Pattern Repeats

What a Production Pipeline Requires

The Architecture That Handles This

What to Do Now

Frequently asked questions