Models Overview

Explore Sanas’s Human ↔ Human and Human ↔ Machine models and their capabilities. Sanas delivers world-class AI models across two categories: Human ↔ Human and Human ↔ Machine. Sanas currently offers Noise Cancellation and Speech Enhancement capabilities. Accent Translation, Language Translation, and Speech Intelligence coming soon.

Who is listening?

A person is listening

You need a Human ↔ Human model. Voice quality and naturalness matter.

A machine is listening

You need a Human ↔ Machine model. Relative Word Error Rate (RWERR) reduction and ASR accuracy matter.

Agentic Speech Enhancement Models

Model	Use Case	Description	Best For:
AGENTIC_VI_G_SE	Agentic — Voice Isolation (General) — Human ↔ Machine	Removes all background noise and non-primary voices for complete speaker isolation	Voice agents, IVR, and phone bots that need a single, noise-free speaker isolated from everything else on the line
AGENTIC_VI_GT_SE	Agentic — Voice Isolation (Telephony) — Human ↔ Machine	Telephony-optimized variant of VI_G for 8kHz narrowband audio	Telephony-based voice agents and IVR in contact centers running on narrowband (8kHz) call audio
AGENTIC_ST_SE	Agentic — Standard — Human ↔ Machine	Removes background noise while keeping all human speech audible	Multi-party calls or meeting/triage agents where overheard background speech still adds useful context

Speech Enhancement Models

Model	Use Case	Description	Best For:
SE1.2	SE Enhanced (Telco) — Human ↔ Human	Enhances voice quality for narrowband telephony audio and adds comfort noise; mainly intended for Telco use cases	Telco and carrier-grade voice applications needing lightweight enhancement with natural-sounding comfort noise
SE2.1	SE Standard — Human ↔ Human	Restores and enhances voice quality for telephony audio, outputting 8kHz	High-volume contact centers and IVR systems needing lightweight, low-latency voice cleanup on telephony audio
SE2.2	SE Full-Fidelity — Human ↔ Human	Full-Fidelity speech enhancement with bandwidth extension to 24kHz	Premium contact centers, conferencing, and telemedicine where full-fidelity audio quality is essential
VI_G_SE	Voice Isolation (General) — Human ↔ Human	Isolates intended speech by removing background noise and voices	Contact center agents, conferencing, and live/gaming voice chat where listeners need clean, isolated speech

Accent Translation Models

Model	Use Case	Description	Best For:
AT5.2	Accent Translation — Human ↔ Human	Accent Translation modifies global accents in real-time, allowing your teams to be instantly understood while preserving what makes every voice unique.	Contact centers and global teams who need instant, natural-sounding accent translation in meetings and daily calls

Language Translation

API	Use Case	Description	Best For:
LT	Language Translation— Human ↔ Human	Real-time language translation that preserves your speakers’ voices, tone, and intent.	Cross-language customer support and international meetings that need real-time translation while preserving the speaker’s voice and tone

Hear samples and learn more about each model’s specifications, use cases, and code examples below.

Human ↔ Human

Speech Enhancement · Enhanced (Telco)

SE1.2 — Enhances voice quality for narrowband telephony audio and adds comfort noise. Mainly intended for Telco use cases.

Speech Enhancement · Standard

SE2.1 — Restores and enhances voice quality for telephony audio. Low CPU footprint.

Latency: 120ms
Sample rate: 16kHz → 8kHz

Speech Enhancement · with full-fidelity

SE2.2 — Full-fidelity speech enhancement with bandwidth extension to ultra-fidelity 24kHz.

Latency: 160ms
Sample rate: 16kHz → 24kHz

Speech Enhancement · Voice Isolation (General)

VI_G_SE — Isolates intended speech by removing background noise and voices. Optimized for human listeners.

Latency: ~40ms
Sample rate: Up to 24kHz
Range: Primary speaker within ~1m

Accent Translation

AT5.2 — Accent Translation modifies global accents in real-time, allowing your teams to be instantly understood while preserving what makes every voice unique.

Latency: ~200ms

Language Translation

LT — Real-time language translation that preserves your speakers’ voices, tone, and intent.

Latency: ~3-5s

Human ↔ Machine

Agentic Speech Enhancement · Voice Isolation (General)

AGENTIC_VI_G_SE — Removes background noise and distant voices for complete voice isolation of the primary speaker’s audio stream.

Latency: ~100ms
Sample rate: 16kHz
Relative Word Error Rate Reduction (RWERR): 5–30% (average)

Agentic Speech Enhancement · Voice Isolation (Telephony)

AGENTIC_VI_GT_SE— Telephony-optimized variant of Voice Isolation for 8kHz narrowband audio.

Latency: ~100ms
Sample rate: 8kHz
Relative Word Error Rate Reduction (RWERR): 5–30% (average)

Agentic Speech Enhancement · Standard

AGENTIC_ST_SE — Removes background noise while preserving all human speech for multi-speaker environments.

Latency: ~100ms
Sample rate: 16kHz
Relative Word Error Rate Reduction (RWERR): 5–30% (average)

Getting Started

Models

Deployment

Tutorials/Examples

Enterprise

Resources

Who is listening?

A person is listening

A machine is listening

Agentic Speech Enhancement Models

Speech Enhancement Models

Accent Translation Models

Language Translation

Human ↔ Human

Speech Enhancement · Enhanced (Telco)

Speech Enhancement · Standard

Speech Enhancement · with full-fidelity

Speech Enhancement · Voice Isolation (General)

Accent Translation

Language Translation

Human ↔ Machine

Agentic Speech Enhancement · Voice Isolation (General)

Agentic Speech Enhancement · Voice Isolation (Telephony)

Agentic Speech Enhancement · Standard

​Who is listening?

A person is listening

A machine is listening

​Agentic Speech Enhancement Models

​Speech Enhancement Models

​Accent Translation Models

​Language Translation

​Human ↔ Human

​Speech Enhancement · Enhanced (Telco)

​Speech Enhancement · Standard

​Speech Enhancement · with full-fidelity

​Speech Enhancement · Voice Isolation (General)

​Accent Translation

​Language Translation

​Human ↔ Machine

​Agentic Speech Enhancement · Voice Isolation (General)

​Agentic Speech Enhancement · Voice Isolation (Telephony)

​Agentic Speech Enhancement · Standard

Who is listening?

Agentic Speech Enhancement Models

Speech Enhancement Models

Accent Translation Models

Language Translation

Human ↔ Human

Speech Enhancement · Enhanced (Telco)

Speech Enhancement · Standard

Speech Enhancement · with full-fidelity

Speech Enhancement · Voice Isolation (General)

Accent Translation

Language Translation

Human ↔ Machine

Agentic Speech Enhancement · Voice Isolation (General)

Agentic Speech Enhancement · Voice Isolation (Telephony)

Agentic Speech Enhancement · Standard