Skip to main content
The Voice Isolation Agentic NC (AGENTIC_VI_G_NC) model is designed to improve Automatic Speech Recognition (ASR) accuracy in noisy environments. Unlike traditional noise cancellation optimized for human perception, this model preserves the acoustic and phonetic information that ASR systems depend on while removing conversational disruptions.

Hear the Difference

Removes environmental background noise and distant human voices from the primary speaker’s audio stream for complete voice isolation, ensuring ASR processes only the primary speaker’s input.
Before · Primary speaker + background noise
After · Primary speaker only

Performance Benchmarks

The following transcripts are generated from the audio samples above. The Oracle Transcript is the ground truth — the actual words spoken. The Source Transcript is the raw audio passed directly through the ASR system. The Sanas Transcript is the audio processed through the Sanas model first, then passed through the same ASR system. The Word Error Rate (WER) is calculated relative to the Oracle Transcript (0% WER), based on insertion, deletion, and substitution errors — bolded words below indicate these errors. WER percentage for each is shown at the bottom of the table.

Test Environment

  • Background: Contact center with background office chatter and ambient noise
  • ASR System: Deepgram Nova3 Streaming
Oracle TranscriptSource TranscriptSanas Transcript
good morning welcome to sun bank fraud prevention desk this is partha speaking how can i assist i completely understand let’s check your account immediately uh for verification may i have the last four digits of your card thank you i will review your account i see an o t p multiple request attempted today but no transaction went through did anyone recently ask for your o t p or card details good please remember we will never ask for your o t p to keep your account safe i recommend blocking the card immediately and issuing a replacement would you like me to proceed done your card has been blocked successfully your replacement card will reach your uh reach your registered address in three to five business days okay yes i’m also enabling security alerts for your account you are welcome anything else i can help you with todaygood morning welcome to sunbank fraud prevention desk this is speaking how can i assist i completely understand let’s check your account immediately for okay to continue using yes i’m also enabling security alerts for your account will you write me the changes to you you are welcome anything else i can help you with changing your accountgood morning welcome to sunbank fraud prevention desk this is speaking how can i assist i completely understand let’s check your account immediately for verification may i have the last four digits of your card thank you i will review your account i will see an o t p request attempted today but no transaction went through did anyone recently ask for your o t p or card details good september we will never ask for your o t p to keep your account safe i recommend blocking the card immediately and issue a replacement would you like me to proceed done your card has been blocked successfully your replacement card will reach your reach your registered address in three to five business days okay i’m also enabling security alerts for your account you are welcome anything else i can help you with today
75.3% WER7.3% WER

Key Features

Improved ASR Accuracy

WER improvement across multiple ASR systems on noisy data with no degradation on clean audio.

ASR-Agnostic Design

Works seamlessly with any ASR pipeline, open-source or commercial, without requiring retraining or modification.

Enhanced Turn-Taking

Reduces false triggers from ambient sounds (background chatter, environmental noise) that cause agents to interrupt.

Real-World Robustness

Trained on diverse acoustic environments to handle production variability from call centers to mobile devices.

Specifications

Model ID: AGENTIC_VI_G_NC
Category: Agentic Noise Cancellation
Type: Human ↔ Machine

~100ms

Streaming latency (end-to-end processing time)

16kHz

Sample rate

5-30%

Average Word Error Rate (WER) in noisy conditions

Use Cases

Voice Agents

Improves speech recognition in noisy environments for single-speaker isolation.

IVR (Interactive Voice Response) Systems

Enhances recognition accuracy for automated phone systems.

Phone Assistants

Optimizes speech-to-text in telephony conditions for hands-free use.

ASR Preprocessing

Purpose-built enhancement for downstream ASR systems.

Code Example

Create an audio processor with the Voice Isolation Agentic NC model:
audio_params = sanas_remote_sdk.AudioParams()
audio_params.modelName = "AGENTIC_VI_G_NC"
audio_params.sampleRate = 16000

processor, create_result = sdk.CreateAudioProcessor(audio_params)
Sample rate: 16 kHz for best quality. Audio with sample rates greater than 16 kHz will be downsampled.
For full setup and initialization, see the Quick Start →

Next Steps

Quick Start

Get up and running with Sanas SDK in under 5 minutes.

API Reference

Full SDK documentation for classes, enums, and callbacks.

Processing Multiple Streams

Handle multiple concurrent audio streams.