Skip to main content
The Standard Agentic NC (AGENTIC_ST_NC) model is designed to improve Automatic Speech Recognition (ASR) accuracy in noisy multi-speaker environments. Unlike traditional noise cancellation optimized for human perception, this model preserves the acoustic and phonetic information that ASR systems depend on, including distant human speech, while removing environmental noise.

Hear the Difference

Removes environmental background noise while preserving distant human speech, ensuring ASR captures every speaker in multi-speaker environments where background conversations carry context.
Before · Multiple speakers + background noise
After · All speakers preserved, noise removed

Performance Benchmarks

The following transcripts are generated from the audio samples above. The Oracle Transcript is the ground truth — the actual words spoken. The Source Transcript is the raw audio passed directly through the ASR system. The Sanas Transcript is the audio processed through the Sanas model first, then passed through the same ASR system. The Word Error Rate (WER) is calculated relative to the Oracle Transcript (0% WER), based on insertion, deletion, and substitution errors — bolded words below indicate these errors. WER percentage for each is shown at the bottom of the table.

Test Environment

  • Background: Indoor environment with TV audio playing at moderate volume
  • ASR System: Deepgram Nova3 Streaming
Oracle TranscriptSource TranscriptSanas Transcript
hi my name is partho i’m here recording audio for snc dataset here i am uh playing the stock market news in moderate volume i’ll give a pause for few seconds okay i will talk again for a few minutes then i will again give a pause so uh we would try to record in more different scenarios let’s give a pause for another five seconds maybe okay uh they’re showing advertisement now so i’m going to stop the recording herehi my name is barto i’m here recording audio for s n c dataset here i am uh playing the stock market news in moderate volume i’ll give a pause for a few seconds wait for it till they’re enter i think so not a taker right now in terms of like push move our goals but let’s quickly slip into a topic okay i will talk again for a few minutes then i will again give a pause so we would try to record in more different scenarios let’s give a pause for another five seconds maybe okay they’re showing advertisement now so i’m going to stop the recording herehi my name is barto i’m here recording audio for s m c dataset here i am uh playing the stock market news in moderate volume i’ll give a pause for few seconds okay i will talk again for a few minutes then i will again give a pause so we would try to record in more different scenarios let’s give a pause for another five seconds maybe okay they’re showing advertisement now so i’m going to stop the recording here
44.4% WER11.1% WER

Key Features

Improved ASR Accuracy

WER improvement across multiple ASR systems on noisy data with no degradation on clean audio.

ASR-Agnostic Design

Works seamlessly with any ASR pipeline, open-source or commercial, without requiring retraining or modification.

Preserves All Speech

Keeps all human voices intact while removing environmental noise — ideal for multi-speaker transcription.

Real-World Robustness

Trained on diverse acoustic environments to handle production variability from call centers to mobile devices.

Specifications

Model ID: AGENTIC_ST_NC
Category: Agentic Noise Cancellation
Type: Human ↔ Machine

~100ms

Streaming latency (end-to-end processing time)

16kHz

Sample rate

5-30%

Average Word Error Rate (WER) in noisy conditions

Use Cases

Customer Service Bots

Reduces transcription errors in call center environments where multi-speaker context matters.

Real-Time Transcription

Clean audio input for live transcription services when capturing all speakers.

ASR Preprocessing

Purpose-built enhancement for downstream ASR systems.

Code Example

Create an audio processor with the Standard Agentic NC model:
audio_params = sanas_remote_sdk.AudioParams()
audio_params.modelName = "AGENTIC_ST_NC"
audio_params.sampleRate = 16000

processor, create_result = sdk.CreateAudioProcessor(audio_params)
Sample rate: 16 kHz for best quality. Audio with sample rates greater than 16 kHz will be downsampled.
For full setup and initialization, see the Quick Start →

Next Steps

Quick Start

Get up and running with Sanas SDK in under 5 minutes.

API Reference

Full SDK documentation for classes, enums, and callbacks.

Processing Multiple Streams

Handle multiple concurrent audio streams.