Agentic NC Standard - Sanas Developer Hub

The Standard Agentic NC (AGENTIC_ST_NC) model is designed to improve Automatic Speech Recognition (ASR) accuracy in noisy multi-speaker environments. Unlike traditional noise cancellation optimized for human perception, this model preserves the acoustic and phonetic information that ASR systems depend on, including distant human speech, while removing environmental noise.

Hear the Difference

Removes environmental background noise while preserving distant human speech, ensuring ASR captures every speaker in multi-speaker environments where background conversations carry context.

Before · Multiple speakers + background noise

After · All speakers preserved, noise removed

Performance Benchmarks

The following transcripts are generated from the audio samples above. The Oracle Transcript is the ground truth — the actual words spoken. The Source Transcript is the raw audio passed directly through the ASR system. The Sanas Transcript is the audio processed through the Sanas model first, then passed through the same ASR system. The Word Error Rate (WER) is calculated relative to the Oracle Transcript (0% WER), based on insertion, deletion, and substitution errors — bolded words below indicate these errors. WER percentage for each is shown at the bottom of the table.

Test Environment

Background: Indoor environment with TV audio playing at moderate volume
ASR System: Deepgram Nova3 Streaming

Oracle Transcript	Source Transcript	Sanas Transcript
hi my name is partho i’m here recording audio for snc dataset here i am uh playing the stock market news in moderate volume i’ll give a pause for few seconds okay i will talk again for a few minutes then i will again give a pause so uh we would try to record in more different scenarios let’s give a pause for another five seconds maybe okay uh they’re showing advertisement now so i’m going to stop the recording here	hi my name is barto i’m here recording audio for s n c dataset here i am uh playing the stock market news in moderate volume i’ll give a pause for a few seconds wait for it till they’re enter i think so not a taker right now in terms of like push move our goals but let’s quickly slip into a topic okay i will talk again for a few minutes then i will again give a pause so we would try to record in more different scenarios let’s give a pause for another five seconds maybe okay they’re showing advertisement now so i’m going to stop the recording here	hi my name is barto i’m here recording audio for s m c dataset here i am uh playing the stock market news in moderate volume i’ll give a pause for few seconds okay i will talk again for a few minutes then i will again give a pause so we would try to record in more different scenarios let’s give a pause for another five seconds maybe okay they’re showing advertisement now so i’m going to stop the recording here
—	44.4% WER	11.1% WER

Key Features

Improved ASR Accuracy

Relative Word Error Rate Reduction (RWERR) across multiple ASR systems on noisy data with no degradation on clean audio.

ASR-Agnostic Design

Works seamlessly with any ASR pipeline, open-source or commercial, without requiring retraining or modification.

Preserves All Speech

Keeps all human voices intact while removing environmental noise — ideal for multi-speaker transcription.

Real-World Robustness

Trained on diverse acoustic environments to handle production variability from call centers to mobile devices.

Specifications

Model ID: AGENTIC_ST_NC
Category: Agentic Noise Cancellation
Type: Human ↔ Machine

~100ms

Streaming latency (end-to-end processing time)

16kHz

Sample rate

5-30%

RWERR in noisy conditions (average)

Use Cases

Customer Service Bots

Reduces transcription errors in call center environments where multi-speaker context matters.

Real-Time Transcription

Clean audio input for live transcription services when capturing all speakers.

ASR Preprocessing

Purpose-built enhancement for downstream ASR systems.

Code Example

Create an audio processor with the Standard Agentic NC model:

audio_params = sanas_remote_sdk.AudioParams()
audio_params.modelName = "AGENTIC_ST_NC"
audio_params.sampleRate = 16000

processor, create_result = sdk.CreateAudioProcessor(audio_params)

Sample rate: 16 kHz for best quality. Audio with sample rates greater than 16 kHz will be downsampled.

For full setup and initialization, see the Quick Start →

Next Steps

Quick Start

Get up and running with Sanas SDK in under 5 minutes.

API Reference

Full SDK documentation for classes, enums, and callbacks.

Processing Multiple Streams

Handle multiple concurrent audio streams.

Documentation Index

​Hear the Difference

​Performance Benchmarks

​Test Environment

​Key Features

Improved ASR Accuracy

ASR-Agnostic Design

Preserves All Speech

Real-World Robustness

​Specifications

~100ms

16kHz

5-30%

​Use Cases

Customer Service Bots

Real-Time Transcription

ASR Preprocessing

​Code Example

​Next Steps

Quick Start

API Reference

Processing Multiple Streams

Hear the Difference

Performance Benchmarks

Test Environment

Key Features

Specifications

Use Cases

Code Example

Next Steps