SE - Standard - Sanas Developer Hub

The Speech Enhancement Standard (SE2.1) model restores and enhances voice quality for telephony audio. It takes 16kHz input and outputs high-fidelity 8kHz audio, correcting codec degradation, packet loss, reverb, and background noise — purpose-built for contact center and telephony environments with limited CPU budgets.

Hear the Difference

Various Accents

Before · Original audio

After · Enhanced with SE Standard

Codec Degradation

Before · Original audio

After · Enhanced with SE Standard

Packet Loss

Before · Original audio

After · Enhanced with SE Standard

Overlapping Speech

Before · Original audio

After · Enhanced with SE Standard

Background Noise

Before · Original audio

After · Enhanced with SE Standard

Key Features

Removes Noise and Speech

Eliminates background noises and background speech while preserving the primary speaker.

Restores Voice Fidelity

Restores foreground voice to high-fidelity, enhancing clarity, articulation, energy, and vocal presence.

Corrects Degradation

Fixes codec degradation, packet loss from poor connections, reverb, and room acoustics.

Low CPU Footprint

Designed for production environments with limited CPU resources.

Specifications

Model ID: SE2.1
Category: Speech Enhancement
Type: Human ↔ Human

120ms

Streaming latency

16kHz → 8kHz

Input / Output sample rate

Use Cases

Contact Centers

Enhance agent-customer call quality in telephony environments with limited bandwidth.

Telephony Systems

Restore voice clarity for 8kHz telephony pipelines degraded by compression and network conditions.

IVR Systems

Improve audio quality for automated phone systems operating on narrowband audio.

Voice Recording

Clean up and restore archived or low-quality voice recordings.

Known Limitations

Speaker identity may occasionally differ slightly from the original.
Although pronunciations are unchanged, there may be a subtle perception of an Americanized accent.

Code Example

Create an audio processor with the SE Standard model:

audio_params = sanas_remote_sdk.AudioParams()
audio_params.modelName = "SE2.1"
audio_params.sampleRate = 16000

processor, create_result = sdk.CreateAudioProcessor(audio_params)

Input sample rate: 16 kHz. The model outputs 8 kHz optimized for telephony, but Sanas Cloud upsamples it to 16 kHz to match the input sample rate.

For full setup and initialization, see the Quick Start →

Next Steps

Quick Start

Get up and running with Sanas SDK in under 5 minutes.

API Reference

Full SDK documentation for classes, enums, and callbacks.

Processing Multiple Streams

Handle multiple concurrent audio streams.

SE - Voice Isolation SE with full-fidelity

⌘I