Skip to main content

Documentation Index

Fetch the complete documentation index at: https://developer.sanas.ai/llms.txt

Use this file to discover all available pages before exploring further.

The Speech Enhancement Ultra (SE2.2) model delivers full-fidelity speech enhancement with bandwidth extension. It takes 16kHz input and outputs ultra-fidelity 24kHz audio, performing all the restorative capabilities of SE Standard while extending bandwidth for a richer, more natural listening experience — designed for environments with larger CPU budgets.

Hear the Difference

Various Accents

Before · Original audio
After · Enhanced with SE Ultra

Codec Degradation

Before · Original audio
After · Enhanced with SE Ultra

Packet Loss

Before · Original audio
After · Enhanced with SE Ultra

Overlapping Speech

Before · Original audio
After · Enhanced with SE Ultra

Background Noise

Before · Original audio
After · Enhanced with SE Ultra

Key Features

Bandwidth Extension

Extends audio to ultra-fidelity 24kHz for a richer, more natural voice experience beyond standard telephony quality.

Removes Noise and Speech

Eliminates background noises and background speech while preserving the primary speaker.

Restores Voice Fidelity

Restores foreground voice to high-fidelity, enhancing clarity, articulation, energy, and vocal presence.

Corrects Degradation

Fixes codec degradation, packet loss from poor connections, reverb, and room acoustics.

Specifications

Model ID: SE2.2
Category: Speech Enhancement
Type: Human ↔ Human

160ms

Streaming latency

16kHz → 24kHz

Input / Output sample rate
Remote Server currently outputs at 16kHz. Full 24kHz output support coming soon.

Use Cases

Contact Centers

Premium voice quality for agent-customer calls where clarity and presence matter most.

Conferencing

Ultra-fidelity audio for video conferences and virtual meetings.

Voice Recording

Restore and upscale archived or degraded voice recordings to studio-like quality.

Telemedicine

Crystal-clear audio for doctor-patient consultations where every word counts.
Known Limitations
  • Speaker identity may occasionally differ slightly from the original.
  • Although pronunciations are unchanged, there may be a subtle perception of an Americanized accent.
  • Requires a larger CPU budget compared to SE Standard (SE2.1).

Code Example

Create an audio processor with the SE Ultra model:
audio_params = sanas_remote_sdk.AudioParams()
audio_params.modelName = "SE2.2"
audio_params.sampleRate = 16000

processor, create_result = sdk.CreateAudioProcessor(audio_params)
Input sample rate: 16 kHz. The model outputs 24 kHz ultra-fidelity, but Sanas Cloud downsamples it to 16 kHz to match the input sample rate.
For full setup and initialization, see the Quick Start →

Next Steps

Quick Start

Get up and running with Sanas SDK in under 5 minutes.

API Reference

Full SDK documentation for classes, enums, and callbacks.

Processing Multiple Streams

Handle multiple concurrent audio streams.