[Olympus developers 172]: Re: Audioserver VAD

Antoine Raux antoine.raux at gmail.com
Wed Nov 18 14:17:36 EST 2009

Hi Jose,

The settings for AudioServer are in the AudioServer.cfg file, which is 
located in the Configurations folder you're using (i.e. under 
Below is the example for the LetsGoPublic system.

You can see there are several possible VADs (energy-based vs GMM based) 
so it depends which one you're using. The energy-based VAD can be tuned 
by changing the power_threshold value.
If you set  "log = 1" (as is the case here), you should see the kind of 
values you're getting for energy (at least I think so). Based on those, 
you should be able to set the threshold. Note that another option is to 
tune your audio device input volume. For GMM-based VAD, I don't think 
there's an easy way to tune it (short of retraining the GMMs on your own 

hope this helps.

# This is a configuration file for the Audio_Server for LetsGoPublic

# Sample rate
sps = 8000

# Host and port number of Sphinx recognition engines
engine_list = male:localhost:9990,female:localhost:9991,dtmf:localhost:9992

# Enable/disable logging
log_full_session_input = 1
log = 1

# Configuration for the Voice Activity Detector
#vad = power
#vad_config = window_width=800,power_threshold=10000
vad = GMM
vad_config = 
sampling_rate=8000, fe_frame_rate=100, fe_window_length=0.0256, 
fe_fft_size=512, fe_num_filters=35, fe_lower_filter_freq=130, 
fe_upper_filter_freq=3800, fe_normalize_c0=1, prior_noise_level=8, 
prior_speech_level=15, snr_estimation_step=80000, window_width=20

José David Lopes wrote:
> I am trying to connect another ASR to Olympus. I found some 
> difficulties because VAD that comes with Audioserver is very 
> sensitive. It triggers both speech and pause events very quickly.
> Is it possible to tune the sensitivity of this parameter?
> José David Lopes

More information about the Olympus-developers mailing list