Technical discussions about the implementation and research of speech recognition algorithms.
Post a new Thread
mfcc calculation--please help - jasi...@yahoo.com - Nov 8 2:39:00 2005
Hi everybody,
I am new to this group and could find that many of the group members know much about speaker
identification. I am a graduate student, now doing my final year project on speaker
recognition. The problem is to perform speaker recognition on movie clips. I have no previous
experience with speech processing.
I could eliminate silence, environment sounds etc. from the audio signal to a satisfactory
extent. The next step is MFCC calculation and the classification. I tried a lot doing it. But
the output MFCC vectors I get do not seem to be correct (I am not sure actually). The vectors
for different speakers do not seem to be distinguishably different, and those belong to the
same speaker don’t seem to be sufficiently similar even.
Has this problem ever occurred to anybody? Any suggestions are highly welcomed.
I read somewhere that we should perform some cepstral normalization on the MFCC vectors. But I
don’t know how. Can anybody help please?
Or is it the case that by just looking at the vectors we cannot determine the similarity or
dissimilarity of the MFCC vectors?
Anybody with an experience with MFCC please help.
Thanks in advance
jasine

(You need to be a member of speech-recognition -- send a blank email to speech-recognition-subscribe@yahoogroups.com )
Re: mfcc calculation--please help - Raghavendra K S - Nov 21 5:28:00 2005
Hi Jasine,
Please visit, www.etsi.org There are source files available for Mel Cepstrum calculation.
Although I can not give you exact name and url on searching for Distributed Speech Recognition
Frontend you will get a floating point C code base.
Please note that normalizing the variance of the cepstrum is for speaker independent
recognition. Infact on normalization most of the speaker dependent features ae eliminated
rendering the vector useless for you purpose. To normalize you have to devide the cepstrum by
its variance.
Mel coeficiants do produce substantial differences for different speakers. I advice you to plot
vectors to have see visual difference to convince yourself about it. It will also help you to
look at the feasibility in using Mel coefficinets.
This is the only thing I can say about the topic owing to unfamiliarity in speaker recognition.
Best of luck.
Regards
~rAGU
speech-recognition@spee... wrote:
From: jasinekb@jasi...
________________________________________________________________________
________________________________________________________________________
Message: 1
Date: Tue, 08 Nov 2005 01:39:56 -0500
From: jasinekb@jasi...
Subject: mfcc calculation--please help
Hi everybody,
I am new to this group and could find that many of the group members know much about speaker
identification. I am a graduate student, now doing my final year project on speaker
recognition. The problem is to perform speaker recognition on movie clips. I have no previous
experience with speech processing.
I could eliminate silence, environment sounds etc. from the audio signal to a satisfactory
extent. The next step is MFCC calculation and the classification. I tried a lot doing it. But
the output MFCC vectors I get do not seem to be correct (I am not sure actually). The vectors
for different speakers do not seem to be distinguishably different, and those belong to the
same speaker don’t seem to be sufficiently similar even.
Has this problem ever occurred to anybody? Any suggestions are highly welcomed.
I read somewhere that we should perform some cepstral normalization on the MFCC vectors. But I
don’t know how. Can anybody help please?
Or is it the case that by just looking at the vectors we cannot determine the similarity or
dissimilarity of the MFCC vectors?
Anybody with an experience with MFCC please help.
Thanks in advance
jasine

(You need to be a member of speech-recognition -- send a blank email to speech-recognition-subscribe@yahoogroups.com )