语音识别基于matlab GUI智能语音识别门禁系统含Matlab源码 596期
Posted 紫极神光
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了语音识别基于matlab GUI智能语音识别门禁系统含Matlab源码 596期相关的知识,希望对你有一定的参考价值。
一、简介
本文基于Matlab设计实现了一个文本相关的声纹识别系统,可以判定说话人身份。
1 系统原理
a.声纹识别
这两年随着人工智能的发展,不少手机App都推出了声纹锁的功能。这里面所采用的主要就是声纹识别相关的技术。声纹识别又叫说话人识别,它和语音识别存在一点差别。
b.梅尔频率倒谱系数(MFCC)
梅尔频率倒谱系数(Mel Frequency Cepstrum Coefficient, MFCC)是语音信号处理中最常用的语音信号特征之一。
实验观测发现人耳就像一个滤波器组一样,它只关注频谱上某些特定的频率。人耳的声音频率感知范围在频谱上的不遵循线性关系,而是在Mel频域上遵循近似线性关系。
梅尔频率倒谱系数考虑到了人类的听觉特征,先将线性频谱映射到基于听觉感知的Mel非线性频谱中,然后转换到倒谱上。普通频率转换到梅尔频率的关系式为:
c.矢量量化(VectorQuantization)
本系统利用矢量量化对提取的语音MFCC特征进行压缩。
VectorQuantization (VQ)是一种基于块编码规则的有损数据压缩方法。事实上,在 JPEG 和 MPEG-4 等多媒体压缩格式里都有 VQ 这一步。它的基本思想是:将若干个标量数据组构成一个矢量,然后在矢量空间给以整体量化,从而压缩了数据而不损失多少信息。
3 系统结构
本文整个系统的结构如下图:
–训练过程
首先对语音信号进行预处理,之后提取MFCC特征参数,利用矢量量化方法进行压缩,得到说话人发音的码本。同一说话人多次说同一内容,重复该训练过程,最终形成一个码本库。
–识别过程
在识别时,同样先对语音信号预处理,提取MFCC特征,比较本次特征和训练库码本之间的欧氏距离。当小于某个阈值,我们认定本次说话的说话人及说话内容与训练码本库中的一致,配对成功。
4 测试实验
可以看到只有说话人及说话内容与码本库完全一致时才会显示“密码正确”,否则显示“密码错误”,实现了声纹锁的相关功能。
二、源代码
function varargout = GUI(varargin)
gui_Singleton = 1;
gui_State = struct(\'gui_Name\', mfilename, ...
\'gui_Singleton\', gui_Singleton, ...
\'gui_OpeningFcn\', @GUI_OpeningFcn, ...
\'gui_OutputFcn\', @GUI_OutputFcn, ...
\'gui_LayoutFcn\', [] , ...
\'gui_Callback\', []);
if nargin && ischar(varargin{1})
gui_State.gui_Callback = str2func(varargin{1});
end
if nargout
[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT
% --- Executes just before GUI is made visible.
function GUI_OpeningFcn(hObject, eventdata, handles, varargin)
% This function has no output args, see OutputFcn.
% varargin command line arguments to GUI (see VARARGIN)
% Choose default command line output for GUI
handles.output = hObject;
% Update handles structure
guidata(hObject, handles);
% UIWAIT makes GUI wait for user response (see UIRESUME)
% uiwait(handles.figure1);
% --- Outputs from this function are returned to the command line.
function varargout = GUI_OutputFcn(hObject, eventdata, handles)
% Get default command line output from handles structure
varargout{1} = handles.output;
% --- Executes on button press in trainrec.
function trainrec_Callback(hObject, eventdata, handles)
speaker_id = trainrec();
set(handles.train_current,\'string\',\'Hurraay,DONE!\');
speaker_iden = sprintf(\'you re speaker number %d\', speaker_id);
% set(handles.speaker,\'string\',speaker_iden);
set(handles.access,\'BackgroundColor\',\'blue\');
set(handles.access,\'string\',\'YOU HAVE ACCESS, TRAIN COMMANDS NOW!\');
% if access_ == 1
% set(handles.access,\'string\',\'YOU HAVE ACCESS, TRAIN COMMANDS NOW!\');
% else
% set(handles.access,\'string\',\'YOU DONT HAVE ACCESS,SPEAKER NOT RECOGNIZED!\');
% end
% --- Executes on button press in command.
function command_Callback(hObject, eventdata, handles)
trai_pairs=30;
out_neurons=5;
hid_neurons=6;
in_nodes=13;
eata=0.1;emax=0.001;q=1;e=0;lamda=.7; t=1;
load backp.mat W V;
recObj = audiorecorder;
Fs=8000;
Nseconds = 1;
while(1)
fprintf(\'say any word immediately after hitting enter\');
input(\'\');
recordblocking(recObj, 1);
x = getaudiodata(recObj);
[kk,g] = lpc(x,12);
Z=(kk);
Z=double(Z);
p1=max(Z);
Z=Z/p1;
for p=1:trai_pairs
z=transpose(Z(p,:));
% calculate output
y=(tansig(V*(z)));
o=(tansig(W*(y)));
break
end
b=o(1);
c=o(2);
d=o(3);
e=o(4);
f=o(5);
a= max(o);
if (b==a )
display(\'AHEAD\');
set(handles.ahead,\'BackgroundColor\',\'green\');
set(handles.command,\'string\',\'Ahead\');
pause(2);
elseif (c== a)
display(\'STOP\');
set(handles.stop,\'BackgroundColor\',\'green\');
set(handles.command,\'string\',\'Stop\');
pause(2);
elseif (d== a)
display(\'BACK\');
set(handles.back,\'BackgroundColor\',\'green\');
set(handles.command,\'string\',\'Back\');
pause(2);
elseif (e==a)
display(\'LEFT\');
set(handles.left,\'BackgroundColor\',\'green\');
set(handles.command,\'string\',\'Left\');
pause(2);
elseif (f==a)
display(\'RIGHT\');
set(handles.right,\'BackgroundColor\',\'green\');
set(handles.command,\'string\',\'Right\');
pause(2);
end
set(handles.ahead,\'BackgroundColor\',\'white\');
set(handles.left,\'BackgroundColor\',\'white\');
set(handles.right,\'BackgroundColor\',\'white\');
set(handles.stop,\'BackgroundColor\',\'white\');
set(handles.back,\'BackgroundColor\',\'white\');
end
function traincommands()
Fs=8000;
Nseconds = 1;
samp=6;
words=5;
recObj = audiorecorder;
aheaddir = \'C:\\Users\\Rezetane\\Desktop\\HRI Proj\\Speech-Recognition-master\\data\\train_commands\\ahead\\\';
backdir = \'C:\\Users\\Rezetane\\Desktop\\HRI Proj\\Speech-Recognition-master\\data\\train_commands\\back\\\';
stopdir = \'C:\\Users\\Rezetane\\Desktop\\HRI Proj\\Speech-Recognition-master\\data\\train_commands\\stop\\\';
rightdir = \'C:\\Users\\Rezetane\\Desktop\\HRI Proj\\Speech-Recognition-master\\data\\train_commands\\right\\\';
leftdir = \'C:\\Users\\Rezetane\\Desktop\\HRI Proj\\Speech-Recognition-master\\data\\train_commands\\left\\\';
s_right = numel(dir([rightdir \'*.wav\']));
for i= 1:1:samp
filename = sprintf(\'%ss%d.wav\', aheaddir, i);
fprintf(\'Reading %ss%d \',aheaddir,i);
[x,Fs] = audioread(filename);
[s(i,:),g] = lpc(x,12);
end
for i= (samp+1):1:2*samp
filename = sprintf(\'%ss%d.wav\', stopdir, i- samp);
fprintf(\'Reading %ss%d \',stopdir,i);
[x,Fs] = audioread(filename);
[s(i,:),g] = lpc(x,12);
%plot(s(i,:));
end
for i= (2*samp+1):1:3*samp
filename = sprintf(\'%ss%d.wav\', backdir, i-2*samp);
fprintf(\'Reading %ss%d \',backdir,i);
[x,Fs] = audioread(filename);
[s(i,:),g] = lpc(x,12);
end
for i= (3*samp+1):1:4*samp
filename = sprintf(\'%ss%d.wav\', leftdir, i-3*samp);
fprintf(\'Reading %ss%d \',leftdir,i);
[x,Fs] = audioread(filename);
[s(i,:),g] = lpc(x,12);
end
for i= (4*samp+1):1:5*samp
filename = sprintf(\'%ss%d.wav\', rightdir, i- 4*samp);
fprintf(\'Reading %ss%d \',rightdir,i);
[x,Fs] = audioread(filename);
[s(i,:),g] = lpc(x,12);
end
S=zeros(1,13);
for i=1:1:samp
S=cat(1,S,s(i,:));
S=cat(1,S,s(samp+i,:));
S=cat(1,S,s(2*samp+i,:));
S=cat(1,S,s(3*samp+i,:));
S=cat(1,S,s(4*samp+i,:));
end
S(1,:)=[];
save speechp.mat S
trai_pairs=30; % 48 samples
out_neurons=5; % no of words
hid_neurons=6; %matka
in_nodes=13; %features are 13
eata=0.1;emax=0.001;q=1;e=0;lamda=.7; t=1;
load speechp.mat S
p1=max(max(S));
s=S/p1;
Z= double(s);
dummy=[1 -1 -1 -1 -1;
-1 1 -1 -1 -1;
-1 -1 1 -1 -1;
-1 -1 -1 1 -1;
-1 -1 -1 -1 1];
t=trai_pairs/out_neurons;
D=dummy;
for i= 1:1:5
D=cat(1,D,dummy);
end
三、运行结果
四、备注
版本:2014a
完整代码或代写加1564658423
以上是关于语音识别基于matlab GUI智能语音识别门禁系统含Matlab源码 596期的主要内容,如果未能解决你的问题,请参考以下文章
语音识别基于matlab GUI MFCC+VAD端点检测智能语音门禁系统含Matlab源码 451期
语音识别基于matlab GUI动态时间规整算法(RTW)语音识别系统含Matlab源码 341期
语音识别基于matlab GUI动态时间规整算法(RTW)语音识别系统含Matlab源码 341期