Acoustic modelling from the signal domain using CNNs

Posted 2021-10-28 dream-and-truth

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Acoustic modelling from the signal domain using CNNs相关的知识，希望对你有一定的参考价值。

3. Neural network architecture

此处描述了在本文当中所使用的网络结构，和所提取的关键特征（key features）。首先，描述了两个新型的网络结构：the network-in-network nonlinearity和the statistics extraction layer（NIN非线性结构和统计信息提取层）。

3.1 Network-in-Network nonlinearity

技术图片

如图（1）所示，该网络结构是一个多对多的非线性系统，由两个块对角阵组成，在使用的过程中，在同一层中，所有的NIN模块是参数共享的，且互相之间不重叠（non-overlapping）。
在NIN的内部，转换块(transformation block)\\(U_1\\)是尺寸为\\(m\\times k\\)的矩阵，将尺寸为\\(m\\)的输入映射到维度为\\(k\\)的高维空间中，然后使用Relu函数进行非线性映射；\\(U_2\\)是尺寸为\\(k\\times n\\)的矩阵，将非线性变化后的\\(k\\)维变量映射到\\(n\\)为空间当中，再进行Relu非线性映射。该NIN模块在论文中称之为“micro neural network blocks”。

如果，NIN模块在单层网络中共享权值，那么\\(U_1\\)的每一列可以解释为一维卷积核，且卷积核的尺寸为\\(m\\)，卷积的步长为\\(m\\)。
对于此处的理解：
\\[ x \\cdot U_{(m,k)}=x \\cdot [u_1,u_2 \\cdots u_k]=[x\\cdot u_1,x\\cdots u_2 \\cdots x\\cdot u_k] \\]

技术图片

在图（2）当中，将本文提出的网络与基于MFCC的基线系统目标函数的收敛情况进行对比，可以得到：本文提出的网络目标函数的收敛速度较快，且收敛之后的目标函数的数值较好。

以上是关于Acoustic modelling from the signal domain using CNNs的主要内容，如果未能解决你的问题，请参考以下文章

PythonAttributeError: Can‘t get attribute ‘Vocab‘ on ＜module ‘gensim.models.word2vec‘ from

AttributeError: Can‘t get attribute ‘SPPF‘ on ＜module ‘models.common‘ from ‘/home/yolov5/models/comm

yolov5 5.0 报错日常Can‘t get attribute ‘SPPF‘ on ＜module ‘models.common‘ from ‘D:\Pycharm\Code\yolov5