基于百度AI开放平台的人脸识别及语音合成
Posted jcdjor
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了基于百度AI开放平台的人脸识别及语音合成相关的知识,希望对你有一定的参考价值。
基于百度AI的人脸识别及语音合成课题
课题需求
(1)人脸识别
在Web界面上传人的照片,后台使用Java技术接收图片,然后对图片进行解码,调用云平台接口识别人脸特征,接收平台返回的人员年龄、性别、颜值等信息,将信息返回到Web界面进行显示。
(2)人脸比对
在Web界面上传两张人的照片,后台使用Java技术接收图片,然后对图片进行解码,调用云平台接口比对照片信息,返回相似度。
(3)语音识别
在Web页面上传语音文件,判断语音文件格式,如果不是wav格式进行转码处理,然后调用平台接口进行识别,最后将识别的文本内容返回到Web界面进行显示。
(4)语音合成
在Web界面上传文本内容和语音类型,后台接收文本内容和语音类型后,调用平台接口生成语音数据,最后将数据转码成mp3格式文件,Web界面可以下载到本地。
课题设计
课题基于客户端—服务端-平台端构架,客户端主要实现功能界面展示、数据上传和处理结果展示;服务器端接收客户端数据、数据转码处理、平台接口调用、请求结果相应;平台端介绍服务端数据、人脸识别、人脸比对、语音识别、语音合成等。
总体架构
总体逻辑
前端设计(包括首页、人脸检测、人脸对比、语音识别及语音合成)
index.html
<!DOCTYPE html>
<html lang="zh-cn">
<head>
<meta charset="UTF-8">
<title>人工智能 未来已来</title>
<link rel="stylesheet" href="css/button.min.css" />
<link rel="stylesheet" href="css/style.css" />
<script type='text/javascript' src='js/jquery-1.11.1.min.js'></script>
<script type='text/javascript' src='js/jquery.particleground.min.js'></script>
<script type='text/javascript' src='js/ai.js'></script>
</head>
<body>
<div id="context">
<div class="intro">
<div class="position">
<h1>人工智能 未来已来</h1>
<!--start button, nothing above this is necessary -->
<div class="svg-wrapper">
<svg height="140" width="450" xmlns="http://www.w3.org/2000/svg">
<rect id="shape" height="140" width="300" />
<div id="text">
<a href="face_recognition.html"><span class="spot"></span>人脸检测</a>
</div>
</svg>
</div>
<div class="svg-wrapper">
<svg height="140" width="450" xmlns="http://www.w3.org/2000/svg">
<rect id="shape" height="140" width="300" />
<div id="text">
<a href="face_match.html"><span class="spot"></span>人脸比对</a>
</div>
</svg>
</div>
<!--Next button -->
<div class="svg-wrapper">
<svg height="140" width="450" xmlns="http://www.w3.org/2000/svg">
<rect id="shape" height="140" width="300" />
<div id="text">
<a href="speech_recognition.html"><span class="spot"></span>语音识别</a>
</div>
</svg>
</div>
<!--Next button -->
<div class="svg-wrapper">
<svg height="140" width="450" xmlns="http://www.w3.org/2000/svg">
<rect id="shape" height="140" width="300" />
<div id="text">
<a href="speech_produce.html"><span class="spot"></span>语音合成</a>
</div>
</svg>
</div>
<!--End button -->
</div>
</div>
</div>
</body>
</html>
face_recognition.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>人脸识别</title>
<meta name="viewport" content="width=device-width,initial-scale=1,minimum-scale=1,maximum-scale=1,user-scalable=no" />
<script src="js/jquery-2.1.0.js" type="text/javascript" charset="utf-8"></script>
<script src="js/jquery.particleground.min.js" type="text/javascript" charset="utf-8"></script>
<script src="js/ai.js" type="text/javascript" charset="utf-8"></script>
<link rel="stylesheet" type="text/css" href="css/style.css" />
<style type="text/css">
.cent_bg {
width: 80%;
height: 22em;
border: 1px solid rgba(255, 255, 255, 0.5);
margin: auto;
padding: 3% 6%;
border-radius: 10%;
font-size: 1.5em;
line-height: 2em;
position: relative;
}
.cent {
margin-top: 1.5em;
overflow-y: auto;
height: 80%;
text-indent: 2em;
text-align: left;
padding-right: 0.2em;
}
::-webkit-scrollbar {
width: 10px;
background-color: rgba(255,255,255,0.2);
}
/*定义滚动条轨道 内阴影+圆角*/
::-webkit-scrollbar-track {
border-radius: 5px;
background-color: transparent;
}
/*定义滑块 内阴影+圆角*/
::-webkit-scrollbar-thumb {
border-radius: 10px;
background-color: rgba(255,255,255,0.7);
}
.btn{
margin-top: 20px;
width: 70%;
height:80px;
transition: 0.4s;
border-radius: 1em;
border: 1px solid rgba(255, 255, 255, 0.5);
border-top:none;
z-index: 66;
background: rgba(255, 255, 255, 0.2);
}
.btn:hover{
border: 1px solid rgba(255, 255, 255, 0.8) !important;
border-top:none !important;
}
.btn span{
width: 40%;
height: 84%;
margin-top:6px;
line-height: 38px;
display: inline-block;
border: 1px solid #009FFD;
color: #fff;
background: rgba(255,255,255,0.4);
border-radius: 4px;
cursor: pointer;
}
.btn span:nth-of-type(1){
margin-right: 15px;
}
.btn span:nth-of-type(2){
margin-left: 15px;
}
.img{
width: 70%;
height: 100%;
float: left;
position: relative;
}
.xinxi{
width: 30%;
height: 100%;
float: left;
}
.biankuang{
width: 92%;
height: 150%;
top: -100px;
position: absolute;
background: url(img/biankuang.png) no-repeat;
background-size: 100% 100%;
}
.xinxi{
text-align: left;
}
.xinxi span{
margin-top: 50px;
}
.xinxi span a{
color: #FFFFFF;
text-decoration: none;
}
#neirong{
color: red;
}
</style>
</head>
<body>
<div id="context">
<div class="intro">
<div class="cent_bg">
<div class="img">
<div class="biankuang">
<img src="img/img111.jpg" id='img' style="width: 83%;height: 55%;position: absolute;left: 49px;top: 150px;"/>
</div>
</div>
<div class="xinxi">
<span style="display: block;">性别:<a href="" id="sex"></a></span>
<span style="display: block;">年龄:<a href="" id="age"></a><span style="margin-left: 10px;">岁</span></span>
<span style="display: block;">表情:<a href="" id="expression"></a></span>
<span style="display: block;">颜值:<a href="" id="beauty"></a></span>
<span style="display: block;"><a href="" id="neirong"></a></span>
</div>
</div>
<div class="btn">
<form id="renlian" method="post">
<span style="position: relative;">提交图片
<input type="file" name="image" value="" id="uploading" onchange="test()" style="opacity: 0;width: 100%;position: absolute;height: 100%;display: block;top: 0px;" />
</span>
<span id="tijiao">开始识别</span>
</form>
</div>
</div>
</div>
</body>
</html>
<script type="text/javascript">
function test() {
var file = document.getElementById("uploading").files[0];
var fr = new FileReader;
var filePath = document.querySelector("#uploading").value;
fileFormat = filePath.substring(filePath.lastIndexOf(".")).toLowerCase();
if(!fileFormat.match(/.png|.jpg|.jpeg/)) {
alert('上传错误,文件格式必须为:png/jpg/jpeg');
return;
} else {
fr.readAsDataURL(file);
fr.onload = function(e) {
document.getElementById("img").src = this.result;
}
}
}
$("#tijiao").click(function() {
$.ajax({
type: "post",
url: basePath + "/FaceDetect",
dataType: "json",
data: new FormData($('#renlian')[0]),
processData: false,
contentType: false,
beforeSend: function() {
uploading = true;
},
success: function(res) {
if(res.status=="200"){
$("#neirong").text("");
$("#sex").text(res.data.gender);
$("#age").text(res.data.age);
$("#expression").text(res.data.expression);
$("#beauty").text(res.data.beauty);
}else{
$("#neirong").text("无法识别");
}
},
error(xhr,status,error){
$("#neirong").text("后台服务异常");
return;
}
})
})
</script>
face_match.html
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>人脸比对</title>
<script src="js/jquery-2.1.0.js" type="text/javascript" charset="utf-8"></script>
<script src="js/jquery-2.1.0.js" type="text/javascript" charset="utf-8"></script>
<script src="js/jquery.particleground.min.js" type="text/javascript" charset="utf-8"></script>
<script src="js/ai.js" type="text/javascript" charset="utf-8"></script>
<link rel="stylesheet" type="text/css" href="css/style.css" />
<style type="text/css">
#dianji:hover{
transition: 0.5s;
background: skyblue;
}
.sss{
display: inline-block;
line-height:100px ;
border-radius: 100%;
width: 200px;
margin: auto;
height: 200px;
border: 1px solid white;
margin-top: 10%;
}
</style>
</head>
<body>
<div id="context">
<div class="intro">
<div style="overflow: hidden; position: relative;top: 40%; overflow: hidden;margin: auto;text-align: center;">
<div style="border: 1px solid white; width: 30%; height: 400px; float: left;"><img id="ig1" src="" alt="" style="width: 100%; height: 100%"/></div>
<div class="sss">相似度</div>
<div style="border: 1px solid white; width: 30%; height: 400px; float: right;"><img id="ig2" src="" alt="" style="width: 100%; height: 100%;"/></div>
</div>
<form id="tttt" method="post" style="width: 80%; margin:40px auto;">
<div style="width: 40%;float: left; position: relative;">
<span style="position: absolute;left: 0px;background: skyblue;display: inline-block;height: 40px;line-height: 40px;width: 160px;">
点击上传
</span>
<input style="opacity: 0; position: absolute;left: 0px; height: 40px;" id="uploading" type="file" onchange="upload1()" name="image1">
</div>
<div style="float: right;bwidth: 40%;float: right;position: relative;">
<span id="" style="position: absolute;right: 0px;background: skyblue;display: inline-block;height: 40px;line-height: 40px;width: 160px;">
点击上传
</span>
<input id="up" type="file" onchange="upload2()" name="image2" style="opacity: 0; position: absolute;right: 0px; height: 40px;" >
</div>
</form>
<div id="dianji" style="border: 1px solid white;width: 140px;height: 40px;line-height: 40px;border-radius: 15px; margin: auto;">开始比对</div>
</div>
</div>
</body>
<script type="text/javascript">
function upload1() {
var file = document.getElementById("uploading").files[0];
var fr = new FileReader;
var filePath = document.querySelector("#uploading").value;
fileFormat = filePath.substring(filePath.lastIndexOf(".")).toLowerCase();
if(!fileFormat.match(/.png|.jpg|.jpeg/)) {
alert('上传错误,文件格式必须为:png/jpg/jpeg');
return;
} else {
fr.readAsDataURL(file);
fr.onload = function(e){
document.getElementById("ig1").src = this.result;
}
}
}
function upload2() {
var file = document.getElementById("up").files[0];
var fr = new FileReader;
var filePath = document.querySelector("#up").value;
fileFormat = filePath.substring(filePath.lastIndexOf(".")).toLowerCase();
if(!fileFormat.match(/.png|.jpg|.jpeg/)) {
alert('上传错误,文件格式必须为:png/jpg/jpeg');
return;
} else {
fr.readAsDataURL(file);
fr.onload = function(e){
document.getElementById("ig2").src = this.result;
}
}
}
$(function() {
$("#dianji").click(function() {
$.ajax({
url: basePath+"/FaceMatch",
type: 'post',
cache: false,
data: new FormData($('#tttt')[0]),
processData: false,
contentType: false,
dataType: "json",
beforeSend: function() {
uploading = true;
},
success: function(data) {
if(data.status==200){
document.querySelector(".sss").innerHTML="相似度<br>"+data.data.score;
}else{
$(".sss").html("相似度<br><span style='color:red'>"+data.msg+"</span>");
}
},
error(xhr,status,error){
$(".sss").html("相似度<br><span style='color:red'>后台服务异常</span>");
return;
}
});
});
});
</script>
</html>
speech_recognition.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>语音识别</title>
<meta name="viewport" content="width=device-width,initial-scale=1,minimum-scale=1,maximum-scale=1,user-scalable=no" />
<script src="js/jquery-2.1.0.js" type="text/javascript" charset="utf-8"></script>
<script src="js/jquery.particleground.min.js" type="text/javascript" charset="utf-8"></script>
<script src="js/ai.js" type="text/javascript" charset="utf-8"></script>
<link rel="stylesheet" type="text/css" href="css/style.css" />
<style type="text/css">
.cent_bg {
width: 80%;
height: 22em;
border: 1px solid rgba(255, 255, 255, 0.5);
margin: auto;
padding: 3% 6%;
border-radius: 10%;
font-size: 1.5em;
line-height: 2em;
position: relative;
}
.cent {
margin-top: 1.5em;
overflow-y: auto;
height: 80%;
text-indent: 2em;
text-align: left;
padding-right: 0.2em;
}
::-webkit-scrollbar {
width: 10px;
background-color: rgba(255, 255, 255, 0.2);
}
/*定义滚动条轨道 内阴影+圆角*/
::-webkit-scrollbar-track {
border-radius: 5px;
background-color: transparent;
}
/*定义滑块 内阴影+圆角*/
::-webkit-scrollbar-thumb {
border-radius: 10px;
background-color: rgba(255, 255, 255, 0.7);
}
#ttttt{
display: none;
}
.btn {
margin-top: 20px;
width: 70%;
height: 80px;
transition: 0.4s;
border-radius: 1em;
border: 1px solid rgba(255, 255, 255, 0.5);
border-top: none;
z-index: 66;
overflow: hidden;
position: relative;
background: rgba(255, 255, 255, 0.2);
}
.btn:hover {
border: 1px solid rgba(255, 255, 255, 0.8) !important;
border-top: none !important;
}
.btn span {
width: 40%;
height: 84%;
margin-top: 6px;
line-height: 38px;
display: inline-block;
border: 1px solid #009FFD;
color: #fff;
background: rgba(255, 255, 255, 0.4);
border-radius: 4px;
cursor: pointer;
}
.btn span:nth-of-type(1) {
margin-right: 15px;
position: relative;
}
.btn span:nth-of-type(2) {
margin-left: 15px;
}
input{
width: 100%;
height: 100%;
border: 1px solid red;
position: absolute;
top: 0;
left: 0;
opacity: 0;
}
.img{
width: 60px;
height: 60px;
margin: auto;
text-align: center;
position: relative;
}
.img img{
width: 100%;
height: 100%;
position: absolute;
top: 0;
left: 0;
}
.mmm{
width: 100%;
height: 100%;
position: absolute;
top: 0;
left: 0;
z-index: 999;
display: none;
cursor: pointer;
background: #b9fff4;
color: aqua;
line-height: 80px;
}
</style>
</head>
<body>
<div id="context">
<div class="intro">
<div class="cent_bg">
<h3>语音识别内容:</h3>
<div class="cent">
</div>
</div>
<div class="btn">
<span>上传文件
</span>
<span>开始识别</span>
<div class="mmm">
正在识别...
</div>
</div>
<form id="ttttt" action="" method="post">
<input type="file" name="voice" value="">
</form>
</div>
</div>
</body>
</html>
<script type="text/javascript">
$(function() {
$(".btn span:nth-of-type(1)").click(function() {
$('#ttttt input[name="voice"]').click();
})
$(".btn span:nth-of-type(2)").click(function() {
if(document.querySelector("input").value==""){
return alert("未选择文件");
}
$(".mmm").css("display","block");
$(".cent").html('<div class="img"><img src="img/timg.gif"/></div>');
$.ajax({
url: basePath + "/VoiceRecognize",
type: 'post',
cache: false,
data: new FormData($('#ttttt')[0]),
processData: false,
contentType: false,
dataType: "json",
beforeSend: function() {
uploading = true;
},
success: function(data) {
if (data.status=="200") {
$(".cent").html(data.data.text);
$(".mmm").css("display","none");
} else{
$(".cent").html("未能识别 "+data.msg);
$(".mmm").css("display","none");
}
}
});
});
});
</script>
speech_produce.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>语音合成</title>
<meta name="viewport" content="width=device-width,initial-scale=1,minimum-scale=1,maximum-scale=1,user-scalable=no" />
<script src="js/jquery-2.1.0.js" type="text/javascript" charset="utf-8"></script>
<script src="js/jquery.particleground.min.js" type="text/javascript" charset="utf-8"></script>
<script src="js/ai.js" type="text/javascript" charset="utf-8"></script>
<link rel="stylesheet" type="text/css" href="css/style.css" />
<style type="text/css">
#error {
width: 60%;
height: 120px;
line-height: 120px;
text-align: center;
font-size: 1.5em;
position: absolute;
top: 50%;
left: 50%;
transform: translate(-50%, -65%);
z-index: 999;
}
.cent_bg {
width: 80%;
height: 22em;
border: 1px solid rgba(255, 255, 255, 0.5);
margin: auto;
padding: 3% 6%;
border-radius: 10%;
font-size: 1.5em;
line-height: 2em;
position: relative;
}
.cent {
margin-top: 1em;
width: 100%;
background: rgba(0, 0, 0, .5);
overflow-y: auto;
height: 80%;
color: white;
font-size: 1.3em;
text-indent: 1em;
padding: .8em;
border: none;
resize: none;
outline-color: white;
}
::-webkit-scrollbar {
width: 10px;
background-color: rgba(255, 255, 255, 0.2);
}
/*定义滚动条轨道 内阴影+圆角*/
::-webkit-scrollbar-track {
border-radius: 5px;
background-color: transparent;
}
/*定义滑块 内阴影+圆角*/
::-webkit-scrollbar-thumb {
border-radius: 10px;
background-color: rgba(255, 255, 255, 0.7);
}
.btn {
width: 70%;
height: 40px;
transition: 0.4s;
border-bottom-left-radius: 1em;
border-bottom-right-radius: 1em;
border: 1px solid rgba(255, 255, 255, 0);
border-top: none;
z-index: 66;
background: rgba(255, 255, 255, 0.2);
padding: 5px 0px;
}
.btn #btnSub:hover {
border: 1px solid rgba(255, 255, 255) !important;
color: white;
}
.btn #btnSub {
width: 40%;
height: 100%;
line-height: 25px;
display: inline-block;
border: 1px solid #009FFD;
color: #fff;
background: rgba(255, 255, 255, 0.4);
border-radius: 4px;
font-size: 1.1em;
cursor: pointer;
}
#select {
margin: auto;
margin-top: 10px;
width: 70%;
height: 30px;
border-top-left-radius: 1em;
border-top-right-radius: 1em;
transition: 0.4s;
border-top: none;
z-index: 66;
background: rgba(255, 255, 255, 0.2);
}
#selectBox {
width: 70%;
height: 30px;
margin: 0 auto;
}
#mantext,
#womantext {
font-size: 1.25em;
line-height: 30px;
float: left;
}
#mantext {
text-align: center;
}
#woman {}
.inputBox {
width: 50%;
height: 30px;
float: left;
}
#man,
#woman {
margin-top: 6px;
display: block;
float: left;
width: 20px;
height: 20px;
}
</style>
</head>
<body>
<div id="error">请在此输入文本内容</div>
<div id="context">
<div class="intro">
<form method="post" enctype="multipart/form-data" id="Voice">
<div class="cent_bg">
<h3>请输入要合成的语音文本:</h3>
<textarea id="Content" class="cent" name="text"></textarea>
</div>
<div id="select">
<div id="selectBox">
<div class="inputBox">
<label id="mantext" for="man" style="float: right;">男声</label><input id="man" type="radio" name="voiceType" checked value="1" style="float: right;" />
</div>
<div class="inputBox">
<input id="woman" type="radio" name="voiceType" value="2" /><label id="womantext" for="woman">女声</label>
</div>
</div>
</div>
<div class="btn">
<button id="btnSub" type="submit">合成语音</button>
</div>
</form>
</div>
</div>
</body>
<script>
$(function() {
var cor=0;
var stop=setInterval(function(){
$('#error').fadeToggle(700);
cor++;
if(cor==3){
clearInterval(stop);
}
},700);
$('#btnSub').on('click', function() {
cor=0;
var Text = $('#Content').val();
if(Text.length == 0) {
var stop=setInterval(function(){
$('#error').fadeToggle(700);
cor++;
if(cor==4){
clearInterval(stop);
}
},700);
return false;
}
});
// 服务器请求地址
$('#Voice').attr('action', basePath+"/VoiceGen");
});
</script>
</html>
ai.js
// 服务器主地址
var basePath="127.0.0.1:8080/AIProject"
// 背景效果
$(document).ready(function() {
$('#context').particleground({
dotColor: '#5cbdaa',
lineColor: '#5cbdaa'
});
$('.intro').css({
'margin-top': -($('.intro').height() / 2)
});
});
style.css
/*********CSS初始化*********/
html,
body,
div,
span,
applet,
object,
iframe,
h1,
h2,
h3,
h4,
h5,
h6,
p,
blockquote,
pre,
a,
abbr,
acronym,
address,
big,
cite,
code,
del,
dfn,
em,
img,
ins,
kbd,
q,
s,
samp,
small,
strike,
strong,
sub,
sup,
tt,
var,
b,
u,
i,
center,
dl,
dt,
dd,
ol,
ul,
li,
fieldset,
form,
label,
legend,
table,
caption,
tbody,
tfoot,
thead,
tr,
th,
td,
article,
aside,
canvas,
details,
embed,
figure,
figcaption,
footer,
header,
hgroup,
menu,
nav,
output,
ruby,
section,
summary,
time,
mark,
audio,
video {
margin: 0;
padding: 0;
border: 0;
font-size: 100%;
font: inherit;
vertical-align: baseline;
}
article,
aside,
details,
figcaption,
figure,
footer,
header,
hgroup,
menu,
nav,
section {
display: block;
}
body {
line-height: 1;
}
ol,
ul {
list-style: none;
}
blockquote,
q {
quotes: none;
}
blockquote:before,
blockquote:after,
q:before,
q:after {
content: '';
content: none;
}
table {
border-collapse: collapse;
border-spacing: 0;
}
/* particleground demo */
*{
-webkit-box-sizing: border-box;
-moz-box-sizing: border-box;
box-sizing: border-box;
}
html,
body {
width: 100%;
height: 100%;
/*overflow: scroll;*/
}
/*********CSS初始化结束*********/
body {
background: #202AA3;
font-family: 'Montserrat', sans-serif;
color: #fff;
line-height: 1.3;
-webkit-font-smoothing: antialiased;
}
#particles {
width: 100%;
height: 100%;
overflow: hidden;
}
.intro {
position: absolute;
left: 0;
top: 50%;
padding: 0 20px;
width: 100%;
text-align: center;
}
h1 {
text-transform: uppercase;
font-size: 85px;
font-weight: 700;
letter-spacing: 0.015em;
}
h1::after {
content: '';
width: 60%;
display: block;
background: #fff;
height: 10px;
margin: 30px auto;
line-height: 1.1;
}
p {
margin: 0 0 30px 0;
font-size: 24px;
}
.btn {
display: inline-block;
padding: 15px 30px;
border: 2px solid #fff;
text-transform: uppercase;
letter-spacing: 0.015em;
font-size: 18px;
font-weight: 700;
line-height: 1;
color: #fff;
text-decoration: none;
-webkit-transition: all 0.4s;
-moz-transition: all 0.4s;
-o-transition: all 0.4s;
transition: all 0.4s;
}
.btn:hover {
color: #005544;
border-color: #005544;
}
@media only screen and (max-width: 1000px) {
h1 {
font-size: 70px;
}
}
@media only screen and (max-width: 800px) {
h1 {
font-size: 48px;
}
h1::after {
height: 8px;
}
}
@media only screen and (max-width: 568px) {
.intro {
padding: 0 10px;
}
h1 {
font-size: 30px;
}
h1::after {
height: 6px;
}
p {
font-size: 18px;
}
.btn {
font-size: 16px;
}
}
@media only screen and (max-width: 320px) {
h1 {
font-size: 28px;
}
h1::after {
height: 4px;
}
}
接口规范
数据交互类型:JSON
请求数据:请求数据除了请求参数以外,还需另外发送以下参数:(否则会返回403状态码)
返回数据格式:
{"status": "200","msg":"","data": {"namename":"user","password":"password"}}
(1)人脸识别
接口名:FaceDetect
请求参数:
返回参数:
(2)人脸比对
接口名:FaceMatch
请求参数:
返回参数:
(3)语音识别
接口名:VoiceRecognize
请求参数:
返回参数:
(4)语音生成
接口名:VoiceGen
返回参数:
Mp3音频格式文件
请求注意事项
? 请求体格式化:Content-Type为application/json,通过json格式化请求体。
? Base64编码:请求的图片需经过Base64编码,图片的base64编码指将图片数据编码成一串字符串,使用该字符串代替图像地址。您可以首先得到图片的二进制,然后用Base64格式编码即可。需要注意的是,图片的base64编码是不包含图片头的,如data:image/jpg;base64,
? 图片格式:现支持PNG、JPG、JPEG、BMP,不支持GIF图片
实例代码
1. 人脸识别实例代码
// 配置请求参数
HashMap<String, String> options = new HashMap<String, String>();
options.put("face_field", "age,gender,glasses,beauty,expression");
options.put("max_face_num", "2");
options.put("face_type", "LIVE");
// 转换成base64
String image = Base64Util.part2Base64(imagePart);
String imageType = "BASE64";
// 接口调用,并返回JSON数据
JSONObject json = client.detect(image, imageType, options);
// 响应数据处理
Map<String, Object> map = new HashMap<>();
// 获取人脸信息列表
JSONObject result = json.getJSONObject("result").getJSONArray("face_list").getJSONObject(0);
// 响应数据:性别
JSONObject genderObj = result.getJSONObject("gender");
String genderStr = genderObj.getString("type");
if(genderObj.getDouble("probability") >= 0.6) {//概率并转换
if("female".equals(genderStr)) {
genderStr = "女";
}else if("male".equals(genderStr)) {
genderStr = "男";
}
}
map.put("gender", genderStr);
// 返回接口数据
return ResponseData.success(map);
2. 人脸对比实例代码
// 转换成base64
String image1 = Base64Util.part2Base64(imagePart1);
String image2 = Base64Util.part2Base64(imagePart2);
// 封装平台接口请求对象
MatchRequest req1 = new MatchRequest(image1, "BASE64");
MatchRequest req2 = new MatchRequest(image2, "BASE64");
ArrayList<MatchRequest> requests = new ArrayList<MatchRequest>();
requests.add(req1);
requests.add(req2);
// 人脸匹配
JSONObject json = client.match(requests);
// 响应数据处理
Map<String, Object> map = new HashMap<>();
// 匹配分值
double score = json.getJSONObject("result").getDouble("score");
return ResponseData.success(map);
3. 语音识别实例代码
// 文件类型
String fileType = voicePart.getContentType();
if(fileType.endsWith("mp3")) {
fileType = MP3;
}else if(fileType.endsWith("wav")) {
fileType = WAV;
}else {
return ResponseData.fail("请上传mp3、wav音频");
}
try {
// 获取音频字符流
InputStream is = voicePart.getInputStream();
// 保存临时音频文件
String filename = new SimpleDateFormat("yyyyMMddHHmmssSSS").format(Calendar.getInstance().getTime());
File tmpVoice = new File(workspace + File.separator + filename+fileType);
VoiceUtil.saveVoiceFile(is, tmpVoice);
if(fileType.equals(MP3)) {
File mp3File = tmpVoice;
tmpVoice = new File(tmpVoice.getPath().replace(MP3, WAV));
if(!VoiceUtil.mp3ToWav(mp3File, tmpVoice)) {
return ResponseData.fail("mp3音频文件错误,请用wav音频。");
}
mp3File.delete();
}
// 调用百度接口
JSONObject json = client.asr(tmpVoice.getPath(), "wav", 16000, null);
tmpVoice.delete();
Integer status = json.getInt("err_no"); //状态码
if(status != 0) {
// 异常响应处理
String msg = json.getString("err_msg");
log.warn("百度接口调用响应异常,error_code:"+status + " error_msg:"+msg);
return ResponseData.fail(msg);
}
// 响应数据处理
Map<String, Object> map = new HashMap<>();
// 获取结果
JSONArray jsonArray = json.getJSONArray("result");
// 识别文本
String text = jsonArray.getString(0);
map.put("text", text);
return ResponseData.success(map);
} catch (Exception e) {
log.warn("识别音频文件错误:", e);;
}
4. 语音合成实例代码
// 请求参数
HashMap<String, Object> options = new HashMap<String, Object>();
// 语速,取值0-9,默认为5中语速
options.put("spd", "5");
// 音调,取值0-9,默认为5中语调
options.put("pit", "5");
// 发音人选择, 0为女声,1为男声, 3为情感合成-度逍遥,4为情感合成-度丫丫,默认为普通女
options.put("per", voiceType);
// 调用百度api接口
TtsResponse res = client.synthesis(text, "zh", 1, options);
byte[] data = res.getData();
return data;
效果展示
源码下载地址:https://github.com/jcdjor/AIProject
PS:欢迎大家给予评论、建议和下载学习,下面问源码的一些说明
版权声明:冷魅蘇的博客已于2019年4月24日起声明:本文版权归作者和CSDN共有,欢迎转载,使用本文章或代码还请声明,且在使用处的明显位置给出。如有其它问题或有什么建议,可在下方评论,或加QQ(1414782205),或发邮箱[email protected]。
以上是关于基于百度AI开放平台的人脸识别及语音合成的主要内容,如果未能解决你的问题,请参考以下文章
AI 实战篇 |基于 AI开放平台实现 人脸识别对比 功能,超详细教程附带源码