有没有办法直接使用 SpeechRecognizer API 进行语音输入?
Posted
技术标签:
【中文标题】有没有办法直接使用 SpeechRecognizer API 进行语音输入?【英文标题】:Is there a way to use the SpeechRecognizer API directly for speech input? 【发布时间】:2011-06-25 21:59:27 【问题描述】:android Dev 网站提供了一个使用内置 Google Speech Input Activity 进行语音输入的示例。该活动显示一个带有麦克风的预配置弹出窗口,并使用onActivityResult()
传递其结果
我的问题:
有没有办法直接使用SpeechRecognizer
类进行语音输入而不显示预设活动?这将让我为语音输入构建自己的活动。
【问题讨论】:
【参考方案1】:这是使用 SpeechRecognizer 类的代码(来自here 和here):
import android.app.Activity;
import android.content.Intent;
import android.os.Bundle;
import android.view.View;
import android.view.View.OnClickListener;
import android.speech.RecognitionListener;
import android.speech.RecognizerIntent;
import android.speech.SpeechRecognizer;
import android.widget.Button;
import android.widget.TextView;
import java.util.ArrayList;
import android.util.Log;
public class VoiceRecognitionTest extends Activity implements OnClickListener
private TextView mText;
private SpeechRecognizer sr;
private static final String TAG = "MyStt3Activity";
@Override
public void onCreate(Bundle savedInstanceState)
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
Button speakButton = (Button) findViewById(R.id.btn_speak);
mText = (TextView) findViewById(R.id.textView1);
speakButton.setOnClickListener(this);
sr = SpeechRecognizer.createSpeechRecognizer(this);
sr.setRecognitionListener(new listener());
class listener implements RecognitionListener
public void onReadyForSpeech(Bundle params)
Log.d(TAG, "onReadyForSpeech");
public void onBeginningOfSpeech()
Log.d(TAG, "onBeginningOfSpeech");
public void onRmsChanged(float rmsdB)
Log.d(TAG, "onRmsChanged");
public void onBufferReceived(byte[] buffer)
Log.d(TAG, "onBufferReceived");
public void onEndOfSpeech()
Log.d(TAG, "onEndofSpeech");
public void onError(int error)
Log.d(TAG, "error " + error);
mText.setText("error " + error);
public void onResults(Bundle results)
String str = new String();
Log.d(TAG, "onResults " + results);
ArrayList data = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
for (int i = 0; i < data.size(); i++)
Log.d(TAG, "result " + data.get(i));
str += data.get(i);
mText.setText("results: "+String.valueOf(data.size()));
public void onPartialResults(Bundle partialResults)
Log.d(TAG, "onPartialResults");
public void onEvent(int eventType, Bundle params)
Log.d(TAG, "onEvent " + eventType);
public void onClick(View v)
if (v.getId() == R.id.btn_speak)
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,"voice.recognition.test");
intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS,5);
sr.startListening(intent);
Log.i("111111","11111111");
使用按钮定义 main.xml 并在清单中授予 RECORD_AUDIO 权限
【讨论】:
在搜索其他内容时遇到了这个问题。虽然它的老问题我认为发布答案会对其他人有所帮助 复制自***.com/questions/6316937/… :) 它总是输出 5 或 4 或错误 7 Result-5 应该被接受为答案,现在已经 3 年了。 为了清楚起见,授予 RECORD_AUDIO 权限看起来像<uses-permission android:name="android.permission.RECORD_AUDIO" />
。【参考方案2】:
还要确保向用户请求适当的权限。我遇到了错误 9 返回值:INSUFFICIENT_PERMISSIONS,即使我在清单中列出了正确的 RECORD_AUDIO 权限。
按照示例代码here,我能够从用户那里获得权限,然后语音识别器返回了良好的响应。
例如在调用 SpeechRecognizer 方法之前,我将这个块放入我的 onCreate() 活动中,尽管它可以在 UI 流中的其他位置:
protected void onCreate(Bundle savedInstanceState)
...
if (ContextCompat.checkSelfPermission(this,
Manifest.permission.RECORD_AUDIO)
!= PackageManager.PERMISSION_GRANTED)
// Should we show an explanation?
if (ActivityCompat.shouldShowRequestPermissionRationale(this,
Manifest.permission.RECORD_AUDIO))
// Show an explanation to the user *asynchronously* -- don't block
// this thread waiting for the user's response! After the user
// sees the explanation, try again to request the permission.
else
// No explanation needed, we can request the permission.
ActivityCompat.requestPermissions(this,
new String[]Manifest.permission.RECORD_AUDIO,
527);
// MY_PERMISSIONS_REQUEST_READ_CONTACTS is an
// app-defined int constant. The callback method gets the
// result of the request. (In this example I just punched in
// the value 527)
...
然后在activity中为权限请求提供回调方法:
@Override
public void onRequestPermissionsResult(int requestCode,
String permissions[], int[] grantResults)
switch (requestCode)
case 527:
// If request is cancelled, the result arrays are empty.
if (grantResults.length > 0
&& grantResults[0] == PackageManager.PERMISSION_GRANTED)
// permission was granted, yay! Do the
// contacts-related task you need to do.
else
// permission denied, boo! Disable the
// functionality that depends on this permission.
return;
// other 'case' lines to check for other
// permissions this app might request
我必须在上面的 preetha 示例代码中更改另一件事,其中在 onResults() 方法中检索结果文本。要获取已翻译语音的实际文本(而不是原始代码打印的大小),要么打印构造字符串 str 的值,要么获取 ArrayList(数据)中的返回值之一。例如:
.setText(data.get(0));
【讨论】:
【参考方案3】:您可以使用SpeechRecognizer
,尽管我不知道除this previous SO question 之外的任何示例代码。但是,这是 API 级别 8 (Android 2.2) 的新功能,因此在撰写本文时尚未广泛使用。
【讨论】:
我编写了一个测试应用程序,试图启动 SpeechRecognizer.startListening() 以及实现的监听器方法,但什么也没发生。【参考方案4】:你可以这样做:
import android.app.Activity
import androidx.appcompat.app.AppCompatActivity
import android.os.Bundle
import kotlinx.android.synthetic.main.activity_main.*
import android.widget.Toast
import android.content.ActivityNotFoundException
import android.speech.RecognizerIntent
import android.content.Intent
class MainActivity : AppCompatActivity()
private val REQ_CODE = 100
override fun onCreate(savedInstanceState: Bundle?)
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_main)
speak.setOnClickListener
val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "ar-JO") // Locale.getDefault()
intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Need to speak")
try
startActivityForResult(intent, REQ_CODE)
catch (a: ActivityNotFoundException)
Toast.makeText(applicationContext,
"Sorry your device not supported",
Toast.LENGTH_SHORT).show()
override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?)
super.onActivityResult(requestCode, resultCode, data)
when (requestCode)
REQ_CODE ->
if (resultCode == Activity.RESULT_OK && data != null)
val result = data
.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS)
println("result: $result")
text.text = result[0]
layout
可以很简单:
<?xml version = "1.0" encoding = "utf-8"?>
<RelativeLayout xmlns:android = "http://schemas.android.com/apk/res/android"
xmlns:app = "http://schemas.android.com/apk/res-auto"
xmlns:tools = "http://schemas.android.com/tools"
android:layout_width = "match_parent"
android:layout_height = "match_parent"
tools:context = ".MainActivity">
<LinearLayout
android:layout_width = "match_parent"
android:gravity = "center"
android:layout_height = "match_parent">
<TextView
android:id = "@+id/text"
android:textSize = "30sp"
android:layout_width = "wrap_content"
android:layout_height = "wrap_content"/>
</LinearLayout>
<LinearLayout
android:layout_width = "wrap_content"
android:layout_alignParentBottom = "true"
android:layout_centerInParent = "true"
android:orientation = "vertical"
android:layout_height = "wrap_content">
<ImageView
android:id = "@+id/speak"
android:layout_width = "wrap_content"
android:layout_height = "wrap_content"
android:background = "?selectableItemBackground"
android:src = "@android:drawable/ic_btn_speak_now"/>
</LinearLayout>
</RelativeLayout>
你问的另一种方式,时间长一点,但给你更多的控制权,也不会用谷歌帮助对话框打扰你:
1- 首先您需要在Manifest
文件中授予权限:
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.RECORD_AUDIO"/>
2- 我将以上所有答案合并为:
创建RecognitionListener
类,为:
private val TAG = "Driver-Assistant"
class Listener(context: Context): RecognitionListener
private var ctx = context
override fun onReadyForSpeech(params: Bundle?)
Log.d(TAG, "onReadyForSpeech")
override fun onRmsChanged(rmsdB: Float)
Log.d(TAG, "onRmsChanged")
override fun onBufferReceived(buffer: ByteArray?)
Log.d(TAG, "onBufferReceived")
override fun onPartialResults(partialResults: Bundle?)
Log.d(TAG, "onPartialResults")
override fun onEvent(eventType: Int, params: Bundle?)
Log.d(TAG, "onEvent")
override fun onBeginningOfSpeech()
Toast.makeText(ctx, "Speech started", Toast.LENGTH_LONG).show()
override fun onEndOfSpeech()
Toast.makeText(ctx, "Speech finished", Toast.LENGTH_LONG).show()
override fun onError(error: Int)
var string = when (error)
6 -> "No speech input"
4 -> "Server sends error status"
8 -> "RecognitionService busy."
7 -> "No recognition result matched."
1 -> "Network operation timed out."
2 -> "Other network related errors."
9 -> "Insufficient permissions"
5 -> " Other client side errors."
3 -> "Audio recording error."
else -> "unknown!!"
Toast.makeText(ctx, "sorry error occurred: $string", Toast.LENGTH_LONG).show()
override fun onResults(results: Bundle?)
Log.d(TAG, "onResults $results")
val data = results!!.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
display.text = data!![0]
在主文件中你需要定义SpeechRecognizer
,把上面的listner
加进去,别忘了请求运行时权限,全部如下:
lateinit var sr: SpeechRecognizer
lateinit var display: TextView
class MainActivity : AppCompatActivity()
override fun onCreate(savedInstanceState: Bundle?)
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_main)
display = text
if (ContextCompat.checkSelfPermission(this,
Manifest.permission.RECORD_AUDIO)
!= PackageManager.PERMISSION_GRANTED)
if (ActivityCompat.shouldShowRequestPermissionRationale(this,
Manifest.permission.RECORD_AUDIO))
else
ActivityCompat.requestPermissions(this,
arrayOf(Manifest.permission.RECORD_AUDIO),
527)
sr = SpeechRecognizer.createSpeechRecognizer(this)
sr.setRecognitionListener(Listener(this))
speak.setOnClickListener
val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "ar-JO") // Locale.getDefault()
sr.startListening(intent)
override fun onRequestPermissionsResult(requestCode: Int, permissions: Array<out String>, grantResults: IntArray)
super.onRequestPermissionsResult(requestCode, permissions, grantResults)
when (requestCode)
527 -> if (grantResults.isNotEmpty()
&& grantResults[0] == PackageManager.PERMISSION_GRANTED)
Toast.makeText(this, "Permission granted", Toast.LENGTH_SHORT).show()
else
Toast.makeText(this, "Permission not granted", Toast.LENGTH_SHORT).show()
【讨论】:
【参考方案5】:package com.android.example.speechtxt;
import androidx.appcompat.app.AppCompatActivity;
import androidx.core.content.ContextCompat;
import android.Manifest;
import android.content.Intent;
import android.content.pm.PackageManager;
import android.net.Uri;
import android.os.Build;
import android.os.Bundle;
import android.provider.Settings;
import android.speech.RecognitionListener;
import android.speech.RecognizerIntent;
import android.speech.SpeechRecognizer;
import android.view.MotionEvent;
import android.view.View;
import android.widget.RelativeLayout;
import android.widget.Toast;
import java.util.ArrayList;
import java.util.Locale;
public class MainActivity extends AppCompatActivity
private RelativeLayout relativeLayout;
private SpeechRecognizer speechRecognizer;
private Intent speechintent;
String keeper="";
@Override
protected void onCreate(Bundle savedInstanceState)
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
checkVoiceCommandPermission();
relativeLayout = findViewById(R.id.touchscr);
speechRecognizer = SpeechRecognizer.createSpeechRecognizer(getApplicationContext());
speechintent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
speechintent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
speechintent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
speechRecognizer.setRecognitionListener(new RecognitionListener()
@Override
public void onReadyForSpeech(Bundle params)
@Override
public void onBeginningOfSpeech()
@Override
public void onRmsChanged(float rmsdB)
@Override
public void onBufferReceived(byte[] buffer)
@Override
public void onEndOfSpeech()
@Override
public void onError(int error)
@Override
public void onResults(Bundle results)
ArrayList<String> speakedStringArray = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
if(speakedStringArray!=null)
keeper = speakedStringArray.get(0);
Toast.makeText(getApplicationContext(),""+keeper,Toast.LENGTH_SHORT).show();
@Override
public void onPartialResults(Bundle partialResults)
@Override
public void onEvent(int eventType, Bundle params)
);
relativeLayout.setOnTouchListener(new View.OnTouchListener()
@Override
public boolean onTouch(View v, MotionEvent event)
switch (event.getAction())
case MotionEvent.ACTION_DOWN:
speechRecognizer.startListening(speechintent);
keeper="";
break;
case MotionEvent.ACTION_UP:
speechRecognizer.stopListening();
break;
return false;
);
private void checkVoiceCommandPermission()
if(Build.VERSION.SDK_INT>=Build.VERSION_CODES.M)
if (!(ContextCompat.checkSelfPermission(MainActivity.this, Manifest.permission.RECORD_AUDIO)== PackageManager.PERMISSION_GRANTED))
Intent intent = new Intent(Settings.ACTION_APPLICATION_DETAILS_SETTINGS, Uri.parse("package:" +getPackageName()));
startActivity(intent);
finish();
【讨论】:
以上是关于有没有办法直接使用 SpeechRecognizer API 进行语音输入?的主要内容,如果未能解决你的问题,请参考以下文章
有没有办法直接使用链接在 RStudio 中运行保存的 bigquery? [复制]
有没有办法直接使用 SpeechRecognizer API 进行语音输入?
Python:有没有办法直接使用 Pandas 系列对象而不使用列表