Android 应用程序中的语音检测

Posted

技术标签:

【中文标题】Android 应用程序中的语音检测【英文标题】:Voice Detection in Android Application 【发布时间】:2011-10-31 14:26:07 【问题描述】:

说明


我的应用程序从手机麦克风录制声音。我正在使用 android 标准类 (android.media.AudioRecord) 来做到这一点。当我按下 start 按钮应用程序开始记录和按下 stop 时,应用程序有 2 个按钮“Start”和“Stop” strong> 应用程序停止录制并给我回缓冲区,语音数据为 .wav 格式。一切正常。

问题


我想以这种方式更改我的应用程序,当应用程序开始工作时,它开始分析来自麦克风的声音,如果用户保持静音应用程序继续分析来自麦克风的声音,如果用户开始说话应用程序开始 录制来自麦克风的声音,然后用户完成通话应用程序停止录制并返回相同的缓冲区,语音数据为 .wav 格式。

问题


    我如何检测到用户开始说话 我如何检测到用户停止交谈

【问题讨论】:

我认为您将不得不自己完成大部分声音工作 - Android 确实提供了一些有限的功能来访问麦克风和录音,但没有任何东西可以对文本进行繁重的计算或语音。跨度> @sqrfv 感谢您的评论,+1 还有其他建议吗? 【参考方案1】:

只需将此代码添加到您的应用程序中,您就会检测到用户何时开始说话以及何时停止。

public void onCreate(Bundle savedInstanceState) 
    
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);

    // Get the minimum buffer size required for the successful creation of an AudioRecord object. 
    int bufferSizeInBytes = AudioRecord.getMinBufferSize( RECORDER_SAMPLERATE,
                                                          RECORDER_CHANNELS,
                                                          RECORDER_AUDIO_ENCODING
                                                         ); 
    // Initialize Audio Recorder.
    AudioRecord audioRecorder = new AudioRecord( MediaRecorder.Audiosource.MIC,
                                                 RECORDER_SAMPLERATE,
                                                 RECORDER_CHANNELS,
                                                 RECORDER_AUDIO_ENCODING,
                                                 bufferSizeInBytes
                                                );
    // Start Recording.
    audioRecorder.startRecording();

    int numberOfReadBytes   = 0; 
    byte audioBuffer[]      = new  byte[bufferSizeInBytes];
    boolean recording       = false;
    float tempFloatBuffer[] = new float[3];
    int tempIndex           = 0;
    int totalReadBytes      = 0;
    byte totalByteBuffer[]  = new byte[60 * 44100 * 2];


    // While data come from microphone. 
    while( true )
    
        float totalAbsValue = 0.0f;
        short sample        = 0; 

        numberOfReadBytes = audioRecorder.read( audioBuffer, 0, bufferSizeInBytes );

        // Analyze Sound.
        for( int i=0; i<bufferSizeInBytes; i+=2 ) 
        
            sample = (short)( (audioBuffer[i]) | audioBuffer[i + 1] << 8 );
            totalAbsValue += Math.abs( sample ) / (numberOfReadBytes/2);
        

        // Analyze temp buffer.
        tempFloatBuffer[tempIndex%3] = totalAbsValue;
        float temp                   = 0.0f;
        for( int i=0; i<3; ++i )
            temp += tempFloatBuffer[i];

        if( (temp >=0 && temp <= 350) && recording == false )
        
            Log.i("TAG", "1");
            tempIndex++;
            continue;
        

        if( temp > 350 && recording == false )
        
            Log.i("TAG", "2");
            recording = true;
        

        if( (temp >= 0 && temp <= 350) && recording == true )
        
            Log.i("TAG", "Save audio to file.");

            // Save audio to file.
            String filepath = Environment.getExternalStorageDirectory().getPath();
            File file = new File(filepath,"AudioRecorder");
            if( !file.exists() )
                file.mkdirs();

            String fn = file.getAbsolutePath() + "/" + System.currentTimeMillis() + ".wav";

            long totalAudioLen  = 0;
            long totalDataLen   = totalAudioLen + 36;
            long longSampleRate = RECORDER_SAMPLERATE;
            int channels        = 1;
            long byteRate       = RECORDER_BPP * RECORDER_SAMPLERATE * channels/8;
            totalAudioLen       = totalReadBytes;
            totalDataLen        = totalAudioLen + 36;
            byte finalBuffer[]  = new byte[totalReadBytes + 44];

            finalBuffer[0] = 'R';  // RIFF/WAVE header
            finalBuffer[1] = 'I';
            finalBuffer[2] = 'F';
            finalBuffer[3] = 'F';
            finalBuffer[4] = (byte) (totalDataLen & 0xff);
            finalBuffer[5] = (byte) ((totalDataLen >> 8) & 0xff);
            finalBuffer[6] = (byte) ((totalDataLen >> 16) & 0xff);
            finalBuffer[7] = (byte) ((totalDataLen >> 24) & 0xff);
            finalBuffer[8] = 'W';
            finalBuffer[9] = 'A';
            finalBuffer[10] = 'V';
            finalBuffer[11] = 'E';
            finalBuffer[12] = 'f';  // 'fmt ' chunk
            finalBuffer[13] = 'm';
            finalBuffer[14] = 't';
            finalBuffer[15] = ' ';
            finalBuffer[16] = 16;  // 4 bytes: size of 'fmt ' chunk
            finalBuffer[17] = 0;
            finalBuffer[18] = 0;
            finalBuffer[19] = 0;
            finalBuffer[20] = 1;  // format = 1
            finalBuffer[21] = 0;
            finalBuffer[22] = (byte) channels;
            finalBuffer[23] = 0;
            finalBuffer[24] = (byte) (longSampleRate & 0xff);
            finalBuffer[25] = (byte) ((longSampleRate >> 8) & 0xff);
            finalBuffer[26] = (byte) ((longSampleRate >> 16) & 0xff);
            finalBuffer[27] = (byte) ((longSampleRate >> 24) & 0xff);
            finalBuffer[28] = (byte) (byteRate & 0xff);
            finalBuffer[29] = (byte) ((byteRate >> 8) & 0xff);
            finalBuffer[30] = (byte) ((byteRate >> 16) & 0xff);
            finalBuffer[31] = (byte) ((byteRate >> 24) & 0xff);
            finalBuffer[32] = (byte) (2 * 16 / 8);  // block align
            finalBuffer[33] = 0;
            finalBuffer[34] = RECORDER_BPP;  // bits per sample
            finalBuffer[35] = 0;
            finalBuffer[36] = 'd';
            finalBuffer[37] = 'a';
            finalBuffer[38] = 't';
            finalBuffer[39] = 'a';
            finalBuffer[40] = (byte) (totalAudioLen & 0xff);
            finalBuffer[41] = (byte) ((totalAudioLen >> 8) & 0xff);
            finalBuffer[42] = (byte) ((totalAudioLen >> 16) & 0xff);
            finalBuffer[43] = (byte) ((totalAudioLen >> 24) & 0xff);

            for( int i=0; i<totalReadBytes; ++i )
                finalBuffer[44+i] = totalByteBuffer[i];

            FileOutputStream out;
            try 
                out = new FileOutputStream(fn);
                 try 
                        out.write(finalBuffer);
                        out.close();
                     catch (IOException e) 
                        // TODO Auto-generated catch block
                        e.printStackTrace();
                    

             catch (FileNotFoundException e1) 
                // TODO Auto-generated catch block
                e1.printStackTrace();
            

            //*/
            tempIndex++;
            break;
        

        // -> Recording sound here.
        Log.i( "TAG", "Recording Sound." );
        for( int i=0; i<numberOfReadBytes; i++ )
            totalByteBuffer[totalReadBytes + i] = audioBuffer[i];
        totalReadBytes += numberOfReadBytes;
        //*/

        tempIndex++;

    

检查这个link。

【讨论】:

我试过这段代码,但它对我不起作用我得到如下异常请帮助我摆脱这个问题 04-06 15:03:01.729:E/AndroidRuntime(16174):原因:java.lang.IllegalArgumentException:0Hz 不是支持的采样率。 04-06 15:03:01.729: E/AndroidRuntime(16174): 在 android.media.AudioRecord.audioParamCheck(AudioRecord.java:265) 04-06 15:03:01.729: E/AndroidRuntime(16174): 在 android. media.AudioRecord.(AudioRecord.java:223) 04-06 15:03:01.729: E/AndroidRuntime(16174): 在 com.test.recording.RecordingActivity.onCreate(RecordingActivity.java:32) 04-06 15:03:01.729: E/AndroidRuntime(16174): 在 android.app.ActivityThread.performLaunchActivity(ActivityThread.java:1615) private static final int RECORDER_SAMPLERATE 我应该在这里给它哪个值??? @Dipali private static final int RECORDER_BPP = 16; private static int RECORDER_SAMPLERATE = 8000; private static int RECORDER_CHANNELS = AudioFormat.CHANNEL_IN_MONO; private static int RECORDER_AUDIO_ENCODING = AudioFormat.ENCODING_PCM_16BIT; 网站无法访问【参考方案2】:

最好使用 private static final int RECORDER_SAMPLERAT=8000; 它对我有用。我认为它会对你有所帮助

【讨论】:

【参考方案3】:

当我替换 totalAbsValue += Math.abs( sample ) / (numberOfReadBytes/2) by totalAbsValue += (float)Math.abs( sample ) / ((float)numberOfReadBytes/(float)2) 时,它工作正常。

【讨论】:

以上是关于Android 应用程序中的语音检测的主要内容,如果未能解决你的问题,请参考以下文章

Android 语音中的关键字识别?

如何进行语音频率检测?

我的 Android 应用程序中的离线语音识别

Android检测辅助功能是不是开启

在我的应用程序中使用 Android 4.1 (Jelly Bean) 中的离线语音到文本?

android 6 中的语音演示崩溃