如何设置文本到语音的采样率 - Android

Posted

技术标签:

【中文标题】如何设置文本到语音的采样率 - Android【英文标题】:How to Set Sample Rate on Text to Speech - Android 【发布时间】:2017-04-24 04:31:59 【问题描述】:

在我的文本到语音的输出中,我需要将采样率设置为大约 32000 Hz,Pitch - 1 和 SpeechRate - 0.2(我已经这样做了)。但我无法设置采样率。

tts = new TextToSpeech(getApplicationContext(), new TextToSpeech.OnInitListener() 
        @Override
        public void onInit(int status) 
            if(status != TextToSpeech.ERROR) 
                tts.setLanguage(Locale.US);
                tts.setSpeechRate((float) 0.2);
                tts.setPitch((float) 1);
            
        
    , TextToSpeech.Engine.KEY_FEATURE_NETWORK_SYNTHESIS);

我使用 AudioTrack 来设置采样率,但这需要很长时间,因为我必须先 TTS synthesizeToFile,然后在 AudioTrack 中播放。

HashMap<String, String> myHasRead = new HashMap<String, String>();
myHasRead.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, outPutS);
String StorePath = Environment.getExternalStorageDirectory().getAbsolutePath();
File myF = new File(StorePath+"/tempAudio.wav");
                            try 
                                myF.createNewFile();
                             catch (IOException e) 
                                e.printStackTrace();
                            
                            tts.setOnUtteranceProgressListener(new TtsUtteranceListener());
                            tts.synthesizeToFile("Bla Bla bla",myHasRead, StorePath+"/tempAudio.wav");

....

private class TtsUtteranceListener extends UtteranceProgressListener 
        @Override
        public void onStart(String utteranceId) 

        

        @Override
        public void onDone(String utteranceId) 
            playWav();
        

        @Override
        public void onError(String utteranceId) 

        
    

    public void playWav()
        int minBufferSize = AudioTrack.getMinBufferSize(32000, AudioFormat.CHANNEL_CONFIGURATION_MONO, AudioFormat.ENCODING_PCM_16BIT);
        int bufferSize = 512;
        AudioTrack at = new AudioTrack(AudioManager.STREAM_MUSIC, 32000, AudioFormat.CHANNEL_CONFIGURATION_MONO, AudioFormat.ENCODING_PCM_16BIT, minBufferSize, AudioTrack.MODE_STREAM);
        String filepath = Environment.getExternalStorageDirectory().getAbsolutePath();

        int i = 0;
        byte[] s = new byte[bufferSize];
        try 
            FileInputStream fin = new FileInputStream(filepath + "/tempAudio.wav");
            DataInputStream dis = new DataInputStream(fin);

            at.play();
            while((i = dis.read(s, 0, bufferSize)) > -1)
                at.write(s, 0, i);
            
            at.stop();
            at.release();
            dis.close();
            fin.close();

         catch (FileNotFoundException e) 
            // TODO
            e.printStackTrace();
         catch (IOException e) 
            // TODO
            e.printStackTrace();
        
    

有任何方法可以将采样率直接设置为 TTS,例如 tts.setSampleRate(32000);,或者从 TTS 获取 Stream 到 AudioTrack,例如 DataInputStream dis = new DataInputStream(tts.speak("bla bla bla").getDataInputStream);简而言之,我需要适用于 Android 的 Chipmunk 的 Text to Speech,但没有 synthesizeToFile 或在 AudioTrack 中直接流式传输 TTS 语音数据,而不保存 TTS 的输出。

【问题讨论】:

【参考方案1】:

不能直接设置 TTS 采样率:

我在一个项目中做了这样的事情(Dint 使用 TTS)

这可能对你有帮助,

播放不同声音类型的录音:-

waveSampling=90000; (花栗鼠)

waveSampling=24200; (“慢动作”)

waveSampling=30000;("BANE") /蝙蝠侠角色

waveSampling=18000;(鬼)

waveSampling=70000;(蜜蜂)

waveSampling=60000;(女)

waveSampling=37000; (正常)

void playRecord() throws IOException 




            int minBufferSize = AudioTrack.getMinBufferSize(8000, AudioFormat.CHANNEL_CONFIGURATION_MONO, AudioFormat.ENCODING_PCM_16BIT);
            int bufferSize = 512;
              at = new AudioTrack(AudioManager.STREAM_MUSIC, waveSampling, AudioFormat.CHANNEL_CONFIGURATION_MONO, AudioFormat.ENCODING_PCM_16BIT, minBufferSize, AudioTrack.MODE_STREAM);
            String filepath = Environment.getExternalStorageDirectory().getAbsolutePath();

            int i = 0;
            byte[] s = new byte[bufferSize];
            try 
                FileInputStream fin = new FileInputStream(Environment.getExternalStorageDirectory().getAbsolutePath()+"/Voice Changer/temp/"+filename+".wav");
                DataInputStream dis = new DataInputStream(fin);

                at.play();
                while((i = dis.read(s, 0, bufferSize)) > -1)
                    at.write(s, 0, i);

                
                at.stop();
                at.release();
                dis.close();
                fin.close();

                    openmenu();


             catch (FileNotFoundException e) 
                // TODO
                e.printStackTrace();
             catch (IOException e) 
                // TODO
                e.printStackTrace();
            



    

保存音频:-

public void save() throws IOException 
        Random r = new Random();
        final int i1 = r.nextInt(80 - 65) + 65;
        File tempfile2=new File(Environment.getExternalStorageDirectory().getAbsolutePath()+"/Voice Changer/temp/"+i1+filename+".wav");

        savedfile=Environment.getExternalStorageDirectory().getAbsolutePath()+"/Voice Changer/"+"VOICE CHANGER"+i1+filename+".mp3";






        Toast.makeText(this, "File Saved", Toast.LENGTH_SHORT).show();



        rawToWave(tempfile,tempfile2);

        File wavFile = new File(Environment.getExternalStorageDirectory().getAbsolutePath()+"/Voice Changer/temp/"+i1+filename+".wav");
        IConvertCallback callback = new IConvertCallback() 
            @Override
            public void onSuccess(File convertedFile) 

                File newfile=new File(Environment.getExternalStorageDirectory().getAbsolutePath()+"/Voice Changer/"+"VOICE CHANGER"+i1+filename+".mp3");
                File savedmp3=new File(Environment.getExternalStorageDirectory().getAbsolutePath()+"/Voice Changer/temp/"+i1+filename+".mp3");
                Toast.makeText(MainActivity.this, "SUCCESS: " + newfile.getPath(), Toast.LENGTH_LONG).show();

                try 
                    copyit(savedmp3,newfile);
                 catch (IOException e) 
                    e.printStackTrace();
                
            
            @Override
            public void onFailure(Exception error) 
                Toast.makeText(MainActivity.this, "ERROR: " + error.getMessage(), Toast.LENGTH_LONG).show();


            
        ;
        Toast.makeText(this, "Converting audio file...", Toast.LENGTH_SHORT).show();
        androidAudioConverter.with(this)
                .setFile(wavFile)
                .setFormat(cafe.adriel.androidaudioconverter.model.AudioFormat.MP3)
                .setCallback(callback)
                .convert();





    

输出将是一个 .mp3 文件。如果您想快速输出,可以使用 .wav 格式。

【讨论】:

我想在不使用 synthesizeToFile 的情况下更改 TTS 语音输出的采样率,或者想在 AudioTrack 中直接流式传输 TTS 语音数据而不保存 TTS 的输出。

以上是关于如何设置文本到语音的采样率 - Android的主要内容,如果未能解决你的问题,请参考以下文章

AvaudioEngine - 以特定采样率录制语音 AvaudioEngine for Analysis

语音处理:音频入门之基础概念总结

如何在 AVCaptureSession 上设置音频采样率?

tc397can的采样率怎么设置

libfdk_aac音频采样率和编码字节数注意

音频重采样实现原理