如何将 SSML 合并到 Python 中

Posted 2023-03-25

技术标签:

【中文标题】如何将 SSML 合并到 Python 中【英文标题】：How to incorporate SSML into Python 【发布时间】：2016-08-06 15:06:02 【问题描述】：

我需要使用 SSML 在我的 Alexa Skill 中播放带有标签的音频文件（按照亚马逊的说明）。

问题是，我不知道如何在 Python 中使用 SSML。我知道我可以将它与 Java 一起使用，但我想用 Python 来培养我的技能。我已经查看了所有内容，但没有在 Python 脚本/程序中找到任何 SSML 的工作示例 - 有人知道吗？

【问题讨论】：

【参考方案1】：

这是两年前提出的，但也许有人会从以下内容中受益。

我刚刚检查过，如果您使用 Alexa Skills Kit SDK for Python，您可以简单地将 SSML 添加到您的回复中，例如：

@sb.request_handler(can_handle_func=is_request_type("LaunchRequest"))
def launch_request_handler(handler_input):

    speech_text = "Wait for it 3 seconds<break time="3s"/> Buuuu!"

    return handler_input.response_builder.speak(speech_text).response

希望这会有所帮助。

【讨论】：

【参考方案2】：

SSML 音频位于response.outputSpeech.ssml 属性中。这是一个删除了其他必需参数的示例 obj：


 "response": 
    "outputSpeech": 
      "type": "SSML",
      "ssml": "<speak>
              Welcome to Car-Fu.
              <audio src="https://carfu.com/audio/carfu-welcome.mp3" />
              You can order a ride, or request a fare estimate. Which will it be?
              </speak>"

进一步参考：

JSON Interface Reference for Custom Skills Speech Synthesis Markup Language (SSML) Reference

【讨论】：

【参考方案3】：

安装 ssml-builder “pip install ssml-builder”，并使用它：

from ssml_builder.core import Speech

speech = Speech()
speech.add_text('sample text')
ssml = speech.speak()
print(ssml)

【讨论】：

【参考方案4】：

这些 cmets 在弄清楚如何使用 ask-sdk-python 使 SSML 工作方面确实有很大帮助。而不是

speech_text = "Wait for it 3 seconds<break time="3s"/> Buuuu!" - from wmatt's comment

我定义了代表我正在使用的每个标签的开始和结束的变量

ssml_start = '<speak>'
speech_text = ssml_start + whispered_s + "Here are the latest alerts from MMDA" + whispered_e

使用单引号并将这些字符串连接到语音输出，它起作用了！非常感谢你们！非常感谢！

【讨论】：

【参考方案5】：

python的ssml包已经存在。

你可以通过 pip 像下面这样安装



    $ pip install pyssml
    or
    $ pip3 install pyssml

所以例子是下面的链接

http://blog.naver.com/chandong83/221145083125 对不起。是韩国的。



    # -*- coding: utf-8 -*-
    # for amazon
    import re
    import os
    import sys
    import time
    from boto3 import client
    from botocore.exceptions import BotoCoreError, ClientError
    import vlc
    from pyssml.PySSML import PySSML


    # amazon service fuction
    # if isSSML is True, SSML format
    # else Text format
    def aws_polly(text, isSSML = False):
        voiceid = 'Joanna'

        try:
            polly = client("polly", region_name="ap-northeast-2")

            if isSSML:
                textType = 'ssml'
            else:
                textType = 'text'

            response = polly.synthesize_speech(
                    TextType=textType,
                    Text=text,
                    OutputFormat="mp3",
                    VoiceId=voiceid)

            # get Audio Stream (mp3 format)
            stream = response.get("Audiostream")

            # save the audio Stream File
            with open('aws_test_tts.mp3', 'wb') as f:
                data = stream.read()
                f.write(data)


            # VLC play audio
            # non block
            p = vlc.MediaPlayer('./aws_test_tts.mp3')
            p.play()

        except ( BotoCoreError, ClientError) as err:
            print(str(err))


    if __name__ == '__main__':
        # normal pyssml
        #s = PySSML()

        # amazon speech ssml
        s = AmazonSpeech()

        # normal 
        s.say('i am normal')

        #  speed is very slow
        s.prosody('rate':"x-slow", 'i am very slow')

        #  volume is very loud
        s.prosody('volume':'x-loud', 'my voice is very loud')

        #  take a one sec
        s.pause('1s')

        #  pitch is very high
        s.prosody('pitch':'x-high', 'my tone is very high')

        # amazone 
        s.whisper('i am whispering')
        # print to convert to ssml format
        print(s.ssml())

        # request aws polly and play
        aws_polly(s.ssml(), True)

        # Wait while playback.
        time.sleep(50)

【讨论】：

【参考方案6】：

这个问题有点模糊，但我确实设法弄清楚如何将 SSML 合并到 Python 脚本中。这是一个播放一些音频的 sn-p：

  if 'Item' in intent['slots']:
    chosen_item = intent['slots']['Item']['value']
    session_attributes = create_attributes(chosen_item)

    speech_output =  '<speak> Here is something to play' + \
    chosen_item + \
    '<audio src="https://s3.amazonaws.com/example/example.mp3" /> </speak>'

【讨论】：

用户BMW指出了正确答案。当您将outputSpeech JSON 对象的type 参数设置为SSML 并使用ssml 而不是text 时，您可以使用SSML 标签（如Speech Synthesis Markup Language (SSML) Reference 中所述）。

以上是关于如何将 SSML 合并到 Python 中的主要内容，如果未能解决你的问题，请参考以下文章