Qbot5.接入DALL·E图像生成/Disco Diffusion本地部署

Posted 2023-02-17 zstar-_

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Qbot5.接入DALL·E图像生成/Disco Diffusion本地部署相关的知识，希望对你有一定的参考价值。

该项目计划长期进行维护更新，欢迎star：https://github.com/zstar1003/Qbot

前言

Ai绘图的出现引发了美术界的变革，本篇尝试将Ai绘图融入进QQ机器人中。

DALL·E图像生成

和GPT-3类似，DALL·E同样是openai的产品，并且官方提供了调用API。
官方文档中明了调用方式，下面是python的调用示例：

import openai

openai.api_key = '自己的api_key'
response = openai.Image.create(
  prompt="a white siamese cat",
  n=1,
  size="1024x1024"
)
image_url = response['data'][0]['url']
print(image_url)

Qbot发送图片

QQ机器人采用的是go-cqhttp框架，该框架定义了一种CQ码，可以用来发送图片。
官方文档中给出了具体格式：

[CQ:image,file=http://baidu.com/1.jpg,type=show,id=40004]

其中，file参数指向一个网络图片路径。

因此，可以直接将DALL·E得到的图片路径赋值给CQ码的参数，相关函数如下：

# 发送群消息图片
def send_group_message_image(gid, pic_path, uid, msg):
    try:
        message = "[CQ:image,file=" + pic_path + "]"
        if msg != "":
            message = msg + '\\n' + message
        message = str('[CQ:at,qq=%s]\\n' % uid) + message  # @发言人
        res = requests.post(url=cqhttp_url + "/send_group_msg",
                            params='group_id': int(gid), 'message': message).json()
        if res["status"] == "ok":
            print("群消息发送成功")
        else:
            print("群消息发送失败，错误信息：" + str(res['wording']))
    except Exception as error:
        print("群消息发送失败")
        print(error)

Disco Diffusion本地部署测试

安装

虽然上面这样直接调用API很方便快捷，然而，每个账号都只有18$的限额，如果生成一张1024x1024的图片，将直接耗费0.02$。如果直接在本地进行Ai绘画，就可以摆脱限额限制。

找了目前主流的开源Ai绘图，流行较广的是novelai和Disco Diffusion。
novelai官方提供了封装较好的webui，并不利于二次开发。于是找了基于Disco Diffusion的discoart。
项目地址：https://github.com/jina-ai/discoart

项目文档中写明可用下列方式安装：

pip install discoart

不过实测发现该方式安装会出错，因此采用的理想方式是拷贝整个项目，然后使用setup.py进行安装：

python setup.py install

注：安装尽量创建一个新的虚拟环境，pytorch版本需手动安装为1.13.1，默认安装的pytorch版本不支持GPU。

不支持GPU的运行之后会出现不支持半精度推理的问题，检测是否支持GPU可用下面的方式进行检测：

import torch

print(torch.cuda.is_available())

使用

安装好之后，调用起来比较便捷，下面是一个调用示例：

from discoart import create

da = create(
    text_prompts='spring morning, a painting of Chinese water town , There are green trees on the bank, created by Makoto Shinkai and Hayao Miyazaki,Evgeny Lushpin, popular on cgsociety,ultrawide angle, soft light, 8K, fairy tales, dreams, tranquility, HD pictures',
    skip_steps=200,
    width_height=[400, 200]
)

运行完成后，会生成以下一系列参数，任何参数都可以通过上面类似的方式进行设定，不设定则使用默认值。

┌────────────────────────────┬────────────────────────────────────────────────┐
│                   Argument │ Value                                          │
├────────────────────────────┼────────────────────────────────────────────────┤
│                 batch_name │ None                                           │
│                 batch_size │ 1                                              │
│                 clamp_grad │ True                                           │
│                  clamp_max │ 0.05                                           │
│              clip_denoised │ False                                          │
│        clip_guidance_scale │ 5000                                           │
│                clip_models │ ['ViT-B-32::openai', 'ViT-B-16::openai',       │
│                            │ 'RN50::openai']                                │
│      clip_models_schedules │ None                                           │
│                 cut_ic_pow │ 1.0                                            │
│               cut_icgray_p │ [0.2]*400+[0]*600                              │
│               cut_innercut │ [4]*400+[12]*600                               │
│               cut_overview │ [12]*400+[4]*600                               │
│        cut_schedules_group │ None                                           │
│               cutn_batches │ 4                                              │
│            diffusion_model │ 512x512_diffusion_uncond_finetune_008100       │
│     diffusion_model_config │ None                                           │
│    diffusion_sampling_mode │ ddim                                           │
│               display_rate │ 1                                              │
│                        eta │ 0.8                                            │
│                    gif_fps │ 20                                             │
│             gif_size_ratio │ 0.5                                            │
│               image_output │ True                                           │
│                 init_image │ None                                           │
│                 init_scale │ 1000                                           │
│                  n_batches │ 4                                              │
│             name_docarray* │ discoart-a52a19258c3b11ed83433868935d4197      │
│        on_misspelled_token │ ignore                                         │
│                perlin_init │ False                                          │
│                perlin_mode │ mixed                                          │
│                   rand_mag │ 0.05                                           │
│            randomize_class │ True                                           │
│                range_scale │ 150                                            │
│                  sat_scale │ 0                                              │
│                  save_rate │ 20                                             │
│                      seed* │ 4017468948                                     │
│                 skip_event │ None                                           │
│                skip_steps* │ 200                                            │
│                      steps │ 250                                            │
│                 stop_event │ None                                           │
│           text_clip_on_cpu │ False                                          │
│              text_prompts* │ spring morning, a painting of Chinese water    │
│                            │ town , There are green trees on the bank,      │
│                            │ created by Makoto Shinkai and Hayao            │
│                            │ Miyazaki,Evgeny Lushpin, popular on            │
│                            │ cgsociety,ultrawide angle, soft light, 8K,     │
│                            │ fairy tales, dreams, tranquility, HD pictures  │
│     transformation_percent │ [0.09]                                         │
│ truncate_overlength_prompt │ False                                          │
│                   tv_scale │ 0                                              │
│    use_horizontal_symmetry │ False                                          │
│        use_secondary_model │ True                                           │
│      use_vertical_symmetry │ False                                          │
│             visualize_cuts │ False                                          │
│              width_height* │ [400, 200]                                     │
└────────────────────────────┴────────────────────────────────────────────────┘

运行之后，会生成一个独立的文件夹存储结果，其中包含绘图过程图，最终生成图标记为done

运用这一点，可以将过程图转换成Gif动图，而形成官方说明中所展示的效果：

缺陷

本地部署最大的问题还是速度慢，并且对设备要求较高。

我的设备使用默认参数，会报错：

【error】RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

这个问题是GPU显存不足，调小生成图片尺寸，才能够成功运行。
因此，对于群机器人这种需要及时响应的场景，在没有优秀设备加持的情况下，使用本地绘图并不合适。

以上是关于Qbot5.接入DALL·E图像生成/Disco Diffusion本地部署的主要内容，如果未能解决你的问题，请参考以下文章