用AI写代码 -- Github Copilot测试

Posted 织网者Eric

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了用AI写代码 -- Github Copilot测试相关的知识,希望对你有一定的参考价值。


截图为copilot官网

要使用copilot,首先要安装vscode。下载copilot插件。下载完了以后,并不能马上使用。还需要申请内测账号,前几天,我终于收到内测账号通过的邮件。趁着今天在家,连忙打开vscode测试一下。

爬图片

我们的第一个实验是爬csdn的logo。

于是,我新建了一个文件,我开始写注释。当我写道retri的时候,copilot已经知道我要说什么了,直接把一句话补全了。牛!


我于是打了一下tab,这句话就补全了。

接着,我打了import,copilot已经开始建议我导入urllib了。牛!

打完def三个字,整个函数的建议都有了。牛!


接着,打if:

于是,满怀期待的,运行了程序,结果,悲剧了:

AttributeError: module ‘urllib’ has no attribute ‘urlopen’

我还是比较熟悉requests模块,于是,我把第一句代码改成了import requests,重新生成代码。如下:

# find logo in csdn.net
# retrieve the logo and save it in the same directory

import requests
from bs4 import BeautifulSoup
import os

def get_logo(url):
    r = requests.get(url)
    soup = BeautifulSoup(r.text, 'html.parser')
    logo = soup.find('link', rel='shortcut icon')
    if logo:
        logo_url = logo['href']
        r = requests.get(logo_url)
        if r.status_code == 200:
            with open(os.path.basename(logo_url), 'wb') as f:
                f.write(r.content)
                print('logo saved')
        else:
            print('logo not found')
    else:
        print('logo not found')
        

if __name__ == "__main__":
    get_logo('http://www.csdn.net')

运行完了,发现不是logo,是favicon。😦

算了,算了,Copilot找不到logo,还是我自己找吧。

于是,我把注释改成了

# download and save https://img-home.csdnimg.cn/images/20201124032511.png
# to the same folder

生成的代码如下:

# download and save https://img-home.csdnimg.cn/images/20201124032511.png
# to the same folder

import requests
from bs4 import BeautifulSoup
import os

def download():
    url = 'https://img-home.csdnimg.cn/images/20201124032511.png'
    root = './img'
    path = root + url.split('/')[-1]
    try:
        if not os.path.exists(root):
            os.mkdir(root)
        if not os.path.exists(path):
            r = requests.get(url)
            with open(path, 'wb') as f:
                f.write(r.content)
                f.close()
                print('文件保存成功')
        else:
            print('文件已存在')
    except:
        print('爬取失败')


if __name__ == '__main__':
    download()
    

这次终于成功的下载了csdn的logo。

我想,这可能是因为csdn使用了反爬技术(logo图片名字不叫logo),所以找不到,于是,我想到了,可以去爬小米的200万logo。

给了注释以后,生成代码如下:

# find logo from https://www.mi.com/index.html
# and download it to local

import os
import requests
from bs4 import BeautifulSoup

def download():
    url = 'https://www.mi.com/index.html'
    r = requests.get(url)
    r.raise_for_status()
    soup = BeautifulSoup(r.text, 'html.parser')
    img = soup.find('img', class_='logo')
    print(img)
    img_url = img['src']
    img_name = img['alt']
    img_path = os.path.join('.', img_name)
    if not os.path.exists(img_path):
        with open(img_path, 'wb') as f:
            f.write(requests.get(img_url).content)
            print('Downloading %s' % img_name)
    else:
        print('%s already exists' % img_name)


if __name__ == '__main__':
    download()


运行结果,找不到logo。我打开小米官网,看了半天,我都没看出来logo的地址在哪里。怪不得copilot也找不到。看来,200万的logo,不容屌丝随便盗取呀!

Pandas过滤

第二个例子,我打算写一个过滤程序,导入两个csv,用第一个减去第二个。得到第三个并保存。

生成代码如下

# read two csv files using pandas, 
# subtract the data in the second file from the first file,
# and write the result to a new csv file.

import pandas as pd
import csv
import sys

def subtract():
    # read the two csv files
    df1 = pd.read_csv(sys.argv[1])
    df2 = pd.read_csv(sys.argv[2])
    # subtract the second file from the first file
    df3 = df1 - df2
    # write the result to a new csv file
    df3.to_csv(sys.argv[3], index=False)

if __name__ == '__main__':
    subtract()
    

结果,运行出错:

TypeError: unsupported operand type(s) for -: ‘str’ and ‘str’

于是,我修改了注释,再次生成。

# read two csv files, a.csv and b.csv
# remove every row in a.csv that appears in b.csv
# write the result to a new csv file.

import os
import csv
import pandas as pd

def main():
    # get the current working directory
    cwd = os.getcwd()
    # get the path to the files
    a_path = os.path.join(cwd, "a.csv")
    b_path = os.path.join(cwd, "b.csv")
    # read the files into pandas dataframes
    a_df = pd.read_csv(a_path)
    b_df = pd.read_csv(b_path)
    # remove rows from a_df that appear in b_df
    a_df = a_df[~a_df.index.isin(b_df.index)]
    # write the result to a new csv file
    a_df.to_csv("a_minus_b.csv", index=False)

if __name__ == "__main__":
    main()

这次居然用了index过滤,但是并不是我想要的。算了,我还是去StackOverflow找吧。

Leetcode测试

我用leetcode第14题,测试,结果如下:

# 14. Longest Common Prefix
# Easy

# 4845

# 2366

# Add to List

# Share
# Write a function to find the longest common prefix string amongst an array of strings.

# If there is no common prefix, return an empty string "".

 

# Example 1:

# Input: strs = ["flower","flow","flight"]
# Output: "fl"
# Example 2:

# Input: strs = ["dog","racecar","car"]
# Output: ""
# Explanation: There is no common prefix among the input strings.
 

# Constraints:

# 1 <= strs.length <= 200
# 0 <= strs[i].length <= 200
# strs[i] consists of only lower-case English letters.

class Solution(object):
    def longestCommonPrefix(self, strs):
        """
        :type strs: List[str]
        :rtype: str
        """
        # 以上是我复制粘贴自
		# https://leetcode.com/problems/longest-common-prefix/
        # 以下为copilot生成
        if not strs:
            return ""
        if len(strs) == 1:
            return strs[0]
        min_len = min([len(s) for s in strs])
        for i in range(min_len):
            for j in range(len(strs)):
                if strs[j][i] != strs[0][i]:
                    return strs[0][:i]
        return strs[0][:min_len]

if __name__ == "__main__":
    a = Solution()
    print(a.longestCommonPrefix(["flower","flow","flight"]))
    print(a.longestCommonPrefix(["dog","racecar","car"]))
    print(a.longestCommonPrefix(["a","b","c"]))
    print(a.longestCommonPrefix(["a"]))
    print(a.longestCommonPrefix([""]))
    print(a.longestCommonPrefix([]))

打印结果

fl

a

看起来好像很厉害,都能做leetcode的题目了。但是,我大胆怀疑,copilot应该用了leetcode训练模型,所以能做出leetcode并不奇怪

机器学习代码

同理,我给出注释,然后copilot给出代码,如下:

# read iris datasets from sklearn.datasets
# predict the class of the samples in the test set
# return the predicted classes
# print accuracy, f1 score, recall, precision


from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, f1_score, recall_score, precision_score

def predict(X_train, y_train, X_test, y_test, k):
    knn = KNeighborsClassifier(n_neighbors=k)
    knn.fit(X_train, y_train)
    y_pred = knn.predict(X_test)
    print("Accuracy: :.2f".format(accuracy_score(y_test, y_pred)))
    print("F1 score: :.2f".format(f1_score(y_test, y_pred)))
    print("Recall: :.2f".format(recall_score(y_test, y_pred)))
    print("Precision: :.2f".format(precision_score(y_test, y_pred)))

def main():
    iris = load_iris()
    X = iris.data
    y = iris.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
    predict(X_train, y_train, X_test, y_test, 5)


if __name__ == "__main__":
    main()

又报错了:

ValueError: Target is multiclass but average=‘binary’. Please choose another average setting, one of [None, ‘micro’, ‘macro’, ‘weighted’].

修改predict函数如下

def predict(X_train, y_train, X_test, y_test, k):
    knn = KNeighborsClassifier(n_neighbors=k)
    knn.fit(X_train, y_train)
    y_pred = knn.predict(X_test)
    print("Accuracy: :.2f".format(accuracy_score(y_test, y_pred)))
    print("F1 score: :.2f".format(f1_score(y_test, y_pred, average='macro')))
    print("Recall: :.2f".format(recall_score(y_test, y_pred, average='macro')))
    print("Precision: :.2f".format(precision_score(y_test, y_pred, average='macro')))

再测试,通过了。
虽然有错误,但是我觉得也挺好,我可以少打不少字了。少数几个错误改一下即可。

总结

要用好copilot,还是需要给他非常明确的指示。比如下载图片,要告诉他图片的URL。很多时候,你忙了半天,或许还不如自己去网上找代码。常见问题能解决,非常见问题,就未必了。不过,这还仅仅是开始,相信未来copilot会越来越准。那时候,善于使用copitlot的程序员,就不用加班了。不用copilot,就要天天加班了。

即使有了copilot,你还是需要自己理解代码。比如上面的pandas过滤程序,代码其实错了,但是正好通过了测试。如果你不知起所以然,直接上生产,后果可想而知。

测试过程中发现,他的速度非常不稳定。经常是半天没反应,不知道是不是网络原因。希望未来copilot能在中国放服务器,这样速度就有保证了。

以上是关于用AI写代码 -- Github Copilot测试的主要内容,如果未能解决你的问题,请参考以下文章

AI 自动写代码插件 Copilot(副驾驶员)

AI 自动写代码插件 Copilot(副驾驶员)

让 AI 为你写代码 - 体验 Github Copilot

GitHub Copilot代码笔刷火了,一刷修bug加文档,特斯拉前AI总监:我现在80%的代码由AI完成...

AI 帮写代码 67 元/月!GitHub Copilot 搞收费“双标”,劝退大批程序员

AI 帮写代码 67 元/月!GitHub Copilot 搞收费“双标”,劝退大批程序员