基准缓存字典导致“读取字节时出现意外的 EOF”

Posted

技术标签:

【中文标题】基准缓存字典导致“读取字节时出现意外的 EOF”【英文标题】:Benchmarking cache dictionary leads to "Unexpected EOF while reading bytes" 【发布时间】:2021-02-03 23:24:10 【问题描述】:

我在 vm 压力测试缓存字典上安装了 Clickhouse 版本 20.8.3.18 和 python3。在使用 clickhouse_driver 查询一定数量的条目后,我会得到错误

Unexpected EOF while reading bytes

这是由于驱动程序/python 相关的错误还是由于系统上的缓存被最大化。例如,这发生在具有 32Gb RAM 和 256Gb SSD 内存的机器上的文件大小为 203 列和 10000 行,一个大约 66Mb 的 csv 文件对于这样的错误来说似乎很小。我正在运行的查询是:

SELECT  
    dictGet('CacheDictionary', 'date', toUInt64(number)) AS date, 
    SUM(dictGet('CacheDictionary', 'filterColumn', toUInt64(number))) AS val, 
    AVG(dictGet('CacheDictionary', 'filterColumn', toUInt64(number))) AS avg 
FROM numbers(1, 10000) 
GROUP BY date

csv 文件的示例条目是:

20000,2021-02-05,6867,0.5314826651111791,OA9SMRN54LC3MTDW,D6S8AYXZ3JVSHPCY,12UQV1JR87MT00EP,3WBT23MA2QN6URA7,YGKJR5577BP6S3AD,2T90WPW1REOZA0L9,JQG8Z6FXXIX2788M,OAOVV1YX3A6HKQV8,FISBMOAHEXHAAKEY,XAULW5F90T3VEMUL,RAAZ5TM5XL7GRC1F,B16JEGDHXUXFI2R9,DETSZ7BR45CRAIA7,Z2X53PAQYCSBHPU3,SRISC0ZLWXC2DP34,KO2M3044JX5JCB74,ML776REFIX3Z1L78,ND6PXBOR135SWFSB,ZF4K45N2AIGFAK0L,RFE3EHCKC5EPYE2V,NJKM5T8UUD5NRDPX,O57IQW0670LP00I9,F0EBZ3BXHPETCFSY,RUZ7VH2IM0DIZ4UC,08BP467WG7ROEHTJ,9LSTNLUA240T2K4D,5L4PIRKMK746QW5Q,2VX3SER8ULU93NZG,Z0MZ9C3TTPR6WFDV,KB32XWCR67AWGSIB,PDM8QJ34X4EOTVN1,P7TUVP8Q1YF9S746,YDFDBCG6S2EXYPNW,55RN0F4UMGF3ABQZ,RRF895J8LQSLI48U,54OQWCJODIEQLRQF,D5ZJPGAG7CCO4LWA,UQDWEXPI184UUJQD,3QF6QAS32ITRL8JH,FPQ324RO04LNVAMO,ZJ6QCWNQCBQOE7F5,6OWVEVWHNSZILC6E,GIUD29OIFF3LUCCX,VGBJHKW32BUNUSDH,908TDRODVZIIC5O8,UCIU38BXEREJMO4M,5LKJ23ER4CKUZ88J,A1GBKPPM10L8X5RM,BB3SAVWF3CNBDXHO,279MIC1OXTDS2PFP,J6UVFJE8RGFK4LDN,3CE12GT27GX0WVWU,PNNTRLDFVJQ0TCRK,MI7XOHWUQX3W938H,LKZPV4K0BA6OE3R0,YJMLI82UBLSZWP7U,JORNKD1MSVECXBRF,CO5KKJIL1FHEYA11,GXVXWDOI538WCLC0,OPODB2R2ITSX0E6J,3VE7SOJZL3DKIES7,5LPXB17GJ94S86HL,UQ0DZVUDMBD39LC3,KSSVOBUKMZC7T89M,P6YL0WW22NOM5A36,RA46SZF4ZLO5YWUM,TUTMJ34X4040USXX,09HPKJAD58P3FVMP,DM0NJVFYKR2653HH,HP869NM4Y2EBE3ND,RVKP40RPBOPB6RPQ,WI3QXYA5XIWJUFUK,770L6U5KAEPKKJC1,2H0XNUDM41QBAZWB,8AWJ2Y7RB9F2WTT0,Y6T3PIPLU3FCBZCU,CY8SCO15RNUWQU2B,DRC88XH21J9ADT6Z,MLZ2JN7F8MXVBHBI,2YSUVHRL4V0EVHXF,Y0U12EBQSEVE6W6X,A6RRJY191S0JOXJH,4F12P4K0SJ6EDKSD,THCRJ2ZEXGM1RUM4,PF0OUAULUNIW0W9X,EK1249WXC0C2KKY8,11WEDAAJL7BL4T4U,4K8OP1WXSN1MIXPF,8D0WNN1672A6WK07,5RLYH7K00ZSR1LL2,EKEXBG87U1X6UOLL,YWK3V1F7MTAF9T19,XZ8ZF0XO5V8TCBPS,A3RX8X8A8I11Z8X3,77P2Q5WRSTL4ERAI,00BGNPDYFSVG5F81,5KTUM76C42VTP4I7,TA933GZZN8OQ20QJ,612WNQ74RDHMBWX3,D41HNOBPX11GFYWO,OGR4A0EPCSS00XL6,QIOH165Y5JGKJMFC,TF2R9TFC5TJN2PER,TYNXWI46H7I83O77,JMD5DOEV4U628SDK,D7ECJH43FEC77UCJ,FKA9AT5J20QI3MQP,7QSU0I8VRRLUMD7R,6OJ1O2XI2QJXP6W2,UD2QVJXNUFRCAO43,GS3TZUW8U6Z8EWWQ,QD79GBSO6D6GCAZ1,GQ5TUY2FMJSNMTRK,OGOYL2PD64E2DOOQ,Q733OU5P7J7SAFS1,GBS7MV5QOMQ4E89N,SB8MIQ1P37HMQZBJ,Z6G96BM7FL4150H3,05PS81HW528971RM,6F3KFLYT0345GI43,G65CDWEORNH3OUCY,12F43L99AZ84PDWR,GQQVWMTMS471WAWD,F1DFWRJ1F9M9MUTT,1M734H07IQAW49Q3,OPSRG5J7370227XE,BIPNR22KFF71MKQN,PV7DWGCQF5551FKT,YPGQVGUP37MRJY2B,RILKP96QV69WBW2D,4RXDCJURAVCQEGLX,XGIPC0AK1K0I6KDP,HMSE306L5NAK62LC,YAZHMS2UHGMWIB44,RZCAVUM45YTNV23T,3B7K07XPRTE8OMW1,FTP48ED5DQ4K3DM8,WW419RRJ2WU1F15L,85FWD49J0ARSUGI9,4U4768ANPCJ46K5P,EJ24BNUA6OZMUDEL,6Z27W6BN36GO8QWU,5AMZ4UU819GSI454,KMNIEJ2V5PI83KGP,APT4CYG8M5FM0BSW,IME5VRP08W468DZE,6BT4W0ZAW6C7993L,DRD6Q4P8BZVDG37U,2R1OEWQFV5J597AF,CKS41A6PXKVYICAG,OQYZ9UOQRVS3LLTF,JA3PZSAXFCJVZVLB,J23BP73T6GNC0Z08,GWOJXMXDVHCRE51Y,I826DE6KEVQK2PFC,6FF5LWM61KCM4C9K,P16P80EIX2X87OZO,O5GEOEO72CDV4GAX,UMKFUKMV6U0L5PM5,U64YI4G53LR3SC6J,CLML8KPAL697KYYJ,LMH2W0STEJ5H2J2S,AL61EP61ZR3GOPN3,Z3AEUMZSX4MQJ6M6,IS5RFEWIJ8XHYNK0,TNE1BS4JYN280PIF,67IER2YS6N2XHEW1,63P3O4X42T2INRT4,XYV043108XRK7Y4S,RW0HN600K0GQXF4Y,BZ1ZE6IBB4B72A81,QHAINYDIZX7838YI,7FFCKG3XJSZ2DIHJ,DF6C1OMPC1ETFPDZ,1EJ3EW0TXKVBC88R,WX6HG8FD021VFZ2S,W4OB9NZRODSTM96M,6GDA3L5CLBPVTPWQ,1Y4U7BL9UHPBJVIX,Y31SUUZ0JF2AXZWO,PL2I18PA0SVXG85E,TEY1HC97QMZ5YXMI,T49EVLLM43AI4OG3,0SDNMLWY85Z7NENX,4446QKGO8UL6RERT,IMEAM22I51GT4ZHY,HUCLC93NIUG0C5R0,5VPBRUUVMBXP7HJY,XCOOPM3JU5VHQ94T,3LRZGAF451G9XDIN,Y6VIN1E31NYRLA2N,RAROO2EM5Q9NJRG9,NUQ2QJ9M6T5KRCHK,WQKKQK8UBB30GRWI,20SOMMKD08FYAENW,1G9K4UFWAI8Q7Z8K,XLG898A4MQXZHVYR,FPT67A7VDLVZEWYH,6DQ6417FF07FORXZ,10RUAPY5KGAYBZZD

我已经发布了部分代码,试图找到存储的缓存项的最大数量,以及为每个项执行的查询。在selectBenchmark 中,string 对应于上面的查询。每个参数都非常容易解释(xmlFile 是在/etc/lib/clickhouse-server 中创建的字典)。

def cacheMaxItems(csvRead, xmlFile, benchmarkType, columnStepSize, rowStepSize):
    maxCache = []
    os.system('rm -f ' + csvRead)
    os.system('bash /root/restartCH.sh')
    for j in range(1, 13):
        outputCSV = '/root/results' + benchmarkType + '/cacheResults' + str(j*columnStepSize) + '.csv'  
        with open(outputCSV, 'w') as fp:
            wr = csv.writer(fp)  
            wr.writerow([benchmarkType + ': Number of rows', 'Loading time', 'Mean', 'Variance', 'Skewness', 'Number of Columns: ' + str(j*columnStepSize)])
        for i in range(1, 10000):
            if i%5 == 0:
                os.system('bash /root/restartCH.sh')
            createCSV(10000, j*columnStepSize, csvRead)
            try:
                clickhouseDictionary(rowStepSize*i*j*columnStepSize, j*columnStepSize, xmlFile, csvRead, 'Cache')
                if benchmarkType == 'Random':
                    results = selectBenchmark(i*rowStepSize, j*columnStepSize, 'Random', 'Cache')
                elif benchmarkType == 'Consecutive':
                    results = selectBenchmark(i*rowStepSize, j*columnStepSize, 'Consecutive', 'Cache')
                elif benchmarkType == 'CPU':
                    results = selectBenchmark(i*rowStepSize, j*columnStepSize, 'CPU', 'Cache')
                results.insert(0, i*rowStepSize)
                with open(outputCSV, 'a') as fp:
                    wr = csv.writer(fp)  
                    wr.writerow(results)

                print('Successfully loaded and queried cache of size ' + str(rowStepSize*i*j*columnStepSize) + '.')
            except Exception as ex:
                print(ex)
                os.system('rm -f ' + csvRead)
                os.system('bash /root/restartCH.sh')
                maxCache.append([j*columnStepSize, (i-1)*rowStepSize])
                print(maxCache)
                break
    return maxCache
def selectBenchmark(numberOfRows, numberOfColumns, benchmarkType, dictType):
    client = Client('localhost', port=9000, database='system')
    client.execute('SYSTEM RELOAD DICTIONARY ' + dictType + 'Dictionary')
    loadingTime = client.last_query.elapsed
    client.execute('SELECT dictGet(\'' + dictType + 'Dictionary\', \'random0\', toUInt64(1))', query_id=str(uuid.uuid4()))
    loadingTime += client.last_query.elapsed
    loop = True
    counter = 0
    j=0
    while loop:
        times = []
        for i in range(0, 31):
            query_id = str(uuid.uuid4())
            string = stringGen(numberOfRows, numberOfColumns, benchmarkType, dictType)
            client.execute(string, query_id = query_id)
            times.append(client.last_query.elapsed)  
        if max(times) > loadingTime:
            loadingTime = max(times)
        stats = transformedMLE(times)
        redactedTimes = [x for x in times if (stats[0]-3*np.sqrt(stats[1])) < x < (stats[0]+3*np.sqrt(stats[1]))]
        if len(times) - len(redactedTimes) <= 3:
            loop = False
        elif j > 15:
            print('High variance query')
            loop = False
        j+=1
    result = transformedMLE(redactedTimes)
    loadingTime = loadingTime - result[0]
    result.insert(0, loadingTime)
    client.disconnect()
    return result

restartCH.sh 文件是

service clickhouse-server forcerestart

因为缓存溢出经常阻塞restart 命令。 服务器错误日志没有输出,表明这是 python 驱动程序的问题,可能读取了返回的大量数据。我还得到了“Killed”python 输出,它也指向缓存问题,这是可以预料的,因为我正在对缓存字典进行基准测试。

【问题讨论】:

你能提供你的python代码错误调用栈吗? +检查相关问题:github.com/mymarilyn/clickhouse-driver/… @vladimir 我已经对答案进行了修改,但在 github 上没有发现类似的问题/答案。一个问题已在日志中解决,而错误未出现在日志中,另一个与异步查询有关。 【参考方案1】:

读取字节时出现意外的 EOF -- 这是 python 驱动程序错误。

检查 clickhouse-server.log 是否有真正的错误。

20.8.3.18 不支持,请升级到 20.8.12.2

【讨论】:

在 Centos 7.6 上,这是最新的软件包(至少使用 sudo yum update)。 clickhouse-server.log 中没有弹出与 EOF 错误相对应的错误,或者在 EOF 错误发生的时间。

以上是关于基准缓存字典导致“读取字节时出现意外的 EOF”的主要内容,如果未能解决你的问题,请参考以下文章

已解决Linux中buff/cache(磁盘写/读缓存)占用大量内存导致用户进程内存不足

已解决Linux中buff/cache(磁盘写/读缓存)占用大量内存导致用户进程内存不足

基准对象object中的基础类型----字典

如何优化缓存架构?--针对于热key问题

刷新缓存以防止基准测试波动

redis缓存与数据库一致性