为什么hashlib比sha256的其他代码更快?如何让我的代码接近hashlib性能?

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了为什么hashlib比sha256的其他代码更快?如何让我的代码接近hashlib性能?相关的知识,希望对你有一定的参考价值。

Bellow是一个代码,用于将hashlib.sha256()与我的sha256_test()函数进行比较,该函数在哈希速率性能方面用原始python编写。

from time import time_ns as time
import hashlib

def pad512(bytes_):
    L       = len(bytes_)*8
    K       = 512 - ((L + 1) % 512)
    padding = (1 << K) | L
    return bytes_ + padding.to_bytes((K + 1)//8, 'big')

def mpars (M):
    chunks = []
    while M:
        chunks.append(M[:64])
        M = M[64:]
    return chunks

def sha256_transform(H, Kt, W):
    a, b, c, d, e, f, g, h = H
    # Step 1: Looping
    for t in range(0, 64):
        T1 = h + g1(e) + Ch(e, f, g) + Kt[t] + W[t]
        T2 = (g0(a) + Maj(a, b, c))
        h = g
        g = f
        f = e
        e = (d + T1) & 0xffffffff
        d = c
        c = b
        b = a
        a = (T1 + T2) & 0xffffffff
    # Step 2: Updating Hashes
    H[0] = (a + H[0]) & 0xffffffff
    H[1] = (b + H[1]) & 0xffffffff
    H[2] = (c + H[2]) & 0xffffffff
    H[3] = (d + H[3]) & 0xffffffff
    H[4] = (e + H[4]) & 0xffffffff
    H[5] = (f + H[5]) & 0xffffffff
    H[6] = (g + H[6]) & 0xffffffff
    H[7] = (h + H[7]) & 0xffffffff
    return H

Ch   = lambda x, y, z: (z ^ (x & (y ^ z)))
##    """The x input chooses if the output is from y or z.
##    Ch(x,y,z)=(x∧y)⊕(¬x∧z)"""
Maj  = lambda x, y, z: (((x | y) & z) | (x & y))
##    """The result is set according to the majority of the 3 inputs.
##    Maj(x, y,z) = (x ∧ y) ⊕ (x ∧ z) ⊕ ( y ∧ z)"""

ROTR = lambda x, y: (((x & 0xffffffff) >> (y & 31)) | (x << (32 - (y & 31)))) & 0xffffffff
SHR  = lambda x, n: (x & 0xffffffff) >> n

s0   = lambda x: (ROTR(x, 7) ^ ROTR(x, 18) ^ SHR(x, 3))
s1   = lambda x: (ROTR(x, 17) ^ ROTR(x, 19) ^ SHR(x, 10))

g0   = lambda x: (ROTR(x, 2) ^ ROTR(x, 13) ^ ROTR(x, 22))
g1   = lambda x: (ROTR(x, 6) ^ ROTR(x, 11) ^ ROTR(x, 25))

def sha256_test (bytes_):
    #Parameters
    initHash = [
                0x6A09E667, 0xBB67AE85, 0x3C6EF372, 0xA54FF53A,
                0x510E527F, 0x9B05688C, 0x1F83D9AB, 0x5BE0CD19,
                ]
    Kt = [
        0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
        0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
        0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
        0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
        0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
        0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
        0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
        0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2,
        ]

    padM   = pad512(bytes_)
    chunks = mpars(padM)
    # Preparing Initaial Hashes
    H = initHash    
    # Starting the Main Loop
    for chunk in chunks:
        W = []
        # Step 1: Preparing Wt
        for t in range(0, 16):
            W.append((((((chunk[4*t] << 8) | chunk[4*t+1]) << 8) | chunk[4*t+2]) << 8) | chunk[4*t+3])
        for t in range(16, 64):
            W.append((s1(W[t-2]) + W[t-7] + s0(W[t-15]) + W[t-16]) & 0xffffffff)
        # Step 2: transform the hash
        H = sha256_transform(H, Kt, W)
        # Step 3: Give Out the digest
        Hash = b''
        for j in H:
            Hash += (j.to_bytes(4, byteorder='big'))

    return Hash

if __name__ == "__main__":

    k = 10000
    M = bytes.fromhex('00000000000000000001d2c45d09a2b4596323f926dcb240838fa3b47717bff6') #block #548867
    start = time()
    for i in range(0, k):
        o1 = sha256_test(sha256_test(M))
    end    = time()
    endtns1 = (end-start)/k
    endts1  = endtns1 * 1e-9
    print('@sha256_TESTs() Each iteration takes:  {} (ns) and {} (sec).'.format(endtns1, endts1))
    print('@sha256_TESTs() Calculated Hash power: {} (h/s)'.format(int(2/endts1)))

    start = time()
    for i in range(0, k):
        o2 = hashlib.sha256(hashlib.sha256(M).digest()).digest()
    end    = time()
    endtns2 = (end-start)/k
    endts2  = endtns2 * 1e-9
    print('@hashlib.sha256() Each iteration takes:  {} (ns) and {} (sec).'.format(endtns2, endts2))
    print('@hashlib.sha256() Calculated Hash power: {} (Kh/s)'.format(int(2/endts2/1024)))

    print('Outputs Match       : ', o1 == o2)
    print('hashlib is ~{} times faster'.format(int(endtns1/endtns2)))

在计算哈希率时,1 Kilo Hash被认为是1000哈希或1024哈希?!

如果我对计算哈希率是正确的,我得出结论,我的PC可以使用我自己的~900 (h/s)函数生成sha256_test()的哈希率,而hashlib.sha256()使用~300 Kh/s优于此值。

首先,我想了解hashlib卓越性能背后的机制。当我在hashlib.py中读取代码时,其中的代码不多,我无法理解如何计算哈希值。有可能看到hashlib.sha256()背后的代码吗?

其次,是否有可能改进我的代码,以便它可以接近300 (Kh/s)的性能?我已经阅读过有关Cython的内容,我只是不确定它能够改进这样一个算法的能力。

第三,技术上可能比python中的hashlib快吗?

答案

好好看看hashlib.py对你说实话不会太过帮助,但它可能会给你一个hint。你所做的是纯python代码,而hashlib依赖于C实现,并且可以轻松地围绕纯python运行。即你需要看看this。因此,如果你想接近这些数字,你需要研究cython,C,C ++或Rust。

以上是关于为什么hashlib比sha256的其他代码更快?如何让我的代码接近hashlib性能?的主要内容,如果未能解决你的问题,请参考以下文章

python基础整理笔记

hashlib模块

python 模块之hashlib

Python中hashlib模块的使用

hashlib模块

hashlib模块