基础算法优化——Fast Modular Multiplication

Posted 2022-08-04 mutourend

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了基础算法优化——Fast Modular Multiplication相关的知识，希望对你有一定的参考价值。

1. 引言

Yuval Domb 2022年论文《Fast Modular Multiplication》

模乘可以说是任何密码系统中计算量最大的算术原语。本文提出了一种高效、硬件友好的算法，据作者所知，该算法优于迄今为止的算法。

标准的modulo-prime multiplication problem in $\\mathbbF_s$ 表示为：
$\\beginequation r=a\\cdot b \\mod s \\endequation$
其中 $a,b,s\\in\\mathbbF_s$ ， $s$ 为素数，并利用标准 $\\mathbbZ$ -algebra。
等价为：
$\\beginequation a\\cdot b = l\\cdot s +r \\endequation$
其中， $l\\in \\mathbbZ$ ，使得 $0\\leq r < s$ 。

本文主要为（1）中计算提供了一种高效、硬件友好的快速计算方法。

将所有变量以 $d$ -进制来表示，其中 $\\mathbbF_s$ 内的每个元素都以 $n$ 个digits来表示，有：
$\\beginequation n=\\left \\lceil \\log _ds\\right \\rceil \\endequation$

接下来，简单地令 $d = 2$ ，所有元素以二进制来表示。

尽管本文重点关注modulo-prime multiplication，但可将其推广到任意 $a\\mod s$ 运算，其中 $a<s^2$ ， $s$ 可为素数或非素数的任意值。

2. 本文主要贡献

本文主要展现了，如何将：

Barrett Reduction算法（具体见Barrett 1987年论文《Implementing the rivest shamir and adleman public key encryption algorithm on a standard digital signal processor》）
与好的参数选择
以及简单的bounding技术

结合，用于求取quotient $l$ 的近似值，近视精度为一个小的constant error，该constant error与 $n$ 无关（无论 $n$ 值大小）。

令人惊讶的是，最终的reduction算法与Montgomery的Modular-Multiplication算法（见Montgomery 1985年论文《 Modular multiplication without trial division》）类似，但是本文最终的reduction算法不需要coordinate translation。

本文的bounding技术可用于进一步降低特定感兴趣场景的计算复杂度（知识需要增加constant error），本文不展开。

3. Reduction Scheme

3.1 假设 $l$ 为近似已知

假设 $l$ 为近似已知，将其近似值表示为 $\\hatl$ ，使得：
$\\beginequation l-\\lambda \\leq \\hatl \\leq l \\endequation$
其中 $\\lambda=O(1)$ 为一个已知的constant。

若 $\\lambda=0$ ，则显然有：
$\\beginequation ab[2n-1:0] - \\hatls[2n-1:0]=r[n-1:0] \\endequation$
其中[]中括号内的值表示了bit locations和sizes。

注意，当 $\\lambda=0$ 时，可推测余数 $r$ 最大长度为 $n$ bits，使得等式（5）中右侧值的剩余最高有效位（ms (most-significant) bits）必须为 $0$ 。

通过简单的bit操作，可以long addition表示为：

其中，上横杠表示的是bit-inversion运算符，横岗上的 $1$ 表示为初始carry bit。
不过，对上面的long addition表示仔细观察可知，仅需要 $ab [n - 1 : 0]$ 和 $\\hatls[n-1:0]$ 来完成该计算，从而可节约近一半的计算量。最终的adder为a fixed width adder——即， $n+n\\rightarrow n$ 。这意味着可忽略 ms bits（最高有效位）的任何溢出。可将其等价为a fixed-width subtractor——即， $n-n\\rightarrow n$ ，可将其结果看成是unsigned integer。

将生成以上乘积的multiplier表示为 $n\\times n\\rightarrow n_\\textlsb$ ，其中 $n_\\textlsb$ 是指该full product的 $n$ 个least-significant bits。 $a\\cdot b$ 和 $\\hatl\\cdot s$ 都可通过 $n\\times n\\rightarrow n_\\textlsb$ 来生成。
此外，若 $s$ 为constant， $\\hatl\\cdot s$ 可通过一个constant $n\\times n\\rightarrow n_\\textlsb$ multiplier来生成。

当 $\\lambda\\neq 0$ 时：
$\\beginequation ab-\\hatls = r+\\lambda s\\endequation$
此时，用于表示等式（5）中右侧值所需的number of bits为：
$\\beginequation \\left \\lceil \\log_2(r+\\lambda s) \\right \\rceil \\leq n+\\left \\lceil \\log_2\\fracr+\\lambda ss \\right \\rceil \\leq n+\\left \\lceil \\log_2(1+\\lambda) \\right \\rceil \\endequation$
因此，若 $\\lambda=1$ ，则仅需要额外再增加 $1$ 个bit来表示。

3.2 使用Barrett Reduction算法求 $l$ 的近似值

采用Barrett的modular reduction算法对 $l$ 求近似值为：

以上是关于基础算法优化——Fast Modular Multiplication的主要内容，如果未能解决你的问题，请参考以下文章