如何在 linux 中对文件执行按位运算？

Posted 2023-02-23

技术标签:

【中文标题】如何在 linux 中对文件执行按位运算？【英文标题】：How to perform bitwise operations on files in linux? 【发布时间】：2011-10-16 20:57:30 【问题描述】：

我想对 Linux 中的文件进行一些按位运算（例如异或两个文件），但我不知道该怎么做。是否有任何命令？

我们将不胜感激。

【问题讨论】：

你承诺接受答案吗？读取块，将操作符应用于块中的所有数据，写入块 @unkulunkulu @phihag 是的，当然我不知道，我现在才找到，谢谢;) 【参考方案1】：

您可以使用mmap 映射文件，对映射的内存应用按位运算，然后关闭它。

或者，将块读入缓冲区，对缓冲区应用操作，然后写出缓冲区也可以。

这是一个反转所有位的示例（C，不是 C++；因为除了错误处理之外的所有内容都是相同的）：

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char* argv[]) 
    if (argc != 2) printf("Usage: %s file\n", argv[0]); exit(1);

    int fd = open(argv[1], O_RDWR);
    if (fd == -1) perror("Error opening file for writing"); exit(2);

    struct stat st;
    if (fstat(fd, &st) == -1) perror("Can't determine file size"); exit(3);

    char* file = mmap(NULL, st.st_size, PROT_READ | PROT_WRITE,
                      MAP_SHARED, fd, 0);
    if (file == MAP_FAILED) 
        perror("Can't map file");
        exit(4);
    

    for (ssize_t i = 0;i < st.st_size;i++) 
        /* Binary operation goes here.
        If speed is an issue, you may want to do it on a 32 or 64 bit value at
        once, and handle any remaining bytes in special code. */
        file[i] = ~file[i];
    

    munmap(file, st.st_size);
    close(fd);
    return 0;

【讨论】：

非常感谢您的回答。您介意再解释一下第一个解决方案吗？ @Sina 我添加了一个示例程序（我不相信我的 C++ 技能，所以我用 C 编写了它——你可能想要 C++ 化错误处理）。这有帮助吗？非常感谢，很有用:) @Sina 首先，计算你有多少条目（num = st.size/sizeof(entry)，其中entry 是int32_t 左右）。将file 转换为指向所需类型的指针，并在for 循环中将st.st_size 替换为num。不要忘记处理剩余的st.st_size - num * sizeof(entry) 字节！ @Sina 您可以使用strtol 将十六进制字符串参数转换为数字。 no way 可以在不完全重写的情况下写入文件的头部。你can append to a memory-mapped file.【参考方案2】：

通过互联网快速搜索发现Monolith，这是一个专门用于异或两个文件的开源程序。我发现它是因为 Bruce Schneier 在博客上写过它，而且这样做的目的似乎是合法的。

【讨论】：

【参考方案3】：

感谢“phihag”，此代码用于对 2 个文件进行二进制操作。示例 1：您有两个文件并想比较这两个文件，因此您对它们进行二进制 XOR。 Ex.2：您已经使用 jdownloader 或类似的东西下载了一个文件，并且您将未完成的下载移动到另一个文件夹，然后下载管理器继续未完成的部分并创建另一个文件。所以你有两个单独的文件，它们可以互相完成。现在，如果您对这两个文件执行二进制 OR，您就有了一个完整的文件。

警告：较大的文件将被操作结果覆盖。

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <string.h>

int main(int argc, char* argv[])

    int FP1 = 0, FP2 = 0;
    struct stat St1, St2;
    char *File1 = NULL, *File2 = NULL;
    int Rn = 0;

    if (argc != 4)
    
        printf("Usage: %s File1 File2 Operator\n", argv[0]);
        exit(1);
    

    //Opening and mapping File1
    FP1 = open(argv[1], O_RDWR);
    if (FP1 == -1)
    
        perror("Error opening file1 for writing");
        exit(2);
    

    if (fstat(FP1, &St1) == -1)
    
        perror("Can't determine file1 size");
        exit(3);
    

    File1 = (char*) mmap(NULL, St1.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, FP1, 0);
    if (File1 == MAP_FAILED)
    
        perror("Can't map file1");
        exit(4);
    
    //======================

    //Opening and mapping File2
    FP2 = open(argv[2], O_RDWR);
    if (FP2 == -1)
    
        perror("Error opening file2 for writing");
        exit(2);
    

    if (fstat(FP2, &St2) == -1)
    
        perror("Can't determine file2 size");
        exit(3);
    

    File2 = (char*) mmap(NULL, St2.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, FP2, 0);
    if (File2 == MAP_FAILED)
    
        perror("Can't map file2");
        exit(4);
    
    //======================

    //Binary operations
    ssize_t i = 0;
    switch (*(argv[3]))
    
        case '|':
            if (St1.st_size <= St2.st_size)
                for (i = 0; i < St1.st_size; i ++)
                    File2[i] = File1[i] | File2[i];
            else
                for (i = 0; i < St2.st_size; i ++)
                    File1[i] = File1[i] | File2[i];
            break;
        case '&':
            if (St1.st_size <= St2.st_size)
                for (i = 0; i < St1.st_size; i ++)
                    File2[i] = File1[i] & File2[i];
            else
                for (i = 0; i < St2.st_size; i ++)
                    File1[i] = File1[i] & File2[i];
            break;
        case '^':
            if (St1.st_size <= St2.st_size)
                for (i = 0; i < St1.st_size; i ++)
                    File2[i] = File1[i] ^ File2[i];
            else
                for (i = 0; i < St2.st_size; i ++)
                    File1[i] = File1[i] ^ File2[i];
            break;
        default:
            perror("Unknown binary operator");
            exit(5);
    
    //======================

    munmap(File1, St1.st_size);
    munmap(File2, St2.st_size);
    close(FP1);
    close(FP2);

    //Renaming the changed file and make output
    char Buffer[1024];
    if (St1.st_size <= St2.st_size)
    
        Rn = system(strcat(strcat(strcat(strcat(strcpy(Buffer, "mv \""), argv[2]), "\" \""), argv[2]),"-Mapped\""));
        if (Rn == -1)
        
            perror("Unable to rename the new file.");
            exit(6);
        
        else
            printf("%s is mapped.\n", argv[2]);
    
    else
    
        Rn = system(strcat(strcat(strcat(strcat(strcpy(Buffer, "mv \""), argv[1]), "\" \""), argv[1]),"-Mapped\""));
        if (Rn == -1)
        
            perror("Unable to rename the new file.");
            exit(6);
        
        else
            printf("%s is mapped.\n", argv[1]);
    
    //======================

    return 0;

【讨论】：

【参考方案4】：

对于那些喜欢 Python 脚本的人：

#!/usr/bin/env python3

import binascii
import sys

blocksize = 4096

input1 = open(sys.argv[1], 'rb')
input2 = open(sys.argv[2], 'rb')
output = open(sys.argv[3], 'wb')

while True:
    block1 = input1.read(blocksize)
    block2 = input2.read(blocksize)
    if not block1 and not block2:
        break  # reached EOF in both files
    if len(block1) != len(block2):
        sys.stderr.write('Premature EOF, truncating to shorter file\n')
        block1 = block1[:min(len(block1), len(block2))]
        block2 = block2[:min(len(block1), len(block2))]
    # convert to large integer
    int1 = int(binascii.hexlify(block1), 16)
    int2 = int(binascii.hexlify(block2), 16)
    # apply logical operator: xor
    int_o = int1 ^ int2
    # covert back to binary
    hexformat = '%%0%dx' %(2*len(block1))  # e.g. '%0512x' for 256 bytes
    block_o = binascii.unhexlify(hexformat %int_o)
    output.write(block_o)

output.close()
input1.close()
input2.close()

对于不同长度的文件，它会发出警告并停止。在某些应用程序中，最好用零字节填充较短的输入或回绕到输入文件的开头。这可以在命令行上通过将较短的文件与自身或/dev/zero 的输出连接起来来实现。

【讨论】：

以上是关于如何在 linux 中对文件执行按位运算？的主要内容，如果未能解决你的问题，请参考以下文章

如何在 Objective-C 中对 NSData 进行按位异或？

如何在 UNIX/Linux 中对生产应用程序进行核心转储分析？

这个按位运算如何检查 2 的幂？

在 C/C++ 中，按位运算如何作用于变量赋值中的表达式？

这个按位与运算符如何屏蔽数字的低七位？

如何处理 javascript 中的 256 位数字并对它们执行按位运算？