防止从 ZipOutPutStream 中删除重复项

Posted 2023-03-06

技术标签:

【中文标题】防止从 ZipOutPutStream 中删除重复项【英文标题】：Prevent removing duplicates from ZipOutPutStream 【发布时间】：2021-03-24 20:21:29 【问题描述】：

我正在使用此功能来压缩一堆文件，但问题是如果有两个文件名具有不同的内容但具有相同的原始名称，那么只有一个文件被压缩，我如何通过添加来防止这种情况扩展名前的文件名数字，例如file1.txt 重复名称的数字扩展？

     ZipOutputStream zipOut = new ZipOutputStream(response.getOutputStream())
    
               files.forEach(file -> 

                        final ZipEntry zipEntry = new ZipEntry(Objects.requireNonNull(file.getOriginalName()));
                        zipOut.putNextEntry(zipEntry);
                        IOUtils.copy(file.getInputStream(), zipOut);
                        file.getInputStream().close();
                        zipOut.closeEntry();

复制

例如

 files.add("hello");
        files.add("hello");
        files.add("hello");
        files.add("name");
        files.add("name");
        files.add("name");

        files.add("hello22");
        files.add("name");

那么结果应该是

"hello", "hello1", "hello2", "name", "name1", "name2", "name3"

【问题讨论】：

在写入流之前扫描您的文件列表？或者，如果您正在编辑现有的 zip，那么一种选择是制作 zip 的副本（如临时文件），以便您可以从其中读取并检查重复项，然后写入另一个而不会发生任何冲突。我不想在内存中加载任何东西我如何检查重复的 plz 你能提供一个代码吗 【参考方案1】：

假设您正在创建一个新的 zip 文件而不是编辑现有的 zip 存档，那么您可以遍历您的 files 列表，如果发现任何重复项，您可以在 HashMap 或类似 @987654323 中记下新名称@，然后在您的 zip 方法中，您可以简单地检查 HashMap 并使用更新后的名称：

//Hashmap to store updated names
HashMap<String, String> duplicateNameMap = new HashMap<>();

//compare files to each other file to find duplicate names:
for (int i = 0; i < files.size(); i++) 
    for (int j = i+1; j < files.size(); j++) 
        //If duplicate exists then save the new name to the HashMap:
        if(files.get(i).getOriginalName().equals(files.get(j).getOriginalName())) 
            //Use substring to get the file name and extension
            String name = files.get(i).getOriginalName().substring(0, files.get(i).getOriginalName().lastIndexOf("."));
            //If the files have no extension ".doc" etc, then you can remove the next line
            String extension = files.get(i).getOriginalName().substring(files.get(i).getOriginalName().lastIndexOf(".")+1);
            //Use a method to count the number of previous files with the same name and set the correct duplicate number
            String duplicateNumber = fixDuplicateName(files.get(i).getOriginalName());
                
            //Store the new name in a hashmap using the old name as a key
            duplicateNameMap.put(files.get(i).getOriginalName(), name + duplicateNumber + extension));
        
    


ZipOutputStream zipOut = new ZipOutputStream(response.getOutputStream());

//Then when saving files we can check the hashmap and update the files accordingly
files.forEach(file -> 
    String name = file.getOriginalName();
    //Check the HashMap to see if there is a duplicate then get the correct name from the hashmap:
    if (duplicateNameMap.containsKey(file)) 
        //Grab the new name from the hashmap
        String newName = duplicateNameMap.get(name);
        //Remove that entry from the hashmap so that it is not used again
        duplicateNameMap.remove(name, newName);
        //Assign the new name to be used
        name = newName;
    
    final ZipEntry zipEntry = new ZipEntry(name);
    zipOut.putNextEntry(zipEntry);
    IOUtils.copy(file.getInputStream(), zipOut);
    file.getInputStream().close();
    zipOut.closeEntry();


Here is the method used to count the number of duplicates and return the correct duplicate number:

public static String fixDuplicateName(HashMap<String, String> duplicateNameMap, String name) 
    //Start the count at 1 (The first duplicate should always be 1)
    int count = 1;
    //Find out if there is more than 1 duplicate in the hashmap and increase the count if needed
    for (String key: duplicateNameMap.keySet()) 
        if (key.equals(name)) 
            count++;
        
    
    return count+"";

注意这样做的一个小副作用是第一个文件将添加“1”，第二个将添加“2”等，最后一个将是原始名称，但这几乎不是问题。如果您的files 列表中的文件顺序很重要，那么您可以通过将每个名称添加到哈希映射中轻松修复它，即使它不是重复的，然后在fixDuplicateName 方法中将int count = 1; 更改为@ 987654328@ 以便正确标记非重复项。然后在编写 zip 时，只需从 hashmap 中获取每个文件名。

【讨论】：

谢谢您，我会接受您的解决方案，但您能否介绍一下我最近在我的问题中添加的案例？所以基本上名字应该加1但不是第一个您可以使用子字符串和计算重复项的方法来执行此操作。我已经编辑了我的答案，展示了它是如何工作的。请注意，HashMap 现在需要有一个 String 作为 Key HashMap<String, String> duplicateNameMap = new HashMap<>(); ***.com/questions/66799128/… 你能帮我吗

以上是关于防止从 ZipOutPutStream 中删除重复项的主要内容，如果未能解决你的问题，请参考以下文章

从 url 中删除/重定向 index.php 以防止重复的 url

从数组列表中删除重复项

MySQL 处理重复数据：防止表中出现重复数据统计过滤删除重复数据

使用实体框架删除大量项目[重复]

如何防止 Core Data 在 iOS 5 中重复？

mysql 插入数据如何防止重复