C#在遍历文件结构时如何判断所有线程或任务何时完成[重复]

Posted

技术标签:

【中文标题】C#在遍历文件结构时如何判断所有线程或任务何时完成[重复]【英文标题】:C# how do I tell when all threads or tasks have completed when traversing a file structure [duplicate] 【发布时间】:2021-07-19 02:42:27 【问题描述】:

此应用处理符合特定标准的图像文件,即分辨率大于 1600X1600。源目录树中可能有超过 8000 个文件,但并非所有文件都符合解析标准。这棵树至少可以有 4 或 5 层深。

我已经“分配”了实际的转换过程。但是,我不知道最后一个任务何时完成。

我真的不想创建任务数组,因为它会包含数千个不符合分辨率标准的文件。而且打开图像文件,检查分辨率,添加或不添加到任务数组,然后在处理文件时再次打开它似乎很浪费。

这是代码。

using System;
using System.Data;
using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
using System.Linq;
using System.Threading.Tasks;
using System.Windows.Forms;

namespace ResizeImages2

    public partial class Form1 : Form
    
        private string _destDir;
        private string _sourceDir;
        private StreamWriter _logfile;
        private int _numThreads;

        private void button1_Click(object sender, EventArgs e)
        
            _numThreads = 0;
            if (_logfile is null)
                _logfile = new StreamWriter("c:\\temp\\imagesLog.txt", append: true);

            _destDir = "c:\\inetpub\\wwwroot\\mywebsite\\";
            _soureDir = "c:\\images\\"
            Directory.CreateDirectory(_destDir);
            var root = new DirectoryInfo(sourceDir);
            WalkDirectoryTree(root); // <--async so it's going to return before all the threads are complete. Can't close the logfile here
            //_logfile.Close();
        

        private async void WalkDirectoryTree(System.IO.DirectoryInfo root)
        
            System.IO.FileInfo[] files = null;
            System.IO.DirectoryInfo[] subDirs = null;
            files = root.GetFiles("*-*-*.jpg"); //looks like a sku and is a jpg.
            if (files == null) return;
            foreach (System.IO.FileInfo fi in files)
            
                _numThreads++;
                await Task.Run(() =>
                
                    CreateImage(fi);
                    _numThreads--;
                    return true;
                );
            

            // Now find all the subdirectories under this directory.
            subDirs = root.GetDirectories();

            foreach (System.IO.DirectoryInfo dirInfo in subDirs)
            
                WalkDirectoryTree(dirInfo);
            
        

        private void CreateImage(FileSystemInfo f)
        
            var originalBitmap = new Bitmap(f.FullName);
            if (originalBitmap.Width <= 1600 || originalBitmap.Height <= 1600) return;
            using (var bm = new Bitmap(1600, 1600))
            
                Point[] points =
                
                    new Point(0, 0),
                    new Point(new_wid, 0),
                    new Point(0, new_hgt),
                ;
                using (var gr = Graphics.FromImage(bm))
                
                    gr.DrawImage(originalBitmap, points);
                

                bm.SetResolution(96, 96);
                bm.Save($"_destDirf.Name", ImageFormat.Jpeg);
                bm.Dispose();
                originalBitmap.Dispose();
            
            _logfile.WriteLine(f.Name);
        
    

更新——最终结果

    private async void button1_Click(Object sender, EventArgs e)
    
        _logfile = new StreamWriter("c:\\temp\\imagesLog.txt", append: true);

        _sourceDir=$"c:\\sourceImages\\";
        _destDir = $"c:\\inetpub\\wwwroot\\images\\";
        var jpgFiles = Directory.EnumerateFiles(_sourceDir, "*-*-*.jpg", SearchOption.AllDirectories);
        var myPOpts = new ParallelOptions MaxDegreeOfParallelism = 10;
        await Task.Run(() => //allows the user interface to be updated -- usually, file being processed
        
            Parallel.ForEach(jpgFiles,myPOpts, f=>
            
                CreateImage(new FileInfo(f));
            );
            _logfile.Close();
        );
    
    private void CreateImage(FileSystemInfo f)
    
        var originalBitmap = new Bitmap(f.FullName);
        if (originalBitmap.Width <= 1600 || originalBitmap.Height <= 1600)
        
            originalBitmap.Dispose();
            return;
        
        tbCurFile.Text = f.FullName;
        var new_wid = 1600;
        var new_hgt = 1600;
        using (var bm = new Bitmap(new_wid, new_hgt))
        
            Point[] points =
            
                new Point(0, 0),
                new Point(new_wid, 0),
                new Point(0, new_hgt),
            ;
            var scount = 1;
            var saved = false;
            //why the while/try/catch? Because we are copying files across the internet from a dropbbox in Sync Only Mode.
            //this means we are only working with a pointer until we actually open the file and sometimes 
            //the app gets ahead of itself. It tries to draw the new image before the original file is completely open
            //so let's see if can't  mitigate that here.
            while (!saved && scount < 5)
            
                try
                
                    using (var gr = Graphics.FromImage(bm))
                    
                        gr.DrawImage(originalBitmap, points);
                    

                    saved = true;
                
                catch
                
                    scount++;
                
            

            bm.SetResolution(96, 96);
            scount = 1;
            saved = false;
            while (!saved && scount<5)
            
                try
                
                    bm.Save($"_destDirf.Name", ImageFormat.Jpeg);
                    saved = true;
                
                catch
                
                    scount++;
                
                            
            bm.Dispose();
            originalBitmap.Dispose();
        
        _logfile.WriteLine($"_collectionId\\f.Name");
    

【问题讨论】:

您不必创建数组。您可以使用任何IEnumerable,包括yields tasks/files/whatever 的类 无论如何你都不能在await 中使用Parallel.ForEach,所以它不会解决你的问题。但是 await Task.Run(() =&gt; inside foreach` 不会被等待也是超级糟糕的解决方案,您将创建 8000 个任务并杀死您的应用程序性能。相反,您需要将文件排入线程安全队列,创建有限数量的工作线程(4?8?)并让它们处理所有文件。当所有工作人员因为队列为空而空闲时,您就完成了。它叫Producer-Consumer pattern 这是在并行执行中处理异步的另一个选项:devblogs.microsoft.com/pfxteam/… Docs: "EnumerateFiles 和 GetFiles 方法的区别如下:使用 EnumerateFiles 时,可以在返回整个集合之前开始枚举名称集合。使用 GetFiles 时,必须等待在访问数组之前要返回的整个名称数组。" 谢谢你们!就这么简单: var jpgFiles = Directory.EnumerateFiles(fbd.SelectedPath, "--*.jpg", SearchOption.AllDirectories); Parallel.ForEach(jpgFiles, f => CreateImage(new FileInfo(f)); ); 【参考方案1】:

一种快速而肮脏的处理方式是对共享变量使用互锁操作。


    private int _numberOfOutstandingOperations;
    
    private async void button1_Click(object sender, EventArgs e)
    
        Interlocked.Exchange(ref _numberOfOutstandingOperations, 1);
        
        await WalkDirectoryTree(root);
        CompleteOperation();
    
    
    
    private async Task WalkDirectoryTree(System.IO.DirectoryInfo root)
    
        // ...   
        foreach (System.IO.FileInfo fi in files)
        
            Interlocked.Increment(ref _numberOfOutstandingOperations);
            _ = Task.Run(() =>
            
                CreateImage(fi);
                CompleteOperation();
            );
        
        
        foreach (System.IO.DirectoryInfo dirInfo in subDirs)
        
            await WalkDirectoryTree(dirInfo);
        
    
    
    private void CompleteOperation()
    
        if (Interlocked.Decrement(ref _numberOfOutstandingOperations) == 0)
        
            // Everything is done, signal a mutex or complete a TaskCompletionSource or whatever
        
    

诀窍是将操作数初始化为 1,以便仅当所有任务完成并且所有目录都已遍历时才完成。

【讨论】:

以上是关于C#在遍历文件结构时如何判断所有线程或任务何时完成[重复]的主要内容,如果未能解决你的问题,请参考以下文章

如何确定我的所有线程何时完成执行?

在 C# 中使用 ReadBlock 方法时,如何判断何时到达文件末尾?

C# 线程知识--使用Task执行异步操作

使用调度者

何时在 C# 中使用线程池? [关闭]

python多处理池:我怎么知道池中的所有工作人员何时完成?