C#在遍历文件结构时如何判断所有线程或任务何时完成[重复]
Posted
技术标签:
【中文标题】C#在遍历文件结构时如何判断所有线程或任务何时完成[重复]【英文标题】:C# how do I tell when all threads or tasks have completed when traversing a file structure [duplicate] 【发布时间】:2021-07-19 02:42:27 【问题描述】:此应用处理符合特定标准的图像文件,即分辨率大于 1600X1600。源目录树中可能有超过 8000 个文件,但并非所有文件都符合解析标准。这棵树至少可以有 4 或 5 层深。
我已经“分配”了实际的转换过程。但是,我不知道最后一个任务何时完成。
我真的不想创建任务数组,因为它会包含数千个不符合分辨率标准的文件。而且打开图像文件,检查分辨率,添加或不添加到任务数组,然后在处理文件时再次打开它似乎很浪费。
这是代码。
using System;
using System.Data;
using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
using System.Linq;
using System.Threading.Tasks;
using System.Windows.Forms;
namespace ResizeImages2
public partial class Form1 : Form
private string _destDir;
private string _sourceDir;
private StreamWriter _logfile;
private int _numThreads;
private void button1_Click(object sender, EventArgs e)
_numThreads = 0;
if (_logfile is null)
_logfile = new StreamWriter("c:\\temp\\imagesLog.txt", append: true);
_destDir = "c:\\inetpub\\wwwroot\\mywebsite\\";
_soureDir = "c:\\images\\"
Directory.CreateDirectory(_destDir);
var root = new DirectoryInfo(sourceDir);
WalkDirectoryTree(root); // <--async so it's going to return before all the threads are complete. Can't close the logfile here
//_logfile.Close();
private async void WalkDirectoryTree(System.IO.DirectoryInfo root)
System.IO.FileInfo[] files = null;
System.IO.DirectoryInfo[] subDirs = null;
files = root.GetFiles("*-*-*.jpg"); //looks like a sku and is a jpg.
if (files == null) return;
foreach (System.IO.FileInfo fi in files)
_numThreads++;
await Task.Run(() =>
CreateImage(fi);
_numThreads--;
return true;
);
// Now find all the subdirectories under this directory.
subDirs = root.GetDirectories();
foreach (System.IO.DirectoryInfo dirInfo in subDirs)
WalkDirectoryTree(dirInfo);
private void CreateImage(FileSystemInfo f)
var originalBitmap = new Bitmap(f.FullName);
if (originalBitmap.Width <= 1600 || originalBitmap.Height <= 1600) return;
using (var bm = new Bitmap(1600, 1600))
Point[] points =
new Point(0, 0),
new Point(new_wid, 0),
new Point(0, new_hgt),
;
using (var gr = Graphics.FromImage(bm))
gr.DrawImage(originalBitmap, points);
bm.SetResolution(96, 96);
bm.Save($"_destDirf.Name", ImageFormat.Jpeg);
bm.Dispose();
originalBitmap.Dispose();
_logfile.WriteLine(f.Name);
更新——最终结果
private async void button1_Click(Object sender, EventArgs e)
_logfile = new StreamWriter("c:\\temp\\imagesLog.txt", append: true);
_sourceDir=$"c:\\sourceImages\\";
_destDir = $"c:\\inetpub\\wwwroot\\images\\";
var jpgFiles = Directory.EnumerateFiles(_sourceDir, "*-*-*.jpg", SearchOption.AllDirectories);
var myPOpts = new ParallelOptions MaxDegreeOfParallelism = 10;
await Task.Run(() => //allows the user interface to be updated -- usually, file being processed
Parallel.ForEach(jpgFiles,myPOpts, f=>
CreateImage(new FileInfo(f));
);
_logfile.Close();
);
private void CreateImage(FileSystemInfo f)
var originalBitmap = new Bitmap(f.FullName);
if (originalBitmap.Width <= 1600 || originalBitmap.Height <= 1600)
originalBitmap.Dispose();
return;
tbCurFile.Text = f.FullName;
var new_wid = 1600;
var new_hgt = 1600;
using (var bm = new Bitmap(new_wid, new_hgt))
Point[] points =
new Point(0, 0),
new Point(new_wid, 0),
new Point(0, new_hgt),
;
var scount = 1;
var saved = false;
//why the while/try/catch? Because we are copying files across the internet from a dropbbox in Sync Only Mode.
//this means we are only working with a pointer until we actually open the file and sometimes
//the app gets ahead of itself. It tries to draw the new image before the original file is completely open
//so let's see if can't mitigate that here.
while (!saved && scount < 5)
try
using (var gr = Graphics.FromImage(bm))
gr.DrawImage(originalBitmap, points);
saved = true;
catch
scount++;
bm.SetResolution(96, 96);
scount = 1;
saved = false;
while (!saved && scount<5)
try
bm.Save($"_destDirf.Name", ImageFormat.Jpeg);
saved = true;
catch
scount++;
bm.Dispose();
originalBitmap.Dispose();
_logfile.WriteLine($"_collectionId\\f.Name");
【问题讨论】:
您不必创建数组。您可以使用任何IEnumerable
,包括yield
s tasks/files/whatever 的类
无论如何你都不能在await
中使用Parallel.ForEach
,所以它不会解决你的问题。但是 await Task.Run(() =>
inside foreach` 不会被等待也是超级糟糕的解决方案,您将创建 8000 个任务并杀死您的应用程序性能。相反,您需要将文件排入线程安全队列,创建有限数量的工作线程(4?8?)并让它们处理所有文件。当所有工作人员因为队列为空而空闲时,您就完成了。它叫Producer-Consumer pattern
这是在并行执行中处理异步的另一个选项:devblogs.microsoft.com/pfxteam/…
Docs: "EnumerateFiles 和 GetFiles 方法的区别如下:使用 EnumerateFiles 时,可以在返回整个集合之前开始枚举名称集合。使用 GetFiles 时,必须等待在访问数组之前要返回的整个名称数组。"
谢谢你们!就这么简单: var jpgFiles = Directory.EnumerateFiles(fbd.SelectedPath, "--*.jpg", SearchOption.AllDirectories); Parallel.ForEach(jpgFiles, f => CreateImage(new FileInfo(f)); );
【参考方案1】:
一种快速而肮脏的处理方式是对共享变量使用互锁操作。
private int _numberOfOutstandingOperations;
private async void button1_Click(object sender, EventArgs e)
Interlocked.Exchange(ref _numberOfOutstandingOperations, 1);
await WalkDirectoryTree(root);
CompleteOperation();
private async Task WalkDirectoryTree(System.IO.DirectoryInfo root)
// ...
foreach (System.IO.FileInfo fi in files)
Interlocked.Increment(ref _numberOfOutstandingOperations);
_ = Task.Run(() =>
CreateImage(fi);
CompleteOperation();
);
foreach (System.IO.DirectoryInfo dirInfo in subDirs)
await WalkDirectoryTree(dirInfo);
private void CompleteOperation()
if (Interlocked.Decrement(ref _numberOfOutstandingOperations) == 0)
// Everything is done, signal a mutex or complete a TaskCompletionSource or whatever
诀窍是将操作数初始化为 1,以便仅当所有任务完成并且所有目录都已遍历时才完成。
【讨论】:
以上是关于C#在遍历文件结构时如何判断所有线程或任务何时完成[重复]的主要内容,如果未能解决你的问题,请参考以下文章