在 C# 代码上使用 GPU/TPL 来加快速度，需要 40 分钟

Posted 2023-04-12

技术标签:

【中文标题】在 C# 代码上使用 GPU/TPL 来加快速度，需要 40 分钟【英文标题】：use GPU/TPL on C# code to speed up things, taking 40 minutes 【发布时间】：2019-10-24 02:24:04 【问题描述】：

我想对每行有 1 个数字 "0,1" 并且几乎有 100 万 行的文本文件执行一些计算。

我想检查一个序列在整个文件中存在多少次，它根据sequence lengthis 创建一个序列，例如我的文件是：

01100101011....最多 100 万（每个数字换行）

代码

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;

public class Program

    static void Main(string[] args)
    
        Stopwatch time = new Stopwatch();
        time.Start();
        try
        
            // I have hard coded fileName and Sequence Length that i am taking from user
            string data = "", fileName = "10.txt";  // this file has almost 1 Million records
            int first = 0, last = 0;
            
            // reads data and make a string of that data
            // which means "data" = "1001011001010100101 .... upto 1 million"
            data = string.Join("", File.ReadAllLines(fileName)); 
            last = Convert.ToInt32("15"); // sequence length
            int l = data.Length;    // calculates it one time so that dont have to do it everytime
            
            //so why i create List is because sometime Array dont get fully used to its length
            // and get Null values at the end
            List<string> dataList = new List<string>();
            while (first + last < l+1)
            
                dataList.Add((data.Substring(first, last)));
                first++;
            
            // converts list to Array so array will have values and no Null
            // and will use Array.FindAll() later
            string[] dataArray = dataList.ToArray(), value;
            
            // get rready a file to start writing on
            StreamWriter sw = new StreamWriter(fileName.Substring(0, fileName.Length - 4) + "Results.txt");
            
            //THIS IS THE PART THATS TAKING around 40 minutes
            for (int j = 0; j < dataArray.Length; j++)
            
                // finds a value in whole array and make array of that finding 
                value = Array.FindAll(dataArray, str => str.Equals(dataArray[j]));
                // value.Length means the count of the Number in the whole array
                sw.WriteLine(value.Length);
            
            sw.Close();
            time.Stop();
            Console.WriteLine("Time : " + time.Elapsed);
            Console.ReadLine();
        
        catch (Exception ex)
        
            Console.WriteLine("Exception " + ex.StackTrace);
            Console.ReadLine();

我设置了一个sequence length = 3，现在我的程序做了一个数组：

dataArray = “011”、“110”、“100”、“001”、“010”、“101”、“011”

通过使用 String.Substring() 。现在我只想计算数组元素的频率。

结果 .txt 文件中的数据

011 - 2

110 - 0

100 - 0

001 - 0

010 - 0

101 - 0

011 - 2

现在看起来很简单，其实不然，我无法转换它int，因为它是一个序列，我不想丢失序列前面的零。

现在我的程序必须循环 100 万（每个元素）X 100 万（与数组的每个元素相比）= 1 万亿 次。大约需要 40 分钟。我想知道如何让它变得更快，Parallel.For, TPL 我不知道它们如何使用它们。因为它应该在几秒钟内完成。

我的系统规格

32 GB 内存

i7- 5820k 3.30 ghz

64 位

2x 英伟达 gtx 970

【问题讨论】：

【参考方案1】：

如果我正确理解您的代码和问题，您需要在文本上“滑动一个窗口”（长度为 N，last 在您的原始代码中），并计算每个子字符串在文本中存在的次数.

如果是这样的话，下面的代码在 0.292 秒左右的时间内完成一个百万字符的文件，并且您根本不需要并行性或 GPU。

这里的想法是当我们在文本上滑动窗口时，将块计数计入Dictionary。

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;

public class Program

    static Dictionary<string, int> CountChunks(string data, int chunkLength)
    
        var chunkCounts = new Dictionary<string, int>();
        var l = data.Length;
        for (var i = 0; i < l - chunkLength; i++)
        
            var chunk = data.Substring(i, chunkLength);
            int count = 0;
            chunkCounts.TryGetValue(chunk, out count);
            chunkCounts[chunk] = count + 1;
        
        return chunkCounts;
    
    static void Main(string[] args)
    
        var time = new Stopwatch();
        time.Start();
        var fileName = "10.txt";
        var data = string.Join("", File.ReadAllText(fileName));
        var chunkCounts = CountChunks(data, 15);
        using (var sw = new StreamWriter(fileName.Substring(0, fileName.Length - 4) + "Results.txt"))
        
            foreach (var pair in chunkCounts)
            
                sw.WriteLine($"pair.Key - pair.Value");
            
        
        time.Stop();
        Console.WriteLine("Time : " + time.Elapsed);

输出 10Results.txt 看起来像

011100000111100 - 34
111000001111000 - 37
110000011110001 - 27
100000111100010 - 28
000001111000101 - 37
000011110001010 - 36
000111100010100 - 44
001111000101001 - 35
011110001010011 - 41
111100010100110 - 42

等等

编辑： 这是等效的 Python 程序。慢一点，大约 0.9 秒。

import time
from collections import Counter

t0 = time.time()
c = Counter()
data = ''.join(l for l in open('10.txt'))
l = 15
for i in range(0, len(data) - l):
    c[data[i : i + l]] += 1

with open('10Results2.txt', 'w') as outf:
    for key, value in c.items():
        print(f'key - value', file=outf)

print(time.time() - t0)

【讨论】：

【参考方案2】：

For 循环会给你带来糟糕的性能，因为它必须循环一百万个字符串比较。我建议使用字典而不是列表将您的序列存储为键并计为值。与 while/for 循环相比，它应该为您提供更好的性能。您需要做的只是从性能角度稍微调整一下，甚至可能不需要利用 GPU/TLP 运行时，除非它是您唯一的目的。下面的东西应该让你去。

       string keyString = string.Empty;
       Dictionary<string,int> dataList = new Dictionary<string,int>;
        while (first + last < l+1)
        
            keyString = data.Substring(first, last);
            if(dataList.ContainsKey(keyString)
               
                 dataList[keyString] = dataList[keyString] + 1; 
               
             else
               
                 dataList.Add(keyString,1);
               
            first++;

剩下的代码就是打印这本字典了。

【讨论】：

以上是关于在 C# 代码上使用 GPU/TPL 来加快速度，需要 40 分钟的主要内容，如果未能解决你的问题，请参考以下文章