CsvHelper 流太长

Posted

技术标签:

【中文标题】CsvHelper 流太长【英文标题】:CsvHelper stream too long 【发布时间】:2021-11-06 16:42:28 【问题描述】:

我在使用 CsvHelper 将大量数据 (> 2GB) 保存到 Azure Blob 存储时遇到问题:我收到错误消息“流太长”。 有没有人可以帮我解决?提前致谢!这是我的代码:

public static void EXPORT_CSV(DataTable dt, string fileName, ILogger log)
    
        try
        
            // Retrieve storage account from connection string.
            var cnStorage = Environment.GetEnvironmentVariable("cnStorage");
            CloudStorageAccount storageAccount = CloudStorageAccount.Parse(cnStorage);
            // Create the blob client.
            CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
            // Retrieve reference to a previously created container.
            CloudBlobContainer container = blobClient.GetContainerReference("dataexport");
            bool exists = container.CreateIfNotExists();
            // Retrieve reference to a blob named "myblob".
            CloudBlockBlob blockBlob = container.GetBlockBlobReference(fileName);

            var stream = new MemoryStream();

            using (var writer = new StreamWriter(stream))
            using (var csvWriter = new CsvWriter(writer, CultureInfo.InvariantCulture))
            
                csvWriter.Configuration.TypeConverterOptionsCache.GetOptions<DateTime>().Formats = new[]  "dd/MM/yyyy" ;
                foreach (DataColumn column in dt.Columns)
                
                    csvWriter.WriteField(column.ColumnName);
                

                csvWriter.NextRecord();

                foreach (DataRow row in dt.Rows)
                
                    for (var i = 0; i < dt.Columns.Count; i++)
                    
                        csvWriter.WriteField(row[i]);
                    
                    csvWriter.NextRecord();
                
                csvWriter.Flush();
                writer.Flush();
                stream.Position = 0;

                log.LogInformation($"C# BatchDataExportCSVsegnalazioni START UploadFromStream  at: DateTime.Now");
                blockBlob.UploadFromStream(stream);
                log.LogInformation($"C# BatchDataExportCSVsegnalazioni END UploadFromStream  at: DateTime.Now");
            
        
        catch (Exception ex)
        
            log.LogError("Error upload BatchDataExportCSVsegnalazioni: " + ex.Message);
        
    

【问题讨论】:

看看***.com/questions/41225364/… alternative to MemoryStream for large data volumes 和 MemoryStream replacement? 也可能有帮助,尤其是 this answer ghord 推荐 Microsoft.IO.RecyclableMemoryStream 【参考方案1】:

错误可能是由于对大数据使用 MemoryStream 而不是 csvHelper。 看看问题是否可以通过以下方式解决:

    将数据直接写入 FileStream 而不是写入内存流。

    using (var fileStream = File.Create(path))
    
    // (or)
    
    using (var fileStream = new FileStream(filePath, FileMode.OpenOrCreate))
    
      using (var writer = new StreamWriter(fileStream))
         
       using (var csvWriter = new CsvWriter(writer, CultureInfo.InvariantCulture))
                
    

(或)

    您可以使用 cloudblockblob 库在 Azure 存储中创建文件,方法是使用程序集 Azure.Storage.Blobs 和命名空间 Azure.Storage.Blobs.Specialized 中的扩展方法:

请参考Handling Large Files in Azure with Blob Storage Streaming

例如:

 var stream = blob.OpenWrite()

另见Do's and Don'ts for Streaming File Uploads to Azure Blob Storage with .NET

【讨论】:

【参考方案2】:

我解决了使用 blob.OpenWriteAsync() 直接写入 azure blob 存储的问题:

        public static async Task UPLOAD_CSVAsync(DataTable dt, string fileName, ILogger log)
    
        try
        
            // Retrieve storage account from connection string.
            var cnStorage = Environment.GetEnvironmentVariable("cnStorage");
            CloudStorageAccount storageAccount = CloudStorageAccount.Parse(cnStorage);
            // Create the blob client.
            CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
            // Retrieve reference to a previously created container.
            CloudBlobContainer container = blobClient.GetContainerReference("dataexport");
            bool exists = container.CreateIfNotExists();
            // Retrieve reference to a blob named "fileName".
            CloudBlockBlob blockBlob = container.GetBlockBlobReference(fileName);

            log.LogInformation($"C# BatchExpCSVsegnalazioni START UploadFromStream  at: DateTime.Now");
            await WriteDataTableToBlob(dt, blockBlob);
            log.LogInformation($"C# BatchExpCSVsegnalazioni END UploadFromStream  at: DateTime.Now");
        
        catch (Exception ex)
        
            log.LogError("error upload BatchExpCSVsegnalazioni: " + ex.Message);
        
    
    public static async Task WriteDataTableToBlob(DataTable dt, CloudBlockBlob blob)
    
        using (var writer = await blob.OpenWriteAsync())
        using (var streamWriter = new StreamWriter(writer))
        using (var csvWriter = new CsvWriter(streamWriter, CultureInfo.InvariantCulture))
        
            csvWriter.Configuration.TypeConverterOptionsCache.GetOptions<DateTime>().Formats = new[]  "dd/MM/yyyy" ;
            foreach (DataColumn column in dt.Columns)
            
                csvWriter.WriteField(column.ColumnName);
            
            csvWriter.NextRecord();

            foreach (DataRow row in dt.Rows)
            
                for (var i = 0; i < dt.Columns.Count; i++)
                
                    csvWriter.WriteField(row[i]);
                
                csvWriter.NextRecord();
            
            csvWriter.Flush();
        
    

【讨论】:

以上是关于CsvHelper 流太长的主要内容,如果未能解决你的问题,请参考以下文章

构造 Jooq 流太慢

Java 11 HttpClient Http2 流太多错误

.netcore获取文件流太慢是啥原因

使用 CsvHelper 获取列名列表的最佳方法是啥?

使用 csvHelper 动态创建列

如何将 EnumConverter 与 CsvHelper 一起使用