将数据从(大)文件 Excel 导入 datagridview,然后导入数据库 - 为啥插入数据库需要这么长时间并且不能保存所有数据?

Posted

技术标签:

【中文标题】将数据从(大)文件 Excel 导入 datagridview,然后导入数据库 - 为啥插入数据库需要这么长时间并且不能保存所有数据?【英文标题】:Importing data from (large) file Excel to datagridview and then database - why inserting to database takes so long and doesn't save all data?将数据从(大)文件 Excel 导入 datagridview,然后导入数据库 - 为什么插入数据库需要这么长时间并且不能保存所有数据? 【发布时间】:2019-10-30 17:56:21 【问题描述】:

在这种情况下,我将数据从大文件(大约 37 MB)导入到 datagridview。 Excel文件中的表格如下:

将数据从 Excel 加载到 datagridview 后,我将该数据插入 mysql 数据库:

foreach (DataGridViewRow row in datagrdStatus_order.Rows)

    string constring = "datasource = localhost; port = 3306; username = root; password = ";
    using (MySqlConnection con = new MySqlConnection(constring))
    
        using (MySqlCommand cmd = new MySqlCommand("INSERT IGNORE INTO try1.order_status(ID_WORKER, ID_ORDER, ID_MODULE, ID_PROJECT, AMOUNT_OF_PRODUCTS, BEGIN_DATE, END_DATE) SELECT workers.ID_WORKER, orders.ID_ORDER, module.ID_MODULE, projects.ID, @AMOUNT_OF_PRODUCTS, @BEGIN_DATE, @END_DATE FROM try1.workers INNER JOIN try1.orders INNER JOIN try1.modules INNER JOIN try1.projects WHERE workers.FNAME = @FNAME AND workers.LNAME = @LNAME AND workers.ID_WORKER = @ID_WORKER AND orders.DESC_ORDER = @DESC_ORDER AND orders.ORDER_NUMBER = @ORDER_NUMBER AND modules.NAME = @MODULES_NAME AND projects.PROJECT_NAME = @PROJECT_NAME", con))
        
            cmd.Parameters.AddWithValue("@ID_WORKER", row.Cells["ID_WORKER"].Value);
            cmd.Parameters.AddWithValue("@FNAME", row.Cells["FNAME"].Value);
            cmd.Parameters.AddWithValue("@LNAME", row.Cells["LNAME"].Value);
            cmd.Parameters.AddWithValue("@DESC_ORDER", row.Cells["DESC_ORDER"].Value);
            cmd.Parameters.AddWithValue("@ORDER_NUMBER", row.Cells["ORDER_NUMBER"].Value);
            cmd.Parameters.AddWithValue("@MODULES_NAME", row.Cells["NAME"].Value);
            cmd.Parameters.AddWithValue("@PROJECT_NAME", row.Cells["PROJECT_NAME"].Value);
            cmd.Parameters.AddWithValue("@AMOUNT_OF_PRODUCTS", row.Cells["AMOUNT_OF_PRODUCTS"].Value);
            cmd.Parameters.AddWithValue("@BEGIN_DATE", row.Cells["BEGIN_DATE"].Value);
            cmd.Parameters.AddWithValue("@END_DATE", row.Cells["END_DATE"].Value);
            con.Open();
            cmd.ExecuteNonQuery();
            con.Close();
        
    

后来当我执行这个导入代码时,问题就开始了。

    尽管关闭了 ContextSwitchDeadlock 异常(仅插入 1 个表),但插入此数据大约需要... 10 分钟。

    尽管没有在程序中调用任何错误或异常,但它并未插入所有数据。在那种情况下,我看到我只导入了 8 个状态而不是全部(11 个状态)。

我有一个问题:为什么插入数据库需要这么长时间并且不能保存所有数据?如何减少向 MySQL db 插入数据并保存所有数据?

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using MySql.Data.MySqlClient;
using System.Collections;
using System.Data.OleDb;
using System.IO;
using System.Configuration;

namespace ControlDataBase

    public partial class New_Tables : Form
    
        public New_Tables()
        
            InitializeComponent();
        
        Form1 frm1 = (Form1)Application.OpenForms["Form1"];

        private void btnClose_Click(object sender, EventArgs e)
        
            this.Close();
        

        private void ImportData_Click(object sender, EventArgs e)
        
            using (OpenFileDialog ofd = new OpenFileDialog()  Filter = "Excel Files|*.xlsx;*.xlsm;*.xlsb;*.xltx;*.xltm;*.xls;*.xlt;*.xls;*.xml;*.xml;*.xlam;*.xla;*.xlw;*.xlr;", ValidateNames = true )
            
                if (ofd.ShowDialog() == DialogResult.OK)
                
                    FileInfo fi = new FileInfo(ofd.FileName);
                    string FileName1 = ofd.FileName;

                    string excel = fi.FullName;

                    if (ofd.FileName.EndsWith(".xlsx"))
                    
                        StrConn = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excel + ";Extended Properties=\"Excel 12.0;\"";
                    

                    if (ofd.FileName.EndsWith(".xls"))
                    
                        StrConn = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + excel + ";Extended Properties=\"Excel 1.0;HDR=Yes;IMEX=1\"";
                    
                    OleDbConnection oledbconn = new OleDbConnection(StrConn);

                    OleDbDataAdapter dta5 = new OleDbDataAdapter("SELECT * FROM [Order_status$]", oledbconn);
                    oledbconn.Open();

                    DataSet dsole5 = new DataSet();
                    dta5.Fill(dsole5, "Order_status$");
                    datagrdStatus_order.DataSource = dsole5.Tables["Order_status$"];

                    oledbconn.Close();

foreach (DataGridViewRow row in datagrdStatus_order.Rows)

    string constring = "datasource = localhost; port = 3306; username = root; password = ";
    using (MySqlConnection con = new MySqlConnection(constring))
    
        using (MySqlCommand cmd = new MySqlCommand("INSERT IGNORE INTO try1.order_status(ID_WORKER, ID_ORDER, ID_MODULE, ID_PROJECT, AMOUNT_OF_PRODUCTS, BEGIN_DATE, END_DATE) SELECT workers.ID_WORKER, orders.ID_ORDER, module.ID_MODULE, projects.ID, @AMOUNT_OF_PRODUCTS, @BEGIN_DATE, @END_DATE FROM try1.workers INNER JOIN try1.orders INNER JOIN try1.modules INNER JOIN try1.projects WHERE workers.FNAME = @FNAME AND workers.LNAME = @LNAME AND workers.ID_WORKER = @ID_WORKER AND orders.DESC_ORDER = @DESC_ORDER AND orders.ORDER_NUMBER = @ORDER_NUMBER AND modules.NAME = @MODULES_NAME AND projects.PROJECT_NAME = @PROJECT_NAME", con))
        
            cmd.Parameters.AddWithValue("@ID_WORKER", row.Cells["ID_WORKER"].Value);
            cmd.Parameters.AddWithValue("@FNAME", row.Cells["FNAME"].Value);
            cmd.Parameters.AddWithValue("@LNAME", row.Cells["LNAME"].Value);
            cmd.Parameters.AddWithValue("@DESC_ORDER", row.Cells["DESC_ORDER"].Value);
            cmd.Parameters.AddWithValue("@ORDER_NUMBER", row.Cells["ORDER_NUMBER"].Value);
            cmd.Parameters.AddWithValue("@MODULES_NAME", row.Cells["NAME"].Value);
            cmd.Parameters.AddWithValue("@PROJECT_NAME", row.Cells["PROJECT_NAME"].Value);
            cmd.Parameters.AddWithValue("@AMOUNT_OF_PRODUCTS", row.Cells["AMOUNT_OF_PRODUCTS"].Value);
            cmd.Parameters.AddWithValue("@BEGIN_DATE", row.Cells["BEGIN_DATE"].Value);
            cmd.Parameters.AddWithValue("@END_DATE", row.Cells["END_DATE"].Value);
            con.Open();
            cmd.ExecuteNonQuery();
            con.Close();
        
    

                    connection.Close();
                    MessageBox.Show("The data are imported correctly");

                    loaddataalldatagridview();
                
            
        

        private void loaddataalldatagridview()
        
            frm1.loaddata5();
        
    

编辑:

我修改了基于@Matt_Johnson 答案的代码行:

1) 在 for 循环中:

            string constring = "datasource = localhost; port = 3306; username = root; password = ";
            string query5 = "INSERT IGNORE INTO try1.order_status(ID_WORKER, ID_ORDER, ID_MODULE, ID_PROJECT, AMOUNT_OF_PRODUCTS, BEGIN_DATE, END_DATE) SELECT workers.ID_WORKER, orders.ID_ORDER, module.ID_MODULE, projects.ID, @AMOUNT_OF_PRODUCTS, @BEGIN_DATE, @END_DATE FROM try1.workers INNER JOIN try1.orders INNER JOIN try1.modules INNER JOIN try1.projects WHERE workers.FNAME = @FNAME AND workers.LNAME = @LNAME AND workers.ID_WORKER = @ID_WORKER AND orders.DESC_ORDER = @DESC_ORDER AND orders.ORDER_NUMBER = @ORDER_NUMBER AND modules.NAME = @MODULES_NAME AND projects.PROJECT_NAME = @PROJECT_NAME";

            using (MySqlConnection con = new MySqlConnection(constring))
            
                using (MySqlCommand cmd = new MySqlCommand(query5, con))
                
                    cmd.Parameters.Add("@ID_WORKER", MySqlDbType.Int32);
                    cmd.Parameters.Add("@FNAME", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@LNAME", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@DESC_ORDER", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@ORDER_NUMBER", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@MODULES_NAME", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@PROJECT_NAME", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@AMOUNT_OF_PRODUCTS", MySqlDbType.Int32);
                    cmd.Parameters.Add("@BEGIN_DATE", MySqlDbType.DateTime);
                    cmd.Parameters.Add("@END_DATE", MySqlDbType.DateTime);
                    con.Open();

                    for (int i = 0; i < datagrdStatus_order.Rows.Count + 1; i++)
                    
                        cmd.Parameters["@ID_WORKER"].Value = datagrdStatus_order.Rows[i].Cells[0].Value;
                        cmd.Parameters["@FNAME"].Value = datagrdStatus_order.Rows[i].Cells[1].Value;
                        cmd.Parameters["@LNAME"].Value = datagrdStatus_order.Rows[i].Cells[2].Value;
                        cmd.Parameters["@DESC_ORDER"].Value = datagrdStatus_order.Rows[i].Cells[3].Value;
                        cmd.Parameters["@ORDER_NUMBER"].Value = datagrdStatus_order.Rows[i].Cells[4].Value;
                        cmd.Parameters["@MODULES_NAME"].Value = datagrdStatus_order.Rows[i].Cells[5].Value;
                        cmd.Parameters["@PROJECT_NAME"].Value = datagrdStatus_order.Rows[i].Cells[6].Value;
                        cmd.Parameters["@AMOUNT_OF_PRODUCTS"].Value = datagrdStatus_order.Rows[i].Cells[7].Value;
                        cmd.Parameters["@BEGIN_DATE"].Value = datagrdStatus_order.Rows[i].Cells[8].Value;
                        cmd.Parameters["@END_DATE"].Value = datagrdStatus_order.Rows[i].Cells[9].Value;
                        cmd.ExecuteNonQuery();
                    
                    con.Close();
                
            
            MessageBox.Show("Imported correctly");
            loaddataalldatagridview();

2) 在 foreach 循环中:

            string constring = "datasource = localhost; port = 3306; username = root; password = ";
            string query5 = "INSERT IGNORE INTO try1.order_status(ID_WORKER, ID_ORDER, ID_MODULE, ID_PROJECT, AMOUNT_OF_PRODUCTS, BEGIN_DATE, END_DATE) SELECT workers.ID_WORKER, orders.ID_ORDER, module.ID_MODULE, projects.ID, @AMOUNT_OF_PRODUCTS, @BEGIN_DATE, @END_DATE FROM try1.workers INNER JOIN try1.orders INNER JOIN try1.modules INNER JOIN try1.projects WHERE workers.FNAME = @FNAME AND workers.LNAME = @LNAME AND workers.ID_WORKER = @ID_WORKER AND orders.DESC_ORDER = @DESC_ORDER AND orders.ORDER_NUMBER = @ORDER_NUMBER AND modules.NAME = @MODULES_NAME AND projects.PROJECT_NAME = @PROJECT_NAME";

            using (MySqlConnection con = new MySqlConnection(constring))
            
                using (MySqlCommand cmd = new MySqlCommand(query5, con))
                
                    cmd.Parameters.Add("@ID_WORKER", MySqlDbType.Int32);
                    cmd.Parameters.Add("@FNAME", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@LNAME", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@DESC_ORDER", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@ORDER_NUMBER", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@MODULES_NAME", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@PROJECT_NAME", MySqlDbType.VarChar);
                    cmd.Parameters.Add("@AMOUNT_OF_PRODUCTS", MySqlDbType.Int32);
                    cmd.Parameters.Add("@BEGIN_DATE", MySqlDbType.DateTime);
                    cmd.Parameters.Add("@END_DATE", MySqlDbType.DateTime);
                    con.Open();

                    foreach (DataGridViewRow row in datagrdStatus_order.Rows)
                    
                        cmd.Parameters.AddWithValue("@ID_WORKER", row.Cells["ID_WORKER"].Value);
                        cmd.Parameters.AddWithValue("@FNAME", row.Cells["FNAME"].Value);
                        cmd.Parameters.AddWithValue("@LNAME", row.Cells["LNAME"].Value);
                        cmd.Parameters.AddWithValue("@DESC_ORDER", row.Cells["DESC_ORDER"].Value);
                        cmd.Parameters.AddWithValue("@ORDER_NUMBER", row.Cells["ORDER_NUMBER"].Value);
                        cmd.Parameters.AddWithValue("@MODULES_NAME", row.Cells["NAME"].Value);
                        cmd.Parameters.AddWithValue("@PROJECT_NAME", row.Cells["PROJECT_NAME"].Value);
                        cmd.Parameters.AddWithValue("@AMOUNT_OF_PRODUCTS", row.Cells["AMOUNT_OF_PRODUCTS"].Value);
                        cmd.Parameters.AddWithValue("@BEGIN_DATE", row.Cells["BEGIN_DATE"].Value);
                        cmd.Parameters.AddWithValue("@END_DATE", row.Cells["END_DATE"].Value);
                        cmd.ExecuteNonQuery();
                    
                    con.Close();
                
            
            MessageBox.Show("Imported correctly");
            loaddataalldatagridview();

我仍然没有有效的答案。我发送共享文件的链接以供下载: https://drive.google.com/file/d/1LE7phZwyT7VR3NJc6bA-n1reCJ_-X3u9/view?usp=sharing

也许有帮助。

【问题讨论】:

一些建议。批量插入,而不是一次做一个。在事件处理程序中利用 async-await。使用事务。改为使用一个连接,或者为每一行使用和处理它。 可能太大了。尝试分页 GET 和分块/批处理插入/更新。 您需要将 Excel 数据导入 DataTable。然后可以将 DataGridView 绑定到 DataTable,显示结果并允许直接编辑。 DataTable 可用于快速更新 MySql 表。就像它们的名字所暗示的那样:DataGridView 是用于视图的视图对象。 DataTable 用于保存实际数据,它是一个数据库对象。他们有自己的优势。 【参考方案1】:

在查看会影响性能的代码时,有两点让我印象深刻:

AddWithValue 很昂贵,因为它使用反射,有时它会做错事。相反,使用它们的名称和数据类型添加参数,而不提供值。然后,根据需要提供值。

您正在为每一行连接和断开与数据库的连接。就网络资源而言,这非常昂贵。相反,您应该连接到数据库一次,然后执行您的命令,然后关闭连接。您甚至可以重复使用该命令并简单地改变每一行的参数。

using (MySqlConnection con ...)

    using (MySqlCommand cmd ...)
    
        ... define the parameters and add them to the command, 
        ... without adding values to them yet

        con.Open();
        foreach (...)
        
            ... now set values of the parameters for the row

            cmd.ExecuteNonQuery();
        

        con.Close();
    

请注意,我故意没有提供完整代码,因为其中大部分已经在其他 Stack Overflow 答案、外部网站和产品文档中提供。我想如果你尝试,你应该能够完成这个谜题。祝你好运。

【讨论】:

这看起来是一个绝妙的主意,但我尝试过定义像 cmd.Parameters.Add(...); 这样的参数然后我也用 foreach 和 for 循环执行了该代码。我还是一样的效果。请参阅我编辑的问题(我刚刚添加了链接)。也许有帮助。 请再次编辑您的问题以显示更新后的代码。谢谢。 好的,我刚刚编辑了问题。我在“编辑”部分添加了代码行。你怎么看? 嗯.. for 循环示例乍一看还不错。不过,for-each 示例仍在调用 AddWithValue。结果如何受到影响?您可以考虑使用参数索引(cmd.Parameters[0].Value 而不是cmd.Parameters["@ID_WORKER"].Value)来节省查找成本。 索引也有问题。您从 0 变为 【参考方案2】:
    导出为 CSV 文件。 通过LOAD DATA LOCAL INFILE ...导入

这两个步骤都非常快。除非您需要操作列的格式,否则不需要“代码”。

【讨论】:

【参考方案3】:

尝试使用以下代码。由于您在每一行上打开和关闭 SQL 连接的方式,您可能已达到最大会话数,就像上面 @Nkosi 所说的那样。

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using MySql.Data.MySqlClient;
using System.Collections;
using System.Data.OleDb;
using System.IO;
using System.Configuration;

namespace ControlDataBase

    public partial class New_Tables : Form
    
        public New_Tables()
        
            InitializeComponent();
        
        Form1 frm1 = (Form1)Application.OpenForms["Form1"];

        private void btnClose_Click(object sender, EventArgs e)
        
            this.Close();
        

        private void ImportData_Click(object sender, EventArgs e)
        
            using (OpenFileDialog ofd = new OpenFileDialog()  Filter = "Excel Files|*.xlsx;*.xlsm;*.xlsb;*.xltx;*.xltm;*.xls;*.xlt;*.xls;*.xml;*.xml;*.xlam;*.xla;*.xlw;*.xlr;", ValidateNames = true )
            
                if (ofd.ShowDialog() == DialogResult.OK)
                
                    FileInfo fi = new FileInfo(ofd.FileName);
                    string FileName1 = ofd.FileName;

                    string excel = fi.FullName;

                    if (ofd.FileName.EndsWith(".xlsx"))
                    
                        StrConn = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excel + ";Extended Properties=\"Excel 12.0;\"";
                    

                    if (ofd.FileName.EndsWith(".xls"))
                    
                        StrConn = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + excel + ";Extended Properties=\"Excel 1.0;HDR=Yes;IMEX=1\"";
                    
                    OleDbConnection oledbconn = new OleDbConnection(StrConn);

                    OleDbDataAdapter dta5 = new OleDbDataAdapter("SELECT * FROM [Order_status$]", oledbconn);
                    oledbconn.Open();

                    DataSet dsole5 = new DataSet();
                    dta5.Fill(dsole5, "Order_status$");
                    datagrdStatus_order.DataSource = dsole5.Tables["Order_status$"];

                    oledbconn.Close();
        string constring = "datasource = localhost; port = 3306; username = root; password = ";
        using (MySqlConnection con = new MySqlConnection(constring))
        

con.Open();
foreach (DataGridViewRow row in datagrdStatus_order.Rows)


        using (MySqlCommand cmd = new MySqlCommand("INSERT IGNORE INTO try1.order_status(ID_WORKER, ID_ORDER, ID_MODULE, ID_PROJECT, AMOUNT_OF_PRODUCTS, BEGIN_DATE, END_DATE) SELECT workers.ID_WORKER, orders.ID_ORDER, module.ID_MODULE, projects.ID, @AMOUNT_OF_PRODUCTS, @BEGIN_DATE, @END_DATE FROM try1.workers INNER JOIN try1.orders INNER JOIN try1.modules INNER JOIN try1.projects WHERE workers.FNAME = @FNAME AND workers.LNAME = @LNAME AND workers.ID_WORKER = @ID_WORKER AND orders.DESC_ORDER = @DESC_ORDER AND orders.ORDER_NUMBER = @ORDER_NUMBER AND modules.NAME = @MODULES_NAME AND projects.PROJECT_NAME = @PROJECT_NAME", con))
        
            cmd.Parameters.AddWithValue("@ID_WORKER", row.Cells["ID_WORKER"].Value);
            cmd.Parameters.AddWithValue("@FNAME", row.Cells["FNAME"].Value);
            cmd.Parameters.AddWithValue("@LNAME", row.Cells["LNAME"].Value);
            cmd.Parameters.AddWithValue("@DESC_ORDER", row.Cells["DESC_ORDER"].Value);
            cmd.Parameters.AddWithValue("@ORDER_NUMBER", row.Cells["ORDER_NUMBER"].Value);
            cmd.Parameters.AddWithValue("@MODULES_NAME", row.Cells["NAME"].Value);
            cmd.Parameters.AddWithValue("@PROJECT_NAME", row.Cells["PROJECT_NAME"].Value);
            cmd.Parameters.AddWithValue("@AMOUNT_OF_PRODUCTS", row.Cells["AMOUNT_OF_PRODUCTS"].Value);
            cmd.Parameters.AddWithValue("@BEGIN_DATE", row.Cells["BEGIN_DATE"].Value);
            cmd.Parameters.AddWithValue("@END_DATE", row.Cells["END_DATE"].Value);

            cmd.ExecuteNonQuery();

        
    
con.Close();

                    connection.Close();
                    MessageBox.Show("The data are imported correctly");

                    loaddataalldatagridview();
                
            
        

        private void loaddataalldatagridview()
        
            frm1.loaddata5();
        
    

此外,批量插入绝对值得研究。除非有理由要单独插入每个。

【讨论】:

感谢您的回答。但这并没有提供足够的帮助。我的效果与更改之前相同。我还是一样的效果。请参阅我编辑的问题(我刚刚添加了链接)。也许有帮助。【参考方案4】:

尝试 Excel 到数据库,然后从 sql 绑定到 datagrid 视图。 Datagridview占用更多内存,转回sql数据库也需要时间。

【讨论】:

以上是关于将数据从(大)文件 Excel 导入 datagridview,然后导入数据库 - 为啥插入数据库需要这么长时间并且不能保存所有数据?的主要内容,如果未能解决你的问题,请参考以下文章

EXCEL大数据量导出的解决方案

navicat导入数据到一定量就停止了

将数据从csv导入excel

使用VB将数据从Excel导入Ms Access

将数据从 Excel 文件导入 SQL 表而不重复?

将数据从 Excel 文件导入 MSAccess 格式化 VBA 中的每一列