将数据从(大)文件 Excel 导入 datagridview,然后导入数据库 - 为啥插入数据库需要这么长时间并且不能保存所有数据?
Posted
技术标签:
【中文标题】将数据从(大)文件 Excel 导入 datagridview,然后导入数据库 - 为啥插入数据库需要这么长时间并且不能保存所有数据?【英文标题】:Importing data from (large) file Excel to datagridview and then database - why inserting to database takes so long and doesn't save all data?将数据从(大)文件 Excel 导入 datagridview,然后导入数据库 - 为什么插入数据库需要这么长时间并且不能保存所有数据? 【发布时间】:2019-10-30 17:56:21 【问题描述】:在这种情况下,我将数据从大文件(大约 37 MB)导入到 datagridview。 Excel文件中的表格如下:
将数据从 Excel 加载到 datagridview 后,我将该数据插入 mysql 数据库:
foreach (DataGridViewRow row in datagrdStatus_order.Rows)
string constring = "datasource = localhost; port = 3306; username = root; password = ";
using (MySqlConnection con = new MySqlConnection(constring))
using (MySqlCommand cmd = new MySqlCommand("INSERT IGNORE INTO try1.order_status(ID_WORKER, ID_ORDER, ID_MODULE, ID_PROJECT, AMOUNT_OF_PRODUCTS, BEGIN_DATE, END_DATE) SELECT workers.ID_WORKER, orders.ID_ORDER, module.ID_MODULE, projects.ID, @AMOUNT_OF_PRODUCTS, @BEGIN_DATE, @END_DATE FROM try1.workers INNER JOIN try1.orders INNER JOIN try1.modules INNER JOIN try1.projects WHERE workers.FNAME = @FNAME AND workers.LNAME = @LNAME AND workers.ID_WORKER = @ID_WORKER AND orders.DESC_ORDER = @DESC_ORDER AND orders.ORDER_NUMBER = @ORDER_NUMBER AND modules.NAME = @MODULES_NAME AND projects.PROJECT_NAME = @PROJECT_NAME", con))
cmd.Parameters.AddWithValue("@ID_WORKER", row.Cells["ID_WORKER"].Value);
cmd.Parameters.AddWithValue("@FNAME", row.Cells["FNAME"].Value);
cmd.Parameters.AddWithValue("@LNAME", row.Cells["LNAME"].Value);
cmd.Parameters.AddWithValue("@DESC_ORDER", row.Cells["DESC_ORDER"].Value);
cmd.Parameters.AddWithValue("@ORDER_NUMBER", row.Cells["ORDER_NUMBER"].Value);
cmd.Parameters.AddWithValue("@MODULES_NAME", row.Cells["NAME"].Value);
cmd.Parameters.AddWithValue("@PROJECT_NAME", row.Cells["PROJECT_NAME"].Value);
cmd.Parameters.AddWithValue("@AMOUNT_OF_PRODUCTS", row.Cells["AMOUNT_OF_PRODUCTS"].Value);
cmd.Parameters.AddWithValue("@BEGIN_DATE", row.Cells["BEGIN_DATE"].Value);
cmd.Parameters.AddWithValue("@END_DATE", row.Cells["END_DATE"].Value);
con.Open();
cmd.ExecuteNonQuery();
con.Close();
后来当我执行这个导入代码时,问题就开始了。
尽管关闭了 ContextSwitchDeadlock 异常(仅插入 1 个表),但插入此数据大约需要... 10 分钟。
尽管没有在程序中调用任何错误或异常,但它并未插入所有数据。在那种情况下,我看到我只导入了 8 个状态而不是全部(11 个状态)。
我有一个问题:为什么插入数据库需要这么长时间并且不能保存所有数据?如何减少向 MySQL db 插入数据并保存所有数据?
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using MySql.Data.MySqlClient;
using System.Collections;
using System.Data.OleDb;
using System.IO;
using System.Configuration;
namespace ControlDataBase
public partial class New_Tables : Form
public New_Tables()
InitializeComponent();
Form1 frm1 = (Form1)Application.OpenForms["Form1"];
private void btnClose_Click(object sender, EventArgs e)
this.Close();
private void ImportData_Click(object sender, EventArgs e)
using (OpenFileDialog ofd = new OpenFileDialog() Filter = "Excel Files|*.xlsx;*.xlsm;*.xlsb;*.xltx;*.xltm;*.xls;*.xlt;*.xls;*.xml;*.xml;*.xlam;*.xla;*.xlw;*.xlr;", ValidateNames = true )
if (ofd.ShowDialog() == DialogResult.OK)
FileInfo fi = new FileInfo(ofd.FileName);
string FileName1 = ofd.FileName;
string excel = fi.FullName;
if (ofd.FileName.EndsWith(".xlsx"))
StrConn = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excel + ";Extended Properties=\"Excel 12.0;\"";
if (ofd.FileName.EndsWith(".xls"))
StrConn = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + excel + ";Extended Properties=\"Excel 1.0;HDR=Yes;IMEX=1\"";
OleDbConnection oledbconn = new OleDbConnection(StrConn);
OleDbDataAdapter dta5 = new OleDbDataAdapter("SELECT * FROM [Order_status$]", oledbconn);
oledbconn.Open();
DataSet dsole5 = new DataSet();
dta5.Fill(dsole5, "Order_status$");
datagrdStatus_order.DataSource = dsole5.Tables["Order_status$"];
oledbconn.Close();
foreach (DataGridViewRow row in datagrdStatus_order.Rows)
string constring = "datasource = localhost; port = 3306; username = root; password = ";
using (MySqlConnection con = new MySqlConnection(constring))
using (MySqlCommand cmd = new MySqlCommand("INSERT IGNORE INTO try1.order_status(ID_WORKER, ID_ORDER, ID_MODULE, ID_PROJECT, AMOUNT_OF_PRODUCTS, BEGIN_DATE, END_DATE) SELECT workers.ID_WORKER, orders.ID_ORDER, module.ID_MODULE, projects.ID, @AMOUNT_OF_PRODUCTS, @BEGIN_DATE, @END_DATE FROM try1.workers INNER JOIN try1.orders INNER JOIN try1.modules INNER JOIN try1.projects WHERE workers.FNAME = @FNAME AND workers.LNAME = @LNAME AND workers.ID_WORKER = @ID_WORKER AND orders.DESC_ORDER = @DESC_ORDER AND orders.ORDER_NUMBER = @ORDER_NUMBER AND modules.NAME = @MODULES_NAME AND projects.PROJECT_NAME = @PROJECT_NAME", con))
cmd.Parameters.AddWithValue("@ID_WORKER", row.Cells["ID_WORKER"].Value);
cmd.Parameters.AddWithValue("@FNAME", row.Cells["FNAME"].Value);
cmd.Parameters.AddWithValue("@LNAME", row.Cells["LNAME"].Value);
cmd.Parameters.AddWithValue("@DESC_ORDER", row.Cells["DESC_ORDER"].Value);
cmd.Parameters.AddWithValue("@ORDER_NUMBER", row.Cells["ORDER_NUMBER"].Value);
cmd.Parameters.AddWithValue("@MODULES_NAME", row.Cells["NAME"].Value);
cmd.Parameters.AddWithValue("@PROJECT_NAME", row.Cells["PROJECT_NAME"].Value);
cmd.Parameters.AddWithValue("@AMOUNT_OF_PRODUCTS", row.Cells["AMOUNT_OF_PRODUCTS"].Value);
cmd.Parameters.AddWithValue("@BEGIN_DATE", row.Cells["BEGIN_DATE"].Value);
cmd.Parameters.AddWithValue("@END_DATE", row.Cells["END_DATE"].Value);
con.Open();
cmd.ExecuteNonQuery();
con.Close();
connection.Close();
MessageBox.Show("The data are imported correctly");
loaddataalldatagridview();
private void loaddataalldatagridview()
frm1.loaddata5();
编辑:
我修改了基于@Matt_Johnson 答案的代码行:
1) 在 for 循环中:
string constring = "datasource = localhost; port = 3306; username = root; password = ";
string query5 = "INSERT IGNORE INTO try1.order_status(ID_WORKER, ID_ORDER, ID_MODULE, ID_PROJECT, AMOUNT_OF_PRODUCTS, BEGIN_DATE, END_DATE) SELECT workers.ID_WORKER, orders.ID_ORDER, module.ID_MODULE, projects.ID, @AMOUNT_OF_PRODUCTS, @BEGIN_DATE, @END_DATE FROM try1.workers INNER JOIN try1.orders INNER JOIN try1.modules INNER JOIN try1.projects WHERE workers.FNAME = @FNAME AND workers.LNAME = @LNAME AND workers.ID_WORKER = @ID_WORKER AND orders.DESC_ORDER = @DESC_ORDER AND orders.ORDER_NUMBER = @ORDER_NUMBER AND modules.NAME = @MODULES_NAME AND projects.PROJECT_NAME = @PROJECT_NAME";
using (MySqlConnection con = new MySqlConnection(constring))
using (MySqlCommand cmd = new MySqlCommand(query5, con))
cmd.Parameters.Add("@ID_WORKER", MySqlDbType.Int32);
cmd.Parameters.Add("@FNAME", MySqlDbType.VarChar);
cmd.Parameters.Add("@LNAME", MySqlDbType.VarChar);
cmd.Parameters.Add("@DESC_ORDER", MySqlDbType.VarChar);
cmd.Parameters.Add("@ORDER_NUMBER", MySqlDbType.VarChar);
cmd.Parameters.Add("@MODULES_NAME", MySqlDbType.VarChar);
cmd.Parameters.Add("@PROJECT_NAME", MySqlDbType.VarChar);
cmd.Parameters.Add("@AMOUNT_OF_PRODUCTS", MySqlDbType.Int32);
cmd.Parameters.Add("@BEGIN_DATE", MySqlDbType.DateTime);
cmd.Parameters.Add("@END_DATE", MySqlDbType.DateTime);
con.Open();
for (int i = 0; i < datagrdStatus_order.Rows.Count + 1; i++)
cmd.Parameters["@ID_WORKER"].Value = datagrdStatus_order.Rows[i].Cells[0].Value;
cmd.Parameters["@FNAME"].Value = datagrdStatus_order.Rows[i].Cells[1].Value;
cmd.Parameters["@LNAME"].Value = datagrdStatus_order.Rows[i].Cells[2].Value;
cmd.Parameters["@DESC_ORDER"].Value = datagrdStatus_order.Rows[i].Cells[3].Value;
cmd.Parameters["@ORDER_NUMBER"].Value = datagrdStatus_order.Rows[i].Cells[4].Value;
cmd.Parameters["@MODULES_NAME"].Value = datagrdStatus_order.Rows[i].Cells[5].Value;
cmd.Parameters["@PROJECT_NAME"].Value = datagrdStatus_order.Rows[i].Cells[6].Value;
cmd.Parameters["@AMOUNT_OF_PRODUCTS"].Value = datagrdStatus_order.Rows[i].Cells[7].Value;
cmd.Parameters["@BEGIN_DATE"].Value = datagrdStatus_order.Rows[i].Cells[8].Value;
cmd.Parameters["@END_DATE"].Value = datagrdStatus_order.Rows[i].Cells[9].Value;
cmd.ExecuteNonQuery();
con.Close();
MessageBox.Show("Imported correctly");
loaddataalldatagridview();
2) 在 foreach 循环中:
string constring = "datasource = localhost; port = 3306; username = root; password = ";
string query5 = "INSERT IGNORE INTO try1.order_status(ID_WORKER, ID_ORDER, ID_MODULE, ID_PROJECT, AMOUNT_OF_PRODUCTS, BEGIN_DATE, END_DATE) SELECT workers.ID_WORKER, orders.ID_ORDER, module.ID_MODULE, projects.ID, @AMOUNT_OF_PRODUCTS, @BEGIN_DATE, @END_DATE FROM try1.workers INNER JOIN try1.orders INNER JOIN try1.modules INNER JOIN try1.projects WHERE workers.FNAME = @FNAME AND workers.LNAME = @LNAME AND workers.ID_WORKER = @ID_WORKER AND orders.DESC_ORDER = @DESC_ORDER AND orders.ORDER_NUMBER = @ORDER_NUMBER AND modules.NAME = @MODULES_NAME AND projects.PROJECT_NAME = @PROJECT_NAME";
using (MySqlConnection con = new MySqlConnection(constring))
using (MySqlCommand cmd = new MySqlCommand(query5, con))
cmd.Parameters.Add("@ID_WORKER", MySqlDbType.Int32);
cmd.Parameters.Add("@FNAME", MySqlDbType.VarChar);
cmd.Parameters.Add("@LNAME", MySqlDbType.VarChar);
cmd.Parameters.Add("@DESC_ORDER", MySqlDbType.VarChar);
cmd.Parameters.Add("@ORDER_NUMBER", MySqlDbType.VarChar);
cmd.Parameters.Add("@MODULES_NAME", MySqlDbType.VarChar);
cmd.Parameters.Add("@PROJECT_NAME", MySqlDbType.VarChar);
cmd.Parameters.Add("@AMOUNT_OF_PRODUCTS", MySqlDbType.Int32);
cmd.Parameters.Add("@BEGIN_DATE", MySqlDbType.DateTime);
cmd.Parameters.Add("@END_DATE", MySqlDbType.DateTime);
con.Open();
foreach (DataGridViewRow row in datagrdStatus_order.Rows)
cmd.Parameters.AddWithValue("@ID_WORKER", row.Cells["ID_WORKER"].Value);
cmd.Parameters.AddWithValue("@FNAME", row.Cells["FNAME"].Value);
cmd.Parameters.AddWithValue("@LNAME", row.Cells["LNAME"].Value);
cmd.Parameters.AddWithValue("@DESC_ORDER", row.Cells["DESC_ORDER"].Value);
cmd.Parameters.AddWithValue("@ORDER_NUMBER", row.Cells["ORDER_NUMBER"].Value);
cmd.Parameters.AddWithValue("@MODULES_NAME", row.Cells["NAME"].Value);
cmd.Parameters.AddWithValue("@PROJECT_NAME", row.Cells["PROJECT_NAME"].Value);
cmd.Parameters.AddWithValue("@AMOUNT_OF_PRODUCTS", row.Cells["AMOUNT_OF_PRODUCTS"].Value);
cmd.Parameters.AddWithValue("@BEGIN_DATE", row.Cells["BEGIN_DATE"].Value);
cmd.Parameters.AddWithValue("@END_DATE", row.Cells["END_DATE"].Value);
cmd.ExecuteNonQuery();
con.Close();
MessageBox.Show("Imported correctly");
loaddataalldatagridview();
我仍然没有有效的答案。我发送共享文件的链接以供下载: https://drive.google.com/file/d/1LE7phZwyT7VR3NJc6bA-n1reCJ_-X3u9/view?usp=sharing
也许有帮助。
【问题讨论】:
一些建议。批量插入,而不是一次做一个。在事件处理程序中利用 async-await。使用事务。改为使用一个连接,或者为每一行使用和处理它。 可能太大了。尝试分页 GET 和分块/批处理插入/更新。 您需要将 Excel 数据导入 DataTable。然后可以将 DataGridView 绑定到 DataTable,显示结果并允许直接编辑。 DataTable 可用于快速更新 MySql 表。就像它们的名字所暗示的那样:DataGridView 是用于视图的视图对象。 DataTable 用于保存实际数据,它是一个数据库对象。他们有自己的优势。 【参考方案1】:在查看会影响性能的代码时,有两点让我印象深刻:
AddWithValue
很昂贵,因为它使用反射,有时它会做错事。相反,使用它们的名称和数据类型添加参数,而不提供值。然后,根据需要提供值。
您正在为每一行连接和断开与数据库的连接。就网络资源而言,这非常昂贵。相反,您应该连接到数据库一次,然后执行您的命令,然后关闭连接。您甚至可以重复使用该命令并简单地改变每一行的参数。
using (MySqlConnection con ...)
using (MySqlCommand cmd ...)
... define the parameters and add them to the command,
... without adding values to them yet
con.Open();
foreach (...)
... now set values of the parameters for the row
cmd.ExecuteNonQuery();
con.Close();
请注意,我故意没有提供完整代码,因为其中大部分已经在其他 Stack Overflow 答案、外部网站和产品文档中提供。我想如果你尝试,你应该能够完成这个谜题。祝你好运。
【讨论】:
这看起来是一个绝妙的主意,但我尝试过定义像 cmd.Parameters.Add(...); 这样的参数然后我也用 foreach 和 for 循环执行了该代码。我还是一样的效果。请参阅我编辑的问题(我刚刚添加了链接)。也许有帮助。 请再次编辑您的问题以显示更新后的代码。谢谢。 好的,我刚刚编辑了问题。我在“编辑”部分添加了代码行。你怎么看? 嗯.. for 循环示例乍一看还不错。不过,for-each 示例仍在调用AddWithValue
。结果如何受到影响?您可以考虑使用参数索引(cmd.Parameters[0].Value
而不是cmd.Parameters["@ID_WORKER"].Value
)来节省查找成本。
索引也有问题。您从 0 变为
【参考方案2】:
-
导出为 CSV 文件。
通过
LOAD DATA LOCAL INFILE ...
导入
这两个步骤都非常快。除非您需要操作列的格式,否则不需要“代码”。
【讨论】:
【参考方案3】:尝试使用以下代码。由于您在每一行上打开和关闭 SQL 连接的方式,您可能已达到最大会话数,就像上面 @Nkosi 所说的那样。
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using MySql.Data.MySqlClient;
using System.Collections;
using System.Data.OleDb;
using System.IO;
using System.Configuration;
namespace ControlDataBase
public partial class New_Tables : Form
public New_Tables()
InitializeComponent();
Form1 frm1 = (Form1)Application.OpenForms["Form1"];
private void btnClose_Click(object sender, EventArgs e)
this.Close();
private void ImportData_Click(object sender, EventArgs e)
using (OpenFileDialog ofd = new OpenFileDialog() Filter = "Excel Files|*.xlsx;*.xlsm;*.xlsb;*.xltx;*.xltm;*.xls;*.xlt;*.xls;*.xml;*.xml;*.xlam;*.xla;*.xlw;*.xlr;", ValidateNames = true )
if (ofd.ShowDialog() == DialogResult.OK)
FileInfo fi = new FileInfo(ofd.FileName);
string FileName1 = ofd.FileName;
string excel = fi.FullName;
if (ofd.FileName.EndsWith(".xlsx"))
StrConn = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excel + ";Extended Properties=\"Excel 12.0;\"";
if (ofd.FileName.EndsWith(".xls"))
StrConn = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + excel + ";Extended Properties=\"Excel 1.0;HDR=Yes;IMEX=1\"";
OleDbConnection oledbconn = new OleDbConnection(StrConn);
OleDbDataAdapter dta5 = new OleDbDataAdapter("SELECT * FROM [Order_status$]", oledbconn);
oledbconn.Open();
DataSet dsole5 = new DataSet();
dta5.Fill(dsole5, "Order_status$");
datagrdStatus_order.DataSource = dsole5.Tables["Order_status$"];
oledbconn.Close();
string constring = "datasource = localhost; port = 3306; username = root; password = ";
using (MySqlConnection con = new MySqlConnection(constring))
con.Open();
foreach (DataGridViewRow row in datagrdStatus_order.Rows)
using (MySqlCommand cmd = new MySqlCommand("INSERT IGNORE INTO try1.order_status(ID_WORKER, ID_ORDER, ID_MODULE, ID_PROJECT, AMOUNT_OF_PRODUCTS, BEGIN_DATE, END_DATE) SELECT workers.ID_WORKER, orders.ID_ORDER, module.ID_MODULE, projects.ID, @AMOUNT_OF_PRODUCTS, @BEGIN_DATE, @END_DATE FROM try1.workers INNER JOIN try1.orders INNER JOIN try1.modules INNER JOIN try1.projects WHERE workers.FNAME = @FNAME AND workers.LNAME = @LNAME AND workers.ID_WORKER = @ID_WORKER AND orders.DESC_ORDER = @DESC_ORDER AND orders.ORDER_NUMBER = @ORDER_NUMBER AND modules.NAME = @MODULES_NAME AND projects.PROJECT_NAME = @PROJECT_NAME", con))
cmd.Parameters.AddWithValue("@ID_WORKER", row.Cells["ID_WORKER"].Value);
cmd.Parameters.AddWithValue("@FNAME", row.Cells["FNAME"].Value);
cmd.Parameters.AddWithValue("@LNAME", row.Cells["LNAME"].Value);
cmd.Parameters.AddWithValue("@DESC_ORDER", row.Cells["DESC_ORDER"].Value);
cmd.Parameters.AddWithValue("@ORDER_NUMBER", row.Cells["ORDER_NUMBER"].Value);
cmd.Parameters.AddWithValue("@MODULES_NAME", row.Cells["NAME"].Value);
cmd.Parameters.AddWithValue("@PROJECT_NAME", row.Cells["PROJECT_NAME"].Value);
cmd.Parameters.AddWithValue("@AMOUNT_OF_PRODUCTS", row.Cells["AMOUNT_OF_PRODUCTS"].Value);
cmd.Parameters.AddWithValue("@BEGIN_DATE", row.Cells["BEGIN_DATE"].Value);
cmd.Parameters.AddWithValue("@END_DATE", row.Cells["END_DATE"].Value);
cmd.ExecuteNonQuery();
con.Close();
connection.Close();
MessageBox.Show("The data are imported correctly");
loaddataalldatagridview();
private void loaddataalldatagridview()
frm1.loaddata5();
此外,批量插入绝对值得研究。除非有理由要单独插入每个。
【讨论】:
感谢您的回答。但这并没有提供足够的帮助。我的效果与更改之前相同。我还是一样的效果。请参阅我编辑的问题(我刚刚添加了链接)。也许有帮助。【参考方案4】:尝试 Excel 到数据库,然后从 sql 绑定到 datagrid 视图。 Datagridview占用更多内存,转回sql数据库也需要时间。
【讨论】:
以上是关于将数据从(大)文件 Excel 导入 datagridview,然后导入数据库 - 为啥插入数据库需要这么长时间并且不能保存所有数据?的主要内容,如果未能解决你的问题,请参考以下文章