您如何正确快速地比较 Datarows / Datatables?
Posted
技术标签:
【中文标题】您如何正确快速地比较 Datarows / Datatables?【英文标题】:How do you correctly and quickly compare Datarows / Datatables? 【发布时间】:2021-08-03 04:35:37 【问题描述】:更新:解释我正在比较的数据表类型 - “比较具有相同列的两个数据表,一个数据表是从外部服务器中拉出并插入的,最初,从那时起,只有最近 6 个月的记录被拉出外部数据库(由于各种原因),并将数据与本地数据进行比较(对于 6 个月的日期范围)查看 DataRow 是否已更改,是否需要删除或添加行标识符 (PKey),其本质上是 SalesID + LineRow 匹配,其他列是要比较的值以查看该行是否需要重新添加/删除,因为传入的列与当前列不同,并且还会删除传入数据不包含这些行的行
所以基本上我想要一个 独占左连接 [插入该数据] 和 独占权加入 [删除该数据] "
我一直在做一些数据库编码以及 JSON 提取,我想知道做事的标准方式/正确方式是什么,我从 2 小时比较时间(在虚拟 DB 表上)开始缩短到 1 小时1 秒(在将我的 janky 方法应用于 DB 表比较之后),然后最终在实时拉取中使用它,结果似乎正确且一致,所以我开始对虚拟数据进行测试,从 1 小时到 26 分钟到最后
第一个也是显而易见的想法是使用两个 ForEach
迭代(即使在心理上这似乎会很慢,但我认为考虑到 Add 的速度有多快,以及您可以多快地比较 JSON 令牌遍历 Jarrays)。代码类似于以下内容:
DataTable dtQueryItemsDiff = dtItems.Clone();
DataTable dtItemsDiff = dtItems.Clone();
int maxRowCountCache = dtItems.AsEnumerable().OrderBy(row => Convert.ToDateTime(row.Field<String>("Date"))).ThenBy(row => row.Field<String>("Name")).Count();
int rowcountCCache = 0;
var query = dtQuery.AsEnumerable().OrderBy(row => Convert.ToDateTime(row.Field<String>("Date"))).ThenBy(row => row.Field<String>("Name"));
foreach (DataRow drDTI in dtItems.AsEnumerable().OrderBy(row => Convert.ToDateTime(row.Field<String>("Date"))).ThenBy(row => row.Field<String>("Name")))
int innerrowcount = 0;
bool rowfound = false;
if (query.Count() != 0)
foreach (DataRow drDTQ in query)
if (drDTI["SalesID"].ToString() == drDTI["SalesID"].ToString() && drDTI["LineNumber"].ToString() == drDTI["LineNumber"].ToString())
rowfound = true;
break;
innerrowcount++;
else
dtItemsDiff.ImportRow(drDTI);
continue;
if (rowfound == true)
orderedDtquery.ElementAt(innerrowcount).Delete();
else
dtItemsDiff.ImportRow(drDTI);
rowcountCCache++;
BeginInvoke(new MethodInvoker(delegate
lblDataLoadC.Text = rowcountCCache.ToString() + " / " + maxRowCountCache.ToString();
));
if (query.Count() != 0)
foreach (DataRow drDTQ in query)
dtQueryItemsDiff.ImportRow(drDTQ);
这需要相当长的时间,大约 1H(1 小时)到 1.5H,具体取决于数据的排序方式等。好处是我可以精细地更改代码,并且它给了我两个表中不匹配的数据,它还减少了搜索的查询大小,但这对我来说还不够快,所以我尝试了 Linq 搜索,但我没有减少列表大小(删除然后搜索比较慢,然后只是搜索)和这个花了大约 40-50 分钟,看起来像:
int maxRowCountCache = dtItems.AsEnumerable().OrderBy(row => Convert.ToDateTime(row.Field<String>("Date"))).ThenBy(row => row.Field<String>("Name")).Count();
int rowcountCCache = 0;
dtItems.AcceptChanges();
foreach (DataRow drDTI in dtItems.AsEnumerable().OrderBy(row => Convert.ToDateTime(row.Field<String>("Date"))).ThenBy(row => row.Field<String>("Name")))
var checkIfRecordInIDB = progSettings.query.AsEnumerable().Where(row => row.Field<string>("CardRecordID") == drDTI["CardRecordID"].ToString()
&& row.Field<string>("Date") == drDTI["Date"].ToString() && row.Field<string>("SaleID") == drDTI["SaleID"].ToString()
&& row.Field<string>("ItemID") == drDTI["ItemID"].ToString() && row.Field<Int64>("LineNumber") == Convert.ToInt64(drDTI["LineNumber"].ToString())).FirstOrDefault();
if (checkIfRecordInIDB != null)
drDTI.Delete();
rowcountCCache++;
BeginInvoke(new MethodInvoker(delegate
lblDataLoadC.Text = rowcountCCache.ToString() + " / " + maxRowCountCache.ToString();
));
dtItems.AcceptChanges();
这样做的好处是它稍微更懒惰、更快、更简洁,但是它只在一个表中为您提供数据,就像除此之外的那样,这正是我接下来尝试使用约 100,000 行虚拟数据的方法,这需要 26 分钟和35 秒。
dtItems.Rows.Clear();
query.Rows.Clear();
Thread start = new Thread(timerAndUIupdate);
start.Start();
dtItems.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "72421ee8-459b-46fb-bf5a-f51e80976e5a", "Pioneer 1kg (FT), RRP $42", "100115", 1, 25.0, "N");
dtItems.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "8885a911-8d32-4dfe-93e5-2e453fd54db9", "Decaf Beans 250g FT", "1002302", 2, 2.0, "N");
dtItems.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "e3aa4b15-b774-4f6a-ac21-77fa05a4332f", "P&R Cups 06oz (1000)", "30056", 3, 1.0, "N");
dtItems.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "51e1a867-4079-4a3c-9ddc-e93d87d80b46", "P&R Cups 12oz (1000)", "30058", 4, 1.0, "N");
query.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "72421ee8-459b-46fb-bf5a-f51e80976e5a", "Pioneer 1kg (FT), RRP $42", "100115", 1, 25.0, "N");
query.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "8885a911-8d32-4dfe-93e5-2e453fd54db9", "Decaf Beans 250g FT", "1002302", 2, 2.0, "N");
query.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "e3aa4b15-b774-4f6a-ac21-77fa05a4332f", "P&R Cups 06oz (1000)", "30056", 3, 1.0, "N");
query.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "51e1a867-4079-4a3c-9ddc-e93d87d80b46", "P&R Cups 12oz (1000)", "30058", 4, 1.0, "N");
for (int i = 1; i < 100000; i++)
dtItems.Rows.Add("Bennett St Dairy", "ed0c8d30-6469-4e13-af5a-36d7357a4a70", "2019-07-01", "2019-07-01", "8b909a4b-a07b-4a06-bebc-6a3387433aaf", "c8cc1115-da02-42cf-b427-accc1b6d07e3", "Trailblazer 1Kg, RRP $44", "10011", i, (i * 4), "N");
query.Rows.Add("Bennett St Dairy", "ed0c8d30-6469-4e13-af5a-36d7357a4a70", "2019-07-01", "2019-07-01", "8b909a4b-a07b-4a06-bebc-6a3387433aaf", "c8cc1115-da02-42cf-b427-accc1b6d07e3", "Trailblazer 1Kg, RRP $44", "10011", i, (i * 4), "N");
dtItems.Rows.Add("Air Coffee International Cafe Pty Ltd", "bb4fa724-9759-4c60-93fe-70fbdfd00417", "2019-07-01", "2019-07-01", "b972f020-3740-4ef2-941f-78b1a9edefa8", "0be54733-ac0e-43f9-8ea5-204c7cdb5f48", "Custom 1kg", "100116", 1, 4.0, "N");
dtItems.Rows.Add("Allure Cafe & Co.", "f76f383f-e9f4-45c9-bb93-81102629b9c3", "2019-07-01", "2019-07-01", "2ad0667f-2254-4df5-8b24-eb36736cabb0", "6edc584b-a8eb-4f0b-a449-dbcb76a40a24", "Porter St 1Kg, RRP $40", "100111", 1, 10.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "6edc584b-a8eb-4f0b-a449-dbcb76a40a24", "Porter St 1Kg, RRP $40", "100111", 1, 30.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "51e1a867-4079-4a3c-9ddc-e93d87d80b46", "P&R Cups 12oz (1000)", "30058", 2, 12.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "401ce902-e158-4f21-85a5-3312c32457fc", "Lids 06/08/12oz (White) (1000)", "30062", 3, 7.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "9b80c825-6e9f-4f6b-9c77-f3378cc220e4", "4-Cup Cardboard Holders (300)", "41003", 4, 1.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "ea4c906e-fab1-4b15-8845-619f20e53c6a", "Organic Panela 1kg", "20014", 5, 2.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "bb3e1c10-9e67-46d3-99b4-17df45dead90", "Chocolate Powder 1Kg, RRP $25", "20034", 6, 1.0, "N");
query.Rows.Add("Aussie Bites Cafe", "30389aca-9089-4b37-9a1e-5fbc3c2af485", "2019-07-01", "2019-07-01", "85df1af6-3d1e-4e04-8fe9-d90462a59d4c", "ea89ade4-c7ff-4d79-abcd-dcdbb8122562", "X Blend 1Kg, RRP $40", "100112", 1, 4.0, "N");
query.Rows.Add("Aussie Bites Cafe", "30389aca-9089-4b37-9a1e-5fbc3c2af485", "2019-07-01", "2019-07-01", "85df1af6-3d1e-4e04-8fe9-d90462a59d4c", "21fe57ad-08f9-4c8b-81d0-d7b88b291571", "webfreight", "webfreight", 2, 1.0, "N");
query.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "6edc584b-a8eb-4f0b-a449-dbcb76a40a24", "Porter St 1Kg, RRP $40", "100111", 1, 30.0, "N");
query.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "51e1a867-4079-4a3c-9ddc-e93d87d80b46", "P&R Cups 12oz (1000)", "30058", 2, 1.0, "N");
query.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "401ce902-e158-4f21-85a5-3312c32457fc", "Lids 06/08/12oz (White) (1000)", "30062", 3, 2.0, "N");
query.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "9b80c825-6e9f-4f6b-9c77-f3378cc220e4", "4-Cup Cardboard Holders (300)", "41003", 4, 1.0, "N");
Stopwatch pullTime = new();
pullTime.Start();
BeginInvoke(new MethodInvoker(delegate
lblTimerAddRowEnd.Text = "Start Time,Except: " + pullTime.Elapsed.ToString("mm\\:ss\\.ff");
));
var orderedDtItems = dtItems.AsEnumerable().OrderBy(row => Convert.ToDateTime(row.Field<String>("Date"))).ThenBy(row => row.Field<String>("Name"));
var orderedDtquery = query.AsEnumerable().OrderBy(row => Convert.ToDateTime(row.Field<String>("Date"))).ThenBy(row => row.Field<String>("Name"));
DataTable excepteditems = orderedDtItems.Except(orderedDtquery, DataRowComparer.Default).CopyToDataTable();
BeginInvoke(new MethodInvoker(delegate
labelControl1.Text = "End Time,Except: " + pullTime.Elapsed.ToString("mm\\:ss\\.ff");
));
BeginInvoke(new MethodInvoker(delegate
dgvResults.DataSource = excepteditems;
btnStart.Enabled = true;
simpleButton1.Enabled = true;
));
使用此 UI 更新程序代码(已线程化并用于所有测试比较):
private void timerAndUIupdate()
Stopwatch pullTime = new();
pullTime.Start();
do
Thread.Sleep(500);
BeginInvoke(new MethodInvoker(delegate
lblTimer.Text = "Timer: " + pullTime.Elapsed.ToString("mm\\:ss\\.ff");
Application.DoEvents();
));
while (btnStart.Enabled == false);
pullTime.Stop();
BeginInvoke(new MethodInvoker(delegate
lblTimer.Text = "Timer: " + pullTime.Elapsed.ToString("mm\\:ss\\.ff");
Application.DoEvents();
));
Winforms 上的结果如下所示:
然后我做了我的 janky 方法,结果非常快而且看起来相当准确,因为它只需要几分之一秒,我可以多次执行此操作来获得,新行,旧行不应该拉被删除并且应该被删除的旧行 -> 代码看起来像这样
dtItems.Rows.Clear();
query.Rows.Clear();
Thread start = new Thread(timerAndUIupdate);
start.Start();
dtItems.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "72421ee8-459b-46fb-bf5a-f51e80976e5a", "Pioneer 1kg (FT), RRP $42", "100115", 1, 25.0, "N");
dtItems.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "8885a911-8d32-4dfe-93e5-2e453fd54db9", "Decaf Beans 250g FT", "1002302", 2, 2.0, "N");
dtItems.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "e3aa4b15-b774-4f6a-ac21-77fa05a4332f", "P&R Cups 06oz (1000)", "30056", 3, 1.0, "N");
dtItems.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "51e1a867-4079-4a3c-9ddc-e93d87d80b46", "P&R Cups 12oz (1000)", "30058", 4, 1.0, "N");
query.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "72421ee8-459b-46fb-bf5a-f51e80976e5a", "Pioneer 1kg (FT), RRP $42", "100115", 1, 25.0, "N");
query.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "8885a911-8d32-4dfe-93e5-2e453fd54db9", "Decaf Beans 250g FT", "1002302", 2, 2.0, "N");
query.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "e3aa4b15-b774-4f6a-ac21-77fa05a4332f", "P&R Cups 06oz (1000)", "30056", 3, 1.0, "N");
query.Rows.Add("4 Beans Cafe", "2af0f4bf-52ea-44fb-b1b3-36181fe7bfdf", "2019-07-01", "2019-07-01", "7fc4f98a-35af-4da3-afe3-f7cfcd922ea7", "51e1a867-4079-4a3c-9ddc-e93d87d80b46", "P&R Cups 12oz (1000)", "30058", 4, 1.0, "N");
for (int i = 1; i < 100000; i++)
dtItems.Rows.Add("Bennett St Dairy", "ed0c8d30-6469-4e13-af5a-36d7357a4a70", "2019-07-01", "2019-07-01", "8b909a4b-a07b-4a06-bebc-6a3387433aaf", "c8cc1115-da02-42cf-b427-accc1b6d07e3", "Trailblazer 1Kg, RRP $44", "10011", i, (i * 4), "N");
query.Rows.Add("Bennett St Dairy", "ed0c8d30-6469-4e13-af5a-36d7357a4a70", "2019-07-01", "2019-07-01", "8b909a4b-a07b-4a06-bebc-6a3387433aaf", "c8cc1115-da02-42cf-b427-accc1b6d07e3", "Trailblazer 1Kg, RRP $44", "10011", i, (i * 4), "N");
dtItems.Rows.Add("Air Coffee International Cafe Pty Ltd", "bb4fa724-9759-4c60-93fe-70fbdfd00417", "2019-07-01", "2019-07-01", "b972f020-3740-4ef2-941f-78b1a9edefa8", "0be54733-ac0e-43f9-8ea5-204c7cdb5f48", "Custom 1kg", "100116", 1, 4.0, "N");
dtItems.Rows.Add("Allure Cafe & Co.", "f76f383f-e9f4-45c9-bb93-81102629b9c3", "2019-07-01", "2019-07-01", "2ad0667f-2254-4df5-8b24-eb36736cabb0", "6edc584b-a8eb-4f0b-a449-dbcb76a40a24", "Porter St 1Kg, RRP $40", "100111", 1, 10.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "6edc584b-a8eb-4f0b-a449-dbcb76a40a24", "Porter St 1Kg, RRP $40", "100111", 1, 30.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "51e1a867-4079-4a3c-9ddc-e93d87d80b46", "P&R Cups 12oz (1000)", "30058", 2, 12.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "401ce902-e158-4f21-85a5-3312c32457fc", "Lids 06/08/12oz (White) (1000)", "30062", 3, 7.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "9b80c825-6e9f-4f6b-9c77-f3378cc220e4", "4-Cup Cardboard Holders (300)", "41003", 4, 1.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "ea4c906e-fab1-4b15-8845-619f20e53c6a", "Organic Panela 1kg", "20014", 5, 2.0, "N");
dtItems.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "bb3e1c10-9e67-46d3-99b4-17df45dead90", "Chocolate Powder 1Kg, RRP $25", "20034", 6, 1.0, "N");
query.Rows.Add("Aussie Bites Cafe", "30389aca-9089-4b37-9a1e-5fbc3c2af485", "2019-07-01", "2019-07-01", "85df1af6-3d1e-4e04-8fe9-d90462a59d4c", "ea89ade4-c7ff-4d79-abcd-dcdbb8122562", "X Blend 1Kg, RRP $40", "100112", 1, 4.0, "N");
query.Rows.Add("Aussie Bites Cafe", "30389aca-9089-4b37-9a1e-5fbc3c2af485", "2019-07-01", "2019-07-01", "85df1af6-3d1e-4e04-8fe9-d90462a59d4c", "21fe57ad-08f9-4c8b-81d0-d7b88b291571", "webfreight", "webfreight", 2, 1.0, "N");
query.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "6edc584b-a8eb-4f0b-a449-dbcb76a40a24", "Porter St 1Kg, RRP $40", "100111", 1, 30.0, "N");
query.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "51e1a867-4079-4a3c-9ddc-e93d87d80b46", "P&R Cups 12oz (1000)", "30058", 2, 1.0, "N");
query.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "401ce902-e158-4f21-85a5-3312c32457fc", "Lids 06/08/12oz (White) (1000)", "30062", 3, 2.0, "N");
query.Rows.Add("Mad Hatter Wine Co", "49340e5f-c7ef-41d9-9f1b-200711e6e629", "2021-07-28", "2021-07-28", "e16cbbac-c319-45f3-ac53-89d979fbcdc1", "9b80c825-6e9f-4f6b-9c77-f3378cc220e4", "4-Cup Cardboard Holders (300)", "41003", 4, 1.0, "N");
Stopwatch pullTime = new();
pullTime.Start();
BeginInvoke(new MethodInvoker(delegate
lblTimerAddRowEnd.Text = "Start Time,Except: " + pullTime.Elapsed.ToString("mm\\:ss\\.ff");
));
var orderedDtItems = dtItems.AsEnumerable().OrderBy(row => Convert.ToDateTime(row.Field<String>("Date"))).ThenBy(row => row.Field<String>("Name"));
var orderedDtquery = query.AsEnumerable().OrderBy(row => Convert.ToDateTime(row.Field<String>("Date"))).ThenBy(row => row.Field<String>("Name"));
dtOnlyNewRows.Rows.Clear();
HashSet<String> orderedDtItemsHS = new();
HashSet<String> orderedDtqueryHS = new();
HashSet<String> orderedDtItemsHSRemains = new();
HashSet<String> orderedDtqueryHSRemains = new();
foreach (DataRow dr in orderedDtquery)
orderedDtqueryHSRemains.Add(dr["CardRecordID"].ToString() + "⌁" + dr["Date"].ToString() + "⌁" + dr["SaleID"].ToString() + "⌁" + dr["ItemID"].ToString()
+ "⌁" + dr["LineNumber"].ToString() + "⌁" + dr["Quantity"].ToString());
orderedDtqueryHS.Add(dr["CardRecordID"].ToString() + "⌁" + dr["Date"].ToString() + "⌁" + dr["SaleID"].ToString() + "⌁" + dr["ItemID"].ToString()
+ "⌁" + dr["LineNumber"].ToString() + "⌁" + dr["Quantity"].ToString());
foreach (DataRow dr in orderedDtItems)
orderedDtItemsHSRemains.Add(dr["CardRecordID"].ToString() + "⌁" + dr["Date"].ToString() + "⌁" + dr["SaleID"].ToString() + "⌁" + dr["ItemID"].ToString()
+ "⌁" + dr["LineNumber"].ToString() + "⌁" + dr["Quantity"].ToString());
orderedDtItemsHS.Add(dr["CardRecordID"].ToString() + "⌁" + dr["Date"].ToString() + "⌁" + dr["SaleID"].ToString() + "⌁" + dr["ItemID"].ToString()
+ "⌁" + dr["LineNumber"].ToString() + "⌁" + dr["Quantity"].ToString());
bool added = orderedDtqueryHSRemains.Add(dr["CardRecordID"].ToString() + "⌁" + dr["Date"].ToString() + "⌁" + dr["SaleID"].ToString() + "⌁" + dr["ItemID"].ToString()
+ "⌁" + dr["LineNumber"].ToString() + "⌁" + dr["Quantity"].ToString());
if (added == false)
orderedDtqueryHSRemains.Remove(dr["CardRecordID"].ToString() + "⌁" + dr["Date"].ToString() + "⌁" + dr["SaleID"].ToString() + "⌁" + dr["ItemID"].ToString()
+ "⌁" + dr["LineNumber"].ToString() + "⌁" + dr["Quantity"].ToString());
else if (added == true)
dtOnlyNewRows.ImportRow(dr);
orderedDtqueryHSRemains.Remove(dr["CardRecordID"].ToString() + "⌁" + dr["Date"].ToString() + "⌁" + dr["SaleID"].ToString() + "⌁" + dr["ItemID"].ToString()
+ "⌁" + dr["LineNumber"].ToString() + "⌁" + dr["Quantity"].ToString());
foreach (DataRow dr in orderedDtquery)
bool added = orderedDtItemsHSRemains.Add(dr["CardRecordID"].ToString() + "⌁" + dr["Date"].ToString() + "⌁" + dr["SaleID"].ToString() + "⌁" + dr["ItemID"].ToString()
+ "⌁" + dr["LineNumber"].ToString() + "⌁" + dr["Quantity"].ToString());
if (added == false)
orderedDtItemsHSRemains.Remove(dr["CardRecordID"].ToString() + "⌁" + dr["Date"].ToString() + "⌁" + dr["SaleID"].ToString() + "⌁" + dr["ItemID"].ToString()
+ "⌁" + dr["LineNumber"].ToString() + "⌁" + dr["Quantity"].ToString());
else if (added == true)
DateTime rowTime = Convert.ToDateTime(dr["date"].ToString());
if (rowTime <= MonthCutOff)
dtOnlyLeftoverRows.ImportRow(dr);
else
dtOnlyDeleteRows.ImportRow(dr);
orderedDtItemsHSRemains.Remove(dr["CardRecordID"].ToString() + "⌁" + dr["Date"].ToString() + "⌁" + dr["SaleID"].ToString() + "⌁" + dr["ItemID"].ToString()
+ "⌁" + dr["LineNumber"].ToString() + "⌁" + dr["Quantity"].ToString());
Debug.WriteLine(dtOnlyNewRows.Rows.Count.ToString());
BeginInvoke(new MethodInvoker(delegate
labelControl1.Text = "End Time,Except: " + pullTime.Elapsed.ToString("mm\\:ss\\.ff");
));
pullTime.Stop();
BeginInvoke(new MethodInvoker(delegate
dgvRowsRemaing.DataSource = dtOnlyLeftoverRows;
dgvResults.DataSource = dtOnlyNewRows;
dgvDeleteRows.DataSource = dtOnlyDeleteRows;
btnStart.Enabled = true;
));
最终结果如下:
在所有这些解释之后,我的问题来了:
-
我在其他方法中做错了什么,它们可以更快吗?
如果我的 janky 方法不正常,我应该如何比较数据表?
只要它可以工作并且速度快,即使 janky,也可以吗?
我的 janky 方法可能存在哪些问题?
编辑时间:2021-08-03 11:25 PM AEST(澳大利亚东部标准时间)
法典写的更整洁、更快捷, 应用于我的虚拟数据时的样子 Windows Forms
3 倍更快,更少混乱的代码,更短的方式 这正是我想要的,谢谢
【问题讨论】:
您从未真正清楚地说明“比较数据表”的含义。我确信我最终可以通过阅读代码来解决它,但基本数据表代码往往是一个可怕的转换和字符串列访问的混乱。您的意思是“对两个数据表执行完全外连接,然后查找每个表中等效(相同的列名?)值不同的所有位置”? 所以基本上我想要一个 Exclusive Left Join [插入该数据] 和 Exclusive Right Join [删除该数据] 我更新了我原来的帖子 您是否乐于仅通过主键来执行此操作,还是必须比较所有数据以了解比赛中的变化(以及哪一方获胜)?对于[ID:1,Name:Mark,ID:2,Name:Luke,ID:3,Name:John]
的左侧和[ID:2,Name:Luke,ID:3,Name:Mary,ID:4,Name:Mark]
的右侧,最终状态将是什么(注意;ID 3 名称已更改)
认为 Pkey 就足够了,但我不希望重写相交数据(出于各种原因),也没有真正的 Pkey,因为数据是从 MYOB 在集合模型中挖掘出来的(我无法控制)但是通过组合 2 列存在逻辑 Pkey-> 所以我想根据其余列比较数据,以查看是否应该添加传入行或对于本地数据库中不存在 Pkey 的行和剩余行是否已经存在传入行-> 如果他们共享相同的 Pkey,则右侧数据将被删除,即左侧获胜,因此正如您所展示的 ID:3,Name:John 将替换 ID:3,Name:Mary
【参考方案1】:
我会通过使用一对字典索引数据表来做到这一点。 DataTable 可以定义主键并执行在内部使用字典的快速查找,但通常使用数据表是非常丑陋的东西,所以没有必要添加更多的 PK 丑陋
所以我们在右侧有一些数据表,它是从数据库中下载的,您已经确定“Foo”和“Bar”列是 PK。 Foo 是一个字符串,Bar 是一个 int:
Dim rIndex = new Dictionary(Of (ValueTuple(Of String, Integer), DataRow)
For Each r as DataRow In rightDt.Rows
Dim key = ( r.Field(Of String)("Foo"), r.Field(Of Integer)("Bar") )
rIndex(key) = r
Next r
我们有一些文件已读入左侧数据表。该文件的列恰好被称为 Wit (string) 和 Woo (int)
Dim lIndex = new Dictionary(Of (ValueTuple(Of String, Integer), DataRow)
For Each r as DataRow In leftDt.Rows
Dim key = (r.Field(Of String)("Wit"), r.Field(Of Integer)("Woo") )
lIndex(key) = r
Next r
现在,如果我们在进行过程中也将密钥存储到哈希集中,这可能会让生活变得轻松;这代表左右的联合
Dim allKeys as New HashSet(Of ValueTuple(Of String, Integer))
Dim rIndex = new Dictionary(Of (ValueTuple(Of String, Integer), DataRow)
For Each r as DataRow In rightDt.Rows
Dim key = ( r.Field(Of String)("Foo"), r.Field(Of Integer)("Bar") )
rIndex(key) = r
allKeys.Add(key)
Next r
Dim lIndex = new Dictionary(Of (ValueTuple(Of String, Integer), DataRow)
For Each r as DataRow In leftDt.Rows
Dim key = (r.Field(Of String)("Wit"), r.Field(Of Integer)("Woo") )
lIndex(key) = r
allKeys.Add(key)
Next r
剩下的就是枚举 allKeys 并询问字典是否包含它并决定做什么
For Each k in allKeys
Dim inL = lIndex.ContainsKey(k)
Dim inR = rIndex.ContainsKey(k)
If inL AndAlso inR Then
Dim updateRo = lIndex(k) 'update the db using this datarow
...
ElseIf inL Then
Dim insertRo = lIndex(k) 'insert this row to the db
...
Else
Dim deleteRo = rIndex(k) 'delete this row from the db
...
End If
Next k
--
哈,刚刚意识到我的大脑仍然处于 VB 模式。这是上面的 C# 版本:
var allKeys = new HashSet<(string, int)>();
var rIndex = new Dictionary<(string, int), DataRow>();
foreach(DataRow r in rightDt.Rows)
var key = (r.Field<string>("Foo"), r.Field<int>("Bar"));
rIndex[key] = r;
allKeys.Add(key);
var lIndex = new Dictionary<(string, int), DataRow>();
foreach(DataRow r in leftDt.Rows)
var key = (r.Field<string>("Wit"), r.Field<int>("Woo"));
lIndex[key] = r;
allKeys.Add(key);
foreach(var k in allKeys)
var inL = lIndex.ContainsKey(k);
var inR = rIndex.ContainsKey(k);
if(inL && inR)
var updateRo = lIndex[k]; //update the db using this datarow
...
else if(inL)
var insertRo = lIndex[k]; //insert this row to the db
...
else
var deleteRo = rIndex[k]; //delete this row from the db
...
您可以在 https://dotnetfiddle.net/3jfrPl 上查看工作示例
【讨论】:
LOL 我开始用 VB 编码,直到我读到你的句子“这里是 C#”>. 是的,这是正确的,性能和代码更整洁以上是关于您如何正确快速地比较 Datarows / Datatables?的主要内容,如果未能解决你的问题,请参考以下文章