实体框架6-然后按First()进行排序所需的时间太长

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了实体框架6-然后按First()进行排序所需的时间太长相关的知识,希望对你有一定的参考价值。

我确实需要与此相关的帮助,经过数小时的搜索找不到相关的答案。

mysql,Entity Framework 6,具有数百万条记录的数据库,记录看起来像:

Indexint(11) NOT NULL  
TaskIDint(11) NOT NULL  
DeviceIDbigint(20) NOT NULL  
Commentslongtext NULL  
ExtendedResultslongtext NULL  
RunResultint(11) NOT NULL  
JobResultint(11) NOT NULL  
JobResultValuedouble NOT NULL  
ReporterIDbigint(20) NOT NULL  
FieldIDbigint(20) NOT NULL  
TimeOfRundatetime NOT NULL  

我需要获取特定taskID的所有记录,然后按DeviceID分组并按TimeOfRun排序,以获取特定taskID中每个deviceID的最新数据。

这是我的代码:

List<JobsRecordHistory> newH = db.JobsRecordHistories.AsNoTracking().Where(x => x.TaskID == taskID).GroupBy(x => x.DeviceID).
                Select(x => x.OrderByDescending(y => y.TimeOfRun).FirstOrDefault()).ToList();

但是这是生成的查询:

{SELECT
`Apply1`.`Index`, 
`Apply1`.`TaskID`, 
`Apply1`.`DEVICEID1` AS `DeviceID`, 
`Apply1`.`RunResult`, 
`Apply1`.`JobResult`, 
`Apply1`.`JobResultValue`, 
`Apply1`.`ExtendedResults`, 
`Apply1`.`Comments`, 
`Apply1`.`ReporterID`, 
`Apply1`.`FieldID`, 
`Apply1`.`TimeOfRun`
FROM (SELECT
`Project2`.`p__linq__0`, 
`Project2`.`DeviceID`, 
(SELECT
`Project3`.`Index`
FROM `JobsRecordHistories` AS `Project3`
 WHERE (`Project3`.`TaskID` = @p__linq__0) AND (`Project2`.`DeviceID` = `Project3`.`DeviceID`)
 ORDER BY 
`Project3`.`TimeOfRun` DESC LIMIT 1) AS `Index`, 
(SELECT
`Project3`.`TaskID`
FROM `JobsRecordHistories` AS `Project3`
 WHERE (`Project3`.`TaskID` = @p__linq__0) AND (`Project2`.`DeviceID` = `Project3`.`DeviceID`)
 ORDER BY 
`Project3`.`TimeOfRun` DESC LIMIT 1) AS `TaskID`, 
(SELECT
`Project3`.`DeviceID`
FROM `JobsRecordHistories` AS `Project3`
 WHERE (`Project3`.`TaskID` = @p__linq__0) AND (`Project2`.`DeviceID` = `Project3`.`DeviceID`)
 ORDER BY 
`Project3`.`TimeOfRun` DESC LIMIT 1) AS `DEVICEID1`, 
(SELECT
`Project3`.`RunResult`
FROM `JobsRecordHistories` AS `Project3`
 WHERE (`Project3`.`TaskID` = @p__linq__0) AND (`Project2`.`DeviceID` = `Project3`.`DeviceID`)
 ORDER BY 
`Project3`.`TimeOfRun` DESC LIMIT 1) AS `RunResult`, 
(SELECT
`Project3`.`JobResult`
FROM `JobsRecordHistories` AS `Project3`
 WHERE (`Project3`.`TaskID` = @p__linq__0) AND (`Project2`.`DeviceID` = `Project3`.`DeviceID`)
 ORDER BY 
`Project3`.`TimeOfRun` DESC LIMIT 1) AS `JobResult`, 
(SELECT
`Project3`.`JobResultValue`
FROM `JobsRecordHistories` AS `Project3`
 WHERE (`Project3`.`TaskID` = @p__linq__0) AND (`Project2`.`DeviceID` = `Project3`.`DeviceID`)
 ORDER BY 
`Project3`.`TimeOfRun` DESC LIMIT 1) AS `JobResultValue`, 
(SELECT
`Project3`.`ExtendedResults`
FROM `JobsRecordHistories` AS `Project3`
 WHERE (`Project3`.`TaskID` = @p__linq__0) AND (`Project2`.`DeviceID` = `Project3`.`DeviceID`)
 ORDER BY 
`Project3`.`TimeOfRun` DESC LIMIT 1) AS `ExtendedResults`, 
(SELECT
`Project3`.`Comments`
FROM `JobsRecordHistories` AS `Project3`
 WHERE (`Project3`.`TaskID` = @p__linq__0) AND (`Project2`.`DeviceID` = `Project3`.`DeviceID`)
 ORDER BY 
`Project3`.`TimeOfRun` DESC LIMIT 1) AS `Comments`, 
(SELECT
`Project3`.`ReporterID`
FROM `JobsRecordHistories` AS `Project3`
 WHERE (`Project3`.`TaskID` = @p__linq__0) AND (`Project2`.`DeviceID` = `Project3`.`DeviceID`)
 ORDER BY 
`Project3`.`TimeOfRun` DESC LIMIT 1) AS `ReporterID`, 
(SELECT
`Project3`.`FieldID`
FROM `JobsRecordHistories` AS `Project3`
 WHERE (`Project3`.`TaskID` = @p__linq__0) AND (`Project2`.`DeviceID` = `Project3`.`DeviceID`)
 ORDER BY 
`Project3`.`TimeOfRun` DESC LIMIT 1) AS `FieldID`, 
(SELECT
`Project3`.`TimeOfRun`
FROM `JobsRecordHistories` AS `Project3`
 WHERE (`Project3`.`TaskID` = @p__linq__0) AND (`Project2`.`DeviceID` = `Project3`.`DeviceID`)
 ORDER BY 
`Project3`.`TimeOfRun` DESC LIMIT 1) AS `TimeOfRun`
FROM (SELECT
@p__linq__0 AS `p__linq__0`, 
`Distinct1`.`DeviceID`
FROM (SELECT DISTINCT 
`Extent1`.`DeviceID`
FROM `JobsRecordHistories` AS `Extent1`
 WHERE `Extent1`.`TaskID` = @p__linq__0) AS `Distinct1`) AS `Project2`) AS `Apply1`}

这花了太长时间。我承认我对SQL不够好,但是如果在WHERE语句后插入ToList(),那么我会更快地得到结果,尽管这样做仍然不是正确的选择,因为有很多不需要的数据在这种情况下数据库将传递到我的应用程序,并且仍然很慢,= 40秒记录需要30秒。

我也尝试过:

Dictionary<long, DateTime> DeviceIDAndTime = db.JobsRecordHistories.AsNoTracking().Where(x => x.TaskID == taskID).GroupBy(x => x.DeviceID)
                .Select(g => new DeviceIDaAndTime { deviceID = g.Key, timeOfRun = g.Max(gi => gi.TimeOfRun) }).ToDictionary(x => x.deviceID, x => x.timeOfRun);

为了以这种方式使用字典:

                List<JobsRecordHistory> newH = db.JobsRecordHistories.AsNoTracking().Where(x => DeviceIDAndTime.Keys.Contains(x.DeviceID) && x.TimeOfRun == DeviceIDAndTime[x.DeviceID]).ToList();

但我收到此错误:

Additional information: LINQ to Entities does not recognize the method 'System.DateTime get_Item(Int64)' method, and this method cannot be translated into a store expression.

根据我的理解,这是有道理的,将timeOfRun与字典值进行比较时,LINQ需要一个特定的值,而不是组成查询时的一个集合。

我很奇怪,我没有找到任何相关的帖子,而其他人也没有遇到这个问题。我想我错过了一些东西。

感谢任何帮助,谢谢

答案

最后弄清楚了,提高了性能。我需要一个查询和一个子查询,并且我需要MAX函数而不是ORDER,因为我不在乎结果的顺序,我只在乎最大的(timeOfRun)。另外,一旦我注意到更大的索引列(我的PK,自动递增)意味着更新了数据,事情就得到了简化,因此我不需要MAX(timeOfRun),而是使用了MAX(Index),尽管我很确定会以相同的方式工作。

这是我的LINQ:

var historyQuery = db.JobsRecordHistories.AsNoTracking().Where(y => y.TaskID == taskID &&
                                    db.JobsRecordHistories.Where(x => x.TaskID == taskID).GroupBy(x => x.DeviceID).Select(g => g.Max(i => i.Index)).Contains<int>(y.Index));

这是生成的SQL:

{SELECT
`Extent1`.`Index`, 
`Extent1`.`TaskID`, 
`Extent1`.`DeviceID`, 
`Extent1`.`RunResult`, 
`Extent1`.`JobResult`, 
`Extent1`.`JobResultValue`, 
`Extent1`.`ExtendedResults`, 
`Extent1`.`Comments`, 
`Extent1`.`ReporterID`, 
`Extent1`.`FieldID`, 
`Extent1`.`TimeOfRun`
FROM `JobsRecordHistories` AS `Extent1`
 WHERE (`Extent1`.`TaskID` = @p__linq__0) AND (EXISTS(SELECT
1 AS `C1`
FROM (SELECT
`Extent2`.`DeviceID` AS `K1`, 
MAX(`Extent2`.`Index`) AS `A1`
FROM `JobsRecordHistories` AS `Extent2`
 WHERE `Extent2`.`TaskID` = @p__linq__1
 GROUP BY 
`Extent2`.`DeviceID`) AS `GroupBy1`
 WHERE `GroupBy1`.`A1` = `Extent1`.`Index`))}

我希望这会对某人有所帮助,因为我花了1.5天的时间进行谷歌搜索,查看SQL查询,LINQ,调试和优化

另一答案

提供查询语法,而不是基于镜头的方法。我尚未在本地进行测试,但您可能会看到改进的sql生成。至少,也许这种方法可能会引导您走上正确的道路

using System;
using System.Data.Entity;
using System.Linq;
using Microsoft.VisualStudio.TestTools.UnitTesting;

namespace EF.CodeFirst
{
    [TestClass]
    public class UnitTest1
    {
        [TestMethod]
        public void TestMethod1()
        {
            using (var db = new TestDbContext())
            {
                var taskId = 1;
                var query = from job in db.JobRecordHistories
                    where job.TaskId == taskId
                    orderby job.TimeOfRun descending
                    group job by job.DeviceId
                    into deviceGroup
                    select deviceGroup;

                foreach (var deviceGroup in query)
                {
                    foreach (var jobRecordHistory in deviceGroup)
                    {
                        Console.WriteLine("DeviceId '{0}', TaskId'{1}' Runtime'{2}'", jobRecordHistory.DeviceId,
                            jobRecordHistory.TaskId, jobRecordHistory);
                    }
                }
            }
        }
    }

    public class TestDbContext : DbContext
    {
        public DbSet<JobRecordHistory> JobRecordHistories { get; set; }
    }

    public class JobRecordHistory
    {
        public int Id { get; set; }
        public int TaskId { get; set; }
        public int DeviceId { get; set; }
        public DateTime TimeOfRun { get; set; }
    }
}

以上是关于实体框架6-然后按First()进行排序所需的时间太长的主要内容,如果未能解决你的问题,请参考以下文章

VS2013与MySql建立连接;您的项目引用了最新实体框架;但是,找不到数据链接所需的与版本兼容的实体框架数据库 EF6使用Mysql的技巧

使用带有实体框架的动态字段按子记录排序

如何使用实体框架 6 按行号进行内部连接

Mobile first server(版本 7.1)所需的 websphere 应用程序服务器(网络部署)的最低版本是多少?

php 修改Wordpress后端中post类型的列所需的挂钩示例,包括按附加排序的功能

如何按第一位数字的降序对整数数组进行排序? [关闭]