Entity Framework 4 中的 Linq 查询。糟糕的性能

Posted

技术标签:

【中文标题】Entity Framework 4 中的 Linq 查询。糟糕的性能【英文标题】:Linq queries in Entity Framework 4. Horrible performance 【发布时间】:2011-04-15 09:30:44 【问题描述】:

在我的项目中,我使用 EntityFramework 4 来处理数据。我通过一个简单的查询发现了可怕的性能问题。当我查看由 EF4 生成的 sql 查询上的分析器时,我感到震惊。

我的实体数据模型中有一些表:

看起来很简单。我正在尝试从具有所有相关导航属性的指定类别中选择所有产品项目。

我写了这个 LINQ 查询:

ObjectSet<ProductItem> objectSet = ...; 
int categoryId = ...; 

var res = from pi in objectSet.Include("Product").Include("Inventory").Include("Inventory.Storage") 
where pi.Product.CategoryId == categoryId 
select pi;

EF 生成了这个 sql 查询:

SELECT   [Project1].[pintId1]          AS [pintId], 
[Project1].[pintId]           AS [pintId1], 
[Project1].[intProductId]     AS [intProductId], 
[Project1].[nvcSupplier]      AS [nvcSupplier], 
[Project1].[ nvcArticle]      AS [ nvcArticle], 
[Project1].[nvcBarcode]       AS [nvcBarcode], 
[Project1].[bIsActive]        AS [bIsActive], 
[Project1].[dtDeleted]        AS [dtDeleted], 
[Project1].[pintId2]          AS [pintId2], 
[Project1].[nvcName]          AS [nvcName], 
[Project1].[intCategoryId]    AS [intCategoryId], 
[Project1].[ncProductType]    AS [ncProductType], 
[Project1].[C1]               AS [C1], 
[Project1].[pintId3]          AS [pintId3], 
[Project1].[intProductItemId] AS [intProductItemId], 
[Project1].[intStorageId]     AS [intStorageId], 
[Project1].[dAmount]          AS [dAmount], 
[Project1].[mPrice]           AS [mPrice], 
[Project1].[dtModified]       AS [dtModified], 
[Project1].[pintId4]          AS [pintId4], 
[Project1].[nvcName1]         AS [nvcName1], 
[Project1].[bIsDefault]       AS [bIsDefault] 
FROM     (SELECT [Extent1].[pintId]         AS [pintId], 
[Extent1].[intProductId]   AS [intProductId], 
[Extent1].[nvcSupplier]    AS [nvcSupplier], 
[Extent1].[ nvcArticle]    AS [ nvcArticle], 
[Extent1].[nvcBarcode]     AS [nvcBarcode], 
[Extent1].[bIsActive]      AS [bIsActive], 
[Extent1].[dtDeleted]      AS [dtDeleted], 
[Extent2].[pintId]         AS [pintId1], 
[Extent3].[pintId]         AS [pintId2], 
[Extent3].[nvcName]        AS [nvcName], 
[Extent3].[intCategoryId]  AS [intCategoryId], 
[Extent3].[ncProductType]  AS [ncProductType], 
[Join3].[pintId1]          AS [pintId3], 
[Join3].[intProductItemId] AS [intProductItemId], 
[Join3].[intStorageId]     AS [intStorageId], 
[Join3].[dAmount]          AS [dAmount], 
[Join3].[mPrice]           AS [mPrice], 
[Join3].[dtModified]       AS [dtModified], 
[Join3].[pintId2]          AS [pintId4], 
[Join3].[nvcName]          AS [nvcName1], 
[Join3].[bIsDefault]       AS [bIsDefault], 
CASE 
WHEN ([Join3].[pintId1] IS NULL) THEN CAST(NULL AS int) 
ELSE 1 
END AS [C1] 
FROM   [ProductItem] AS [Extent1] 
INNER JOIN [Product] AS [Extent2] 
ON [Extent1].[intProductId] = [Extent2].[pintId] 
LEFT OUTER JOIN [Product] AS [Extent3] 
ON [Extent1].[intProductId] = [Extent3].[pintId] 
LEFT OUTER JOIN (SELECT [Extent4].[pintId]           AS [pintId1], 
[Extent4].[intProductItemId] AS [intProductItemId], 
[Extent4].[intStorageId]     AS [intStorageId], 
[Extent4].[dAmount]          AS [dAmount], 
[Extent4].[mPrice]           AS [mPrice], 
[Extent4].[dtModified]       AS [dtModified], 
[Extent5].[pintId]           AS [pintId2], 
[Extent5].[nvcName]          AS [nvcName], 
[Extent5].[bIsDefault]       AS [bIsDefault] 
FROM   [Inventory] AS [Extent4] 
INNER JOIN [Storage] AS [Extent5] 
ON [Extent4].[intStorageId] = [Extent5].[pintId]) AS [Join3] 
ON [Extent1].[pintId] = [Join3].[intProductItemId] 
WHERE  [Extent2].[intCategoryId] = 8 /* @p__linq__0 */) AS [Project1] 
ORDER BY [Project1].[pintId1] ASC, 
[Project1].[pintId] ASC, 
[Project1].[pintId2] ASC, 
[Project1].[C1] ASC

对于数据库中的 7000 条记录和指定类别中的 ~1000 条记录,此查询的执行时间 id 约为 10 秒。看看这个就不足为奇了:

FROM [ProductItem] AS [Extent1]
INNER JOIN [Product] AS [Extent2]
ON [Extent1].[intProductId] = [Extent2].[pintId]
LEFT OUTER JOIN [Product] AS [Extent3]
ON [Extent1].[intProductId] = [Extent3].[pintId]
***LEFT OUTER JOIN (SELECT ....***

连接中的嵌套选择...太糟糕了...我尝试更改 LINQ 查询,但输出的 SQL 查询相同。

我不能接受使用存储过程的解决方案,因为我使用的是 SQL Compact 数据库。

【问题讨论】:

你的英语还不错:) 问题也很好。 +1 您可以使用imgur.com分享图片。 包含什么?为什么不只是from pi in objectSet where pi.Product.CategoryId == categoryId select pi 如果使用手写SQL查询,性能会更好吗? +1 用于与手写 SQL 进行比较。很难知道 SQL Compact 的一般性能如何(我肯定不知道) 【参考方案1】:

您正在执行Include("Product").Include("Inventory").Include("Inventory.Storage") 并且想知道为什么要获取这么多记录以及为什么会看到这么大的 SQL 查询?请确保您了解Include 方法的含义。如果您想要更简单的查询,请使用以下内容:

var res =
    from pi in objectSet
    where pi.Product.CategoryId == categoryId 
    select pi;

但请注意,这可能会延迟加载 ProductsInventoriesStorages,这可能会导致在您遍历这些子集合时发送更多查询。

【讨论】:

+1 好点 - 使用 Product:ProductItem (1:*) 和 ProductItem:Inventory (1:*),单个产品将加载大量额外(可能不需要的)数据。 ..难怪它很慢....【参考方案2】:

我认为问题出在 Storage 元素中的 Inventory 集合上。您的查询会将所选的 Product、ProductItem 和 Inventory 项目限制为指定 CategoryId 的项目。但是,为了填充 Storage 元素的 Inventory 集合,查询还必须返回使用相同 StorageId 的所有 Inventory 行(然后是那些额外的 Inventory 记录的所有对应的 ProductItem 和 Product 行。

我首先从 Storage 元素中删除 Inventory 集合或删除相应的包含。

【讨论】:

以上是关于Entity Framework 4 中的 Linq 查询。糟糕的性能的主要内容,如果未能解决你的问题,请参考以下文章

.Net Framework 4.5.2 和 Entity Framework 6 中的两种不同的数据库访问

通过 Entity Framework 4.1 中的用户定义函数进行热切加载

Entity Framework 4 中的 Linq 查询。糟糕的性能

忽略 Entity Framework 4.1 Code First 中的类属性

Entity Framework 4.1 DbContext API 中的接口和存储库抽象中断子查询?

Entity Framework 4.1 Fluent API 中的多个类映射到同一个表