优化 SQL 查询以返回带有标签的记录
Posted
技术标签:
【中文标题】优化 SQL 查询以返回带有标签的记录【英文标题】:Optimizing SQL query to return Record with tags 【发布时间】:2015-03-20 00:11:19 【问题描述】:我正在寻求帮助来优化我为 SQL Server 编写的查询。鉴于此数据库架构:
TradeLead 对象,此表中的一条记录就是一篇小文章。
CREATE TABLE [dbo].[TradeLeads]
(
[TradeLeadID] INT NOT NULL PRIMARY KEY IDENTITY(1,1),
Title nvarchar(250),
Body nvarchar(max),
CreateDate datetime,
EditDate datetime,
CreateUser nvarchar(250),
EditUser nvarchar(250),
[Views] INT NOT NULL DEFAULT(0)
)
这是将 TradeLead 文章链接到行业记录的交叉引用表。
CREATE TABLE [dbo].[TradeLeads_Industries]
(
[ID] INT NOT NULL PRIMARY KEY IDENTITY(1,1),
[TradeLeadID] INT NOT NULL,
[IndustryID] INT NOT NULL
)
最后是 Industry 对象的架构。这些本质上只是标签,但用户无法输入这些。数据库将有一个特定的数量。
CREATE TABLE [dbo].[Industries]
(
IndustryID INT NOT NULL PRIMARY KEY identity(1,1),
Name nvarchar(200)
)
我正在编写的程序用于搜索特定的 TradeLead 记录。用户将能够在 TradeLead 对象的标题中搜索关键字,使用日期范围进行搜索,并搜索具有特定行业标签的 TradeLead。
该数据库很可能包含大约 1,000,000 篇 TradeLead 文章和大约 30 个行业标签。
这是我想出的查询:
DECLARE @Title nvarchar(50);
SET @Title = 'Testing';
-- User defined table type containing a list of IndustryIDs. Would prob have around 5 selections max.
DECLARE @Selectedindustryids IndustryIdentifierTable_UDT;
DECLARE @Start DATETIME;
SET @Start = NULL;
DECLARE @End DATETIME;
SET @End = NULL;
SELECT *
FROM(
-- Subquery to return all the tradeleads that match a user's criteria.
-- These fields can be null.
SELECT TradeLeadID,
Title,
Body,
CreateDate,
CreateUser,
Views
FROM TradeLeads
WHERE(@Title IS NULL OR Title LIKE '%' + @Title + '%') AND (@Start IS NULL OR CreateDate >= @Start) AND (@End IS NULL OR CreateDate <= @End)) AS FTL
INNER JOIN
-- Subquery to return the TradeLeadID for each TradeLead record with related IndustryIDs
(SELECT TI.TradeLeadID
FROM TradeLeads_Industries TI
-- Left join the selected IndustryIDs to the Cross reference table to get the TradeLeadIDs that are associated with a specific industry.
LEFT JOIN @SelectedindustryIDs SIDS
ON SIDS.IndustryID = TI.IndustryID
-- It's possible the user has not selected any IndustryIDs to search for.
WHERE (NOT EXISTS(SELECT 1 FROM @SelectedIndustryIDs) OR SIDS.IndustryID IS NOT NULL)
-- Group by to reduce the amount of records.
GROUP BY TI.TradeLeadID) AS SelectedIndustries ON SelectedIndustries.TradeLeadID = FTL.TradeLeadID
With about 600,000 TradeLead records and with an average of 4 IndustryIDs attached to each one, the query takes around 8 seconds to finish on a local machine. I would like to get it as fast as possible. Any tips or insight would be appreciated.
【问题讨论】:
【参考方案1】:这里有几点。
使用(@Start IS NULL OR CreateDate >= @Start)
之类的结构可能会导致称为参数嗅探的问题。解决它的两种方法是
-
在查询末尾添加
Option (Recompile)
使用动态 SQL 仅包含用户要求的条件。
对于这些数据,我倾向于第二种方法。
接下来,可以使用exists
重写查询以提高效率(假设用户输入了行业ID)
select
TradeLeadID,
Title,
Body,
CreateDate,
CreateUser,
[Views]
from
dbo.TradeLeads t
where
Title LIKE '%' + @Title + '%' and
CreateDate >= @Start and
CreateDate <= @End and
exists (
select
'x'
from
dbo.TradeLeads_Industries ti
inner join
@Selectedindustryids sids
on ti.IndustryID = sids.IndustryID
where
t.TradeLeadID = ti.TradeLeadID
);
最后,您需要在dbo.TradeLeads_Industries
表上至少有一个索引。以下是候选人。
(TradeLeadID, IndustryID)
(IndustryID, TradeLeadID)
测试会告诉你一个或两个是否有用。
【讨论】:
以上是关于优化 SQL 查询以返回带有标签的记录的主要内容,如果未能解决你的问题,请参考以下文章