优化 SQL 查询以返回带有标签的记录

Posted

技术标签:

【中文标题】优化 SQL 查询以返回带有标签的记录【英文标题】:Optimizing SQL query to return Record with tags 【发布时间】:2015-03-20 00:11:19 【问题描述】:

我正在寻求帮助来优化我为 SQL Server 编写的查询。鉴于此数据库架构:

TradeLead 对象,此表中的一条记录就是一篇小文章。

CREATE TABLE [dbo].[TradeLeads]
(
    [TradeLeadID] INT NOT NULL PRIMARY KEY IDENTITY(1,1),
    Title nvarchar(250),
    Body nvarchar(max),
    CreateDate datetime,
    EditDate datetime,
    CreateUser nvarchar(250),
    EditUser nvarchar(250), 
    [Views] INT NOT NULL DEFAULT(0)

)

这是将 TradeLead 文章链接到行业记录的交叉引用表。

CREATE TABLE [dbo].[TradeLeads_Industries]
(
    [ID] INT NOT NULL PRIMARY KEY IDENTITY(1,1), 
    [TradeLeadID] INT NOT NULL, 
    [IndustryID] INT NOT NULL
)

最后是 Industry 对象的架构。这些本质上只是标签,但用户无法输入这些。数据库将有一个特定的数量。

CREATE TABLE [dbo].[Industries]
(
    IndustryID INT NOT NULL PRIMARY KEY identity(1,1),
    Name nvarchar(200)
)

我正在编写的程序用于搜索特定的 TradeLead 记录。用户将能够在 TradeLead 对象的标题中搜索关键字,使用日期范围进行搜索,并搜索具有特定行业标签的 TradeLead。

该数据库很可能包含大约 1,000,000 篇 TradeLead 文章和大约 30 个行业标签。

这是我想出的查询:

DECLARE @Title nvarchar(50);
SET @Title = 'Testing';
-- User defined table type containing a list of IndustryIDs. Would prob have around 5 selections max.
DECLARE @Selectedindustryids IndustryIdentifierTable_UDT;
DECLARE @Start DATETIME;
SET @Start = NULL;
DECLARE @End DATETIME;
SET @End = NULL;


SELECT *
FROM(
-- Subquery to return all the tradeleads that match a user's criteria.
-- These fields can be null.
SELECT TradeLeadID, 
            Title, 
            Body, 
            CreateDate, 
            CreateUser, 
            Views
     FROM TradeLeads
     WHERE(@Title IS NULL OR Title LIKE '%' + @Title + '%') AND (@Start IS NULL OR CreateDate >= @Start) AND (@End IS NULL OR CreateDate <= @End)) AS FTL

    INNER JOIN
    -- Subquery to return the TradeLeadID for each TradeLead record with related IndustryIDs
    (SELECT TI.TradeLeadID
           FROM TradeLeads_Industries TI
           -- Left join the selected IndustryIDs to the Cross reference table to get the TradeLeadIDs that are associated with a specific industry.
           LEFT JOIN @SelectedindustryIDs SIDS
             ON SIDS.IndustryID = TI.IndustryID
           -- It's possible the user has not selected any IndustryIDs to search for.
           WHERE (NOT EXISTS(SELECT 1 FROM @SelectedIndustryIDs) OR SIDS.IndustryID IS NOT NULL)
           -- Group by to reduce the amount of records.
           GROUP BY TI.TradeLeadID) AS SelectedIndustries ON SelectedIndustries.TradeLeadID = FTL.TradeLeadID



       With about 600,000 TradeLead records and  with an average of 4 IndustryIDs attached to each one, the query takes around 8 seconds to finish on a local machine. I would like to get it as fast as possible. Any tips or insight would be appreciated.

【问题讨论】:

【参考方案1】:

这里有几点。

使用(@Start IS NULL OR CreateDate &gt;= @Start) 之类的结构可能会导致称为参数嗅探的问题。解决它的两种方法是

    在查询末尾添加Option (Recompile) 使用动态 SQL 仅包含用户要求的条件。

对于这些数据,我倾向于第二种方法。

接下来,可以使用exists 重写查询以提高效率(假设用户输入了行业ID)

select
    TradeLeadID, 
    Title, 
    Body, 
    CreateDate, 
    CreateUser, 
    [Views]
from
    dbo.TradeLeads t
where
    Title LIKE '%' + @Title + '%' and
    CreateDate >= @Start and
    CreateDate <= @End and
    exists (
        select
            'x'
        from
            dbo.TradeLeads_Industries ti
                inner join
            @Selectedindustryids sids
                on ti.IndustryID = sids.IndustryID
        where
            t.TradeLeadID = ti.TradeLeadID
    );

最后,您需要在dbo.TradeLeads_Industries 表上至少有一个索引。以下是候选人。

(TradeLeadID, IndustryID)
(IndustryID, TradeLeadID)

测试会告诉你一个或两个是否有用。

【讨论】:

以上是关于优化 SQL 查询以返回带有标签的记录的主要内容,如果未能解决你的问题,请参考以下文章

返回带有零而不是空值的 Access 记录集

优化 SQL 查询以减少执行时间

如何查询 sql 以获取最新的记录日期,但如果记录有消息返回该 reocrd [关闭]

带有日期循环的 SQL [关闭]

SQL 查询到带有标签属性的 XML

带有参数的 PL/SQL 过程/函数从选择查询返回表