MySQL - 添加多个派生表时查询慢 - 优化

Posted

技术标签:

【中文标题】MySQL - 添加多个派生表时查询慢 - 优化【英文标题】:MySQL - Slow Query when adding multiple derived tables - Optimization 【发布时间】:2018-07-10 15:15:42 【问题描述】:

对于我的查询,底部的两个派生表导致此查询异常缓慢。查询按原样执行大约需要 45-55 秒。现在,当我仅删除其中一个派生表时(无论哪个派生表都无关紧要),查询时间会下降到 0.1 - 0.3 秒。我的问题;有多个派生表有问题吗?有没有更好的方法来执行这个?我的索引似乎都是正确的,我还将包括这个查询的解释。

select t.name as team, u.name as "REP NAME", 
  count(distinct activity.id) as "TOTAL VISITS", 
  count(distinct activity.account_id) as "UNIQUE VISITS",
  count(distinct placement.id) as "COMMITMENTS ADDED",

  CASE WHEN 
    count(distinct activity.account_id) = 0 THEN (count(distinct 
    placement.id) / 1) 
    else (cast(count(distinct placement.id) as decimal(10,2)) / 
    cast(count(distinct activity.account_id) as decimal(10,2)))
  end as "UNIQUE VISIT TO COMMITMENT %",

  case when o.mode='basic' then count(distinct placement.id) else 
    count(distinct(case when placement.commitmentstatus='fullfilled' 
    then placement.id else 0 end)) 
  end as "COMMITMENTS FULFILLED",

  case when o.mode='basic' then 1 else 
    (CASE WHEN 
     count(distinct placement.id) = 0 THEN (count(distinct(case when 
     placement.commitmentstatus='fullfilled' then placement.id else 0 
    end)) / 1) 
    else (cast(count(distinct(case when 
    placement.commitmentstatus='fullfilled' then placement.id else 0 
    end)) as decimal(10,2)) / cast(count(distinct placement.id) as 
    decimal(10,2)))
    end) end as "COMMITMENT TO FULFILLMENT %"

from lpmysqldb.users u
left join lpmysqldb.teams t on t.team_id=u.team_id
left join lpmysqldb.organizations o on o.id=t.org_id
left join (select * from lpmysqldb.activity where 
  org_id='555b918ae4b07b6ac5050852' and completed_at>='2018-05-01' and 
  completed_at<='2018-06-01' and tag='visit' and accountname is not 
  null and (status='active' or status='true' or status='1')) as 
  activity on activity.user_id=u.id
left join (select * from lpmysqldb.placements where 
  orgid='555b918ae4b07b6ac5050852' and placementdate>='2018-05-01' and 
  placementdate<='2018-06-01' and (status IN ('1','active','true') or 
  status is null)) as placement on placement.userid=u.id

where u.org_id='555b918ae4b07b6ac5050852' 
  and (u.status='active' or u.status='true' or u.status='1')
  and istestuser!='1'
group by u.org_id, t.name, u.id, u.name, o.mode
order by count(distinct activity.id) desc

感谢您的帮助!

我在下面进行了编辑,将两个底部连接从加入子查询更改为直接加入表。仍然产生相同的结果。

【问题讨论】:

为了能够优化您的查询,我必须了解它。我不。或者我应该说:我不能?我不知道您的表是什么样的,为什么要这样设计,以及您要通过查询来完成什么。如果你问我:我的第一个想法是:我可以摆脱这两个子查询吗?有一个变化是需要为主查询表的每一行重新评估其中一个。 它的基础是获取用户列表,然后在两个特定表中循环并获取他们所有的记录数据:活动和展示位置。然后是两者组合的几个额外列。这有帮助吗? 您不能删除子查询并使它们正常连接吗?我认为这是可能的。 我已经尝试过了,结果相同。我将再试一次并在说明下方发布。子查询通常比派生表更高效吗? 对不起?不,您会得到派生表(在查询执行时组装),因为您使用的是子查询。最好在连接中使用数据库中的现有表,因为它们已经存在。结果可能是一样的,我承认这一点,但是与数据库表的连接几乎总是比与子查询的连接更好(总是存在例外)。 【参考方案1】:

这是对您的查询进行了轻微重组的查询。可能会被简化,因为最后两个子查询都针对您各自的计数和计数不同进行了预聚合,因此您可以直接使用这些列名,而不是显示整个查询中嵌入的所有计数(不同)。

我还尝试通过将给定计数乘以 1.00 来简化除法,以强制使用基于小数的精度。

select 
      t.name as team, 
      u.name as "REP NAME", 
      Activity.DistIdCnt as "TOTAL VISITS", 
      Activity.UniqAccountCnt as "UNIQUE VISITS",
      Placement.DistIdCnt as "COMMITMENTS ADDED",

      Placement.DistIdCnt / 
         CASE WHEN Activity.UniqAccountCnt = 0 
           THEN 1.00 
           ELSE Activity.UniqAccountCnt * 1.00 
           end as "UNIQUE VISIT TO COMMITMENT %",

      case when o.mode = 'basic' 
           then Placement.DistIdCnt
           else Placement.DistFulfillCnt
           end as "COMMITMENTS FULFILLED",

      case when o.mode = 'basic' 
           then 1 
           else ( Placement.DistFulfillCnt /
                     CASE when Placement.DistIdCnt = 0
                          then 1.00
                          ELSE Placement.DistIdCnt * 1.00
                          END TRANSACTION )
           END as "COMMITMENT TO FULFILLMENT %"
   from 
      lpmysqldb.users u
         left join lpmysqldb.teams t 
            on u.team_id = t.team_id
            left join lpmysqldb.organizations o 
               on t.org_id = o.id
        left join
        ( select
                user_id,
                count(*) as AllRecs,
                count( distinct id ) DistIdCnt,
                count( distinct account_id) as UniqAccountCnt
             from
                lpmysqldb.activity
             where 
                    org_id = '555b918ae4b07b6ac5050852' 
                and completed_at>='2018-05-01' 
                and completed_at<='2018-06-01' 
                and tag='visit' 
                and accountname is not null 
                and status IN ( '1', 'active', 'true') 
             group by
                user_id ) activity 
            on u.id = activity.user_id

         left join 
         ( select
                 userid,
                 count(*) AllRecs,
                 count(distinct id) as DistIdCnt,
                 count(distinct( case when commitmentstatus = 'fullfilled' 
                                      then id 
                                      else 0 end )) DistFulfillCnt
              from 
                 lpmysqldb.placements
              where 
                     orgid = '555b918ae4b07b6ac5050852'
                 and placementdate >= '2018-05-01' 
                 and placementdate <= '2018-06-01' 
                 and ( status is null OR status IN ('1','active','true')
              group by
                 userid ) as placement 
            on u.id = placement.userid
   where 
         u.org_id = '555b918ae4b07b6ac5050852' 
     and u.status IN ( 'active', 'true', '1')
     and istestuser != '1'
   group by 
      u.org_id, 
      t.name, 
      u.id, 
      u.name, 
      o.mode
   order by 
      activity.DistIdCnt desc

最后,您的内部查询正在查询所有用户。如果您有大量不活跃的用户,您可能会通过在其中添加这些联接/条件来从每个内部查询中排除这些用户,例如...

( ...
              from 
                 lpmysqldb.placements 
                    JOIN lpmysqldb.users u2
                       on placements.userid = u2.id
                      and u2.status IN ( 'active', 'true', '1')
                      and u2.istestuser != '1'
              where … ) as placement

【讨论】:

如果istestuser的值只有0或1,那么最好说u2.istestuser = 0,然后加上INDEX(istestuser, status, id) istestuser 可以是 0 1 或 null 所以更容易检查 != 1 但再次感谢 @nickbrleet - 好的,但这让我建议的 INDEX 毫无用处。

以上是关于MySQL - 添加多个派生表时查询慢 - 优化的主要内容,如果未能解决你的问题,请参考以下文章

数据库优化——慢查询MySQL定位优化流程

mysql慢查询

MySQL(索引原理与慢查询优化 )

MySQL 性能优化:按日期时间字段排序

Mysql 查询优化器之派生条件回移Derived Condition Pushdown详解

Mysql 查询优化器之派生条件回移Derived Condition Pushdown详解