如何使用分层子查询构建层次结构路径
Posted
技术标签:
【中文标题】如何使用分层子查询构建层次结构路径【英文标题】:How to build hierarchy paths with hierarchical subqueries 【发布时间】:2014-03-02 11:17:39 【问题描述】:编辑:我通过引入 location 实体提供了附加信息,以说明我尝试使用子查询的原因
在 oracle 11g 数据库中,我有 元素的层次结构表,最终将包含几百万行。每行都有指向其父行的索引外键,并且不允许循环。 元素也有名字和类型。除此之外,还有另一个实体 - location,它类似于 element(分层,具有指向父级 + 名称的外键)。 Top element(你的根)可以在location(它们通过LocationId连接)。所以有2个实体:
地点:
Id [NUMBER(9,0), PK] ParentId [NUMBER(9,0), FK] 名称 [VARCHAR2(200)]元素:
Id [NUMBER(9,0), PK] LocationId [NUMBER(9,0), FK] ParentId [NUMBER(9,0), FK] TypeId [NUMBER(9,0), FK] 名称 [VARCHAR2(200)]现在假设表格包含以下数据,例如:
地点:
Id | ParentId | Name
----------------------------------
100 | null | TopLocation
101 | 100 | Level1Location
102 | 101 | Level2Location
元素:
Id | LocationId | ParentId | TypeId | Name
----------------------------------------------------
1 | 102 | null | 10 | TopParent
2 | null | 1 | 11 | Level1Child
3 | null | 2 | 11 | Level2Child
我要做的是为 elements 编写查询,除了基本的 4 个 element 列之外,还返回父 ID、名称和类型 ID 的完整路径 +顶部 元素 位置 id 和名称。因此,如果我获取 ID 为 3 的 element(此条件也可能因此处未指定的多列而变得复杂)查询必须返回:
Id | ParentId | TypeId | Name | IdsPath | TypeIdsPath | NamesPath | LocIdsPath | LocNamesPath
---------------------------------------------------------------------------------------------------------------------------------------------------------------
3 | 2 | 11 | Level2Child | /3/2/1 | /11/11/10 | /Level2Child/Level1Child/TopParent | /102/101/100 | /Level2Location/Level1Location/TopLocation
首先我写了oracle hierarchical query,它返回location和element
所需的路径位置
select
SYS_CONNECT_BY_PATH(Id, '/') IdsPath,
SYS_CONNECT_BY_PATH(Name, '/') NamesPath
from
loc
where
connect_by_isleaf = 1
CONNECT BY PRIOR ParentId = e.Id
start with Id = 102
元素
select
SYS_CONNECT_BY_PATH(Id, '/') IdsPath,
SYS_CONNECT_BY_PATH(TypeId, '/') TypeIdsPath,
SYS_CONNECT_BY_PATH(Name, '/') NamesPath
from
ele
where
connect_by_isleaf = 1
CONNECT BY PRIOR ParentId = e.Id
start with Id = 3
当我想将这些查询用作在基本选择中连接的子查询时,问题就开始了 - 不能用连接条件替换 start with 条件,因为分层查询而不是全表扫描:
select
e.*,
elePath.IdsPath,
elePath.TypeIdsPath,
elePath.NamesPath,
locPath.IdsPath as LocIdsPath,
locPath.NamesPath as LocNamesPath
from
ele e
left join (
--full table scan!
select
CONNECT_BY_ROOT(Id) Id,
Id as TopEleId,
SYS_CONNECT_BY_PATH(Id, '/') IdsPath,
SYS_CONNECT_BY_PATH(TypeId, '/') TypeIdsPath,
SYS_CONNECT_BY_PATH(Name, '/') NamesPath
from ele
where
connect_by_isleaf = 1
CONNECT BY PRIOR ParentId = e.Id
) elePath on elePath.Id = e.Id
left join (
--full table scan!
select
CONNECT_BY_ROOT(Id) Id,
SYS_CONNECT_BY_PATH(Id, '/') IdsPath,
SYS_CONNECT_BY_PATH(Name, '/') NamesPath
from loc
where
connect_by_isleaf = 1
CONNECT BY PRIOR ParentId = e.Id
) locPath on locPath.Id = elePath.TopEleId
where
e.Id = 3
我也不能scalar subquery 因为查询必须返回多个路径,而不仅仅是一个。有什么建议么?我是在朝着正确的方向前进,还是应该在元素表中添加一些字段并缓存我需要的所有路径? (不会经常更新)
谢谢!
【问题讨论】:
【参考方案1】:你反向遍历层次结构,只需使用connect_by_root()
运算符即可获取根行的列值。
clear screen;
column IdPath format a11;
column TypeIdPathformat a11
column NamePath format a35;
with t1(id1, parent_id, type_id, Name1) as(
select 1, null, 10, 'TopParent' from dual union all
select 2, 1 , 11, 'Level1Child' from dual union all
select 3, 2 , 11, 'Level2Child' from dual
)
select connect_by_root(id1) as id1
, connect_by_root(parent_id) as ParentId
, connect_by_root(type_id) as Typeid
, connect_by_root(name1) as name1
, sys_connect_by_path(id1, '/') as IdPath
, sys_connect_by_path(type_id, '/') as TypeIdPath
, sys_connect_by_path(name1, '/') as NamePath
from t1
where connect_by_isleaf = 1
start with id1 = 3
connect by id1 = prior parent_id
结果:
id1 ParentId TypeId Name1 IdPath TypeIdPath NamePath
---------------------------------------------------------------------------
3 2 11 Level2Child /3/2/1 /11/11/10 /Level2Child/Level1Child/TopParent
编辑#1
获得所需输出的一种方法是使用标量子查询:
with Locations(Id1, ParentId, Name1) as(
select 100, null, 'TopLocation' from dual union all
select 101, 100 , 'Level1Location' from dual union all
select 102, 101 , 'Level2Location' from dual
),
elements(id1, LocationId, parent_id, type_id, Name1) as(
select 1, 102, null, 10, 'TopParent' from dual union all
select 2, null, 1 , 11, 'Level1Child' from dual union all
select 3, null, 2 , 11, 'Level2Child' from dual
)
select e.*
, (select sys_connect_by_path(l.id1, '/')
from locations l
where connect_by_isleaf = 1
start with l.id1 = e.locationid
connect by l.id1 = prior parentid) as LocIdPath
, (select sys_connect_by_path(l.name1, '/')
from locations l
where connect_by_isleaf = 1
start with l.id1 = e.locationid
connect by l.id1 = prior parentid) as LocNamePath
from ( select connect_by_root(id1) as id1
, connect_by_root(parent_id) as ParentId
, connect_by_root(type_id) as Typeid
, connect_by_root(name1) as name1
, sys_connect_by_path(id1, '/') as IdPath
, sys_connect_by_path(type_id, '/') as TypeIdPath
, sys_connect_by_path(name1, '/') as NamePath
, locationid
from elements
where connect_by_isleaf = 1
start with id1 = 3
connect by id1 = prior parent_id ) e
结果:
ID1 PARENTID TYPEID NAME1 IDPATH TYPEIDPATH NAMEPATH LOCATIONID LOCIDPATH LOCNAMEPATH
---------- ---------- ----------- ----------- ----------- ----------------------------------- ---------- ------------- -------------------------------------------
3 2 11 Level2Child /3/2/1 /11/11/10 /Level2Child/Level1Child/TopParent 102 /102/101/100 /Level2Location/Level1Location/TopLocation
【讨论】:
谢谢你的回答,我已经想到了,但也许这个例子太简单了——在实际情况下,我必须获取当前实体的路径(在这种情况下为元素)+一些与外键元素 - 例如每个元素也有外键 LocationId 并且位置也是分层结构的 - 所以我也需要这些路径,这就是为什么我问如何使用子查询来做到这一点 @PavleGartner 然后,您需要提供更多信息 - 主\详细数据样本,并希望获得您所追求的准确输出。如果有这样的需求,可以在子查询中完成,但需要更准确的信息。在您的示例中,您将内联视图返回的外部连接结果集添加到内联视图所基于的表中。没有LocationId
列以及包含您问题中提到的该列的表。
感谢您的努力 - 我在原始问题中添加了位置概念以使其更清晰。
谢谢,这可以工作 - 您是否知道位置相关子查询如何提高性能 - 优化器是否足够聪明,不会在您的查询中遍历层次结构两次?
@PavleGartner 实际上,由于层次结构的反向遍历,Locations
表在这种情况下会被访问四次(需要额外的嵌套循环),Elements
表将被访问两次。但是,如果创建了适当的索引,则不应该有任何重大的性能损失。如果没有,则在外键上创建一个索引。以上是关于如何使用分层子查询构建层次结构路径的主要内容,如果未能解决你的问题,请参考以下文章