如何在视图中获取列级依赖项
Posted
技术标签:
【中文标题】如何在视图中获取列级依赖项【英文标题】:How to get column-level dependencies in a view 【发布时间】:2018-01-21 08:58:09 【问题描述】:我已经对此事进行了一些研究,但还没有解决方案。我想要得到的是视图中的列级依赖项。所以,假设我们有一张这样的表
create table TEST(
first_name varchar(10),
last_name varchar(10),
street varchar(10),
number int
)
还有这样的视图:
create view vTEST
as
select
first_name + ' ' + last_name as [name],
street + ' ' + cast(number as varchar(max)) as [address]
from dbo.TEST
我想要得到这样的结果:
column_name depends_on_column_name depends_on_table_name
----------- --------------------- --------------------
name first_name dbo.TEST
name last_name dbo.TEST
address street dbo.TEST
address number dbo.TEST
我尝试过sys.dm_sql_referenced_entities
函数,但referencing_minor_id
的视图总是为0。
select
referencing_minor_id,
referenced_schema_name + '.' + referenced_entity_name as depends_on_table_name,
referenced_minor_name as depends_on_column_name
from sys.dm_sql_referenced_entities('dbo.vTEST', 'OBJECT')
referencing_minor_id depends_on_table_name depends_on_column_name
-------------------- --------------------- ----------------------
0 dbo.TEST NULL
0 dbo.TEST first_name
0 dbo.TEST last_name
0 dbo.TEST street
0 dbo.TEST number
sys.sql_expression_dependencies
和过时的sys.sql_dependencies
也是如此。
那么我错过了什么还是不可能做到的?
有一些相关问题 (Find the real column name of an alias used in a view?),但正如我所说 - 我还没有找到可行的解决方案。
编辑 1:我尝试使用 DAC 来查询此信息是否存储在 System Base Tables 的某个位置,但没有找到
【问题讨论】:
mssqltips.com/sqlservertip/2999/…WITH SCHEMABINDING
可以链接依赖项,但我不确定这是否可以让您创建这样的结果。
我认为没有实用的纯TSQL方案。您可能会在this 问题上找到一些有用的信息:解析 TSQL。
DBA 堆栈交换有一个类似的问题,它使用 sys.sql_dependencies 和 sys.sql_expression_dependencies。不幸的是,前者目前处于维护模式,而后者没有削减它。 dba.stackexchange.com/questions/77813
受here 评论的启发,您还可以尝试在信息架构中对VIEW_COLUMN_USAGE 运行 sp_helptext。对我来说 VIEW_COLUMN_USAGE 也使用 sys.sql_dependencies 但我仍然坚持使用 SQL Server 2008,所以我不知道这是否适用于更新的版本。
【参考方案1】:
不幸的是,SQL Server 没有显式存储源表列和视图列之间的映射。我怀疑主要原因仅仅是由于视图的潜在复杂性(表达式列、在这些列上调用的函数、嵌套查询等)。
我能想到的确定视图列和源列之间映射的唯一方法是解析与视图关联的查询或解析视图的执行计划。
我在这里概述的方法侧重于第二个选项,并且依赖于 SQL Server 将避免为查询不需要的列生成输出列表这一事实。
第一步是获取视图所需的依赖表及其关联列的列表。这可以通过 SQL Server 中的标准系统表来实现。
接下来,我们通过游标枚举视图的所有列。
对于每个视图列,我们创建一个临时包装存储过程,它只从视图中选择有问题的单个列。因为只请求单个列,SQL Server 将只检索输出该单个视图列所需的信息。
新创建的过程将以仅格式模式运行查询,因此不会对数据库造成任何实际的 I/O 操作,但它会在执行时生成估计的执行计划。生成查询计划后,我们从执行计划中查询输出列表。由于我们知道选择了哪个视图列,我们现在可以将输出列表与相关的视图列相关联。我们可以通过仅关联构成原始依赖项列表一部分的列来进一步细化关联,这将消除结果集中的表达式输出。
请注意,使用此方法,如果视图需要将不同的表连接在一起以生成输出,则将返回生成输出所需的所有列,即使它没有直接用于列表达式,因为它仍然是直接需要的.
下面的存储过程演示了上面的实现方法:
CREATE PROCEDURE ViewGetColumnDependencies
(
@viewName NVARCHAR(50)
)
AS
BEGIN
CREATE TABLE #_suppress_output
(
result NVARCHAR(500) NULL
);
DECLARE @viewTableColumnMapping TABLE
(
[ViewName] NVARCHAR(50),
[SourceObject] NVARCHAR(50),
[SourceObjectColumnName] NVARCHAR(50),
[ViewAliasColumn] NVARCHAR(50)
)
-- Get list of dependent tables and their associated columns required for the view.
INSERT INTO @viewTableColumnMapping
(
[ViewName]
,[SourceObject]
,[SourceObjectColumnName]
)
SELECT v.[name] AS [ViewName]
,'[' + OBJECT_NAME(d.referenced_major_id) + ']' AS [SourceObject]
,c.[name] AS [SourceObjectColumnName]
FROM sys.views v
LEFT OUTER JOIN sys.sql_dependencies d ON d.object_id = v.object_id
LEFT OUTER JOIN sys.columns c ON c.object_id = d.referenced_major_id AND c.column_id = d.referenced_minor_id
WHERE v.[name] = @viewName;
DECLARE @aliasColumn NVARCHAR(50);
-- Next, we enumerate all of the views columns via a cursor.
DECLARE ViewColumnNameCursor CURSOR FOR
SELECT aliases.name AS [AliasName]
FROM sys.views v
LEFT OUTER JOIN sys.columns AS aliases on v.object_id = aliases.object_id -- c.column_id=aliases.column_id AND aliases.object_id = object_id('vTEST')
WHERE v.name = @viewName;
OPEN ViewColumnNameCursor
FETCH NEXT FROM ViewColumnNameCursor
INTO @aliasColumn
DECLARE @tql_create_proc NVARCHAR(MAX);
DECLARE @queryPlan XML;
WHILE @@FETCH_STATUS = 0
BEGIN
/*
For each view column, we create a temporary wrapper stored procedure that
only selects the single column in question from view. The stored procedure
will run the query in format only mode and will therefore not cause any
actual I/O operations on the database, but it will generate an estimated
execution plan when executed.
*/
SET @tql_create_proc = 'CREATE PROCEDURE ___WrapView
AS
SET FMTONLY ON;
SELECT CONVERT(NVARCHAR(MAX), [' + @aliasColumn + ']) FROM [' + @viewName + '];
SET FMTONLY OFF;';
EXEC (@tql_create_proc);
-- Execute the procedure to generate a query plan. The insert into the temp table is only done to
-- suppress the empty result set from being displayed as part of the output.
INSERT INTO #_suppress_output
EXEC ___WrapView;
-- Get the query plan for the wrapper procedure that was just executed.
SELECT @queryPlan = [qp].[query_plan]
FROM [sys].[dm_exec_procedure_stats] AS [ps]
JOIN [sys].[dm_exec_query_stats] AS [qs] ON [ps].[plan_handle] = [qs].[plan_handle]
CROSS APPLY [sys].[dm_exec_query_plan]([qs].[plan_handle]) AS [qp]
WHERE [ps].[database_id] = DB_ID() AND OBJECT_NAME([ps].[object_id], [ps].[database_id]) = '___WrapView'
-- Drop the wrapper view
DROP PROCEDURE ___WrapView
/*
After the query plan is generate, we query the output lists from the execution plan.
Since we know which view column was selected we can now associate the output list to
view column in question. We can further refine the association by only associating
columns that form part of our original dependency list, this will eliminate expression
outputs from the result set.
*/
;WITH QueryPlanOutputList AS
(
SELECT T.X.value('local-name(.)', 'NVARCHAR(max)') as Structure,
T.X.value('./@Table[1]', 'NVARCHAR(50)') as [SourceTable],
T.X.value('./@Column[1]', 'NVARCHAR(50)') as [SourceColumnName],
T.X.query('*') as SubNodes
FROM @queryPlan.nodes('*') as T(X)
UNION ALL
SELECT QueryPlanOutputList.structure + N'/' + T.X.value('local-name(.)', 'nvarchar(max)'),
T.X.value('./@Table[1]', 'NVARCHAR(50)') as [SourceTable],
T.X.value('./@Column[1]', 'NVARCHAR(50)') as [SourceColumnName],
T.X.query('*')
FROM QueryPlanOutputList
CROSS APPLY QueryPlanOutputList.SubNodes.nodes('*') as T(X)
)
UPDATE @viewTableColumnMapping
SET ViewAliasColumn = @aliasColumn
FROM @viewTableColumnMapping CM
INNER JOIN
(
SELECT DISTINCT QueryPlanOutputList.Structure
,QueryPlanOutputList.[SourceTable]
,QueryPlanOutputList.[SourceColumnName]
FROM QueryPlanOutputList
WHERE QueryPlanOutputList.Structure like '%/OutputList/ColumnReference'
) SourceColumns ON CM.[SourceObject] = SourceColumns.[SourceTable] AND CM.SourceObjectColumnName = SourceColumns.SourceColumnName
FETCH NEXT FROM ViewColumnNameCursor
INTO @aliasColumn
END
CLOSE ViewColumnNameCursor;
DEALLOCATE ViewColumnNameCursor;
DROP TABLE #_suppress_output
SELECT *
FROM @viewTableColumnMapping
ORDER BY [ViewAliasColumn]
END
现在可以按如下方式执行存储过程:
EXEC dbo.ViewGetColumnDependencies @viewName = 'vTEST'
【讨论】:
【参考方案2】:我正在玩这个,但没有时间继续下去。也许这会有所帮助:
-- Returns all table columns called in the view and the objects they pull from
SELECT
v.[name] AS ViewName
,d.[referencing_id] AS ViewObjectID
,c.[name] AS ColumnNames
,OBJECT_NAME(d.referenced_id) AS ReferencedTableName
,d.referenced_id AS TableObjectIDsReferenced
FROM
sys.views v
INNER JOIN sys.sql_expression_dependencies d ON d.referencing_id = v.[object_id]
INNER JOIN sys.objects o ON d.referencing_id = o.[object_id]
INNER JOIN sys.columns c ON d.referenced_id = c.[object_id]
WHERE v.[name] = 'vTEST'
-- Returns all output columns in the view
SELECT
OBJECT_NAME([object_id]) AS ViewName
,[object_id] AS ViewObjectID
,[name] AS OutputColumnName
FROM sys.columns
WHERE OBJECT_ID('vTEST') = [object_id]
-- Get the view definition
SELECT
VIEW_DEFINITION
FROM INFORMATION_SCHEMA.VIEWS
WHERE TABLE_NAME = 'vTEST'
【讨论】:
【参考方案3】:这是一个基于查询计划的解决方案。它有一些冒险
几乎可以处理任何选择查询 没有架构绑定缺点
尚未正确测试 如果 Microsoft 更改 XML 查询计划,可能会突然中断。核心思想是XML查询计划中的每个列表达式都定义在“DefinedValue”节点中。 “DefinedValue”的第一个子节点是对输出列的引用,第二个是表达式。该表达式根据输入列和常量值进行计算。 如上所述,这只是基于经验观察,需要进行适当的测试。
这是一个调用示例:
exec dbo.GetColumnDependencies 'select * from dbo.vTEST'
target_column_name | source_column_name | const_value
---------------------------------------------------
address | Expr1007 | NULL
name | Expr1006 | NULL
Expr1006 | NULL | ' '
Expr1006 | [testdb].[dbo].first_name | NULL
Expr1006 | [testdb].[dbo].last_name | NULL
Expr1007 | NULL | ' '
Expr1007 | [testdb].[dbo].number | NULL
Expr1007 | [testdb].[dbo].street | NULL
这是代码。 首先得到XML查询计划。
declare @select_query as varchar(4000) = 'select * from dbo.vTEST' -- IT'S YOUR QUERY HERE.
declare @select_into_query as varchar(4000) = 'select top (1) * into #foo from (' + @select_query + ') as src'
, @xml_plan as xml = null
, @xml_generation_tries as tinyint = 10
;
while (@xml_plan is null and @xml_generation_tries > 0) -- There is no guaranty that plan will be cached.
begin
execute (@select_into_query);
select @xml_plan = pln.query_plan
from sys.dm_exec_query_stats as qry
cross apply sys.dm_exec_sql_text(qry.sql_handle) as txt
cross apply sys.dm_exec_query_plan(qry.plan_handle) as pln
where txt.text = @select_into_query
;
end
if (@xml_plan is null
) begin
raiserror(N'Can''t extract XML query plan from cache.' ,15 ,0);
return;
end
;
接下来是一个主查询。它最大的部分是用于列提取的递归公用表表达式。
with xmlnamespaces(default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
,'http://schemas.microsoft.com/sqlserver/2004/07/showplan' as shp -- Used in .query() for predictive namespace using.
)
, cte_column_dependencies as
(
递归的种子是一个查询,它为存储 1 行感兴趣的选择查询的 #foo 表提取列。
select
(select foo_col.info.query('./ColumnReference') for xml raw('shp:root') ,type) -- Becouse .value() can't extract attribute from root node.
as target_column_info
, (select foo_col.info.query('./ScalarOperator/Identifier/ColumnReference') for xml raw('shp:root') ,type)
as source_column_info
, cast(null as xml) as const_info
, 1 as iteration_no
from @xml_plan.nodes('//Update/SetPredicate/ScalarOperator/ScalarExpressionList/ScalarOperator/MultipleAssign/Assign')
as foo_col(info)
where foo_col.info.exist('./ColumnReference[@Table="[#foo]"]') = 1
递归部分搜索具有依赖列的“DefinedValue”节点,并提取列表达式中使用的所有“ColumnReference”和“Const”子节点。 XML 到 SQL 的转换过于复杂。
union all
select
(select internal_col.info.query('.') for xml raw('shp:root') ,type)
, source_info.column_info
, source_info.const_info
, prev_dependencies.iteration_no + 1
from @xml_plan.nodes('//DefinedValue/ColumnReference') as internal_col(info)
inner join cte_column_dependencies as prev_dependencies -- Filters by depended columns.
on prev_dependencies.source_column_info.value('(//ColumnReference/@Column)[1]' ,'nvarchar(4000)') = internal_col.info.value('(./@Column)[1]' ,'nvarchar(4000)')
and exists (select prev_dependencies.source_column_info.value('(.//@Schema)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Schema)[1]' ,'nvarchar(4000)'))
and exists (select prev_dependencies.source_column_info.value('(.//@Database)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Database)[1]' ,'nvarchar(4000)'))
and exists (select prev_dependencies.source_column_info.value('(.//@Server)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Server)[1]' ,'nvarchar(4000)'))
cross apply ( -- Becouse only column or only constant can be places in result row.
select (select source_col.info.query('.') for xml raw('shp:root') ,type) as column_info
, null as const_info
from internal_col.info.nodes('..//ColumnReference') as source_col(info)
union all
select null as column_info
, (select const.info.query('.') for xml raw('shp:root') ,type) as const_info
from internal_col.info.nodes('..//Const') as const(info)
) as source_info
where source_info.column_info is null
or (
-- Except same node selected by '..//ColumnReference' from its sources. Sorry, I'm not so well to check it with XQuery simple.
source_info.column_info.value('(//@Column)[1]' ,'nvarchar(4000)') <> internal_col.info.value('(./@Column)[1]' ,'nvarchar(4000)')
and (select source_info.column_info.value('(//@Schema)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Schema)[1]' ,'nvarchar(4000)')) is null
and (select source_info.column_info.value('(//@Database)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Database)[1]' ,'nvarchar(4000)')) is null
and (select source_info.column_info.value('(//@Server)[1]' ,'nvarchar(4000)') intersect select internal_col.info.value('(./@Server)[1]' ,'nvarchar(4000)')) is null
)
)
最后,是将 XML 转换为适当的人类文本的 select 语句。
select
-- col_dep.target_column_info
--, col_dep.source_column_info
--, col_dep.const_info
coalesce(col_dep.target_column_info.value('(.//shp:ColumnReference/@Server)[1]' ,'nvarchar(4000)') + '.' ,'')
+ coalesce(col_dep.target_column_info.value('(.//shp:ColumnReference/@Database)[1]' ,'nvarchar(4000)') + '.' ,'')
+ coalesce(col_dep.target_column_info.value('(.//shp:ColumnReference/@Schema)[1]' ,'nvarchar(4000)') + '.' ,'')
+ col_dep.target_column_info.value('(.//shp:ColumnReference/@Column)[1]' ,'nvarchar(4000)')
as target_column_name
, coalesce(col_dep.source_column_info.value('(.//shp:ColumnReference/@Server)[1]' ,'nvarchar(4000)') + '.' ,'')
+ coalesce(col_dep.source_column_info.value('(.//shp:ColumnReference/@Database)[1]' ,'nvarchar(4000)') + '.' ,'')
+ coalesce(col_dep.source_column_info.value('(.//shp:ColumnReference/@Schema)[1]' ,'nvarchar(4000)') + '.' ,'')
+ col_dep.source_column_info.value('(.//shp:ColumnReference/@Column)[1]' ,'nvarchar(4000)')
as source_column_name
, col_dep.const_info.value('(/shp:root/shp:Const/@ConstValue)[1]' ,'nvarchar(4000)')
as const_value
from cte_column_dependencies as col_dep
order by col_dep.iteration_no ,target_column_name ,source_column_name
option (maxrecursion 512) -- It's an assurance from infinite loop.
【讨论】:
【参考方案4】:此解决方案只能部分回答您的问题。它不适用于作为表达式的列。
您可以使用sys.dm_exec_describe_first_result_set 获取列信息:
@include_browse_information
如果设置为 1,则分析每个查询,就好像它在查询上有一个 FOR BROWSE 选项。返回附加的键列和源表信息。
CREATE TABLE txu(id INT, first_name VARCHAR(10), last_name VARCHAR(10));
CREATE TABLE txd(id INT, id_fk INT, address VARCHAR(100));
CREATE VIEW v_txu
AS
SELECT t.id AS PK_id,
t.first_name AS name,
d.address,
t.first_name + t.last_name AS name_full
FROM txu t
JOIN txd d
ON t.id = d.id_fk
主要查询:
SELECT name, source_database, source_schema,
source_table, source_column
FROM sys.dm_exec_describe_first_result_set(N'SELECT * FROM v_txu', null, 1) ;
输出:
+-----------+--------------------+---------------+--------------+---------------+
| name | source_database | source_schema | source_table | source_column |
+-----------+--------------------+---------------+--------------+---------------+
| PK_id | fiddle_0f9d47226c4 | dbo | txu | id |
| name | fiddle_0f9d47226c4 | dbo | txu | first_name |
| address | fiddle_0f9d47226c4 | dbo | txd | address |
| name_full | null | null | null | null |
+-----------+--------------------+---------------+--------------+---------------+
DBFiddleDemo
【讨论】:
1- 例如,从 2008 年到 2019 年,它在传输的数据库上无法正常工作 2- 基于联合和别名的视图也存在问题【参考方案5】:所有你需要的都在视图的定义中提到。
所以我们可以通过以下步骤提取此信息:-
将视图定义分配给字符串变量。
用 (,) 逗号分隔。
通过将 CROSS APPLY 与 XML 结合使用,将别名与 (+) 加号运算符分开。
使用系统表获取原始表一样的准确信息。
演示:-
Create PROC psp_GetLevelDependsView (@sViewName varchar(200))
AS
BEGIN
Declare @stringToSplit nvarchar(1000),
@name NVARCHAR(255),
@dependsTableName NVARCHAR(50),
@pos INT
Declare @returnList TABLE ([Name] [nvarchar] (500))
SELECT TOP 1 @dependsTableName= table_schema + '.'+ TABLE_NAME
FROM INFORMATION_SCHEMA.VIEW_COLUMN_USAGE
select @stringToSplit = definition
from sys.objects o
join sys.sql_modules m on m.object_id = o.object_id
where o.object_id = object_id( @sViewName)
and o.type = 'V'
WHILE CHARINDEX(',', @stringToSplit) > 0
BEGIN
SELECT @pos = CHARINDEX(',', @stringToSplit)
SELECT @name = SUBSTRING(@stringToSplit, 1, @pos-1)
INSERT INTO @returnList
SELECT @name
SELECT @stringToSplit = SUBSTRING(@stringToSplit, @pos+1, LEN(@stringToSplit)-@pos)
END
INSERT INTO @returnList
SELECT @stringToSplit
select COLUMN_NAME , b.Name as Expression
Into #Temp
FROM INFORMATION_SCHEMA.COLUMNS a , @returnList b
WHERE TABLE_NAME= @sViewName
And (b.Name) like '%' + ( COLUMN_NAME) + '%'
SELECT A.COLUMN_NAME as column_name,
Split.a.value('.', 'VARCHAR(100)') AS depends_on_column_name , @dependsTableName as depends_on_table_name
Into #temp2
FROM
(
SELECT COLUMN_NAME,
CAST ('<M>' + REPLACE(Expression, '+', '</M><M>') + '</M>' AS XML) AS Data
FROM #Temp
) AS A CROSS APPLY Data.nodes ('/M') AS Split(a);
SELECT b.column_name , a.COLUMN_NAME as depends_on_column_name , b.depends_on_table_name
FROM INFORMATION_SCHEMA.VIEW_COLUMN_USAGE a , #temp2 b
WHERE VIEW_NAME= @sViewName
and b.depends_on_column_name like '%' + a.COLUMN_NAME + '%'
drop table #Temp
drop table #Temp2
END
测试:-
exec psp_GetLevelDependsView 'vTest'
结果:-
column_name depends_on_column_name depends_on_table_name
----------- --------------------- --------------------
name first_name dbo.TEST
name last_name dbo.TEST
address street dbo.TEST
address number dbo.TEST
【讨论】:
感谢您的回答,尽管我更愿意避免解析视图 - 它很容易变得过于复杂。我希望这些信息存储在系统表中的某个地方以上是关于如何在视图中获取列级依赖项的主要内容,如果未能解决你的问题,请参考以下文章
如何在 Redshift 或 Postgres 的视图中获取列依赖关系?