从连接表列表中选择唯一的列名

Posted

技术标签:

【中文标题】从连接表列表中选择唯一的列名【英文标题】:Select unique column names from a list of joined tables 【发布时间】:2013-05-10 02:51:28 【问题描述】:

我有一个可以通过同一个 PK 列连接在一起的表列表。由于此表列表可能因项目而异,因此我想创建一个足够动态的查询,以便从这些表中提取所有唯一列。

For example, I have three tables below:
Table A (PK field, column1, column 2)
Table B (PK field, column3, column 4)
Table C (PK field, column5, column 5)

这三个表在“PK 字段”列上连接,我希望查询输出类似于:

PK field  column1  column2  column3  column4  column5
..data..  ..data.. ..data.. ..data.. ..data.. ..data..

最后,此查询将成为 SQL 函数或 SP 的一部分,因此用户可以定义表列表,并在开头定义 PK 字段,然后执行它会返回我预期的输出和数据集。

我想在下面使用这个查询,但结果不是我喜欢的:

SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = ''

任何关于我应该如何设计这个 SP 或功能的建议,我们将不胜感激。

提前致谢。

两个示例表的 DDL:

CREATE TABLE [dbo].[G_bDEM](
    [blaiseKey_code] [nvarchar](255) NULL,
    [qSex] [int] NULL,
    [qDOB] [datetime] NULL,
    [qDOBNR] [int] NULL,
    [qAge] [int] NULL,
    [qAgeNR] [int] NULL,
    [qAgeRange] [int] NULL,
    [qAge15OrOver] [int] NULL,
    [qNotEligible] [nvarchar](1) NULL,
    [qBornInNZ] [int] NULL,
    [qCountryOfBirth] [nvarchar](2) NULL,
    [qArriveNZYr] [int] NULL,
    [qArriveNZYrNR] [int] NULL,
    [qArriveNZMth] [int] NULL,
    [bDEM_BOP_qHowManyRaised] [int] NULL,
    [bDEM_BOP_q1stParentBornNZ] [int] NULL,
    [bDEM_BOP_q2ndParentBornNZ] [int] NULL,
    [bDEM_BOP_qHowManyParentBornNZ] [int] NULL,
    [qMaoriDescent] [int] NULL,
    [qSchQual] [int] NULL,
    [qSchQualOth] [nvarchar](200) NULL,
    [qSchQualOthNR] [int] NULL,
    [qSchQualYr] [int] NULL,
    [qSchQualYrNR] [int] NULL,
    [qPostSchQual] [int] NULL,
    [q3MthsStudy] [int] NULL,
    [qHighestQual] [int] NULL,
    [qHighestQualOth] [nvarchar](200) NULL,
    [qHighestQualOthNR] [int] NULL,
    [qHighestQualYr] [int] NULL,
    [qHighestQualYrNR] [int] NULL,
    [qWorkIntro] [nvarchar](1) NULL,
    [qDidPaidWork] [int] NULL,
    [qAwayFromWork] [int] NULL,
    [qFamilyBusWork] [int] NULL,
    [bDEM_WOR_qPaidWorkIntro] [nvarchar](1) NULL,
    [bDEM_WOR_qJobsNum] [int] NULL,
    [bDEM_WOR_qJobsNumNR] [int] NULL,
    [bDEM_WOR_tabDEM_T2_fTotMins] [int] NULL,
    [bDEM_WOR_q2JobsNoHrsIntro] [nvarchar](1) NULL,
    [bDEM_WOR_q2Jobs2HrsIntro] [nvarchar](1) NULL,
    [bDEM_WOR_q2Jobs1HrsIntro] [nvarchar](1) NULL,
    [bDEM_WOR_qOccupation] [nvarchar](200) NULL,
    [bDEM_WOR_qOccupationNR] [int] NULL,
    [bDEM_WOR_qMainTasks] [nvarchar](200) NULL,
    [bDEM_WOR_qMainTasksNR] [int] NULL,
    [bDEM_WOR_qFeelAboutJob] [int] NULL,
    [bDEM_WOR_qEmployArrangement] [int] NULL,
    [bDEM_WOR_qPermEmployee] [int] NULL,
    [qHasJobToStart] [int] NULL,
    [qLookedForWork] [int] NULL,
    [qJobSearchA] [int] NULL,
    [qJobSearchB] [int] NULL,
    [qJobSearchC] [int] NULL,
    [qJobSearchD] [int] NULL,
    [qJobSearchE] [int] NULL,
    [qJobSearchF] [int] NULL,
    [qJobSearchG] [int] NULL,
    [qJobSearchH] [int] NULL,
    [qJobSearchI] [int] NULL,
    [qJobSearchOth] [nvarchar](200) NULL,
        [qJobSearchOthNR] [int] NULL,
    [qCouldStartLastWk] [int] NULL,
    [qIncTotalAmt] [int] NULL,
    [fCountryName] [nvarchar](60) NULL
     ) ON [PRIMARY]

    GO

CREATE TABLE [dbo].[G_bLWW](
    [blaiseKey_code] [nvarchar](255) NULL,
    [qThingsWorthwhileScale] [int] NULL
 ) ON [PRIMARY]

【问题讨论】:

【参考方案1】:

此脚本为任何具有相似 PK 名称的表生成动态 SQL。

查询:

SET NOCOUNT ON

IF OBJECT_ID (N'dbo.A') IS NOT NULL
   DROP TABLE dbo.A

IF OBJECT_ID (N'dbo.B') IS NOT NULL
   DROP TABLE dbo.B

IF OBJECT_ID (N'dbo.C') IS NOT NULL
   DROP TABLE dbo.C

CREATE TABLE dbo.A (PK_field INT PRIMARY KEY, column1 INT, column2 INT)
CREATE TABLE dbo.B (PK_field INT PRIMARY KEY, column3 INT, column4 INT)
CREATE TABLE dbo.C (PK_field INT PRIMARY KEY, column5 INT, [column 6] INT)

INSERT INTO dbo.A (PK_field, column1, column2)
VALUES (1, 1, 2), (2, 1, 2) 

INSERT INTO dbo.B (PK_field, column3, column4)
VALUES (2, 3, 4) 

INSERT INTO dbo.C (PK_field, column5, [column 6])
VALUES (1, 5, 6), (3, 5, 6) 

DECLARE @SQL NVARCHAR(MAX)

;WITH cte AS 
(
    SELECT 
          column_name = '[' + c.name + ']'
        , table_name = '[' + s.name + '].[' + o.name + ']'
    FROM sys.columns c WITH (NOLOCK)
    JOIN sys.objects o WITH (NOLOCK) ON c.[object_id] = o.[object_id]
    JOIN sys.schemas s WITH (NOLOCK) ON o.[schema_id] = s.[schema_id]
    WHERE o.name IN ('A', 'B', 'C')
        AND s.name = 'dbo'
        AND o.[type] = 'U'  
), unicol AS (
    SELECT TOP 1 column_name 
    FROM cte 
    GROUP BY cte.column_name
    HAVING COUNT(cte.column_name) > 1
), cols AS 
(
    SELECT DISTINCT column_name 
    FROM cte    
), tbl AS 
(
    SELECT DISTINCT table_name
    FROM cte
), rs AS 
(
    SELECT 
          tbl.table_name
        , column_name = ISNULL(cte.column_name, cols.column_name + ' = NULL')
    FROM cols
    CROSS JOIN tbl
    LEFT JOIN cte ON cols.column_name = cte.column_name AND cte.table_name = tbl.table_name
), rs2 AS (
    SELECT uni = ' UNION ALL' + CHAR(13) + 'SELECT ' + STUFF((
        SELECT ', ' + rs.column_name
        FROM rs
        WHERE tbl.table_name = rs.table_name
        GROUP BY rs.column_name
        ORDER BY rs.column_name
        FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '') + 
        ' FROM ' + table_name
    FROM tbl
) 
SELECT @SQL = 'SELECT 
' + STUFF((
    SELECT CHAR(13) + ', ' + ISNULL(unicol.column_name, cols.column_name + ' = MAX(' + cols.column_name + ')')
    FROM cols
    LEFT JOIN unicol ON cols.column_name = unicol.column_name
    FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, ' ')
 + ' 
FROM 
(' + STUFF((
    SELECT CHAR(10) + uni
    FROM rs2
    FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 11, '') + CHAR(13) + 
    ') t 
GROUP BY ' + (SELECT column_name FROM unicol)

PRINT @SQL

EXECUTE sys.sp_executesql @SQL

输出:

SELECT 
      [column 6] = MAX([column 6])
    , [column1] = MAX([column1])
    , [column2] = MAX([column2])
    , [column3] = MAX([column3])
    , [column4] = MAX([column4])
    , [column5] = MAX([column5])
    , [PK_field] 
FROM (
    SELECT [column 6] = NULL, [column1], [column2], [column3] = NULL, [column4] = NULL, [column5] = NULL, [PK_field] FROM [dbo].[A]
     UNION ALL
    SELECT [column 6] = NULL, [column1] = NULL, [column2] = NULL, [column3], [column4], [column5] = NULL, [PK_field] FROM [dbo].[B]
     UNION ALL
    SELECT [column 6], [column1] = NULL, [column2] = NULL, [column3] = NULL, [column4] = NULL, [column5], [PK_field] FROM [dbo].[C]
) t 
GROUP BY [PK_field]

结果:

column 6    column1     column2     column3     column4     column5     PK_field
----------- ----------- ----------- ----------- ----------- ----------- -----------
6           1           2           NULL        NULL        5           1
NULL        1           2           3           4           NULL        2
6           NULL        NULL        NULL        NULL        5           3

脚本更新:

DECLARE @SQL NVARCHAR(2000) -> NVARCHAR(MAX)

DDL 的输出:

SELECT 
  [blaiseKey_code]
, [bDEM_BOP_q1stParentBornNZ] = MAX([bDEM_BOP_q1stParentBornNZ])
, [bDEM_BOP_q2ndParentBornNZ] = MAX([bDEM_BOP_q2ndParentBornNZ])
, [bDEM_BOP_qHowManyParentBornNZ] = MAX([bDEM_BOP_qHowManyParentBornNZ])
, [bDEM_BOP_qHowManyRaised] = MAX([bDEM_BOP_qHowManyRaised])
, [bDEM_WOR_q2Jobs1HrsIntro] = MAX([bDEM_WOR_q2Jobs1HrsIntro])
, [bDEM_WOR_q2Jobs2HrsIntro] = MAX([bDEM_WOR_q2Jobs2HrsIntro])
, [bDEM_WOR_q2JobsNoHrsIntro] = MAX([bDEM_WOR_q2JobsNoHrsIntro])
, [bDEM_WOR_qEmployArrangement] = MAX([bDEM_WOR_qEmployArrangement])
, [bDEM_WOR_qFeelAboutJob] = MAX([bDEM_WOR_qFeelAboutJob])
, [bDEM_WOR_qJobsNum] = MAX([bDEM_WOR_qJobsNum])
, [bDEM_WOR_qJobsNumNR] = MAX([bDEM_WOR_qJobsNumNR])
, [bDEM_WOR_qMainTasks] = MAX([bDEM_WOR_qMainTasks])
, [bDEM_WOR_qMainTasksNR] = MAX([bDEM_WOR_qMainTasksNR])
, [bDEM_WOR_qOccupation] = MAX([bDEM_WOR_qOccupation])
, [bDEM_WOR_qOccupationNR] = MAX([bDEM_WOR_qOccupationNR])
, [bDEM_WOR_qPaidWorkIntro] = MAX([bDEM_WOR_qPaidWorkIntro])
, [bDEM_WOR_qPermEmployee] = MAX([bDEM_WOR_qPermEmployee])
, [bDEM_WOR_tabDEM_T2_fTotMins] = MAX([bDEM_WOR_tabDEM_T2_fTotMins])
, [fCountryName] = MAX([fCountryName])
, [q3MthsStudy] = MAX([q3MthsStudy])
, [qAge] = MAX([qAge])
, [qAge15OrOver] = MAX([qAge15OrOver])
, [qAgeNR] = MAX([qAgeNR])
, [qAgeRange] = MAX([qAgeRange])
, [qArriveNZMth] = MAX([qArriveNZMth])
, [qArriveNZYr] = MAX([qArriveNZYr])
, [qArriveNZYrNR] = MAX([qArriveNZYrNR])
, [qAwayFromWork] = MAX([qAwayFromWork])
, [qBornInNZ] = MAX([qBornInNZ])
, [qCouldStartLastWk] = MAX([qCouldStartLastWk])
, [qCountryOfBirth] = MAX([qCountryOfBirth])
, [qDidPaidWork] = MAX([qDidPaidWork])
, [qDOB] = MAX([qDOB])
, [qDOBNR] = MAX([qDOBNR])
, [qFamilyBusWork] = MAX([qFamilyBusWork])
, [qHasJobToStart] = MAX([qHasJobToStart])
, [qHighestQual] = MAX([qHighestQual])
, [qHighestQualOth] = MAX([qHighestQualOth])
, [qHighestQualOthNR] = MAX([qHighestQualOthNR])
, [qHighestQualYr] = MAX([qHighestQualYr])
, [qHighestQualYrNR] = MAX([qHighestQualYrNR])
, [qIncTotalAmt] = MAX([qIncTotalAmt])
, [qJobSearchA] = MAX([qJobSearchA])
, [qJobSearchB] = MAX([qJobSearchB])
, [qJobSearchC] = MAX([qJobSearchC])
, [qJobSearchD] = MAX([qJobSearchD])
, [qJobSearchE] = MAX([qJobSearchE])
, [qJobSearchF] = MAX([qJobSearchF])
, [qJobSearchG] = MAX([qJobSearchG])
, [qJobSearchH] = MAX([qJobSearchH])
, [qJobSearchI] = MAX([qJobSearchI])
, [qJobSearchOth] = MAX([qJobSearchOth])
, [qJobSearchOthNR] = MAX([qJobSearchOthNR])
, [qLookedForWork] = MAX([qLookedForWork])
, [qMaoriDescent] = MAX([qMaoriDescent])
, [qNotEligible] = MAX([qNotEligible])
, [qPostSchQual] = MAX([qPostSchQual])
, [qSchQual] = MAX([qSchQual])
, [qSchQualOth] = MAX([qSchQualOth])
, [qSchQualOthNR] = MAX([qSchQualOthNR])
, [qSchQualYr] = MAX([qSchQualYr])
, [qSchQualYrNR] = MAX([qSchQualYrNR])
, [qSex] = MAX([qSex])
, [qThingsWorthwhileScale] = MAX([qThingsWorthwhileScale])
, [qWorkIntro] = MAX([qWorkIntro]) 
FROM 
(
SELECT [bDEM_BOP_q1stParentBornNZ], [bDEM_BOP_q2ndParentBornNZ], [bDEM_BOP_qHowManyParentBornNZ], [bDEM_BOP_qHowManyRaised], [bDEM_WOR_q2Jobs1HrsIntro], [bDEM_WOR_q2Jobs2HrsIntro], [bDEM_WOR_q2JobsNoHrsIntro], [bDEM_WOR_qEmployArrangement], [bDEM_WOR_qFeelAboutJob], [bDEM_WOR_qJobsNum], [bDEM_WOR_qJobsNumNR], [bDEM_WOR_qMainTasks], [bDEM_WOR_qMainTasksNR], [bDEM_WOR_qOccupation], [bDEM_WOR_qOccupationNR], [bDEM_WOR_qPaidWorkIntro], [bDEM_WOR_qPermEmployee], [bDEM_WOR_tabDEM_T2_fTotMins], [blaiseKey_code], [fCountryName], [q3MthsStudy], [qAge], [qAge15OrOver], [qAgeNR], [qAgeRange], [qArriveNZMth], [qArriveNZYr], [qArriveNZYrNR], [qAwayFromWork], [qBornInNZ], [qCouldStartLastWk], [qCountryOfBirth], [qDidPaidWork], [qDOB], [qDOBNR], [qFamilyBusWork], [qHasJobToStart], [qHighestQual], [qHighestQualOth], [qHighestQualOthNR], [qHighestQualYr], [qHighestQualYrNR], [qIncTotalAmt], [qJobSearchA], [qJobSearchB], [qJobSearchC], [qJobSearchD], [qJobSearchE], [qJobSearchF], [qJobSearchG], [qJobSearchH], [qJobSearchI], [qJobSearchOth], [qJobSearchOthNR], [qLookedForWork], [qMaoriDescent], [qNotEligible], [qPostSchQual], [qSchQual], [qSchQualOth], [qSchQualOthNR], [qSchQualYr], [qSchQualYrNR], [qSex], [qThingsWorthwhileScale] = NULL, [qWorkIntro] FROM [dbo].[G_bDEM]
 UNION ALL
SELECT [bDEM_BOP_q1stParentBornNZ] = NULL, [bDEM_BOP_q2ndParentBornNZ] = NULL, [bDEM_BOP_qHowManyParentBornNZ] = NULL, [bDEM_BOP_qHowManyRaised] = NULL, [bDEM_WOR_q2Jobs1HrsIntro] = NULL, [bDEM_WOR_q2Jobs2HrsIntro] = NULL, [bDEM_WOR_q2JobsNoHrsIntro] = NULL, [bDEM_WOR_qEmployArrangement] = NULL, [bDEM_WOR_qFeelAboutJob] = NULL, [bDEM_WOR_qJobsNum] = NULL, [bDEM_WOR_qJobsNumNR] = NULL, [bDEM_WOR_qMainTasks] = NULL, [bDEM_WOR_qMainTasksNR] = NULL, [bDEM_WOR_qOccupation] = NULL, [bDEM_WOR_qOccupationNR] = NULL, [bDEM_WOR_qPaidWorkIntro] = NULL, [bDEM_WOR_qPermEmployee] = NULL, [bDEM_WOR_tabDEM_T2_fTotMins] = NULL, [blaiseKey_code], [fCountryName] = NULL, [q3MthsStudy] = NULL, [qAge] = NULL, [qAge15OrOver] = NULL, [qAgeNR] = NULL, [qAgeRange] = NULL, [qArriveNZMth] = NULL, [qArriveNZYr] = NULL, [qArriveNZYrNR] = NULL, [qAwayFromWork] = NULL, [qBornInNZ] = NULL, [qCouldStartLastWk] = NULL, [qCountryOfBirth] = NULL, [qDidPaidWork] = NULL, [qDOB] = NULL, [qDOBNR] = NULL, [qFamilyBusWork] = NULL, [qHasJobToStart] = NULL, [qHighestQual] = NULL, [qHighestQualOth] = NULL, [qHighestQualOthNR] = NULL, [qHighestQualYr] = NULL, [qHighestQualYrNR] = NULL, [qIncTotalAmt] = NULL, [qJobSearchA] = NULL, [qJobSearchB] = NULL, [qJobSearchC] = NULL, [qJobSearchD] = NULL, [qJobSearchE] = NULL, [qJobSearchF] = NULL, [qJobSearchG] = NULL, [qJobSearchH] = NULL, [qJobSearchI] = NULL, [qJobSearchOth] = NULL, [qJobSearchOthNR] = NULL, [qLookedForWork] = NULL, [qMaoriDescent] = NULL, [qNotEligible] = NULL, [qPostSchQual] = NULL, [qSchQual] = NULL, [qSchQualOth] = NULL, [qSchQualOthNR] = NULL, [qSchQualYr] = NULL, [qSchQualYrNR] = NULL, [qSex] = NULL, [qThingsWorthwhileScale], [qWorkIntro] = NULL FROM [dbo].[G_bLWW]
) t 
GROUP BY [blaiseKey_code]

【讨论】:

嗨@Devart,您的解决方案看起来很酷!但是当我执行代码时,我有一个错误“字符串后的未闭合引号”。列名没有任何特殊字符,不知道是什么停止了查询? 报错信息如下:SELECT [bDEM_BOP_q1stParentBornNZ] = MAX([bDEM_BOP_q1stParentBornNZ]) , [bDEM_BOP_q2ndParentBornNZ] = MAX([bDEM_BOP_q2ndParentBornNZ]) , [bDEM_BOP_qHowManyParentBornNZ] = MAX([bDEM_BOP_qHowManyParentBornNZ])....Unclosed quotation mark after the character string 'qHighestQual'. Msg 102, Level 15, State 1, Line 2 Incorrect syntax near 'qHighestQual【参考方案2】:

试试这个:

DECLARE
    @cols VARCHAR(MAX)
  , @TableA VARCHAR(10)= 'TableA'
  , @TableB VARCHAR(10)= 'TableB'
  , @TableC VARCHAR(10)= 'TableC'
  , @Pk VARCHAR(20)

SELECT
    @cols = STUFF((
            SELECT DISTINCT ', [' + c.column_name + ']'
            FROM INFORMATION_SCHEMA.Columns c
            WHERE c.table_name IN ( @TableA,@TableB,@TableC )
            FOR XML PATH('')
          ), 1, 2, '');

SELECT @Pk = column_name
FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE
WHERE OBJECTPROPERTY(OBJECT_ID(constraint_name), 'IsPrimaryKey') = 1
    AND table_name = @TableA

DECLARE @query VARCHAR(1000)
SET @query = 'SELECT ' + @cols + ' FROM ' + @TableA + ' JOIN ' + @TableB
    + ' ON ' + @TableA + '.' + @Pk + '=' + @TableB + '.' + @Pk
    + ' JOIN ' + @TableC + ' ON ' + @TableB + '.' + @Pk + '=' + @TableC
    + '.' + @Pk 

EXEC (@query)

不要忘记@Gordon 给出的关于列名中特殊 html 字符的警告。

【讨论】:

嗨,当我在我们的表上执行此语句时,没有任何返回?我在 EXEC 语句之前打印 @query 并且注意到也回来了。 @Eric : 你在@cols@pk 得到什么吗? 参数cols中有列的列表,但是pk没有赋值。【参考方案3】:

您只能将其作为存储过程来执行。 SQL 查询返回一组指定的列,不多也不少。获得可变数量的列的唯一方法是使用动态 SQL。而且,函数不支持动态 SQL。

您需要构造一条 SQL 语句,将 INFORMATION_SCHEMA.Columns 中的列名连接起来。像这样的:

declare @cols varchar(max);

select @cols = stuff((select distinct ', ['+c.column_name+']'
                      from INFORMATION_SCHEMA.Columns c
                      where c.table_name in (<list of tables here>)
                      for xml path ('')
                     ), 1, 2, '');

这不适用于具有特殊 html 字符的列名,例如“”或“&”。

然后您可以使用exec()sp_executesql() 构建完整的查询语句并执行它。

另一种方法是创建一个包含所有连接和所有列的视图。让 SQL 优化器确定最佳执行路径。

【讨论】:

以上是关于从连接表列表中选择唯一的列名的主要内容,如果未能解决你的问题,请参考以下文章

SQL - 如何从多个可能的列名中进行选择?

选择连接表的列值作为结果列名

如何从 BigQuery 表中提取所有列名的列表?

从动态表中选择时更改列名

如何创建列名列表?

使用列名列表从数据表中选择匹配的列[重复]。