excel中,如何合把行的数据合并到列
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了excel中,如何合把行的数据合并到列相关的知识,希望对你有一定的参考价值。
如图,如何把图一数据合并成图二中G列的效果,因数据量太大,复制粘贴是不科学的。劳烦小哥哥小姐帮忙作答。还有图二中如何在I列统计F列身份证出现的次数
我想到的办法是用powerquery,基本思路是:
将原始数据加载到powerquery
对所有家庭成员列进行逆透视
由于家庭成员只有两个字段(姓名和身份证号码),因此利用索引列对2进行整除,来将同一个家庭成员的两个字段的索引号设置为同一个
然后分组,这样就确保每个家庭成员的数据在同一个子table里,
再对家庭成员所在的子table进行行列转置
删除不需要的行
把结果加载到表(便于查看明细),同时加载到数据模型(便于后面统计)
效果如下(无论家庭成员有多少个都自动处理,我截图添加了三个家庭成员):
完整的powerquery代码如下:
let
源 = Excel.CurrentWorkbook()[Name="表1"][Content],
逆透视的列 = Table.UnpivotOtherColumns(源, "序号", "姓名", "性别", "民族", "住址(乡、村、村小组)", "身份证号码", "属性", "值"),
已添加索引 = Table.AddIndexColumn(逆透视的列, "索引", 0, 1),
已添加自定义 = Table.AddColumn(已添加索引, "自定义", each Number.IntegerDivide([索引],2)),
分组的行 = Table.Group(已添加自定义, "序号", "姓名", "性别", "民族", "住址(乡、村、村小组)", "身份证号码", "自定义", "计数", each _, type table [序号=number, 姓名=text, 性别=text, 民族=text, #"住址(乡、村、村小组)"=text, 身份证号码=text, 属性=text, 值=anynonnull, 索引=number, 自定义=number]),
自定义1 = Table.TransformColumns(分组的行,"计数",each Table.Transpose(Table.SelectColumns(_,"值"))),
#"展开的“计数”" = Table.ExpandTableColumn(自定义1, "计数", "Column1", "Column2", "Column1", "Column2"),
重命名的列 = Table.RenameColumns(#"展开的“计数”","Column1", "家庭成员姓名1", "Column2", "家庭成员身份证号"),
删除的其他列 = Table.SelectColumns(重命名的列,"序号", "姓名", "性别", "民族", "住址(乡、村、村小组)", "身份证号码", "家庭成员姓名1", "家庭成员身份证号")
in
删除的其他列
【代码用法】
确保您是excel2016及以上版本
确保您的原始数据是一个table(也就是【插入】选项卡的“表”)且名字叫“表1”(可在名称管理器看到)
点击【数据】选项卡,新建一个空白查询
4.在查询编辑器窗口点击【高级编辑器】
5.把前面的代码全部复制,替换掉已有的代码,完毕。
6.我提供了一个示例文件,这样的话,只需要做上面的1-2步,然后点击【数据】选项卡的【全部刷新】即可得到结果。
示例文件:链接: https://pan.baidu.com/s/1pXpQooeC07slNSwmSXcFNA
7.如果没有excel2016及以上版本,可以下载powerbi desktop,但是代码要稍作改变,而且步骤有点多,有兴趣的可以研究下。
参考技术A 没看明白,把问题作为内容(邮件主题一定要包含“excel”,本人以此为依据辨别非垃圾邮件,以免误删),excel样表文件(把现状和目标效果表示出)作为附件发到yqch134@163.com帮你看下 参考技术B首先,对国庆节还加班的童靴表示慰问
如果有需求,可以私聊我,帮你看看,函数实现比较困难,VBA实现比较简单
当然也不是绝对的,具体看你的需求。
有文件就好说,明天有时间的话可以瞧瞧。
参考技术C 这个操作比较适合用VBA来处理,一般的公式或复制粘贴不好处理。第2问,统计身份证出现的次数比较简单,论坛或网上容易查到答案。但要了解你详细的方案是如何统计这里没有讲清楚。 参考技术D 说实话,你没有说明白问题,图一图二没有啥变化!
CSV到列,与基于行的数据连接,分析和输出 - 是否可以有效地完成?
我有一个复杂的SQL Server问题,我一直在尝试解决,但我卡住了,我希望我能得到一些帮助!
我有两个数据表,以不同的格式存储,我需要一起创建一个指定的输出。更糟糕的是,其中一个表有一些关键数据存储在逗号分隔值中(我知道这不是数据应该存储的方式 - 怜悯,我没有设计这些表!)。
学生表:
| id | oldSkill | newSkill |
+----+-----------------------+--------------------------------------+
| 1 | Word | Excel,PowerPoint,Word |
| 2 | Excel,PowerPoint,Word | Excel,Outlook,PowerPoint,Word |
| 3 | PowerPoint,Word | Excel,PowerPoint,Word |
| 4 | Access,Excel | Access,Excel,Outlook,PowerPoint,Word |
| 5 | Outlook,Word | Excel,Outlook,PowerPoint,Word |
技能表:
| id | skill | assignment |
+----+------------+------------+
| 1 | Word | B |
| 1 | Word | P |
| 2 | Excel | P |
| 2 | PowerPoint | B |
| 2 | PowerPoint | P |
| 2 | Word | P |
| 3 | PowerPoint | P |
| 3 | Word | P |
| 4 | Access | B |
| 4 | Excel | B |
| 4 | Access | P |
| 4 | Excel | P |
| 5 | Outlook | P |
| 5 | Word | B |
以下是我被要求输出的内容:
| id | skill_1 | skill_1_primary | skill_1_backup | skill_2 | skill_2_primary | skill_2_backup | skill_3 | skill_3_primary | skill_3_backup | skill_4 | skill_4_primary | skill_4_backup | skill_5 | skill_5_primary | skill_5_backup |
|----|---------|-----------------|----------------|------------|-----------------|----------------|------------|-----------------|----------------|------------|-----------------|----------------|---------|-----------------|----------------|
| 1 | Excel | Y | (null) | PowerPoint | Y | (null) | Word | Y | Y | (null) | (null) | (null) | (null) | (null) | (null) |
| 2 | Excel | Y | (null) | Outlook | Y | (null) | PowerPoint | Y | Y | Word | Y | (null) | (null) | (null) | (null) |
| 3 | Excel | Y | (null) | PowerPoint | Y | (null) | Word | Y | (null) | (null) | (null) | (null) | (null) | (null) | (null) |
| 4 | Access | Y | Y | Excel | Y | Y | Outlook | Y | (null) | PowerPoint | Y | (null) | Word | Y | (null) |
| 5 | Excel | Y | (null) | Outlook | Y | (null) | PowerPoint | Y | (null) | Word | (null) | Y | (null) | (null) | (null) |
为了打破它,我需要:
- 输出
newSkill
表中Students
列中的所有项目。这些值需要分成单独的列,每个列都有一个相应的标志,以指示技能是主要技术还是备用技能。请注意,newSkill
列包含oldSkill
值 - 如果技能是旧的,请从
Skills
表中获取标志值,其中P是主要的,B是备份 - 如果技能是新的,只需使用'y'值标记
Primary
列
我一直在尝试从不同的角度(CTE,枢轴,光标等)来看这个,我已经成功使用UDF将CSV列值分开,但是从Skills
表的行中获取数据并进行组合它与他们想要的格式一起,以及Student
数据,正在逃避我。
我还设置了一个SQL小提琴来为这篇文章构建我的测试数据:http://sqlfiddle.com/#!6/e8d5a/1/0
在此先感谢您的任何帮助或指导... SQL不是我最强大的技能之一。我可以用另一种语言更容易地做到这一点,但我被要求将其构建为存储过程。 = P
更新:根据评论中发布的建议,我已经完成了很多工作。我只需要最终输出的帮助。我认为可以使用带动态sql的数据透视表来完成,但是如何透视和聚合这三个与技能相关的列并按照指定的方式对它们进行编号就是逃避我。
-- this pivots the skills table into a single row for each skill
select *
into #skillPiv
from
(
select id, skill, assignment,
'assignment_'+cast(row_number() over(partition by id, skill order by skill) as varchar(10)) rn
from skills
) d
pivot
(
max(assignment)
for rn in ([assignment_1], [assignment_2])
) piv
order by id;
-- this converts the student's oldSkills from CSV into rows and looks up the corresponding skill assignments in the #skills table
with st(id, skill, oldSkill) as (
select id, LEFT(CAST(oldSkill as varchar(max)), CHARINDEX(',',oldSkill+',')-1),
STUFF(CAST(oldSkill as varchar(max)), 1, CHARINDEX(',',oldSkill+','), '')
from students
union all
select id, LEFT(CAST(oldSkill as varchar(max)), CHARINDEX(',',oldSkill+',')-1),
STUFF(CAST(oldSkill as varchar(max)), 1, CHARINDEX(',',oldSkill+','), '')
from st
where oldSkill > ''
)
select st.id
,st.skill
,CASE WHEN sp.assignment_1 = 'P' OR sp.assignment_2 = 'P'
THEN 'Y'
ELSE ''
END AS [primary]
,CASE WHEN sp.assignment_1 = 'B' OR sp.assignment_2 = 'B'
THEN 'Y'
ELSE ''
END AS [backup]
into #oldSkills
from st
inner join #skillPiv sp on st.id = sp.id and st.skill = sp.skill
order by id;
-- convert the newSkills column from CSV to rows and insert our default skill assignment values
with tmp(id, skill, newSkill) as (
select id, LEFT(CAST(newSkill as varchar(max)), CHARINDEX(',',newSkill+',')-1),
STUFF(CAST(newSkill as varchar(max)), 1, CHARINDEX(',',newSkill+','), '')
from students
union all
select id, LEFT(CAST(newSkill as varchar(max)), CHARINDEX(',',newSkill+',')-1),
STUFF(CAST(newSkill as varchar(max)), 1, CHARINDEX(',',newSkill+','), '')
from tmp
where newSkill > ''
)
select id
,skill
,'Y' as [primary]
,'' as [backup]
into #newSkills
from tmp
where skill NOT IN (
select skill from #oldSkills where id = tmp.id
)
order by id;
-- now combine #oldSkills and #newSkills into one table that has all the values we need
select *
into #studentSkills
from (
select * from #newSkills
UNION
select * from #oldSkills
) as ss;
select * from #studentSkills;
我在使用临时表来处理SQL Fiddle时遇到了问题,所以我将测试代码移到了RexTester。
在我的实际代码中,我使用DelimitedSplit8K来解析Students
表中的CSV值。
上面的代码生成了这个最终表:
| id | skill | primary | backup |
|----|------------|---------|--------|
| 1 | Excel | Y | (null) |
| 1 | PowerPoint | Y | (null) |
| 1 | Word | Y | Y |
| 2 | Excel | Y | (null) |
| 2 | Outlook | Y | (null) |
| 2 | PowerPoint | Y | Y |
| 2 | Word | Y | (null) |
| 3 | Excel | Y | (null) |
| 3 | PowerPoint | Y | (null) |
| 3 | Word | Y | (null) |
| 4 | Access | Y | Y |
| 4 | Excel | Y | Y |
| 4 | Outlook | Y | (null) |
| 4 | PowerPoint | Y | (null) |
| 4 | Word | Y | (null) |
| 5 | Excel | Y | (null) |
| 5 | Outlook | Y | (null) |
| 5 | PowerPoint | Y | (null) |
| 5 | Word | (null) | Y |
现在我只需要将它转动为所需的输出:
| id | skill_1 | skill_1_primary | skill_1_backup | skill_2 | skill_2_primary | skill_2_backup | skill_3 | skill_3_primary | skill_3_backup | skill_4 | skill_4_primary | skill_4_backup | skill_5 | skill_5_primary | skill_5_backup |
|----|---------|-----------------|----------------|------------|-----------------|----------------|------------|-----------------|----------------|------------|-----------------|----------------|---------|-----------------|----------------|
| 1 | Excel | Y | (null) | PowerPoint | Y | (null) | Word | Y | Y | (null) | (null) | (null) | (null) | (null) | (null) |
| 2 | Excel | Y | (null) | Outlook | Y | (null) | PowerPoint | Y | Y | Word | Y | (null) | (null) | (null) | (null) |
| 3 | Excel | Y | (null) | PowerPoint | Y | (null) | Word | Y | (null) | (null) | (null) | (null) | (null) | (null) | (null) |
| 4 | Access | Y | Y | Excel | Y | Y | Outlook | Y | (null) | PowerPoint | Y | (null) | Word | Y | (null) |
| 5 | Excel | Y | (null) | Outlook | Y | (null) | PowerPoint | Y | (null) | Word | (null) | Y | (null) | (null) | (null) |
我感谢任何帮助。谢谢!
这个设计真的非常非常糟糕:-D
不过,如果你必须坚持下去,你可以试试这个:
注意:我依赖你的陈述
请注意,newSkill列包含oldSkill值
我认为“没有旧技能,不包括在新技能中!”
该解决方案完全内联并基于集合:
DECLARE @students TABLE(id INT,oldSkill VARCHAR(100),newSkill VARCHAR(100));
INSERT INTO @students VALUES
(1,'Word','Excel,PowerPoint,Word')
,(2,'Excel,PowerPoint,Word','Excel,Outlook,PowerPoint,Word')
,(3,'PowerPoint,Word','Excel,PowerPoint,Word')
,(4,'Access,Excel','Access,Excel,Outlook,PowerPoint,Word')
,(5,'Outlook,Word','Excel,Outlook,PowerPoint,Word');
DECLARE @skills TABLE(id INT, skill VARCHAR(100),assignment VARCHAR(1));
INSERT INTO @skills VALUES
(1,'Word','B')
,(1,'Word','P')
,(2,'Excel','P')
,(2,'PowerPoint','B')
,(2,'PowerPoint','P')
,(2,'Word','P')
,(3,'PowerPoint','P')
,(3,'Word','P')
,(4,'Access','B')
,(4,'Excel','B')
,(4,'Access','P')
,(4,'Excel','P')
,(5,'Outlook','P')
,(5,'Word','B');
- 第一个CTE将使用XML技巧来分割逗号分隔值
WITH Step1 AS
(
SELECT id
,A.*
FROM @students AS s
OUTER APPLY(
SELECT CAST('<x>' + REPLACE(s.oldSkill,',','</x><x>') + '</x>' AS XML) AS OldSkillXml
,CAST('<x>' + REPLACE(s.newSkill,',','</x><x>') + '</x>' AS XML) AS NewSkillXml
) AS A
)
- 第二个CTE获得了旧技能列表以及旗帜
,OldSkills AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY Step1.id ORDER BY (SELECT NULL)) AS OldSkillOrder
,Step1.id
,os.value('text()[1]','varchar(100)') AS Skill
,CASE WHEN (SELECT assignment FROM @skills AS s WHERE s.id=Step1.id AND s.skill=os.value('text()[1]','varchar(100)') AND s.assignment='P') IS NOT NULL THEN 'Y' END AS IsPrimary
,CASE WHEN (SELECT assignment FROM @skills AS s WHERE s.id=Step1.id AND s.skill=os.value('text()[1]','varchar(100)') AND s.assignment='B') IS NOT NULL THEN 'Y' END AS IsBackup
FROM Step1
OUTER APPLY Step1.OldSkillXml.nodes('x') AS A(os)
)
- 这个CTE获得了新技能列表,全部标有“IsPrimary ='Y'”
,NewSkills AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY Step1.id ORDER BY (SELECT NULL)) AS NewSkillOrder
,Step1.id
,ns.value('text()[1]','varchar(100)') AS Skill
,'Y' AS IsPrimary
,NULL AS IsBackup
FROM Step1
OUTER APPLY Step1.NewSkillXml.nodes('x') AS A(ns)
)
- 中间列表是您在枢轴之前的结果
,IntermediateList AS
(
SELECT ns.id
,ns.Skill
,ns.IsPrimary
,os.IsBackup
,ns.NewSkillOrder
FROM NewSkills AS ns
FULL OUTER JOIN OldSkills AS os ON os.id=ns.id AND os.Skill=ns.Skill
)
- 在这里我使用“条件聚合”(老式的枢轴),这是一个伟大的做一个多个列的PIVOT
SELECT id
,MAX(CASE WHEN NewSkillOrder = 1 THEN Skill END) AS skill_1
,MAX(CASE WHEN NewSkillOrder = 1 THEN IsPrimary END) AS skill_1_primary
,MAX(CASE WHEN NewSkillOrder = 1 THEN IsBackup END) AS skill_1_backup
,MAX(CASE WHEN NewSkillOrder = 2 THEN Skill END) AS skill_2
,MAX(CASE WHEN NewSkillOrder = 2 THEN IsPrimary END) AS skill_2_primary
,MAX(CASE WHEN NewSkillOrder = 2 THEN IsBackup END) AS skill_2_backup
,MAX(CASE WHEN NewSkillOrder = 3 THEN Skill END) AS skill_3
,MAX(CASE WHEN NewSkillOrder = 3 THEN IsPrimary END) AS skill_3_primary
,MAX(CASE WHEN NewSkillOrder = 3 THEN IsBackup END) AS skill_3_backup
,MAX(CASE WHEN NewSkillOrder = 4 THEN Skill END) AS skill_4
,MAX(CASE WHEN NewSkillOrder = 4 THEN IsPrimary END) AS skill_4_primary
,MAX(CASE WHEN NewSkillOrder = 4 THEN IsBackup END) AS skill_4_backup
,MAX(CASE WHEN NewSkillOrder = 5 THEN Skill END) AS skill_5
,MAX(CASE WHEN NewSkillOrder = 5 THEN IsPrimary END) AS skill_5_primary
,MAX(CASE WHEN NewSkillOrder = 5 THEN IsBackup END) AS skill_5_backup
FROM IntermediateList AS il
GROUP BY id;
结果
+----+---------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+---------+-----------------+----------------+
| id | skill_1 | skill_1_primary | skill_1_backup | skill_2 | skill_2_primary | skill_2_backup | skill_3 | skill_3_primary | skill_3_backup | skill_4 | skill_4_primary | skill_4_backup | skill_5 | skill_5_primary | skill_5_backup |
+----+---------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+---------+-----------------+----------------+
| 1 | Excel | Y | NULL | PowerPoint | Y | NULL | Word | Y | Y | NULL | NULL | NULL | NULL | NULL | NULL |
+----+---------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+---------+-----------------+----------------+
| 2 | Excel | Y | NULL | Outlook | Y | NULL | PowerPoint | Y | Y | Word | Y | NULL | NULL | NULL | NULL |
+----+---------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+---------+-----------------+----------------+
| 3 | Excel | Y | NULL | PowerPoint | Y | NULL | Word | Y | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+----+---------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+---------+-----------------+----------------+
| 4 | Access | Y | Y | Excel | Y | Y | Outlook | Y | NULL | PowerPoint | Y | NULL | Word | Y | NULL |
+----+---------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+---------+-----------------+----------------+
| 5 | Excel | Y | NULL | Outlook | Y | NULL | PowerPoint | Y | NULL | Word | Y | Y | NULL | NULL | NULL |
+----+---------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+------------+-----------------+----------------+---------+-----------------+----------------+
注意 有一个不同之处:你的学生5已经获得了NULL / Y和技能“Word”,我不明白,为什么这个技能,因为它包含在“新技能”中不应该是“主要”。
以上是关于excel中,如何合把行的数据合并到列的主要内容,如果未能解决你的问题,请参考以下文章