使用另一个数据框的匹配值向数据框添加新列[重复]

Posted 2023-02-19

技术标签:

【中文标题】使用另一个数据框的匹配值向数据框添加新列[重复]【英文标题】：Add a new column to a dataframe using matching values of another dataframe [duplicate] 【发布时间】：2016-08-30 06:37:22 【问题描述】：

我正在尝试用 table2 的匹配 val2 值填充 table1

table1$New_val2 = table2[table2$pid==table1$pid,]$val2

但我收到警告

longer object length is not a multiple of shorter object length

这很公平，因为表格长度不一样。

请指导我正确的方法。

【问题讨论】：

merge(table1, table2, by="pid") 可以根据需要添加all.x=TRUE 参数。嗨，如果table2中有其他列但我只想添加col2怎么办？ merge(table1, table2[, c("pid", "col2")], by="pid") 【参考方案1】：

我不确定你是不是这个意思，但你可能会使用：

newtable <- merge(table1,table2, by  = "pid")

这将创建一个名为 newtable 的新表，其中包含 3 列和与 id 匹配的值，在本例中为“pid”。

【讨论】：

【参考方案2】：

merge(table1, table2[, c("pid", "val2")], by="pid")

添加 all.x=TRUE 参数以保留 table1 中所有在 table2 中没有匹配项的 pid...

你走在正确的轨道上。这是一种使用匹配的方法...

table1$val2 <- table2$val2[match(table1$pid, table2$pid)]

【讨论】：

【参考方案3】：

我来晚了，但万一其他人问同样的问题：这正是 dplyr 的 inner_merge 所做的。

table1.df <- dplyr::inner_join(table1, table2, by=pid)

by-command 指定应使用哪一列来匹配行。

编辑：我曾经很难记住它是 [join]，而不是 [merge]。

【讨论】：

我更喜欢这个而不是merge()，因为在这个过程中表格没有被打乱，尽管这个函数现在被称为dplyr::inner_join() pid 现在也需要在 "" 中 - 即 table1.df

以上是关于使用另一个数据框的匹配值向数据框添加新列[重复]的主要内容，如果未能解决你的问题，请参考以下文章