在 R 中组合 paste0 和 format 时出现空格

Posted

技术标签:

【中文标题】在 R 中组合 paste0 和 format 时出现空格【英文标题】:Whitespaces appear when combining paste0 and format in R 【发布时间】:2022-01-11 17:59:51 【问题描述】:

为了显示我运行的回归结果,我有一个带有估计值和相应置信区间的小标题:

library(tidyverse)
library(magrittr

mydata <- structure(list(term = structure(c(1L, 3L, 4L), .Label = c("Intercept", 
"Follow-up time (years)", "Age (years)", "Sex (male)", "Never smoker (reference)", 
"Current smoker", "Former smoker", "Obesity (=30 kg/m²)", "BMI (kg/m²)", 
"Diabetes", "Glucose (mmol/L)", "Glucose lowering medication use", 
"Hypertension", "Systolic blood pressure (mmHg)", "Diastolic blood pressure (mmHg)", 
"Antihypertensive medication use", "Hypercholesterolemia", "LDL cholesterol (mmol/L)", 
"Lipid lowering medication use", "Chronic kidney disease (mL/min/1.73m²)", 
"=90 (reference)", "60-89", "=60"), class = c("ordered", "factor"
)), estimate = c(518.38, 0.98, 1.07), conf_low = c(178.74, 0.93, 
0.96), conf_high = c(1503.36, 1.03, 1.19), label = c("518.38 (178.74-1503.36)", 
"  0.98 (  0.93-   1.03)", "  1.07 (  0.96-   1.19)")), row.names = c(NA, 
-3L), class = c("tbl_df", "tbl", "data.frame"))

mydata

# A tibble: 3 x 4
  term        estimate conf_low conf_high
  <ord>          <dbl>    <dbl>     <dbl>
1 Intercept     518.     179.     1503.  
2 Age (years)     0.98     0.93      1.03
3 Sex (male)      1.07     0.96      1.19

为了制作包含估计值和 95%CI 的标签,我使用了 paste0,并确保每个数字都有两位小数,我使用了 format。但是,当组合这些时,会出现额外的空格:

mydata <- 
  mydata %>% 
  mutate(
    label=
      paste0(format(round(estimate, digits=2), nsmall=2), 
             " (", 
             format(round(conf_low, digits=2), nsmall=2), 
             "-", 
             format(round(conf_high, digits=2), nsmall=2), 
             ")", 
             sep="", collaps=""))

mydata
# A tibble: 3 x 5
  term        estimate conf_low conf_high label                    
  <ord>          <dbl>    <dbl>     <dbl> <chr>                    
1 Intercept     518.     179.     1503.   "518.38 (178.74-1503.36)"
2 Age (years)     0.98     0.93      1.03 "  0.98 (  0.93-   1.03)"
3 Sex (male)      1.07     0.96      1.19 "  1.07 (  0.96-   1.19)"

为什么会这样?我可以阻止这种情况或以其他方式删除空格以使格式变为“估计(conf_low-conf_high)”吗?

【问题讨论】:

错字:collaps="" >> collapse="".. 不确定是否重要,只是发现了 ;-) 【参考方案1】:

format() 调用中添加trim=TRUE

mydata %>% 
mutate(
  label=
    paste0(format(round(estimate, digits=2), nsmall=2, trim=TRUE), 
           " (", 
           format(round(conf_low, digits=2), nsmall=2, trim=TRUE), 
           "-", 
           format(round(conf_high, digits=2), nsmall=2, trim=TRUE), 
           ")", 
           sep="", collaps=""))
  
# A tibble: 3 × 5
  term        estimate conf_low conf_high label                    
  <ord>          <dbl>    <dbl>     <dbl> <chr>                    
1 Intercept     518.     179.     1503.   "518.38 (178.74-1503.36)"
2 Age (years)     0.98     0.93      1.03 "0.98 (0.93-1.03)"     
3 Sex (male)      1.07     0.96      1.19 "1.07 (0.96-1.19)"     

【讨论】:

【参考方案2】:

1) 使用 sprintf

mydata %>% 
  mutate(label = sprintf("%.2f (%.2f-%.2f)", estimate, conf_low, conf_high))

给予:

# A tibble: 3 x 5
  term        estimate conf_low conf_high label                  
  <ord>          <dbl>    <dbl>     <dbl> <chr>                  
1 Intercept     518.     179.     1503.   518.38 (178.74-1503.36)
2 Age (years)     0.98     0.93      1.03 0.98 (0.93-1.03)       
3 Sex (male)      1.07     0.96      1.19 1.07 (0.96-1.19)       

2) 或这种变化产生的输出略有不同

mydata %>% 
  mutate(label = sprintf("%6.2f (%6.2f-%7.2f)", estimate, conf_low, conf_high))

给予;

# A tibble: 3 x 5
  term        estimate conf_low conf_high label                    
  <ord>          <dbl>    <dbl>     <dbl> <chr>                    
1 Intercept     518.     179.     1503.   "518.38 (178.74-1503.36)"
2 Age (years)     0.98     0.93      1.03 "  0.98 (  0.93-   1.03)"
3 Sex (male)      1.07     0.96      1.19 "  1.07 (  0.96-   1.19)"

【讨论】:

以上是关于在 R 中组合 paste0 和 format 时出现空格的主要内容,如果未能解决你的问题,请参考以下文章

R语言 中的 paste/paste0 函数

R语言paste函数paste0函数将多个输入组合成字符串实战

R语言数据结构-向量

如何使用 R 中 gplot() 包中的 plotmean() 函数使用 paste0() 操作 n.label 值以获得观察次数

如何使用 R 从 xml 页面中提取信息

python中的format函数怎么使用