R:一列中值的平均值,当它不应该等于 0 时

Posted

技术标签:

【中文标题】R:一列中值的平均值,当它不应该等于 0 时【英文标题】:R: mean of values in a column equalling 0 when it should not be 【发布时间】:2020-07-02 01:45:27 【问题描述】:

我正在尝试取平均值。

这是值的示例:

c(0, 0, -0.00426086269980885, -0.0171883361787684, 0, 0.00144261723538186, 
-0.00433411019126062, 0.00144679156901439, 0.0100822459425949, 
0.0099816080711328, -0.0157359963029773, -0.00288315632388603, 
0.00432162324420649, -0.00721311620008525, 0.0215106455827421, 
-0.0071232124906615, -0.00717431689199532, -0.0101129941384546, 
-0.0072998774915094, -0.00734028331959991, -0.0029645801177054, 
-0.0361398560956312, 0.059535967918694, -0.014555000996034, -0.0237432369256014, 
0.00897325497629353, -0.0134834715407033, -0.00453065081556225, 
0.00904086737997201, 0, 0.00897325497629353, 0.0103658246132836, 
0.00293825901286748, 0.0014658983231568, 0.0131102996091697, 
-0.00434666950738372, 0.00867452721813411, 0.00575438823184449, 
0.00713072754549238, -0.00570050845244285, 0.032367744813719, 
-0.0224003315648895, -0.00426690479638658, -0.00856299509528746, 
-0.0320803144128314, -0.00593767207121676, -0.00898667487078519, 
0.042682540672784, 0, -0.00867452721813411, 0.0172873110304499, 
-0.0100533229008199, 0.0200065816431954, -0.0113814352297972, 
0.0156115438602651, 0.0153715651969968, 0, 0.0246591214936727, 
0.00540492616309773, -0.00134546290274429, -0.00541222291310994, 
0.0187980455874044, 0.00793919867215731, -0.0159539973780527, 
-0.046651009916145, 0.00421308190323977, 0.0111281244997983, 
0.0055117761038721, -0.00137509854859985, 0.020454826377061, 
0.00269273851511898, -0.0135490703741734, -0.00410940942736904, 
-0.0193576517224336, 0.015243665180396, -0.00137509854859985, 
0.0123205855863757, -0.0095722767888935, -0.00828015580561559, 
-0.00277119138772619, -0.00138848083750265, 0.00555484167808995, 
0.0137351720732788, -0.0347046257643022, 0, -0.0142236796686062, 
-0.00575438823184449, 0, 0, -0.00867452721813411, -0.0146401608111093, 
0.0218741489407392, -0.0116127724770649, -0.00878900309037522, 
0.0102463560867569, -0.0161622351615947, 0.010320036410862, 0.0188766242402405, 
-0.00721311620008525, -0.00435297650073618, -0.00292568578506813, 
-0.00585074407750774, 0.0014658983231568, -0.0147699819493079, 
0, 0.00148551649006734, -0.0164769253635333, 0.0209202799957451, 
-0.00444335463221179, 0.00740793474991719, 0, -0.00740793474991719, 
0, 0.00296442904837857, -0.00593767207121676, -0.00298210956584821, 
0.026550349587346, 0, 0.0272796463542244, -0.00141602427074616, 
-0.0171765353914815, -0.00578299799930626, 0.00144888780804564, 
0.00144679156901439, -0.00579976806975058, 0.0144497464625504, 
-0.0144497464625504, -0.00146833278868641, 0.0187805658940814, 
-0.01006776364878, -0.00579137089424808, 0.0384939243098099, 
0.0357196284321315, -0.0094943601026829, 0.0202220805588311, 
-0.0147926677568835, 0.00944319295810203, -0.0148726057600497, 
0.0108416357150576, 0, -0.0176824660638335, -0.00965148293373552, 
-0.00555008362262299, -0.0126141662743384, -0.0099250699919331, 
0.0113331189354406, 0.0112061173308309, -0.0126141662743384, 
-0.0257159599524019, -0.0428716904319435, -0.0198406947735075, 
-0.0249516254684341, 0.0493093277118515, -0.00906816810262079, 
-0.00455824865790699, -0.00917930778590481, 0.0480501078992965, 
0.0416403776341081, -0.0416403776341081, -0.0282534534900112, 
0.0119957450079324, -0.019571951737646, -0.0168558039774203, 
-0.0218682468675704, 0.010986868435193, 0.00623841086741272, 
0, 0.0260568262971015, -0.0167785024342884, -0.0046353562978485, 
0.00617110423899714, -0.00308772563813342, 0.00308772563813342, 
0.025766672432173, -0.0135648639295098, 0.00756476484650648, 
-0.00603631769659518, -0.00608669580781829, 0.0181231125874168, 
-0.0150819726937499, 0.0045580547547166, 0.0135239538217764, 
0.00594649918772649, 0.00296004162847652, 0, -0.0179066757819495, 
0.0398580862181479, 0.00144470138686437, -0.0204315317053574, 
0, -0.00148783852519152, 0.0132065679852476, 0.0259011384240697, 
0.00143081644809495, -0.00855138897808327, -0.00143021909304952, 
-0.0173503468010319, -0.00877614740995813, -0.0133237598228533, 
0, -0.0150138969829294, -0.00607746863551384, 0.0358831757773652, 
0.00877642986257587, 0, -0.0295548721011007, 0.00149665303570679, 
0.00894653507046961, 0.0132649050657818, -0.0103048634373053, 
0.00443695349100448, -0.0118535511460545, -0.0135036823406951, 
0.0268274421635484, 0.00877614740995813, 0.0187805658940814, 
-0.0202379188904631, 0.00582977447775868, -0.0175789894666245, 
0.00443042057528942, 0.0203721059791455, 0.0143051798548823, 
-0.00569239604256655, -0.00572498501310204, -0.00288779879921375, 
-0.00144053908850417, 0.00719492732034865, 0.00143021909304952, 
0.00427842492482444, 0.0113331189354406, 0.0153715651969968, 
-0.0267046841324374, -0.00285024843740267, 0, -0.00429066313026638, 
0.0156559472184012, 0.0278544400437779, -0.0166168754292784, 
0.00278663600103979, 0.0124570291793562, -0.00275209059726578, 
0.0109754865666725, -0.00136382925358358, 0.00272580100995512, 
-0.012337458323044, 0.00686607713930343, 0.00136757633846862, 
-0.00685666142910613, 0.0190935708950324, 0, 0, 0.00136125548202859, 
-0.0108563318590544, -0.0179015329138661, 0, 0.00554715557660046, 
0.0137163490936372, -0.00821215747689585, -0.0292664653895374, 
0)

有 260 个值。

当我在 R 中取这些值的平均值时,我得到 0。

但是它不应该是 0。当我获取这个数据集并在另一个工具(如 Excel)中对其进行平均时,它返回非零。我猜是因为平均的结果是一个非常小的数字......

其他发现:

在玩了一些数字选项之后。更改 digits 选项并没有像我预期的那样改变我的输出。

我发现采用a <- c(0, 0, -0.00426086269980885, -0.0171883361787684...)mean(提供的数据)结果为1.0842021724855044e-19

我实际上是以这种特殊的方式取平均值: mean(df$variable[1:260]) 这是提供的同一组数据。 结果为 0!!!

当我运行identical(var1, var2) var1 包含 c(0, 0, -0.00426086269980885, -0.0171883361787684...) 和 var2 包含 df$variable[1:260] 它返回 FALSE....

运行快速循环检查行中的值是否相同,有多个FALSEs。

我已将 var1 和 var2 设为相同:

identical(dput(df$var[1:260]) #this is f
          , df$var[1:260]) #this is g
#returns TRUE

mean(f) #returns 0
mean(g) #returns 0

仍然返回 0

然而,

mean(c(0, 0, -0.00426086269980885, -0.0171883361787684, 0, 0.00144261723538186, 
       -0.00433411019126062, 0.00144679156901439, 0.0100822459425949, 
       0.0099816080711328, -0.0157359963029773, -0.00288315632388603, 
       0.00432162324420649, -0.00721311620008525, 0.0215106455827421, 
       -0.0071232124906615, -0.00717431689199532, -0.0101129941384546, 
       -0.0072998774915094, -0.00734028331959991, -0.0029645801177054, 
       -0.0361398560956312, 0.059535967918694, -0.014555000996034, -0.0237432369256014, 
       0.00897325497629353, -0.0134834715407033, -0.00453065081556225, 
       0.00904086737997201, 0, 0.00897325497629353, 0.0103658246132836, 
       0.00293825901286748, 0.0014658983231568, 0.0131102996091697, 
       -0.00434666950738372, 0.00867452721813411, 0.00575438823184449, 
       0.00713072754549238, -0.00570050845244285, 0.032367744813719, 
       -0.0224003315648895, -0.00426690479638658, -0.00856299509528746, 
       -0.0320803144128314, -0.00593767207121676, -0.00898667487078519, 
       0.042682540672784, 0, -0.00867452721813411, 0.0172873110304499, 
       -0.0100533229008199, 0.0200065816431954, -0.0113814352297972, 
       0.0156115438602651, 0.0153715651969968, 0, 0.0246591214936727, 
       0.00540492616309773, -0.00134546290274429, -0.00541222291310994, 
       0.0187980455874044, 0.00793919867215731, -0.0159539973780527, 
       -0.046651009916145, 0.00421308190323977, 0.0111281244997983, 
       0.0055117761038721, -0.00137509854859985, 0.020454826377061, 
       0.00269273851511898, -0.0135490703741734, -0.00410940942736904, 
       -0.0193576517224336, 0.015243665180396, -0.00137509854859985, 
       0.0123205855863757, -0.0095722767888935, -0.00828015580561559, 
       -0.00277119138772619, -0.00138848083750265, 0.00555484167808995, 
       0.0137351720732788, -0.0347046257643022, 0, -0.0142236796686062, 
       -0.00575438823184449, 0, 0, -0.00867452721813411, -0.0146401608111093, 
       0.0218741489407392, -0.0116127724770649, -0.00878900309037522, 
       0.0102463560867569, -0.0161622351615947, 0.010320036410862, 0.0188766242402405, 
       -0.00721311620008525, -0.00435297650073618, -0.00292568578506813, 
       -0.00585074407750774, 0.0014658983231568, -0.0147699819493079, 
       0, 0.00148551649006734, -0.0164769253635333, 0.0209202799957451, 
       -0.00444335463221179, 0.00740793474991719, 0, -0.00740793474991719, 
       0, 0.00296442904837857, -0.00593767207121676, -0.00298210956584821, 
       0.026550349587346, 0, 0.0272796463542244, -0.00141602427074616, 
       -0.0171765353914815, -0.00578299799930626, 0.00144888780804564, 
       0.00144679156901439, -0.00579976806975058, 0.0144497464625504, 
       -0.0144497464625504, -0.00146833278868641, 0.0187805658940814, 
       -0.01006776364878, -0.00579137089424808, 0.0384939243098099, 
       0.0357196284321315, -0.0094943601026829, 0.0202220805588311, 
       -0.0147926677568835, 0.00944319295810203, -0.0148726057600497, 
       0.0108416357150576, 0, -0.0176824660638335, -0.00965148293373552, 
       -0.00555008362262299, -0.0126141662743384, -0.0099250699919331, 
       0.0113331189354406, 0.0112061173308309, -0.0126141662743384, 
       -0.0257159599524019, -0.0428716904319435, -0.0198406947735075, 
       -0.0249516254684341, 0.0493093277118515, -0.00906816810262079, 
       -0.00455824865790699, -0.00917930778590481, 0.0480501078992965, 
       0.0416403776341081, -0.0416403776341081, -0.0282534534900112, 
       0.0119957450079324, -0.019571951737646, -0.0168558039774203, 
       -0.0218682468675704, 0.010986868435193, 0.00623841086741272, 
       0, 0.0260568262971015, -0.0167785024342884, -0.0046353562978485, 
       0.00617110423899714, -0.00308772563813342, 0.00308772563813342, 
       0.025766672432173, -0.0135648639295098, 0.00756476484650648, 
       -0.00603631769659518, -0.00608669580781829, 0.0181231125874168, 
       -0.0150819726937499, 0.0045580547547166, 0.0135239538217764, 
       0.00594649918772649, 0.00296004162847652, 0, -0.0179066757819495, 
       0.0398580862181479, 0.00144470138686437, -0.0204315317053574, 
       0, -0.00148783852519152, 0.0132065679852476, 0.0259011384240697, 
       0.00143081644809495, -0.00855138897808327, -0.00143021909304952, 
       -0.0173503468010319, -0.00877614740995813, -0.0133237598228533, 
       0, -0.0150138969829294, -0.00607746863551384, 0.0358831757773652, 
       0.00877642986257587, 0, -0.0295548721011007, 0.00149665303570679, 
       0.00894653507046961, 0.0132649050657818, -0.0103048634373053, 
       0.00443695349100448, -0.0118535511460545, -0.0135036823406951, 
       0.0268274421635484, 0.00877614740995813, 0.0187805658940814, 
       -0.0202379188904631, 0.00582977447775868, -0.0175789894666245, 
       0.00443042057528942, 0.0203721059791455, 0.0143051798548823, 
       -0.00569239604256655, -0.00572498501310204, -0.00288779879921375, 
       -0.00144053908850417, 0.00719492732034865, 0.00143021909304952, 
       0.00427842492482444, 0.0113331189354406, 0.0153715651969968, 
       -0.0267046841324374, -0.00285024843740267, 0, -0.00429066313026638, 
       0.0156559472184012, 0.0278544400437779, -0.0166168754292784, 
       0.00278663600103979, 0.0124570291793562, -0.00275209059726578, 
       0.0109754865666725, -0.00136382925358358, 0.00272580100995512, 
       -0.012337458323044, 0.00686607713930343, 0.00136757633846862, 
       -0.00685666142910613, 0.0190935708950324, 0, 0, 0.00136125548202859, 
       -0.0108563318590544, -0.0179015329138661, 0, 0.00554715557660046, 
       0.0137163490936372, -0.00821215747689585, -0.0292664653895374, 
       0))

返回1.0842021724855044e-19

c(...) 来自dput(df$var[1:260])

不太确定下一步该做什么......

修剪似乎会导致非零结果,但不是所需的方法...

【问题讨论】:

我得到mean(v1)# [1] 1.084202e-19 可能在您的系统中,因为这是一个很小的数字,所以四舍五入为 0。或者如果这是带有tibble的打印格式,那么它只是一个打印格式输出 你是怎么理解的?也许您使用了第二个参数,例如mean(x,1),哪个会返回 0 ?对我来说,mean(x) 也给了[1] 1.084202e-19 我用 mean(x) 就这么简单... 在 R 中,我得到 1.084202e-19。当我将该数据复制到 excel 中时,我得到9.34082E-20。当我将该数据从 excel 复制回 R 时,我得到3.846154e-12。因此存在舍入误差将数据导入和导出 R 和 Excel。我可以让 R 返回 0 的唯一方法是首先设置 options(digits=0),它永远不会更改值,这正是 R 打印到控制台的内容(请参阅 pi 以获得明确的示例)。虽然我理解不同工具返回不同值的挫败感,但我几乎不将 Excel 的答案视为逐字真相(只是流行真相)。 大家好,运行getOptions("digits")的结果是什么? 【参考方案1】:

非常接近于0,这里有一些管理输出的方法:

x <- mean(c(0, 0, -0.00426086269980885, -0.0171883361787684,..... )
> x
[1] 1.0842021724855044e-19
formatC(x, digits = 21, format = "f")
#You could also run below before mean()
options(digits = 21)

# generally good to ignore Excel for higher levels of accuracy - but R has its own nuances too,
# I forget but you could do a search on 'R decimal places' to understand the limitations. 

为回应您的最新笔记而添加:

> x <- mean(c(0, 0, -0.00426086269980885, -0.0171883361787684,..... ))
> x
[1] 1.084202e-19

> options(digits = '21')
> x
[1] 1.0842021724855044e-19

> formatC(x, digits = 40, format = "f")
[1] "0.0000000000000000001084202172485504434007"

# So also, for 
options(digits = 22)
y <- c(0, 0, -0.00426086269980885, -0.0171883361787684, 0, 0.00144261723538186,....)

> mean(y)
[1] 1.0842021724855044e-19

> formatC(mean(y), digits = 40, format = "f")
[1] "0.0000000000000000001084202172485504434007"

上面的代码一切正常......不确定问题可能是什么,尝试重新启动 R(?)或尝试将数据复制并粘贴到纯文本文件中,然后再回到 R(?) - 我假设您在 R 或 R-Studio 上运行它...

【讨论】:

奇怪的是,在 options(digits = 21) 之后运行平均值会得到 0,而使用 formatC() 仍然会得到 0 并带有更多小数... 0s 返回的地方还是一样

以上是关于R:一列中值的平均值,当它不应该等于 0 时的主要内容,如果未能解决你的问题,请参考以下文章

pandas groupby 滚动均值/中值删除缺失值

R:在箱线图ggplot上显示平均值和中值标签

如何找到R中分类列的平均值[关闭]

使用最后 n 个值的平均值或中值填充数据框不同列中的缺失值

如何根据 Python、R 中不同数据框/患者之间的第一列元素/基因取第二列值/计数的平均值?

java实现中值滤波均值滤波拉普拉斯滤波