计量经济学,R-squared和F-statistic怎么求
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了计量经济学,R-squared和F-statistic怎么求相关的知识,希望对你有一定的参考价值。
如图,我知道F=(ESS/K)/(RSS/(n-k-1)),R-squared=ESS/TSS=1-RSS/TSS。但是ESS怎么求啊?给我具体公式和解题方法,重重有赏!
参考技术A RSS=342.5486,F 检验值为87.3339,然后N=10 你的自由度是8 K是2 你可以求ESS了 调整的R-squared的公式 你还记得不? 用调整的R-squared =0.9504,你可以求R-squared了python sklearn 多元线性回归显示r-squared
【中文标题】python sklearn 多元线性回归显示r-squared【英文标题】:python sklearn multiple linear regression display r-squared 【发布时间】:2017-06-21 09:13:00 【问题描述】:我计算了我的多元线性回归方程,我想查看调整后的 R 平方。我知道 score 函数可以让我看到 r-squared,但它没有调整。
import pandas as pd #import the pandas module
import numpy as np
df = pd.read_csv ('/Users/jeangelj/Documents/training/linexdata.csv', sep=',')
df
AverageNumberofTickets NumberofEmployees ValueofContract Industry
0 1 51 25750 Retail
1 9 68 25000 Services
2 20 67 40000 Services
3 1 124 35000 Retail
4 8 124 25000 Manufacturing
5 30 134 50000 Services
6 20 157 48000 Retail
7 8 190 32000 Retail
8 20 205 70000 Retail
9 50 230 75000 Manufacturing
10 35 265 50000 Manufacturing
11 65 296 75000 Services
12 35 336 50000 Manufacturing
13 60 359 75000 Manufacturing
14 85 403 81000 Services
15 40 418 60000 Retail
16 75 437 53000 Services
17 85 451 90000 Services
18 65 465 70000 Retail
19 95 491 100000 Services
from sklearn.linear_model import LinearRegression
model = LinearRegression()
X, y = df[['NumberofEmployees','ValueofContract']], df.AverageNumberofTickets
model.fit(X, y)
model.score(X, y)
>>0.87764337132340009
我手动检查了它,0.87764 是 R 平方;而 0.863248 是调整后的 R 平方。
【问题讨论】:
【参考方案1】:计算R^2
和adjusted R^2
的方法有很多种,以下是其中几种(根据您提供的数据计算):
from sklearn.linear_model import LinearRegression
model = LinearRegression()
X, y = df[['NumberofEmployees','ValueofContract']], df.AverageNumberofTickets
model.fit(X, y)
SST = s-s-r + SSE (ref definitions)
# compute with formulas from the theory
yhat = model.predict(X)
SS_Residual = sum((y-yhat)**2)
SS_Total = sum((y-np.mean(y))**2)
r_squared = 1 - (float(SS_Residual))/SS_Total
adjusted_r_squared = 1 - (1-r_squared)*(len(y)-1)/(len(y)-X.shape[1]-1)
print r_squared, adjusted_r_squared
# 0.877643371323 0.863248473832
# compute with sklearn linear_model, although could not find any function to compute adjusted-r-square directly from documentation
print model.score(X, y), 1 - (1-model.score(X, y))*(len(y)-1)/(len(y)-X.shape[1]-1)
# 0.877643371323 0.863248473832
另一种方式:
# compute with statsmodels, by adding intercept manually
import statsmodels.api as sm
X1 = sm.add_constant(X)
result = sm.OLS(y, X1).fit()
#print dir(result)
print result.rsquared, result.rsquared_adj
# 0.877643371323 0.863248473832
另一种方式:
# compute with statsmodels, another way, using formula
import statsmodels.formula.api as sm
result = sm.ols(formula="AverageNumberofTickets ~ NumberofEmployees + ValueofContract", data=df).fit()
#print result.summary()
print result.rsquared, result.rsquared_adj
# 0.877643371323 0.863248473832
【讨论】:
仅供参考,您可以在公式中使用 model.coef_ 而不是 X.shape[1]。那样解释更清楚 非常感谢! @ManuelG 不正确,即使您的意思是len(model.coef_)
(我假设您这样做);这也将包括 LR 的常数项,但情况并非如此。
您也可以这样做from sklearn.metrics import explained_variance_score, r2_score
。 r^2 explained_variance_score & 调整后的 r^2 r2_score.【参考方案2】:
regressor = LinearRegression(fit_intercept=False)
regressor.fit(x_train, y_train)
print(f'r_sqr value: regressor.score(x_train, y_train)')
【讨论】:
考虑添加更多细节或解释你的答案。 我不确定这个答案有什么帮助以上是关于计量经济学,R-squared和F-statistic怎么求的主要内容,如果未能解决你的问题,请参考以下文章