Multiple Linear Regression In Pandas Statsmodels: Valueerror
Data: https://courses.edx.org/c4x/MITx/15.071x_2/asset/NBA_train.csv I know how to fit these data to a multiple linear regression model using statsmodels.formula.api: import pandas
Solution 1:
When using sm.OLS(y, X)
, y
is the dependent variable, and X
are the
independent variables.
In the formula W ~ PTS + oppPTS
, W
is the dependent variable and PTS
and oppPTS
are the independent variables.
Therefore, use
y = NBA['W']
X = NBA[['PTS', 'oppPTS']]
instead of
X = NBA['W']
y = NBA[['PTS', 'oppPTS']]
import pandas as pd
import statsmodels.apias sm
NBA = pd.read_csv("NBA_train.csv")
y = NBA['W']
X = NBA[['PTS', 'oppPTS']]
X = sm.add_constant(X)
model11 = sm.OLS(y, X).fit()
model11.summary()
yields
OLSRegressionResults==============================================================================Dep. Variable: W R-squared:0.942Model: OLS Adj. R-squared:0.942Method: Least Squares F-statistic:6799.Date:Sat,21Mar2015 Prob(F-statistic):0.00Time: 14:58:05 Log-Likelihood:-2118.0No. Observations: 835 AIC:4242.Df Residuals: 832 BIC:4256.Df Model:2Covariance Type:nonrobust==============================================================================coefstderrtP>|t| [95.0%Conf.Int.]
------------------------------------------------------------------------------const41.30481.61025.6520.00038.14444.465PTS0.03260.000109.6000.0000.0320.033oppPTS-0.03260.000-110.9510.000-0.033-0.032==============================================================================Omnibus: 1.026 Durbin-Watson:2.238Prob(Omnibus):0.599Jarque-Bera(JB):0.984Skew:0.084Prob(JB):0.612Kurtosis:3.009Cond.No.1.80e+05==============================================================================Warnings:
[1] StandardErrorsassumethatthecovariancematrixoftheerrorsiscorrectlyspecified.
[2] Theconditionnumberislarge,1.8e+05.Thismightindicatethattherearestrongmulticollinearityorothernumericalproblems.
Post a Comment for "Multiple Linear Regression In Pandas Statsmodels: Valueerror"