Statsmodels -- Weights In Robust Linear Regression
Solution 1:
RLM currently does not allow user specified weights. Weights are internally used to implement the reweighted least squares fitting method.
If the weights have the interpretation of variance weights to account for different variances across observations, then rescaling the data, both endog y and exog x, in analogy to WLS will produce the weighted parameter estimates.
WLS used this in the whiten
method to rescale y and x
X = np.asarray(X)
if X.ndim == 1:
return X * np.sqrt(self.weights)
elif X.ndim == 2:
return np.sqrt(self.weights)[:, None]*X
I'm not sure whether all extra results that are available will be appropriate for the rescaled model.
Edit Followup based on comments
In WLS the equivalence W*( Y_est - Y )^2 = (sqrt(W)*Y_est - sqrt(W)*Y)^2 means that the parameter estimates are the same independent of the interpretation of weights.
In RLM we have a nonlinear objective function g((y - y_est) / sigma) for which this equivalence does not hold in general
fw * g((y - y_est) / sigma) != g((y - y_est) * sw / sigma )
where fw are frequency weights and sw are scale or variance weights and sigma is the estimated scale or standard deviation of the residual. (In general, we cannot find sw that would correspond to the fw.)
That means that in RLM we cannot use rescaling of the data to account for frequency weights.
Aside: The current development in statsmodels is to add different weight categories to GLM to develop the pattern that can be added to other models. The target is to get similar to Stata at least freq_weights, var_weights and prob_weights as options into the models.
Post a Comment for "Statsmodels -- Weights In Robust Linear Regression"