Speed Of SVM Kernels? Linear Vs RBF Vs Poly
Solution 1:
Did you scale your data?
This can become an issue with SVM's. According to A Practical Guide to Support Vector Classification
Because kernel values usually depend on the inner products of feature vectors, e.g. the linear kernel and the polynomial kernel, large attribute values might cause numerical problems.
Now for an example, I will use the sklearn breast cancer dataset:
from time import time
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.svm import SVC
data = load_breast_cancer()
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y)
clf_lin = SVC(kernel='linear', C=1.0, gamma=0.1)
clf_rbf = SVC(kernerl='rbf', C=1.0, gamma=0.1)
start = time()
clf_lin.fit(X_train, y_train)
print("Linear Kernel Non-Normalized Fit Time: {0.4f} s".format(time() - start))
start = time()
clf_rbf.fit(X_train, y_train)
print("RBF Kernel Non-Normalized Fit Time: {0.4f} s".format(time() - start))
scaler = MinMaxScaler() # Default behavior is to scale to [0,1]
X = scaler.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X, y)
start = time()
clf_lin.fit(X_train, y_train)
print("Linear Kernel Normalized Fit Time: {0.4f} s".format(time() - start))
start = time()
clf_rbf.fit(X_train, y_train)
print("RBF Kernel Normalized Fit Time: {0.4f} s".format(time() - start))
Outputs:
Linear Kernel Non-Normalized Fit Time: 0.8672
RBF Kernel Non-Normalized Fit Time: 0.0124
Linear Kernel Normalized Fit Time: 0.0021
RBF Kernel Normalized Fit Time: 0.0039
So you can see that in this dataset with shape (560, 30) we get a pretty drastic improvement in performance from a little scaling.
This behavior is dependent upon the features with large values. Think about working in infinitely dimensional space. As the values that you populate that infinitely dimensional space with get larger the space between their multidimensional products gets a lot bigger. I cannot stress that a lot enough. Read about The Curse of Dimensionality, and do read more than just the wiki entry I linked. This spacing is what makes the process take longer. The mathematics behind trying to separate the classes in this massive space just get drastically more complex, especially as the number of features and observations grow. Thus it is critical to always scale your data. Even if you are just doing a simple linear regression it is a good practice as you will remove any possible bias towards features with larger values.
Post a Comment for "Speed Of SVM Kernels? Linear Vs RBF Vs Poly"