Home:ALL Converter>sklearn multiclass svm function

sklearn multiclass svm function

Ask Time:2018-04-16T09:22:55         Author:april

Json Formatter

I have multi class labels and want to compute the accuracy of my model.
I am kind of confused on which sklearn function I need to use. As far as I understood the below code is only used for the binary classification.

# dividing X, y into train and test data
X_train, X_test, y_train, y_test = train_test_split(X, y,  test_size=0.25,random_state = 0)

# training a linear SVM classifier
from sklearn.svm import SVC
svm_model_linear = SVC(kernel = 'linear', C = 1).fit(X_train, y_train)
svm_predictions = svm_model_linear.predict(X_test)

# model accuracy for X_test  
accuracy = svm_model_linear.score(X_test, y_test)
print accuracy

and as I understood from the link: Which decision_function_shape for sklearn.svm.SVC when using OneVsRestClassifier?

for multiclass classification I should use OneVsRestClassifier with decision_function_shape (with ovr or ovo and check which one works better)

svm_model_linear = OneVsRestClassifier(SVC(kernel = 'linear',C = 1, decision_function_shape = 'ovr')).fit(X_train, y_train)

The main problem is that the time of predicting the labels does matter to me but it takes about 1 minute to run the classifier and predict the data (also this time is added to the feature reduction such as PCA which also takes sometime)? any suggestions to reduce the time for svm multiclassifer?

Author:april,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/49848453/sklearn-multiclass-svm-function
Vivek Kumar :

There are multiple things to consider here:\n\n1) You see, OneVsRestClassifier will separate out all labels and train multiple svm objects (one for each label) on the given data. So each time, only binary data will be supplied to single svm object.\n\n2) SVC internally uses libsvm and liblinear, which have a 'OvO' strategy for multi-class or multi-label output. But this point will be of no use because of point 1. libsvm will only get binary data. \n\nEven if it did, it doesnt take into account the 'decision_function_shape'. So it does not matter if you provide decision_function_shape = 'ovr' or decision_function_shape = 'ovr'. \n\nSo it seems that you are looking at the problem wrong. decision_function_shape should not affect the speed. Try standardizing your data before fitting. SVMs work well with standardized data.",
2018-04-16T10:33:51
yy