Introduction to Support Vector Machines (SVM)


Introduction to Support Vector Machines (SVM)

Support Vector Machines (SVM) is a potent technique in the field of machine learning that can be applied to both classification and regression analysis. It is widely utilized in many areas, including bioinformatics, text classification, and picture classification. The capacity of SVM to handle high-dimensional datasets and non-linear classification issues is its main benefit. The idea of SVM will be introduced in this article, along with a description of how to use it in Python.

Support Vector Machines (SVM)

Definition

A machine learning algorithm called Support Vector Machines (SVM) identifies the ideal hyperplane for classifying data. The margin between the two classes is maximized by measuring the distance between the hyperplane and the nearest data points from each class. Support vectors are the data points most closely related to the hyperplane. By applying kernel functions to translate the input data into a higher-dimensional feature space where a linear hyperplane may be utilized to divide the classes, SVM can handle both linear and non-linear classification issues.

Syntax

from sklearn import svm
clf = svm.SVC(kernel='linear')
clf.fit(X_train, y_train)

Explanation of Syntax

  • ‘from sklearn import svm’ − The SVM module is imported from the Scikit-Learn library by this line.

  • ‘clf = svm.SVC(kernel=’linear’) − An SVM classifier is initialised with a linear kernel in this line. Polynomial, radial basis function (RBF), and sigmoid are some more kernel functions that can be used.

  • ‘clf.fit(X_train, y_train)’ − On the basis of the training data (X_train) and matching class labels (y_train), this line trains the SVM classifier.

Algorithm

  • Step 1: Data preprocessing − Any missing or superfluous features are removed from the supplied data by preprocessing.

  • Step 2: Feature Extraction − The input data is turned into a set of features that may be used to train the SVM classifier through the feature extraction process.

  • Step 3: Training the SVM Classifier − The input data are used to train the SVM classifier to identify the best hyperplane for categorizing the data.

  • Step 4: Testing the SVM Classifier − A collection of validation data is used to test the SVM classifier and assess its performance.

  • Step 5: Tuning the SVM Classifier − The SVM Classifier is tuned by adjusting its hyperparameters in order to enhance its performance on the validation data.

Approach

  • Approach 1 − Program to show Linear SVM

  • Approach 2 − Program to show Non Linear SVM

Approach 1: Program to show Linear SVM

Linear SVM is used when the input data can be separated by a linear hyperplane.Below is the code for same.

Example

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn.metrics import accuracy_score

iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

clf = svm.SVC(kernel='linear')
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Output

Accuracy: 1.0
  • Load the Dataset − We can use the scikit-learn library to load the dataset. The iris dataset is a popular dataset that can be used for classification tasks.

  • Split the Dataset into Training and Testing Sets − The dataset is split into a training set and a testing set using the train_test_split function from the scikit-learn library.

  • Create an SVM Classifier with a Non-Linear Kernel − We create an SVM classifier with a non-linear kernel using the svm. SVM function from the scikit-learn library.

  • Train the SVM Classifier − The SVM classifier is trained on the training data using the fit method.

  • Test the SVM Classifier − The SVM classifier is tested on the testing data using the predict method.

Approach 2: Program to show Non Linear SVM

When a linear hyperplane cannot be used to separate the input data, non-linear SVM is used. In this method, the input data are transformed into a higher-dimensional feature space using the kernel trick so that the data can be separated using a linear hyperplane.

Example

from sklearn.datasets import make_circles
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn.metrics import accuracy_score

X, y = make_circles(n_samples=100, noise=0.1, factor=0.5, random_state=42)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

clf = svm.SVC(kernel='rbf')
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)
accuracy = accuracy score(y_test, y_pred)
print(“Accuracy:”,accuracy)

Output

Accuracy: 0.9
  • Load the Dataset −To create a toy dataset with two classes that cannot be separated by a linear hyperplane, we can use the scikit-learn library's make_circles method.

  • Create Training and Testing Sets from the Dataset − The train_test_split function from the scikit-learn library is used to divide the dataset into a training set and a testing set.

  • Incorporate a Non-Linear Kernel into an SVM Classifier − Using the svm.SVC function from the scikit-learn library, we build an SVM classifier that has a non-linear kernel. In this illustration, the RBF kernel is used.

  • Train the SVM Classifier − The fit approach is used to train the SVM classifier on the training data.

  • Test the SVM Classifier − The predict method is used to test the SVM classifier using the testing data.

Conclusion

SVMs are a potent machine learning technique that may be applied to both linear and non-linear classification applications in conclusion. They are widely employed in numerous industries and have numerous practical uses. In this post, we introduced SVMs, discussed their definition and syntax, and described the linear SVM and non-linear SVM Python implementation methodologies. Additionally, we offered the whole code for applying these methods to two datasets, showing their effectiveness using accuracy as a metric. Machine learning professionals can broaden their skill sets and use SVMs to solve a variety of real-world issues by learning about SVMs and how to implement them in Python.

Updated on: 12-Oct-2023

53 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements