About SVM

Can do linear classification, nonlinear classification, linear regression, etc., compared to logical regression, linear regression, decision tree and other models (non-neural network)

Traditional linear classification: select the centroid of two piles of data, and do the vertical line (low accuracy) - left

SVM: Fitting is not a line, but two parallel lines, and the width of these two parallel lines is as large as possible. The main focus is on the edge of the distance near the data point (support vector support vector), that is, large margin classification - above right

Before use, you need to do a scaling of the data set to make a better decision boundary.

But it is necessary to tolerate some points to cross the dividing boundary and improve the generalization, ie softmax classification

In sklearn, there is a hyperparameter c, which controls the complexity of the model. The larger c is, the smaller the tolerance is, the smaller c is, the higher the tolerance. c Add a new regular amount to control the SVM generalization ability to prevent overfitting. (Generally use gradsearch)

SVM-specific loss function Hinge Loss

(liblinear library, does not support kernel function, but relatively simple, complexity O (m * n))

Consistent with the characteristics of SVM, only consider vectors that fall near the classification surface and cross the classification surface to the other domain, giving a linear penalty (l1), or squared term (l2)

Import numpy as npfrom sklearn import datasetsfrom sklearn.pipeline import Pipelinefrom sklearn.preprocessing import StandardScalerfrom sklearn.svm import LinearSVCiris = datasets.load_iris()X = iris["data"][:,(2,3)]y = (iris[ "target"]==2).astype(np.float64)svm_clf = Pipeline(( ("scaler",StandardScaler()), ("Linear_svc",LinearSVC(C=1,loss="hinge")), ) )svm_clf.fit(X,y)print(svm_clf.predit([[5.5,1.7]]))

Classification of nonlinear data

There are two ways to construct high-dimensional features and construct similarity features.

Use high-dimensional spatial features (ie, the idea of â€‹â€‹the kernel) to square the data, cubes. . Map to high dimensional space

From sklearn.preprocessing import PolynomialFeaturespolynomial_svm_clf = Pipeline(( ("poly_features", PolynomialFeatures(degree=3)), ("scaler", StandardScaler()), ("svm_clf", LinearSVC(C=10, loss="hinge") )))polynomial_svm_clf.fit(X, y)

This kind of kernel trick can greatly simplify the model, without the need to display the processing of high-dimensional features, you can calculate more complex situations.

But the more complex the model, the greater the risk of overfitting

SVC (based on the libsvm library, supports kernel functions, but relatively complex, can not use too large data, complexity O (m^2 * n) - O (m ^ 3 * n))

SVC can be used directly (coef0: high-order and low-order weights)

From sklearn.svm import SVCpoly_kernel_svm_clf = Pipeline(( ("scaler", StandardScaler()), ("svm_clf", SVC(kernel="poly", degree=3, coef0=1, C=5)) ))poly_kernel_svm_clf. Fit(X, y)

Add similarity features

For example, the following figure creates a Gaussian distribution of x1 and x2, respectively, and then creates a new coordinate system to calculate the Gaussian distance (Gaussian RBF Kernel radial basis function).

Gamma (Î³) controls the shape of the Gaussian curve fat and thin, the distance between the data points plays a stronger role

Rbf_kernel_svm_clf = Pipeline(( ("scaler", StandardScaler()), ("svm_clf", SVC(kernel="rbf", gamma=5, C=0.001)) ))rbf_kernel_svm_clf.fit(X, y)

The following are the effects of different gamma and C values.

SGDClassifier (supports massive data, time complexity O(m*n))

SVM Regression (SVM regression)

Try to make the used instance fit into the lane, and use the hyperparameter for the lane width. Control, the bigger the wider the wider

Use LinearSVR

From sklearn.svm import LinearSVRsvm_reg = LinearSVR(epsilon=1.5)svm_reg.fit(X, y)

Using SVR

From sklearn.svm import SVRsvm_poly_reg = SVR(kernel="poly", degree=2, C=100, epsilon=0.1)svm_poly_reg.fit(X, y)

Mathematical principle

w control the width of the lane by controlling the angle of h tilt, the smaller the width, and the fewer data points that violate the classification

Hard margin linear SVM

optimize the target: And guaranteed

Soft margin linear SVM

Add a new slack variable Regularization

optimize the target: And guaranteed

Relaxation conditions, even if there are individual instances that violate the conditions, there is little punishment

Calculated using the Lagrangian multiplier method, where Î± is the result of the relaxation term

Calculation results: take the average

KernelizedSVM

due to

Therefore, the dot product calculation can be done first in the low space and then mapped into the high dimensional space.

The following formula indicates that the kernel trick can be calculated in high-dimensional space and calculated directly on low-dimensional dimensions.

Several common kernals and their functions

9V Switching Wall Charger

9V Switching Wall Charger

About this item

9V Switching wall charger
110V input voltage / 9VDC 1A/2A/3A... output voltage
For use with Arduino Uno, Mega and MB102 Power supply boards
Connector size: 5.5 x 2.1mm/5.5*2.5mm...
Center or Tip is positive, sleeve is negative

9v wall charger,AC Power Supply Wall Plug,Wall Adapter Power Supply,9V Power Adapter,ac 50/60hz power adapter,Wall Adapter Power Supply - 9VDC,100-240v converter switching power adapter

Shenzhen Waweis Technology Co., Ltd. , https://www.szwaweischarger.com